Project

General

Profile

Feature #1459

Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse layers

Zero-copy send support for the netlrts machine layer

Added by Phil Miller 7 months ago. Updated 26 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
Category:
Machine Layers
Target version:
Start date:
04/05/2017
Due date:
% Done:

0%


Description

In the netlrts machine layer, it's pretty easy to stream data from an arbitrary address to a remote recipient on request. We can use this to implement zero-copy sends for the memory footprint reduction, and to take copying time overhead off the worker threads and onto the comm thread.

History

#1 Updated by Phil Miller 7 months ago

  • Target version set to 6.8.0

#2 Updated by Phil Miller 7 months ago

  • Tags changed from lrts netlrts rdma to lrts machine-layers/netlrts rdma

#3 Updated by Phil Miller 7 months ago

  • Tags changed from lrts machine-layers/netlrts rdma to lrts machine-layers netlrts rdma

#4 Updated by Eric Bohm 6 months ago

  • Assignee set to Vipul Harsh

#5 Updated by Vipul Harsh 6 months ago

  • Assignee changed from Vipul Harsh to Nitin Bhat

#6 Updated by Nitin Bhat 6 months ago

  • Tags set to #rdma

#7 Updated by Nitin Bhat 5 months ago

  • Status changed from New to In Progress

The current netlrts layer (UDP) in machine-eth.c sends a Datagram header with every packet it sends.
For every packet, it creates a header at (char *ptr - DGRAM_HEADER_SIZE) and sends it over. The receiver uses the packet information to assemble packets together.

I am guessing that we’re able to use the DGRAM_HEADER_SIZE bytes before the packet ptr because of two reasons:
1. We know that the previous packet has been delivered? (using acks)
2. It is the Charm++ owned copied buffer and free from user intervention and will be freed after the message send.

But, if we’re to send a user buffer using this scheme, how do we send a header with every packet? (as we can’t touch it to send the header info at a negative offset)

#8 Updated by Phil Miller 5 months ago

  • Tags changed from #rdma to #rdma, #netlrts, #lrts, #machine-layers

We could do the packetization in a set aside buffer that we copy the user's data through as we send it. The key is to tightly constrain the size of that buffer, making it just large enough to get full network bandwidth. If the packet injection is synchronous, then that buffer only needs to hold 1 packet.

#9 Updated by Phil Miller 5 months ago

  • Target version changed from 6.8.0 to 6.8.1

#10 Updated by Sam White 26 days ago

  • Target version changed from 6.8.1 to 6.9.0
  • Category set to Machine Layers

Also available in: Atom PDF