Feature #1234: Avoid sender-side copy for large contiguous messages. API for charm and converse layers
Zero-copy send support for the netlrts machine layer
In the netlrts machine layer, it's pretty easy to stream data from an arbitrary address to a remote recipient on request. We can use this to implement zero-copy sends for the memory footprint reduction, and to take copying time overhead off the worker threads and onto the comm thread.
#7 Updated by Nitin Bhat 3 months ago
- Status changed from New to In Progress
The current netlrts layer (UDP) in machine-eth.c sends a Datagram header with every packet it sends.
For every packet, it creates a header at (char *ptr - DGRAM_HEADER_SIZE) and sends it over. The receiver uses the packet information to assemble packets together.
I am guessing that we’re able to use the DGRAM_HEADER_SIZE bytes before the packet ptr because of two reasons:
1. We know that the previous packet has been delivered? (using acks)
2. It is the Charm++ owned copied buffer and free from user intervention and will be freed after the message send.
But, if we’re to send a user buffer using this scheme, how do we send a header with every packet? (as we can’t touch it to send the header info at a negative offset)
#8 Updated by Phil Miller 3 months ago
- Tags changed from #rdma to #rdma, #netlrts, #lrts, #machine-layers
We could do the packetization in a set aside buffer that we copy the user's data through as we send it. The key is to tightly constrain the size of that buffer, making it just large enough to get full network bandwidth. If the packet injection is synchronous, then that buffer only needs to hold 1 packet.