Avoid memory pinning overhead for RDMA sends within a process
If an RDMA message is being sent to another object in the same process, we already do a direct memcpy rather than an rget() or equivalent RDMA operation, but we still currently pay the price of memory pinning on the sender in such a case. The cost of pinning has been shown to be high on Verbs and GNI, and it can be completely avoided in the case of RDMA sends within the same process. So have the sender check the receiver's last known location first before pinning and sending the metadata message.
#1 Updated by Nitin Bhat 6 months ago
- Status changed from New to Rejected
Within process (charm node) rdma exchanges don't have memory pinning in the existing code. Memory Pinning happens only in the machine layer code (which is in the code path of only between process sends).
Existing code path for within process (charm node) uses existing infrastructure for within-process sends (CmiInterFreeSendFn in machine-common-core.c) to send the metadata message. The receiver then intercepts the md message in the process handler and calls CkRdmaCopyMsg which issues a memcpy.