Enable pre-pinning memory for RDMA message sends
The cost of memory pinning on Verbs and GNI is high, so we'd like to enable users to pre-pin memory for use in later RDMA message sends. This is a step toward a persistent messaging API in that users can pin once and send multiple times.
We'd like to let the user allocate however they like, and then underneath we avoid re-pinning memory that is already pinned. Unfortunately, Verbs does not seem implement re-pinning of memory that is already pinned in a faster way, and doesn't seem to have a method to query the pinned-ness of some memory, so we'll have to track that ourselves. We can provide a CkAlloc routine that wraps something like infi_CmiAlloc() or that accepts a parameter that says whether memory should be pinned or not.
#1 Updated by Jaemin Choi about 2 months ago
- Status changed from New to In Progress
infi_CmiAlloc() underneath, which in turn calls
getInfiCmiChunk() where the memory is actually pinned through
So this is not a memory pool per se, since pinning occurs at memory allocation time and not at program initialization time.
If this is fine, we could provide a
CkAlloc() (which doesn't seem to exist currently) for the user to use when pinned memory is needed.
#3 Updated by Sam White about 1 month ago
Phil suggested a complementary optimization on the GNI RDMA patch (https://charm.cs.illinois.edu/gerrit/#/c/1908/) where the runtime would lazily deregister memory. That is, register memory that is not already registered, keep a cache of pre-registered memory (with some configurable limit on the number of buffers and the total size of those buffers), and only de-register memory when that cache is full.
#5 Updated by Sam White about 1 month ago
Is this more complicated than providing the following?
A) Add a CkAlloc() routine that just calls CmiAlloc()
B) Maintain an unordered_map or something that keeps track of pointers that are already pinned on a particular PE.
C) In the machine layers in which it is relevant (Verbs and GNI), check the map before pinning any memory.
Since accel/gpu manager updates aren't going to be merged before 6.8.0 and rdma is one of the most important new features of 6.8.0, it would be good to have this before then.
#6 Updated by Jaemin Choi about 1 month ago
If we just provide a wrapper for malloc and keep track of its pinned-ness, it should be relatively straightforward like you said.
I was more worried about Phil's suggestion of keeping a cache of pre-registered memory, because that sounds similar to creating a mempool of varying sizes.
I have a question though; where should the new CkAlloc routine be defined? Should I create a new file called ckalloc.C in ck-core and somehow add it to the build chain?