Feature #1468

Enable pre-pinning memory for zero-copy message sends

Added by Sam White 5 months ago. Updated 9 days ago.

Target version:
Start date:
Due date:
% Done:




The cost of memory pinning on Verbs and GNI is high, so we'd like to enable users to pre-pin memory for use in later RDMA message sends. This is a step toward a persistent messaging API in that users can pin once and send multiple times.

We'd like to let the user allocate however they like, and then underneath we avoid re-pinning memory that is already pinned. Unfortunately, Verbs does not seem implement re-pinning of memory that is already pinned in a faster way, and doesn't seem to have a method to query the pinned-ness of some memory, so we'll have to track that ourselves. We can provide a CkAlloc routine that wraps something like infi_CmiAlloc() or that accepts a parameter that says whether memory should be pinned or not.


#1 Updated by Jaemin Choi 3 months ago

  • Status changed from New to In Progress

CmiAlloc() calls infi_CmiAlloc() underneath, which in turn calls getInfiCmiChunk() where the memory is actually pinned through ibv_reg_mr().
So this is not a memory pool per se, since pinning occurs at memory allocation time and not at program initialization time.
If this is fine, we could provide a CkAlloc() (which doesn't seem to exist currently) for the user to use when pinned memory is needed.

#2 Updated by Sam White 3 months ago

Yes I think that's what we want is a CkAlloc. We can eventually provide a pre-pinned memory pool behind that, but at first just pinning inside the call to CkAlloc is okay.

#3 Updated by Sam White 3 months ago

Phil suggested a complementary optimization on the GNI RDMA patch ( where the runtime would lazily deregister memory. That is, register memory that is not already registered, keep a cache of pre-registered memory (with some configurable limit on the number of buffers and the total size of those buffers), and only de-register memory when that cache is full.

#4 Updated by Jaemin Choi 3 months ago

Can we move this to 6.8.1? I don't think it will be finished anytime soon.

#5 Updated by Sam White 3 months ago

Is this more complicated than providing the following?
A) Add a CkAlloc() routine that just calls CmiAlloc()
B) Maintain an unordered_map or something that keeps track of pointers that are already pinned on a particular PE.
C) In the machine layers in which it is relevant (Verbs and GNI), check the map before pinning any memory.

Since accel/gpu manager updates aren't going to be merged before 6.8.0 and rdma is one of the most important new features of 6.8.0, it would be good to have this before then.

#6 Updated by Jaemin Choi 3 months ago

If we just provide a wrapper for malloc and keep track of its pinned-ness, it should be relatively straightforward like you said.
I was more worried about Phil's suggestion of keeping a cache of pre-registered memory, because that sounds similar to creating a mempool of varying sizes.
I have a question though; where should the new CkAlloc routine be defined? Should I create a new file called ckalloc.C in ck-core and somehow add it to the build chain?

#7 Updated by Phil Miller 3 months ago

  • Target version changed from 6.8.0 to 6.9.0

The automatic caching approach could go in 6.8.1, but an explicit API would have to be the next feature release.

#8 Updated by Sam White 9 days ago

  • Assignee changed from Jaemin Choi to Nitin Bhat
  • Subject changed from Enable pre-pinning memory for RDMA message sends to Enable pre-pinning memory for zero-copy message sends
  • Status changed from In Progress to New

Also available in: Atom PDF