Project

General

Profile

Feature #1657

shm support for nocopy sends using the Entry Method API across processes on the same host

Added by Sam White 4 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
Category:
Machine Layers
Target version:
Start date:
08/08/2017
Due date:
% Done:

0%

Tags:

Description

It should be straightforward to implement this at least for the transfer of the nocopy payload: the small metadata message can still go through the network, but the large message transfer of the nocopy array parameter should be done using pxshm or xpmem when on the same host.


Related issues

Related to Charm++ - Feature #1655: Enable use of shm transport for regular messages in LRTS New 10/25/2017

History

#1 Updated by Sam White 2 months ago

  • Assignee set to Nitin Bhat

#2 Updated by Sam White 2 months ago

See the following paper for a description of how to use XPMEM efficiently. The key is that you can register the entire virtual address space with xpmem_make() during startup, then memory registration/deregistration/copy is cheap at runtime.

See section "XPMEM BTL – Vader" on pages 2-3 here: https://www.open-mpi.org/papers/cug-2012/cug_2012_open_mpi_for_cray_xe_xk.pdf
The implementation for this is in: openmpi-2.0.0/opal/mca/btl/vader/btl_vader_xpmem.{h,c}

We also have some existing code using xpmem in charm/src/arch/util/machine-xpmem.c but that code performs an extra copy into an intermediate xpmem buffer. Same for machine-pxshm.c.

#3 Updated by Nitin Bhat about 1 month ago

  • Subject changed from pxshm/xpmem support for nocopy sends across processes on the same host to shm support for nocopy sends across processes on the same host

Experimenting with different models has shown that CMA (Cross Memory Attach) is a good candidate for exploiting shm for within-host communication. CMA usage for nocopy sends through the direct API has been implemented here: https://charm.cs.illinois.edu/gerrit/#/c/3116/8/src/arch/util/machine-rdma.h

Was this request to add CMA support to the Zerocopy Entry Method API?

#4 Updated by Nitin Bhat about 1 month ago

  • Subject changed from shm support for nocopy sends across processes on the same host to shm support for nocopy sends using the Entry Method API across processes on the same host

#5 Updated by Nitin Bhat about 1 month ago

  • Related to Feature #1655: Enable use of shm transport for regular messages in LRTS added

Also available in: Atom PDF