Project

General

Profile

Feature #1655

Feature #1497: CMA support for passing data between processes on the same node

Enable use of shm transport for regular messages in LRTS

Added by Sam White 12 months ago. Updated 5 months ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
Machine Layers
Target version:
Start date:
10/25/2017
Due date:
% Done:

100%

Tags:

Description

Experimenting with different models has shown that CMA (Cross Memory Attach) is a good candidate for exploiting shm for within-host communication. Shm transport over CMA has already been implemented for the Nocopy Direct API. Having an LRTS based implementation can greatly improve intra-host inter-process performance for large messages (regular, parameter marshalled) across all LRTS based layers.


Subtasks

Feature #1721: pxshm in OFIRejectedNitin Bhat

Feature #1722: pxshm for mpi layerRejectedSam White


Related issues

Related to Charm++ - Feature #1657: CMA support for nocopy sends using the Entry Method API across processes on the same host New 08/08/2017
Related to Charm++ - Feature #1667: Direct API for nocopy operations on sender-side and receiver-side In Progress 02/20/2018

History

#1 Updated by Sam White 12 months ago

  • Parent task set to #1497

#2 Updated by Sam White 12 months ago

  • Subject changed from Enable use of pxshm on mpi and verbs builds to Enable use of pxshm/xpmem on mpi and verbs builds

Also, './build charm++ gni-crayxe xpmem' fails to build because it tries to build pxshm and xpmem both. The issue is that we build with pxshm by default for gni-crayx* builds and don't disable that when explicitly building with xpmem. From what I've seen, xpmem offers performance nearly on par with user-space memcpy for Cray MPI, so that could potentially become the default on gni builds instead of pxshm if we implement it correctly. The key is to call xpmem_make() on the entire virtual address space during startup, avoiding the high cost of memory registration/deregistration during runtime.

#3 Updated by Sam White 9 months ago

  • Subject changed from Enable use of pxshm/xpmem on mpi and verbs builds to Enable use of pxshm/xpmem on mpi, ofi, and verbs builds

#4 Updated by Sam White 9 months ago

The current plan is to use CMA for interprocess copies. Nitin is working on it now.

#5 Updated by Nitin Bhat 8 months ago

  • Tags set to #lrts
  • Subject changed from Enable use of pxshm/xpmem on mpi, ofi, and verbs builds to Enable use of shm transport for regular messages in LRTS

Using CMA, we don't need a layer dependent shm implementation and can have a generic implementation in the LRTS layer.

#6 Updated by Nitin Bhat 8 months ago

  • Related to Feature #1657: CMA support for nocopy sends using the Entry Method API across processes on the same host added

#7 Updated by Nitin Bhat 8 months ago

  • Related to Feature #1667: Direct API for nocopy operations on sender-side and receiver-side added

#8 Updated by Nitin Bhat 8 months ago

There are three LRTS based use cases for shm (using CMA) to be used for intra-host communication:
1. Large messages using the Nocopy Direct API : https://charm.cs.illinois.edu/redmine/issues/1667. (Already implemented)
2. Large messages using the Nocopy Entry Method API : https://charm.cs.illinois.edu/redmine/issues/1657
3. Large messages using regular API : (This feature).

Reference Links:
Basic CMA infrastructure: https://charm.cs.illinois.edu/gerrit/#/c/3168/
Implementation of Shm transport for Nocopy Direct API: https://charm.cs.illinois.edu/gerrit/#/c/3116/

#9 Updated by Nitin Bhat 8 months ago

  • Description updated (diff)

#10 Updated by Eric Bohm 8 months ago

  • Assignee set to Nitin Bhat

#11 Updated by Nitin Bhat 7 months ago

  • Status changed from New to In Progress

#12 Updated by Nitin Bhat 5 months ago

  • Status changed from In Progress to Implemented

This patch supports using CMA for regular messages. However, since the advantages of using CMA over Charm's regular network messages is only at specific message sizes, I have disabled CMA in this patch. In the future, we need to determine the message size thresholds for which CMA messaging can be enabled. The message size thresholds can be determined by experimentation using applications across different machines and networks.

#13 Updated by Nitin Bhat 5 months ago

  • Status changed from Implemented to Merged

Also available in: Atom PDF