Enabling Support for Zero Copy Semantics in an Asynchronous Task-Based Programming Model
AMTE (Asynchronous Many-Task Systems for Exascale Workshop) 2021
Publication Type: Paper
Repository URL:
Download:
[PDF]
Abstract
Communication is critical to the scalable and efficient performance
of scientific simulations on extreme scale computing systems.
Part of the promise of task-based programming models is that they can
naturally overlap communication with computation and exploit locality
between tasks. Copy-based semantics using eager communication protocols
easily enable such asynchrony by alleviating the responsibility of
buffer management from the user, both on the sender and the receiver.
However, these semantics increase memory allocations and copies and
in turn affect application memory footprint and performance, especially
with large message buffers.
In this work we describe how the so-called “zero copy” messaging semantics
can be supported in Converse, the message-driven parallel programming
framework that is used by Charm++, by implementing support
for user-owned buffer transfers in its lower level runtime system, LRTS.
These semantics work on user-provided buffers and do not semantically
require copies by either the user or the runtime system. We motivate our
work by reviewing the existing messaging model in Converse/Charm++,
identify its semantic shortcomings, and define new LRTS and Converse
APIs to support zero copy communication based on RDMA capabilities.
We demonstrate the utility of our new communication interfaces with
benchmarks written in Converse. The result is up to 91% of message
latency improvement and improved memory usage. These advances will
enable future work on user-facing APIs in Charm++.
People
Research Areas