PPL/Charm++ at
| Booth #2030:

Charm++ and AMPI: Adaptive and Asynchronous Parallel Programming

Birds of a Feather (BoF) at SC21
Tuesday, 11/16/2021 12:15 PM - 1:15 PM, Room 227-228

Outline/Schedule

  • I. Introduction to Charm++/AMPI and Recent Work, by Laxmikant Kale (8 min)
  • II. Application & Developer Talks (40 min)
    • BS-SOLCTRA by Esteban Meneses (4 min)
    • NAMD by David Hardy (4 min)
    • ExaM2M by Eric Mikida (4 min)
    • Enzo-P/Cello by James Bordner (4 min)
    • AMPI by Sam White (4 min)
    • GPGPU support in Charm++ by Jaemin Choi (4 min)
    • Loimos by Abhinav Bhatele (4 min)
    • Charm4Py by Zane Fink (4 min)
    • Load Balancing by Ronak Buch (4 min)
    • Paratreet by Simeng Liu (4 min)
  • III. Open Discussion, Charm++ v7.0.0 and Q&A session (12 min)

Parallel Programming Laboratory (PPL) will be hosting a BoF for the community interested in parallel programming using Charm++, Adaptive MPI, and the associated ecosystem (mini-languages, tools etc), along with parallel applications developed using them. Intended to engage a broader audience and drive adoption.

Charm++ is a parallel programming system with increasing usage. Next to MPI (and now, possibly OpenMP) it is one of the most used systems deployed on parallel supercomputers, using a significant fraction of CPU cycles. A unified programming model with multicore and accelerator support, its abilities include: dynamic load balancing, fault tolerance, latency hiding, interoperability with MPI, and overall support for adaptivity and modularity.

Charm++ and Capabilities

Charm++ is a highly capable, general-purpose parallel programming system based on the notion of over decomposed, migratable units of work/data that are orchestrated by an intelligent, adaptive run time system. The combination of these two design principles allows it to offer several proven performance and productivity benefits. Charm++ can automatically hide communication latencies, balance dynamically changing loads, exploit accelerators and balance load across devices, provide checkpoint / restarts, and seamlessly overlap the execution of independently developed modules.

It also provides other sorely needed capabilities like topology-awareness, non-blocking collectives, recovery from faults etc. Charm++ advocates interoperability in the parallel ecosystem and the run time as the vehicle to realize this inter-operation. It can interact with components written using MPI, providing an adoption path for existing applications. These capabilities are available in production releases on a spectrum of hardware that includes most supercomputer architectures, commodity clusters, workstations, multi-core desktops and accelerator devices.

Existing Impact

Charm++ capabilities have been demonstrated in several petascale science and engineering applications. NAMD is a Gordon Bell-winning biophysical simulation application that has scaled to 500,000 cores and is very widely used (with over 100,000 users). Other Charm++ applications the have scaled to hundreds of thousands of cores include ROSS (discrete event simulation), EpiSimdemics (contagion modeling), SpECTRE (relativistic fluid dynamics with magnetic fields), Enzo-P/Cello (AMR astrophysics and cosmology), ChaNGa (N-body gravity simulation) and OpenAtom (quantum chemistry). Charm++ is also used in other highly-scalable applications for fluid dynamics, hydrology, solid material formation and fracture, and many more. Charm++ applications accounted for 10-20% of CPU cycles on some of the largest supercomputers in recent years.

Benefits and Content

This BOF will provide an opportunity for interactions amongst the developers and users of Charm++. Key features of Charm++ will be highlighted to introduce them to potential users. New features will be presented to solicit feedback on their future development. Late breaking application achievements will be discussed to elucidate how they necessitated improvements in Charm++. Future directions will be proposed for discussion by the entire community.

Relevance and Timeliness

The need for a modern parallel programming solution that will carry CSE applications into the exascale era is being felt strongly by the HPC community. However, hardware diversity and the programming paradigms to manage each of them are increasing. In this context, Charm++ provides a unified programming model that has remained stable for the last 15 years, but has also evolved quickly in response to user feedback. The design of existing Charm++ applications has remained fundamentally unchanged as they have ported across a decade's worth of hardware including accelerators. Performance enhancements are delivered largely through run time strategies that embody the fruits of run time system research. A Gordon bell award in 2002 and the HPCC challenge award in 2011 speak to the staying power of Charm++.

Several of the challenges faced by parallel application developers can be solved by adopting Charm++. We believe it is opportune to expand awareness and drive adoption by engaging a larger community. We intend this BoF to become a staple event that pushes the Charm++ community towards a self-sustaining critical mass.

Session Format

The BOF will be structured as a series of short presentations followed by a discussion session. The presentations will include a broad introduction for the audience, followed by material of interest to the existing and potential user community. We will discuss some recently added/extended features, and news about latest release. These will be followed by short updates/reports from application developers, and plans for the near future. The discussion will be structured with a moderated panel answering questions from the audience.

Important Information: