Project

General

Profile

Feature #1546

RDMA example with migration

Added by Sam White about 2 years ago. Updated about 2 years ago.

Status:
Merged
Priority:
Urgent
Assignee:
Category:
Build & Test Automation
Target version:
Start date:
05/02/2017
Due date:
% Done:

100%

Tags:

Description

The failures in AMPI RDMA have raised concern that there is something bad going on when using RDMA + migration, even when ensuring that there are no outstanding sends during migration and running with +LBSyncResume. The only existing uses of rdma are 2 pingpong examples/tests, and AMPI, so more are needed. So add rdma entry method calls to jacobi or some other example that has load balancing...


Related issues

Related to Charm++ - Bug #1539: Failure in migration when using RDMA sends in AMPI Merged 04/28/2017

History

#1 Updated by Sam White about 2 years ago

  • Priority changed from High to Urgent

This is really needed now to help debug current and future issues with RDMA before 6.8.0

#2 Updated by Phil Miller about 2 years ago

  • Related to Bug #1539: Failure in migration when using RDMA sends in AMPI added

#3 Updated by Sam White about 2 years ago

  • Assignee changed from Vipul Harsh to Nitin Bhat

I believe Nitin modified the stencil load balancing to use RDMA, and that is pending on a fix for RDMA entry methods in SDAG: https://charm.cs.illinois.edu/redmine/issues/1553

#4 Updated by Phil Miller about 2 years ago

The bug that spawned this request, #1539, has now been fixed. Is it still critical to have a new test/example that specifically exercises the interaction of RDMA sends and migration? If not, this can probably be de-prioritized, delayed, or just closed/rejected (as AMPI actually does exercise this case).

#5 Updated by Sam White about 2 years ago

I think it's important to have an SDAG + Migration + RDMA example/test, but up to you whether that is this issue or not

#6 Updated by Nitin Bhat about 2 years ago

  • % Done changed from 0 to 100
  • Status changed from New to Implemented

Added the stencil3d example as a part of the SDAG support for rdma marked entry methods. This example tests both Migration and SDAG capabilities for an entry method with an 'rdma' parameter.

Implementation: https://charm.cs.illinois.edu/gerrit/#/c/2572/

#7 Updated by Phil Miller about 2 years ago

  • Status changed from Implemented to Merged

Also available in: Atom PDF