Project

General

Profile

Bug #1097

Attributes are not duplicated in MPI_Comm_dup

Added by Sam White about 2 years ago. Updated 21 days ago.

Status:
Implemented
Priority:
Normal
Assignee:
Category:
AMPI
Target version:
-
Start date:
06/09/2016
Due date:
% Done:

0%


Description

Running tests/ampi/mpich-tests/context/attrt gives the following:

$ ./attrt +vp2
*** Communicators ***
    Comm_create
    Comm_dup
dup_comm key_1 not found on 0
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: AMPI: User called MPI_Abort!

[0] Stack Traceback:
  [0:0] CmiAbortHelper+0x6b  [0x586e3b]
  [0:1] AMPI_Abort+0x26  [0x49f416]
  [0:2] test_communicators+0x2b8  [0x482197]
  [0:3] AMPI_Main+0x27  [0x481e34]
  [0:4] AMPI_Fallback_Main+0x1a  [0x48d82a]
  [0:5] AMPI_threadstart+0xa8  [0x490e88]
  [0:6]   [0x488d26]
  [0:7] CthStartThread+0x1e  [0x58467e]
  [0:8] +0x49800  [0x7ffff723e800]
Charm++ fatal error:
AMPI: User called MPI_Abort!

History

#1 Updated by Sam White about 2 years ago

From the MPI standard:

For each key value, the respective copy callback function determines the attribute value associated with this key in the new communicator; one particular action that a copy callback may take is to delete the attribute from the new communicator. Returns in newcomm a new communicator with the same group or groups, same topology, same info hints, any copied cached information, but a new context.

Note that AMPI currently does not duplicate info hints either (MPI-3 has MPI_Comm_dup_with_info as well) ...

#2 Updated by Sam White about 2 years ago

  • Priority changed from Normal to Low

#3 Updated by Sam White almost 2 years ago

  • Status changed from New to In Progress
  • Priority changed from Low to Normal

This has come back to bite us in ROMIO. ROMIO duplicates a communicator and expects the attributes of the old one to be there on the new comm as well, in romio/adio/common/cb_config_list.c line 121.

Right now AMPI has maintains a Ckpv set of builtin/predefined keyvals on ampiParent. It also maintains a vector of KeyvalNodes for user-created keyvals. There is one such vector per ampiParent. We should have one vector per comm/win/type of keyvals, and these need to be copied at MPI_Comm_dup and other appropriate places.

Edit: apparently this wasn't the issue with ROMIO.

#4 Updated by Sam White almost 2 years ago

  • Status changed from In Progress to New
  • Target version changed from 6.8.0 to Unscheduled

#5 Updated by Sam White over 1 year ago

  • Priority changed from Normal to Low

#6 Updated by Sam White 10 months ago

  • Target version deleted (Unscheduled)

#7 Updated by Sam White about 1 month ago

  • Priority changed from Low to Normal

PETSc uses communicator attributes, and expects that they are duplicated across calls to MPI_Comm_dup. We don't really implement non-built-in attributes at all in AMPI, which should be define-able per communicator, window, and type.

#8 Updated by Sam White 24 days ago

  • Status changed from New to In Progress

#9 Updated by Sam White 21 days ago

  • Status changed from In Progress to Implemented

Also available in: Atom PDF