Fix bug in MPI machine layer related to the Ncpy API 26/4526/3
authorNitin Bhat <nbhat4@illinois.edu>
Mon, 27 Aug 2018 17:25:55 +0000 (13:25 -0400)
committerNitin Bhat <nbhat4@illinois.edu>
Tue, 28 Aug 2018 14:02:53 +0000 (09:02 -0500)
The bug is related to linked list states when messages are sent
in the callback function invoked inside ReleasePostedMessages.
Previously, the end_sent link list was stale when a new message was
added to the linked list while iterating across the list inside
ReleasePostedMessages. With the fix, the link list state is updated
before invoking the acknowledgement function to handle sends while
iterating across the list.

Change-Id: Ib06b3ab7375c13f328df9502976b5fc900fbc03f

src/arch/mpi/machine.C

index a2e927d582cd11f72fba1d5c80a2f76ade384113..927d3c3a7556e5133f397af31a115588022bab19 100644 (file)
@@ -632,6 +632,10 @@ static void ReleasePostedMessages(void) {
             else
                 prev->next = temp;
 #if CMK_ONESIDED_IMPL
+            // Update end_sent for consistent states during possible insertions
+            if(CpvAccess(end_sent) == msg_tmp) {
+              CpvAccess(end_sent) = prev;
+            }
             //if rdma msg, call the callback
             if(msg_tmp->type == ONESIDED_BUFFER_SEND) {
                 CmiMPIRzvRdmaOpInfo_t *rdmaOpInfo = (CmiMPIRzvRdmaOpInfo_t *)msg_tmp->ref;
@@ -693,7 +697,6 @@ static void ReleasePostedMessages(void) {
         }
 #endif
     }
-    CpvAccess(end_sent) = prev;
     MACHSTATE(2,"} ReleasePostedMessages end");
 }