Bug #668

Bug #259: Bugs exposed by use of randomized Q

ampi/megampi test fails with randomized queues

Added by Eric Mikida over 3 years ago. Updated 17 days ago.

In Progress
Target version:
Start date:
Due date:
% Done:



See parent task for full details.


#1 Updated by Eric Bohm over 3 years ago

  • Assignee changed from PPL to Phil Miller

#2 Updated by Phil Miller almost 3 years ago

Reproduced and captured record/replay logs. Will attempt to run under charmdebug to understand what goes wrong.

#3 Updated by Phil Miller about 2 years ago

  • Assignee changed from Phil Miller to Sam White

Passing off an AMPI bug

#4 Updated by Phil Miller about 2 years ago

Per the parent task,

./build AMPI net-linux-x86_64 --with-prio-type=int --enable-randomized-msgq -j16 --suffix randq-debug -O3 -g

ampi/megampi: Crashes rarely due to a failed assertion: Broadcast integer from master> expected 123, got 4!

It might be worthwhile to try the various mpich-test, imb, and other conformance tests under randomized queues. If we see more failures, those would be indicative of substantial robustness issues that we'll have to face, or subject users to potential unpredictable failures/wrong results.

#5 Updated by Sam White about 2 years ago

I built as above and ran megampi for 1000 iterations 10 times (>1 hour), and got no failures. None from mpich-tests/coll that I tried either. I can try IMB.

#6 Updated by Sam White about 2 years ago

  • Status changed from New to In Progress

#7 Updated by Sam White over 1 year ago

  • Target version set to 6.8.1

#8 Updated by Sam White 11 months ago

  • Target version changed from 6.8.1 to 6.9.0

#9 Updated by Sam White 9 months ago

  • Target version deleted (6.9.0)

#10 Updated by Sam White 17 days ago

  • Priority changed from Normal to Low

Also available in: Atom PDF