Project

General

Profile

Bug #1960

tests/charm++/megatest hangs on netlrts-win-x86_64-smp build

Added by Nitin Bhat 2 months ago. Updated 10 days ago.

Status:
Merged
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
08/08/2018
Due date:
% Done:

0%


Description

../../../bin/testrun  ./pgm +p4  ++local
Charmrun> started all node programs in 0.702 seconds.
Converse/Charm++ Commit ID: v6.8.2-853-g4146bf788
Charm++> Disabling isomalloc because mmap() does not work.
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 4 cores x 2 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.219 seconds.
Megatest is running on 4 nodes 4 processors.
test 0: initiated [groupring (milind)]

This was also seen in last night's autobuild output: http://charm.cs.illinois.edu/autobuild/old.2018_08_08__01_01/netlrts-win-x86_64-smp.txt

History

#1 Updated by Nitin Bhat 2 months ago

I was unable to reproduce this with a non-production debug build (-g -O0)

#2 Updated by Evan Ramos 2 months ago

  • Target version deleted (6.9.0)

#3 Updated by Eric Bohm about 1 month ago

Last night it got through megatest but not simplearrayhello. So there is something peculiar going on in our windows smp target.

make[3]: Entering directory '/home/nikhil/autobuild/netlrts-win-x86_64-smp/charm/netlrts-win-x86_64-smp/tests/charm++/simplearrayhello'
../../../bin/charmc -optimize -production  -optimize -production   hello.ci
../../../bin/charmc -optimize -production  -optimize -production  -c hello.C
hello.C
../../../bin/charmc -optimize -production  -optimize -production  -language charm++ -o hello hello.o
moduleinit7476.C
   Creating library hello.lib and object hello.exp
../../../bin/testrun  ./hello +p4 10  ++local
Charmrun> started all node programs in 1.045 seconds.
Converse/Charm++ Commit ID: bb22b1d
Charm++> Disabling isomalloc because mmap() does not work.
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 4 cores x 2 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.032 seconds.
Running Hello on 4 processors for 10 elements
[0] Hello 0 created
[0] Hello 1 created
[0] Hello 2 created
[0] Hi[17] from element 0
[0] Hi[18] from element 1
[0] Hi[19] from element 2

#4 Updated by Eric Bohm about 1 month ago

  • Assignee set to Evan Ramos

#5 Updated by Evan Ramos 12 days ago

  • Assignee changed from Evan Ramos to Juan Galvez
  • Status changed from New to In Progress

Juan found what looks to be the culprit, a buggy implementation of CmiNodeBarrierCount on Windows.

#6 Updated by Evan Ramos 12 days ago

  • Target version set to 6.9.0

#7 Updated by Sam White 10 days ago

  • Status changed from In Progress to Merged

Also available in: Atom PDF