Project

General

Profile

Bug #1710

syncft tests: warning and crash on init_checkpt

Added by Phil Miller 8 days ago. Updated 6 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
10/10/2017
Due date:
% Done:

0%


Description

http://ppl-jenkins:8080/job/Nightly-Build/label=trusty,platform=net-linux-x86_64-syncft/1346/console

../../../bin/testrun  ./jacobi 4 2 2 200 +vp16 +p8 +balancer DummyLB +isomalloc_sync +killFile kill_02.txt  
DISPLAY "(null)" invalid; disabling X11 forwarding
Charmrun> scalable start enabled. 
Charmrun> started all node programs in 1.658 seconds.
Converse/Charm++ Commit ID: ee8fe05
Charm++> synchronizing isomalloc memory region...
[3] To be killed after 70.000000 s (MEMCKPT) 
Charmrun> error on request socket to node 3 'localhost'--
Socket closed before recv.
Socket 5 failed 
DISPLAY "(null)" invalid; disabling X11 forwarding
charmrun says Processor 3 failed on Node 3
socket_index 1 crashed_node 3 reconnected fd 5  
Charmrun finished launching new process in 1.185174s
[0] consolidated Isomalloc memory region: 0x410000000 - 0x7ffb80000000 (134182656 megs)
Warning> net-* deprecated (Charm >= 6.8.0), please use netlrts
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
[0] killFlag set to true for file kill_02.txt
[0] DummyLB created
Charm++> CkMemCheckPTInit mainchare is created!
[11] Warning: init_checkpt called during restart, possible bug in migration constructor!
[11] Warning: init_checkpt called during restart, possible bug in migration constructor!
------------- Processor 0 Exiting: Called CmiAbort ------------
Reason: CmiFree reference count was zero-- is this a duplicate free?
[0] Stack Traceback:
  [0:0] CmiAbort+0x39  [0x60caa9]
  [0:1] CmiFree+0x4a  [0x611c9a]
  [0:2] CsdScheduleForever+0x50  [0x611dc0]
  [0:3] CsdScheduler+0x2d  [0x6120bd]
  [0:4] ConverseInit+0xc47  [0x60fee7]
  [0:5] main+0x21  [0x4a6b51]
  [0:6] __libc_start_main+0xf5  [0x7ffff7212f45]
  [0:7]   [0x4a7400]
Fatal error on PE 0> CmiFree reference count was zero-- is this a duplicate free?
make[3]: *** [syncfttest] Error 1


Related issues

Related to Charm++ - Bug #1711: syncft tests: unclear failure New 10/10/2017

History

#1 Updated by Phil Miller 8 days ago

  • Related to Bug #1711: syncft tests: unclear failure added

#2 Updated by Sam White 8 days ago

I think the flag '+restartisomalloc' may be needed here? If so we need to try to automate that or at least document it

#3 Updated by Eric Bohm 6 days ago

  • Assignee set to Juan Galvez

Also available in: Atom PDF