Project

General

Profile

Bug #1493

Deleting an array also deletes all common elements from it's bound arrays

Added by Eric Mikida over 1 year ago. Updated over 1 year ago.

Status:
Merged
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
04/10/2017
Due date:
% Done:

0%


Description

When you call ckDestroy on an array proxy, it deletes all location records associated with it's elements. This results in all CkMigratable objects associated with those records to be deleted even if they were part of a different bound array.

History

#1 Updated by Eric Mikida over 1 year ago

  • Status changed from New to Implemented

Implemented a potential fix in https://charm.cs.illinois.edu/gerrit/2381.

#2 Updated by Phil Miller over 1 year ago

  • Status changed from Implemented to Merged

#3 Updated by Phil Miller over 1 year ago

  • Status changed from Merged to In Progress

So, this results in a use-after-free error, as follows:

==32415== Invalid read of size 4
==32415==    at 0x300992: CkLocRec::getLBDB() const (cklocrec.h:51)
==32415==    by 0x2F0585: CkMigratable::~CkMigratable() (cklocation.C:1537)
==32415==    by 0x32EFA6: ArrayElement::~ArrayElement() (ckarray.C:419)
==32415==    by 0x280B7D: Stencil::~Stencil() (in /home/phil/PPL/charm/netlrts-linux-smp/examples/charm++/load_balancing/stencil3d/stencil3d)
==32415==    by 0x30127D: CkArray::deleteElt(unsigned long long) (ckarray.h:718)
==32415==    by 0x2F52DC: CkLocMgr::emigrate(CkLocRec*, int) (cklocation.C:3091)
==32415==    by 0x2F11A5: CkLocRec::migrateMe(int) (cklocation.C:1883)
==32415==    by 0x2F1784: CkLocRec::recvMigrate(int) (cklocation.C:2019)
==32415==    by 0x2F1752: CkLocRec::staticMigrate(LDObjHandle, int) (cklocation.C:2012)
==32415==    by 0x388963: LBOM::Migrate(LDObjHandle, int) (LBOM.h:37)
==32415==    by 0x3877E9: LBDB::Migrate(LDObjHandle, int) (LBDBManager.C:324)
==32415==    by 0x36CD99: LDMigrate (lbdb.C:399)
==32415==  Address 0x4e318ac is 36 bytes inside a block of size 88 free'd
==32415==    at 0x482F938: operator delete(void*) (vg_replace_malloc.c:576)
==32415==    by 0x48F9367: operator delete(void*, unsigned int) (del_ops.cc:32)
==32415==    by 0x2F3552: CkLocMgr::reclaim(CkLocRec*) (cklocation.C:2537)
==32415==    by 0x2F12DC: CkLocRec::destroy() (cklocation.C:1915)
==32415==    by 0x32EF8D: ArrayElement::~ArrayElement() (ckarray.C:421)
==32415==    by 0x280B7D: Stencil::~Stencil() (in /home/phil/PPL/charm/netlrts-linux-smp/examples/charm++/load_balancing/stencil3d/stencil3d)
==32415==    by 0x30127D: CkArray::deleteElt(unsigned long long) (ckarray.h:718)
==32415==    by 0x2F52DC: CkLocMgr::emigrate(CkLocRec*, int) (cklocation.C:3091)
==32415==    by 0x2F11A5: CkLocRec::migrateMe(int) (cklocation.C:1883)
==32415==    by 0x2F1784: CkLocRec::recvMigrate(int) (cklocation.C:2019)
==32415==    by 0x2F1752: CkLocRec::staticMigrate(LDObjHandle, int) (cklocation.C:2012)
==32415==    by 0x388963: LBOM::Migrate(LDObjHandle, int) (LBOM.h:37)
==32415==  Block was alloc'd at
==32415==    at 0x482E8DC: operator new(unsigned int) (vg_replace_malloc.c:328)
==32415==    by 0x2F266E: CkLocMgr::createLocal(CkArrayIndex const&, bool, bool, bool) (cklocation.C:2274)
==32415==    by 0x2F29D9: CkLocMgr::addElement(CkArrayID, CkArrayIndex const&, CkMigratable*, int, void*) (cklocation.C:2337)
==32415==    by 0x3328B8: CkArray::insertElement(CkArrayMessage*, CkArrayIndex const&, int*) (ckarray.C:1189)
==32415==    by 0x332DF4: CkArray::insertInitial(CkArrayIndex const&, void*) (ckarray.C:1264)
==32415==    by 0x2EF5F3: CkArrayMap::populateInitial(int, CkArrayOptions&, void*, CkArray*) (cklocation.C:259)
==32415==    by 0x33B373: CkLocMgr::populateInitial(CkArrayOptions&, void*, CkArray*) (cklocation.h:368)
==32415==    by 0x331EFE: CkArray::CkArray(CkArrayOptions&, CkMarshalledMessage&) (ckarray.C:1022)
==32415==    by 0x335698: CkIndex_CkArray::_call_CkArray_marshall1(void*, void*) (CkArray.def.h:546)
==32415==    by 0x2D1D09: CkDeliverMessageFree (ck.C:593)
==32415==    by 0x2D1E75: _invokeEntryNoTrace(int, envelope*, void*) (ck.C:637)
==32415==    by 0x2D258D: CkCreateLocalGroup (ck.C:733)

#4 Updated by Phil Miller over 1 year ago

This is the autobuild error seen on netlrts-linux-smp in examples/charm++/load_balancing/stencil3d

#5 Updated by Michael Robson over 1 year ago

  • Tags set to namd, openatom, changa

#6 Updated by Eric Mikida over 1 year ago

This bug looks to be due to the fact that ~CkMigratable() tries to access myRec in order to get the LBDB database for removing the local barrier client, even though the location has already potentially been deleted at this point. This bug is very hard to reproduce though even with 10s of thousands of chares and RotateLB so I'm not sure if this is the only issue but it appears to be.

#7 Updated by Eric Mikida over 1 year ago

  • Status changed from In Progress to Implemented

The fix was very straightforward, and is implemented here: https://charm.cs.illinois.edu/gerrit/2448

The only issue is that it was hard to reproduce the error, but I think via just manual inspection it was easy to figure out why the error popped up in the first place and the patch linked above should address it.

#8 Updated by Ronak Buch over 1 year ago

  • Status changed from Implemented to Merged

Also available in: Atom PDF