Project

General

Profile

Bug #1202

Memory leaks in converse's cldb

Added by Sam White almost 3 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Low
Assignee:
Category:
-
Target version:
Start date:
09/08/2016
Due date:
% Done:

0%


Description

Running valgrind on netlrts-linux-x86_64 examples reveals two memory leaks in initialization routines. One is in src/conv-ldb/cldb.c line 360, the other is in src/conv-core/cputopology.C line 183.

==13005== 32 bytes in 1 blocks are possibly lost in loss record 60 of 215
==13005==    at 0x4C27AAA: malloc (vg_replace_malloc.c:291)
==13005==    by 0x53FC64: CmiAlloc (convcore.c:3035)
==13005==    by 0x545102: CldModuleGeneralInit (cldb.c:360)
==13005==    by 0x5413A0: ConverseCommonInit (convcore.c:3791)
==13005==    by 0x53D822: ConverseInit (machine-common-core.c:1261)
==13005==    by 0x4914C6: main (main.C:18)
==13005== 
==13005== 32 bytes in 1 blocks are possibly lost in loss record 61 of 215
==13005==    at 0x4C28222: operator new[](unsigned long) (vg_replace_malloc.c:384)
==13005==    by 0x54FB7F: cpuTopoRecvHandler(void*) (cputopology.C:183)
==13005==    by 0x53F3BC: CsdSchedulePoll (convcore.c:1783)
==13005==    by 0x54E3E4: LrtsInitCpuTopo (cputopology.C:582)
==13005==    by 0x4945E2: _initCharm(int, char**) (init.C:1393)
==13005==    by 0x53D92D: ConverseInit (machine-common-core.c:1294)
==13005==    by 0x4914C6: main (main.C:18)

History

#1 Updated by Eric Bohm almost 3 years ago

  • Assignee set to Kavitha Chandrasekar

#2 Updated by Juan Galvez over 2 years ago

  • Assignee changed from Kavitha Chandrasekar to Juan Galvez

#3 Updated by Juan Galvez over 2 years ago

  • Priority changed from Normal to Low

Fix for cputopology mem leak:
https://charm.cs.illinois.edu/gerrit/#/c/2202/

The mem leak in cldb is just something allocated at init that should be explicitly deleted at exit. Doing this would require creating an explicit clean or finalize function that should be called by Converse at exit.

#4 Updated by Juan Galvez over 2 years ago

  • Subject changed from Memory leaks in converse's cldb and cputopology to Memory leaks in converse's cldb

#5 Updated by Phil Miller over 2 years ago

  • Target version changed from 6.8.0 to 6.8.1

#6 Updated by Sam White about 2 years ago

I see memory leaks from the new topology code merge in the last couple weeks. I'm not sure if they are really new, or if they just have new names...

#7 Updated by Sam White about 2 years ago

Here's the new output:

==25396== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==25396==    at 0x5756183: __sendto_nocancel (syscall-template.S:81)
==25396==    by 0x608A2A: TransmitImplicitDgram1 (machine-eth.c:200)
==25396==    by 0x608DA0: TransmitDatagram (machine-eth.c:285)
==25396==    by 0x609CC8: CommunicationServerNet (machine-eth.c:734)
==25396==    by 0x60A110: LrtsAdvanceCommunication (machine.c:1707)
==25396==    by 0x6055A2: AdvanceCommunication (machine-common-core.c:1317)
==25396==    by 0x605820: CmiGetNonLocal (machine-common-core.c:1487)
==25396==    by 0x60C4E7: CsdNextMessage (convcore.c:1781)
==25396==    by 0x60C835: CsdSchedulePoll (convcore.c:1972)
==25396==    by 0x622743: LrtsInitCpuTopo (cputopology.C:593)
==25396==    by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25396==    by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25396==  Address 0x5ae40c5 is 21 bytes inside a block of size 76 alloc'd
==25396==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25396==    by 0x601DD1: malloc_nomigrate (libmemory-default.c:724)
==25396==    by 0x60E96A: CmiAlloc (convcore.c:2939)
==25396==    by 0x622694: LrtsInitCpuTopo (cputopology.C:580)
==25396==    by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25396==    by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25396==    by 0x60555C: ConverseRunPE (machine-common-core.c:1296)
==25396==    by 0x60547A: ConverseInit (machine-common-core.c:1198)
==25396==    by 0x528B47: main (main.C:18)

==25395== Syscall param socketcall.sendto(msg) points to uninitialised byte(s)
==25395==    at 0x5756183: __sendto_nocancel (syscall-template.S:81)
==25395==    by 0x6088F6: TransmitImplicitDgram (machine-eth.c:174)
==25395==    by 0x608C7F: TransmitDatagram (machine-eth.c:265)
==25395==    by 0x609CC8: CommunicationServerNet (machine-eth.c:734)
==25395==    by 0x60A110: LrtsAdvanceCommunication (machine.c:1707)
==25395==    by 0x6055A2: AdvanceCommunication (machine-common-core.c:1317)
==25395==    by 0x605820: CmiGetNonLocal (machine-common-core.c:1487)
==25395==    by 0x60C4E7: CsdNextMessage (convcore.c:1781)
==25395==    by 0x60C835: CsdSchedulePoll (convcore.c:1972)
==25395==    by 0x622743: LrtsInitCpuTopo (cputopology.C:593)
==25395==    by 0x62291C: CmiInitCPUTopology (cputopology.C:679)
==25395==    by 0x52AE4B: _initCharm(int, char**) (init.C:1364)
==25395==  Address 0x5af8d05 is 21 bytes inside a block of size 64 alloc'd
==25395==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==25395==    by 0x601DD1: malloc_nomigrate (libmemory-default.c:724)
==25395==    by 0x60E96A: CmiAlloc (convcore.c:2939)
==25395==    by 0x60593F: CopyMsg (machine-common-core.c:1579)
==25395==    by 0x603A94: SendSpanningChildren (machine-broadcast.c:117)
==25395==    by 0x603B12: SendSpanningChildrenProc (machine-broadcast.c:176)
==25395==    by 0x603BAD: CmiSyncBroadcastFn1 (machine-broadcast.c:219)
==25395==    by 0x603C87: CmiFreeBroadcastAllFn (machine-broadcast.c:290)
==25395==    by 0x621EAF: cpuTopoHandler(void*) (cputopology.C:288)
==25395==    by 0x60D6F9: CmiSendReduce (convcore.c:2446)
==25395==    by 0x60E196: CmiHandleReductionMessage (convcore.c:2627)
==25395==    by 0x60C3D9: CmiHandleMessage (convcore.c:1672)

#8 Updated by Eric Bohm almost 2 years ago

  • Target version changed from 6.8.1 to 6.9.0

#9 Updated by Juan Galvez almost 2 years ago

  • Target version changed from 6.9.0 to Unscheduled

Also available in: Atom PDF