Project

General

Profile

Bug #1950

Isomalloc produces misleading memory usage numbers

Added by Sam White 25 days ago. Updated 8 days ago.

Status:
Merged
Priority:
Normal
Assignee:
Category:
AMPI
Target version:
Start date:
07/25/2018
Due date:
% Done:

0%

Tags:

Description

Without Isomalloc (using PUP instead):

../../../bin/testrun  +p3 ./jacobi 2 2 2 40 +vp8 +balancer RotateLB +LBDebug 1  ++local
Charmrun> scalable start enabled. 
Charmrun> started all node programs in 0.024 seconds.
Charm++> Running in non-SMP mode: 3 processes (PEs)
Converse/Charm++ Commit ID: 0c0404f
Charm++> scheduler running in netpoll mode.
CharmLB> Verbose level 1, load balancing period: 0.5 seconds
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (2 sockets x 4 cores x 1 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.001 seconds.
CharmLB> RotateLB created.
iter 1 time: 0.138610 maxerr: 2020.200000
iter 2 time: 0.121927 maxerr: 1696.968000
iter 3 time: 0.121899 maxerr: 1477.170240
iter 4 time: 0.121337 maxerr: 1319.433024
iter 5 time: 0.121277 maxerr: 1200.918072

CharmLB> RotateLB: PE [0] step 0 starting at 0.747446 Memory: 27.046875 MB
CharmLB> RotateLB: PE [0] strategy starting at 0.747604
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 0.747634 duration 0.000030 s
CharmLB> RotateLB: PE [0] step 0 finished at 0.910256 duration 0.162810 s

iter 6 time: 0.253832 maxerr: 1108.425519
iter 7 time: 0.122065 maxerr: 1033.970839
iter 8 time: 0.120971 maxerr: 972.509242
iter 9 time: 0.122138 maxerr: 920.721889
iter 10 time: 0.121377 maxerr: 876.344030
iter 11 time: 0.157772 maxerr: 837.779089
iter 12 time: 0.121630 maxerr: 803.868831
iter 13 time: 0.121461 maxerr: 773.751705
iter 14 time: 0.120420 maxerr: 746.772667
iter 15 time: 0.121361 maxerr: 722.424056

CharmLB> RotateLB: PE [0] step 1 starting at 2.265507 Memory: 19.700577 MB
CharmLB> RotateLB: PE [0] strategy starting at 2.265998
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 2.266031 duration 0.000033 s
CharmLB> RotateLB: PE [0] step 1 finished at 2.439914 duration 0.174407 s

iter 16 time: 0.188801 maxerr: 700.305763
iter 17 time: 0.120613 maxerr: 680.097726
iter 18 time: 0.158139 maxerr: 661.540528
iter 19 time: 0.120192 maxerr: 644.421422
iter 20 time: 0.128632 maxerr: 628.564089
iter 21 time: 0.145454 maxerr: 613.821009
iter 22 time: 0.144852 maxerr: 600.067696
iter 23 time: 0.144974 maxerr: 587.198273
iter 24 time: 0.145015 maxerr: 575.122054
iter 25 time: 0.145463 maxerr: 563.760848

CharmLB> RotateLB: PE [0] step 2 starting at 3.827090 Memory: 28.748627 MB
CharmLB> RotateLB: PE [0] strategy starting at 3.827624
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 3.827733 duration 0.000109 s
CharmLB> RotateLB: PE [0] step 2 finished at 4.027472 duration 0.200382 s

iter 26 time: 0.250348 maxerr: 553.046836
iter 27 time: 0.166564 maxerr: 542.920870
iter 28 time: 0.129685 maxerr: 533.331094
iter 29 time: 0.166664 maxerr: 524.231833
iter 30 time: 0.167105 maxerr: 515.582675
iter 31 time: 0.135257 maxerr: 507.347718
iter 32 time: 0.159439 maxerr: 499.494943
iter 33 time: 0.166295 maxerr: 491.995690
iter 34 time: 0.135089 maxerr: 484.824219
iter 35 time: 0.165992 maxerr: 477.957338

CharmLB> RotateLB: PE [0] step 3 starting at 5.618538 Memory: 28.766556 MB
CharmLB> RotateLB: PE [0] strategy starting at 5.618781
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 5.618891 duration 0.000110 s
CharmLB> RotateLB: PE [0] step 3 finished at 5.745993 duration 0.127455 s

iter 36 time: 0.241410 maxerr: 471.374089
iter 37 time: 0.127693 maxerr: 465.055477
iter 38 time: 0.127682 maxerr: 458.984241
iter 39 time: 0.127650 maxerr: 453.144656
iter 40 time: 0.127653 maxerr: 447.522361
[Partition 0][Node 0] End of program

With Isomalloc:

[ ! -s "jacobi.iso" ] || ../../../bin/testrun  +p3 ./jacobi.iso 2 2 2 40 +vp8 +balancer RotateLB +LBDebug 1 ++local
Charmrun> scalable start enabled. 
Charmrun> started all node programs in 0.026 seconds.
Charm++> Running in non-SMP mode: 3 processes (PEs)
Converse/Charm++ Commit ID: 0c0404f
Charm++> scheduler running in netpoll mode.
CharmLB> Verbose level 1, load balancing period: 0.5 seconds
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (2 sockets x 4 cores x 1 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
CharmLB> RotateLB created.
iter 1 time: 0.140234 maxerr: 2020.200000
iter 2 time: 0.122841 maxerr: 1696.968000
iter 3 time: 0.123207 maxerr: 1477.170240
iter 4 time: 0.122113 maxerr: 1319.433024
iter 5 time: 0.159414 maxerr: 1200.918072

CharmLB> RotateLB: PE [0] step 0 starting at 0.786396 Memory: 17.490555 MB
CharmLB> RotateLB: PE [0] strategy starting at 0.786549
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 0.786578 duration 0.000029 s
CharmLB> RotateLB: PE [0] step 0 finished at 1.054772 duration 0.268376 s

iter 6 time: 0.332162 maxerr: 1108.425519
iter 7 time: 0.120714 maxerr: 1033.970839
iter 8 time: 0.120492 maxerr: 972.509242
iter 9 time: 0.120599 maxerr: 920.721889
iter 10 time: 0.120757 maxerr: 876.344030
iter 11 time: 0.121028 maxerr: 837.779089
iter 12 time: 0.121602 maxerr: 803.868831
iter 13 time: 0.120665 maxerr: 773.751705
iter 14 time: 0.120744 maxerr: 746.772667
iter 15 time: 0.120690 maxerr: 722.424056

CharmLB> RotateLB: PE [0] step 1 starting at 2.432781 Memory: 118.170074 MB
CharmLB> RotateLB: PE [0] strategy starting at 2.433278
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 2.433386 duration 0.000108 s
CharmLB> RotateLB: PE [0] step 1 finished at 2.833001 duration 0.400220 s

iter 16 time: 0.340377 maxerr: 700.305763
iter 17 time: 0.120955 maxerr: 680.097726
iter 18 time: 0.121006 maxerr: 661.540528
iter 19 time: 0.121059 maxerr: 644.421422
iter 20 time: 0.120938 maxerr: 628.564089
iter 21 time: 0.121209 maxerr: 613.821009
iter 22 time: 0.120912 maxerr: 600.067696
iter 23 time: 0.120837 maxerr: 587.198273
iter 24 time: 0.120891 maxerr: 575.122054
iter 25 time: 0.120792 maxerr: 563.760848

CharmLB> RotateLB: PE [0] step 2 starting at 4.096562 Memory: 227.546524 MB
CharmLB> RotateLB: PE [0] strategy starting at 4.096854
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 4.096883 duration 0.000029 s
CharmLB> RotateLB: PE [0] step 2 finished at 4.474527 duration 0.377965 s

iter 26 time: 0.334834 maxerr: 553.046836
iter 27 time: 0.121427 maxerr: 542.920870
iter 28 time: 0.121385 maxerr: 533.331094
iter 29 time: 0.122084 maxerr: 524.231833
iter 30 time: 0.121842 maxerr: 515.582675
iter 31 time: 0.122087 maxerr: 507.347718
iter 32 time: 0.122118 maxerr: 499.494943
iter 33 time: 0.121870 maxerr: 491.995690
iter 34 time: 0.121867 maxerr: 484.824219
iter 35 time: 0.122244 maxerr: 477.957338

CharmLB> RotateLB: PE [0] step 3 starting at 5.762272 Memory: 349.500839 MB
CharmLB> RotateLB: PE [0] strategy starting at 5.762491
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 2 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 5.762599 duration 0.000108 s
CharmLB> RotateLB: PE [0] step 3 finished at 6.027424 duration 0.265152 s

iter 36 time: 0.291537 maxerr: 471.374089
iter 37 time: 0.120020 maxerr: 465.055477
iter 38 time: 0.120595 maxerr: 458.984241
iter 39 time: 0.120624 maxerr: 453.144656
iter 40 time: 0.120467 maxerr: 447.522361
[Partition 0][Node 0] End of program

History

#1 Updated by Sam White 25 days ago

  • Subject changed from Isomalloc to Isomalloc produces misleading memory usage numbers

#2 Updated by Sam White 25 days ago

In the above output, you can see that without Isomalloc, the memory usage remains relatively constant throughout the multiple runs of load balancing, but with Isomalloc it looks as if it increases consistently over time. Isomalloc is not actually using that much memory, but it is somehow confusing CmiMemoryUsage(), maybe because of how it uses mmap and how CmiMemoryUsage reports mmapped memory?

#3 Updated by Evan Ramos 10 days ago

  • Status changed from New to In Progress

What is happening is that -memory isomalloc defines its own CmiMemoryUsage that reports a statistic that is internal to the ptmalloc3 allocator in memory-gnu.c.

/** Return number of bytes currently allocated, if possible. */
CMK_TYPEDEF_UINT8 CmiMemoryUsage(void)
{
  return _memory_allocated;
}

All handling of this variable in memory-gnu.c is surrounded by /* CHARM++ ADD BEGIN */ and /* CHARM++ ADD END */, so it is possible that it is not reporting the desired values.

#4 Updated by Evan Ramos 10 days ago

  • Status changed from In Progress to Implemented

https://charm.cs.illinois.edu/gerrit/4470

$ ./charmrun +p3  ./jacobi.iso 2 2 2 40 +vp8 +balancer RotateLB +LBDebug 1 ++local
Charmrun> scalable start enabled. 
Charmrun> started all node programs in 0.017 seconds.
Charm++> Running in non-SMP mode: 3 processes (PEs)
Converse/Charm++ Commit ID: v6.8.2-869-g982734014
Charm++> scheduler running in netpoll mode.
CharmLB> Verbose level 1, load balancing period: 0.5 seconds
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 4 cores x 2 PUs = 8-way SMP)
Charm++> cpu topology info is gathered in 0.002 seconds.
CharmLB> RotateLB created.
iter 1 time: 0.142453 maxerr: 2020.200000
iter 2 time: 0.148573 maxerr: 1696.968000
iter 3 time: 0.156102 maxerr: 1477.170240
iter 4 time: 0.143621 maxerr: 1319.433024
iter 5 time: 0.272285 maxerr: 1200.918072

CharmLB> RotateLB: PE [0] step 0 starting at 0.963085 Memory: 1.710876 MB
CharmLB> RotateLB: PE [0] strategy starting at 0.963746
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 3 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 0.963773 duration 0.000027 s
CharmLB> RotateLB: PE [0] step 0 finished at 1.403715 duration 0.440630 s

iter 6 time: 0.402070 maxerr: 1108.425519
iter 7 time: 0.153616 maxerr: 1033.970839
iter 8 time: 0.280984 maxerr: 972.509242
iter 9 time: 0.183324 maxerr: 920.721889
iter 10 time: 0.151594 maxerr: 876.344030
iter 11 time: 0.135219 maxerr: 837.779089
iter 12 time: 0.160992 maxerr: 803.868831
iter 13 time: 0.172616 maxerr: 773.751705
iter 14 time: 0.157789 maxerr: 746.772667
iter 15 time: 0.171500 maxerr: 722.424056

CharmLB> RotateLB: PE [0] step 1 starting at 3.327597 Memory: 1.825897 MB
CharmLB> RotateLB: PE [0] strategy starting at 3.328260
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 3 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 3.328281 duration 0.000021 s
CharmLB> RotateLB: PE [0] step 1 finished at 3.868468 duration 0.540871 s

iter 16 time: 0.343915 maxerr: 700.305763
iter 17 time: 0.272625 maxerr: 680.097726
iter 18 time: 0.154494 maxerr: 661.540528
iter 19 time: 0.170355 maxerr: 644.421422
iter 20 time: 0.272141 maxerr: 628.564089
iter 21 time: 0.134999 maxerr: 613.821009
iter 22 time: 0.131548 maxerr: 600.067696
iter 23 time: 0.240853 maxerr: 587.198273
iter 24 time: 0.188747 maxerr: 575.122054
iter 25 time: 0.271002 maxerr: 563.760848

CharmLB> RotateLB: PE [0] step 2 starting at 5.864200 Memory: 1.860291 MB
CharmLB> RotateLB: PE [0] strategy starting at 5.864732
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 3 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 5.864749 duration 0.000017 s
CharmLB> RotateLB: PE [0] step 2 finished at 6.444895 duration 0.580695 s

iter 26 time: 0.400345 maxerr: 553.046836
iter 27 time: 0.225215 maxerr: 542.920870
iter 28 time: 0.151393 maxerr: 533.331094
iter 29 time: 0.156504 maxerr: 524.231833
iter 30 time: 0.207614 maxerr: 515.582675
iter 31 time: 0.154018 maxerr: 507.347718
iter 32 time: 0.159434 maxerr: 499.494943
iter 33 time: 0.164532 maxerr: 491.995690
iter 34 time: 0.158423 maxerr: 484.824219
iter 35 time: 0.164938 maxerr: 477.957338

CharmLB> RotateLB: PE [0] step 3 starting at 8.185502 Memory: 1.876205 MB
CharmLB> RotateLB: PE [0] strategy starting at 8.186074
CharmLB> RotateLB: PE [0] Memory: LBManager: 890 KB CentralLB: 3 KB
CharmLB> RotateLB: PE [0] #Objects migrating: 8, LBMigrateMsg size: 0.00 MB
CharmLB> RotateLB: PE [0] strategy finished at 8.186094 duration 0.000020 s
CharmLB> RotateLB: PE [0] step 3 finished at 8.585086 duration 0.399584 s

iter 36 time: 0.368134 maxerr: 471.374089
iter 37 time: 0.230851 maxerr: 465.055477
iter 38 time: 0.136330 maxerr: 458.984241
iter 39 time: 0.130112 maxerr: 453.144656
iter 40 time: 0.166099 maxerr: 447.522361
[Partition 0][Node 0] End of program

#5 Updated by Sam White 8 days ago

  • Target version set to 6.9.0
  • Status changed from Implemented to Merged

Also available in: Atom PDF