charm.git
12 years agoPort the general way of getting memory usage information from NAMD into Charm++ when...
Chao Mei [Thu, 27 Aug 2009 20:09:39 +0000 (20:09 +0000)]
Port the general way of getting memory usage information from NAMD into Charm++ when GNU_MEMORY is not used.

12 years agoInserted few comments in the code.
Esteban Meneses [Thu, 27 Aug 2009 15:56:24 +0000 (15:56 +0000)]
Inserted few comments in the code.

12 years agofixed a silly bug (compiler bug?) that treated a commented line as if's else branch.
Gengbin Zheng [Thu, 27 Aug 2009 13:51:45 +0000 (13:51 +0000)]
fixed a silly bug (compiler bug?) that treated a commented line as if's else branch.
also has a lot of cleanup.

12 years agominor cleanup
Gengbin Zheng [Thu, 27 Aug 2009 13:50:10 +0000 (13:50 +0000)]
minor cleanup

12 years agoFixing performance problems with OneTimeMulticast algorithms.
Isaac Dooley [Wed, 26 Aug 2009 22:30:20 +0000 (22:30 +0000)]
Fixing performance problems with OneTimeMulticast algorithms.

12 years agouse slightly better version of quickthreads on bgp (generic-light)
Gengbin Zheng [Wed, 26 Aug 2009 20:28:01 +0000 (20:28 +0000)]
use slightly better version of quickthreads on bgp (generic-light)

12 years agoIncluded basic support for group message logging.
Esteban Meneses [Wed, 26 Aug 2009 20:20:12 +0000 (20:20 +0000)]
Included basic support for group message logging.

12 years agocomm stats collection in array location manager calls idx2LDObjid in its argument...
Gengbin Zheng [Wed, 26 Aug 2009 18:55:26 +0000 (18:55 +0000)]
comm stats collection in array location manager calls idx2LDObjid in its argument (which is expensive, e.g. about 2us on BG/P). Even if comm stats is actually not on for example when lb module is not even linked in, one has to pay the overhead of calling idx2LDObjid().
eliminated this by moving the guard (that comm stats collection is on) up to location manager.

12 years agoadded a new function CollectingCommStats() which returns true when comm stats collect...
Gengbin Zheng [Wed, 26 Aug 2009 18:52:05 +0000 (18:52 +0000)]
added a new function CollectingCommStats() which returns true when comm stats collection is actually happening.

12 years agotstop param
Pritish Jetley [Wed, 26 Aug 2009 18:11:41 +0000 (18:11 +0000)]
tstop param

12 years agofinal version
Pritish Jetley [Wed, 26 Aug 2009 16:50:21 +0000 (16:50 +0000)]
final version

12 years agocharmrun now can be called from job script for multiple runs
Gengbin Zheng [Wed, 26 Aug 2009 15:54:59 +0000 (15:54 +0000)]
charmrun now can be called from job script for multiple runs

12 years agokill charmrun if jacobi.iso hangs. (this does not necessarily have to work on all...
Gengbin Zheng [Mon, 24 Aug 2009 19:20:55 +0000 (19:20 +0000)]
kill charmrun if jacobi.iso hangs. (this does not necessarily have to work on all platforms)

12 years agofixed for uth when ccs is not supported
Gengbin Zheng [Mon, 24 Aug 2009 15:41:43 +0000 (15:41 +0000)]
fixed for uth when ccs is not supported

12 years agofixed for VC++
Gengbin Zheng [Mon, 24 Aug 2009 15:21:33 +0000 (15:21 +0000)]
fixed for VC++

12 years agoa dummy CmiPushPE to fix the link error. It looks like CmiPushPE was missing on vmi...
Gengbin Zheng [Mon, 24 Aug 2009 15:19:01 +0000 (15:19 +0000)]
a dummy CmiPushPE to fix the link error. It looks like CmiPushPE was missing on vmi layer, and now is required by recent changes to exist on every machine layer

12 years agofixed VC++ errors
Gengbin Zheng [Sun, 23 Aug 2009 03:36:02 +0000 (03:36 +0000)]
fixed VC++ errors

12 years agomake killpe portable to machines when kill and getpid are missing on a platform
Gengbin Zheng [Sun, 23 Aug 2009 03:34:03 +0000 (03:34 +0000)]
make killpe portable to machines when kill and getpid are missing on a platform

12 years agofix for VC++
Gengbin Zheng [Sun, 23 Aug 2009 03:28:13 +0000 (03:28 +0000)]
fix for VC++

12 years agofixed VC++ errors
Gengbin Zheng [Sun, 23 Aug 2009 03:24:59 +0000 (03:24 +0000)]
fixed VC++ errors

12 years agofixed syntax errors for VC++
Gengbin Zheng [Sun, 23 Aug 2009 03:23:19 +0000 (03:23 +0000)]
fixed syntax errors for VC++

12 years agofix for machines without CCS
Filippo Gioachin [Fri, 21 Aug 2009 23:03:10 +0000 (23:03 +0000)]
fix for machines without CCS

12 years agocomment style
Filippo Gioachin [Fri, 21 Aug 2009 23:02:56 +0000 (23:02 +0000)]
comment style

12 years agomaking CmiPushPE globally available
Filippo Gioachin [Fri, 21 Aug 2009 23:01:14 +0000 (23:01 +0000)]
making CmiPushPE globally available

12 years agoFixed a bug with a comment and a macro.
Esteban Meneses [Fri, 21 Aug 2009 19:41:35 +0000 (19:41 +0000)]
Fixed a bug with a comment and a macro.

12 years agousing extended header
Filippo Gioachin [Fri, 21 Aug 2009 00:52:57 +0000 (00:52 +0000)]
using extended header

12 years agoChanging CMK_MSG_HEADER_BASIC to be identical to CMK_MSG_HEADER_EXT.
Filippo Gioachin [Fri, 21 Aug 2009 00:47:32 +0000 (00:47 +0000)]
Changing CMK_MSG_HEADER_BASIC to be identical to CMK_MSG_HEADER_EXT.
The reasons behind this change:
- net is the only machine layer which does not define them identically
- we are saving 4 bytes on headers in pure converse programs
- it is an unnecessary complication
- it is not even 16 bytes aligned

12 years agoupdate signature of merge functions
Filippo Gioachin [Thu, 20 Aug 2009 23:56:45 +0000 (23:56 +0000)]
update signature of merge functions

12 years agoAdded a new field to the converse header. This is used for Converse reductions. ...
Filippo Gioachin [Thu, 20 Aug 2009 23:55:41 +0000 (23:55 +0000)]
Added a new field to the converse header. This is used for Converse reductions. (before I tried to use the "root" field, but apparently this is not possible at least in MPI)

12 years agoprint stack trace in MPI layer when a signal kills the program
Filippo Gioachin [Thu, 20 Aug 2009 23:52:01 +0000 (23:52 +0000)]
print stack trace in MPI layer when a signal kills the program

12 years agomissing include
Filippo Gioachin [Thu, 20 Aug 2009 23:49:32 +0000 (23:49 +0000)]
missing include

12 years agoFixed wrong call
Filippo Gioachin [Thu, 20 Aug 2009 22:46:04 +0000 (22:46 +0000)]
Fixed wrong call

12 years ago*** empty log message ***
Filippo Gioachin [Thu, 20 Aug 2009 01:54:05 +0000 (01:54 +0000)]
*** empty log message ***

12 years agoUnified the file to support four different builds, with all the combinations of on...
Filippo Gioachin [Thu, 20 Aug 2009 01:52:42 +0000 (01:52 +0000)]
Unified the file to support four different builds, with all the combinations of on/off of the two compile time flags CMK_SEPARATE_SLOT and CPD_USE_MMAP.

12 years agoChanged CCS to support Converse-level broadcasts and multicasts (i.e same handler...
Filippo Gioachin [Thu, 20 Aug 2009 01:51:29 +0000 (01:51 +0000)]
Changed CCS to support Converse-level broadcasts and multicasts (i.e same handler executed on multiple processors). A function can be set by the application to decide how the replies from the various processors ought to be merged. The reduction to merge all the replies is performed by the system, and the user only has to CcsReply in every processor. As a consequence, the CcsDelayedReply data structure has also changed.
CpdDebug now uses the new CCS broadcast to reply to queries.
Changed how notifications are sent to CharmDebug: a new generic function "CpdNotify" is introduced.
Enabled "EmergencyExit" on net and mpi layers. This function is called whenever CmiAbort or a signal is received. In the MPI layer, signals are now registered, and the shutdown process has a barrier to ensure all processors have a chance of calling EmergencyExit.

12 years agoAdded function to kill a processor with signal 9, signature: "ccs_killpe"
Filippo Gioachin [Thu, 20 Aug 2009 01:43:43 +0000 (01:43 +0000)]
Added function to kill a processor with signal 9, signature: "ccs_killpe"

12 years agoNew implementation of CmiReduce routine (and the like).
Filippo Gioachin [Thu, 20 Aug 2009 01:41:04 +0000 (01:41 +0000)]
New implementation of CmiReduce routine (and the like).
Now they support multiple simultaneous reductions. They also have forms form subsets of processors involved in the reduction process.

12 years agobuilding a couple more versions of memory-charmdebug
Filippo Gioachin [Thu, 20 Aug 2009 01:37:59 +0000 (01:37 +0000)]
building a couple more versions of memory-charmdebug

12 years agoDeleted signal handler: this functionality is now part of EmergencyExit
Filippo Gioachin [Thu, 20 Aug 2009 01:37:28 +0000 (01:37 +0000)]
Deleted signal handler: this functionality is now part of EmergencyExit

12 years agoMore information pupped for messages to the debugger
Filippo Gioachin [Thu, 20 Aug 2009 01:36:59 +0000 (01:36 +0000)]
More information pupped for messages to the debugger

12 years agoAdded EmergencyExit function, to be called upon CmiAbort or signal received
Filippo Gioachin [Thu, 20 Aug 2009 01:35:31 +0000 (01:35 +0000)]
Added EmergencyExit function, to be called upon CmiAbort or signal received

12 years ago*** empty log message ***
Filippo Gioachin [Thu, 20 Aug 2009 01:34:43 +0000 (01:34 +0000)]
*** empty log message ***

12 years agoenable the tracing of nested begin/end of entry methods.
Filippo Gioachin [Thu, 20 Aug 2009 01:09:41 +0000 (01:09 +0000)]
enable the tracing of nested begin/end of entry methods.
This solution does not take care of message dependencies: if B is executed inside A, it records
begA, endA, begB, endB, begA, endA
(the second and fifth log entries are the addition)

12 years agoadded more comments
Gengbin Zheng [Wed, 19 Aug 2009 16:59:08 +0000 (16:59 +0000)]
added more comments

12 years agoa simple change to make it SMP node aware
Gengbin Zheng [Wed, 19 Aug 2009 16:52:53 +0000 (16:52 +0000)]
a simple change to make it SMP node aware

12 years agounistd.h does not exist on windows
Gengbin Zheng [Wed, 19 Aug 2009 13:18:09 +0000 (13:18 +0000)]
unistd.h does not exist on windows

12 years agotest sleep call
Gengbin Zheng [Wed, 19 Aug 2009 13:16:41 +0000 (13:16 +0000)]
test sleep call

12 years agorecognize some error from qsub to avoid qsub infinitely
Gengbin Zheng [Wed, 19 Aug 2009 02:32:05 +0000 (02:32 +0000)]
recognize some error from qsub to avoid qsub infinitely

12 years agominor
Gengbin Zheng [Tue, 18 Aug 2009 21:10:54 +0000 (21:10 +0000)]
minor

12 years ago#include <unistd.h> for sleep
Gengbin Zheng [Tue, 18 Aug 2009 21:06:47 +0000 (21:06 +0000)]
#include <unistd.h> for sleep

12 years agominor change
Gengbin Zheng [Tue, 18 Aug 2009 21:03:52 +0000 (21:03 +0000)]
minor change

12 years agouse ampicxx
Gengbin Zheng [Tue, 18 Aug 2009 20:59:29 +0000 (20:59 +0000)]
use ampicxx

12 years agouse ampicxx to resolve mpi.h search issue
Gengbin Zheng [Tue, 18 Aug 2009 20:50:23 +0000 (20:50 +0000)]
use ampicxx to resolve mpi.h search issue

12 years agouse ampicc to compile femmain.C to avoid conflict in searching mpi.h
Gengbin Zheng [Tue, 18 Aug 2009 20:47:44 +0000 (20:47 +0000)]
use ampicc to compile femmain.C to avoid conflict in searching mpi.h

12 years agochanges to compile on bgp
Gengbin Zheng [Tue, 18 Aug 2009 20:44:54 +0000 (20:44 +0000)]
changes to compile on bgp

12 years agouses ampicxx to compile
Gengbin Zheng [Tue, 18 Aug 2009 20:35:51 +0000 (20:35 +0000)]
uses ampicxx to compile

12 years agoadded -DMPICH_IGNORE_CXX_SEEK
Gengbin Zheng [Tue, 18 Aug 2009 20:34:55 +0000 (20:34 +0000)]
added -DMPICH_IGNORE_CXX_SEEK

12 years agodisable isomalloc since it does not work
Gengbin Zheng [Tue, 18 Aug 2009 18:22:31 +0000 (18:22 +0000)]
disable isomalloc since it does not work

12 years agoRemoved a condition for the FT versions. Now, all of them are free to use any chare...
Esteban Meneses [Tue, 18 Aug 2009 15:56:58 +0000 (15:56 +0000)]
Removed a condition for the FT versions. Now, all of them are free to use any chare distribution algorithm (not just round-robin).

12 years agodelete the old charmrun that was copied from mpi. Create a new charmrun script that...
Gengbin Zheng [Tue, 18 Aug 2009 15:38:36 +0000 (15:38 +0000)]
delete the old charmrun that was copied from mpi. Create a new charmrun script that submits job, wait until it finishes and print output.

12 years agoQD was broken if a charm message is sent from immediate msg handler. Fix my sending...
Gengbin Zheng [Sat, 15 Aug 2009 01:51:34 +0000 (01:51 +0000)]
QD was broken if a charm message is sent from immediate msg handler. Fix my sending its count to rank 0 processor on the same node.

12 years agoThe size of the checkpoint was reduced by avoiding the messsage log to be stored...
Esteban Meneses [Tue, 11 Aug 2009 21:53:21 +0000 (21:53 +0000)]
The size of the checkpoint was reduced by avoiding the messsage log to be stored. Only unacked local messages are saved as part of the checkpoint of an object.

12 years agodon't apply -fPIC on mpicxx/icpc anymore since it total breaks.
Gengbin Zheng [Tue, 11 Aug 2009 21:49:40 +0000 (21:49 +0000)]
don't apply -fPIC on mpicxx/icpc anymore since it total breaks.

12 years agobug fix for index calculation
Abhinav Bhatele [Mon, 10 Aug 2009 05:34:55 +0000 (05:34 +0000)]
bug fix for index calculation

12 years agobug fix in index calculation
Abhinav Bhatele [Mon, 10 Aug 2009 05:33:33 +0000 (05:33 +0000)]
bug fix in index calculation

12 years agoupdated compilers and cleanup
Abhinav Bhatele [Sun, 9 Aug 2009 23:27:37 +0000 (23:27 +0000)]
updated compilers and cleanup

12 years agominor change in the warning about stack randomization.
Gengbin Zheng [Thu, 6 Aug 2009 02:41:01 +0000 (02:41 +0000)]
minor change in the warning about stack randomization.

12 years agoReverting accidental enabling of critical path code in previous commit.
Isaac Dooley [Wed, 5 Aug 2009 21:27:14 +0000 (21:27 +0000)]
Reverting accidental enabling of critical path code in previous commit.

12 years agoMaking critical path auto-prioritization work better. Now handles ForChareMsg envelopes.
Isaac Dooley [Wed, 5 Aug 2009 21:02:57 +0000 (21:02 +0000)]
Making critical path auto-prioritization work better. Now handles ForChareMsg envelopes.

12 years agoremove ssize_t to make it compile on windows
Gengbin Zheng [Wed, 5 Aug 2009 18:03:42 +0000 (18:03 +0000)]
remove ssize_t to make it compile on windows

12 years agoprint memory usage in %f instead of %d
Gengbin Zheng [Wed, 5 Aug 2009 02:07:04 +0000 (02:07 +0000)]
print memory usage in %f instead of %d

12 years agoAdding support for automatic message prioritization.
Isaac Dooley [Tue, 4 Aug 2009 21:04:56 +0000 (21:04 +0000)]
Adding support for automatic message prioritization.

12 years agoAdding the ability to disable and enable trace log output.
Isaac Dooley [Tue, 4 Aug 2009 20:48:17 +0000 (20:48 +0000)]
Adding the ability to disable and enable trace log output.

12 years agohandle gfortran's mangled name (which incidentally starts with '_'), a patch from...
Gengbin Zheng [Tue, 4 Aug 2009 17:10:53 +0000 (17:10 +0000)]
handle gfortran's mangled name (which incidentally starts with '_'), a patch from Edurado with modification

12 years agocheck if regex.h exists, for elfgot
Gengbin Zheng [Tue, 4 Aug 2009 17:08:26 +0000 (17:08 +0000)]
check if regex.h exists, for elfgot

12 years agotimer cost can now be set in bg_config and command line option (+bgtimercost)
Gengbin Zheng [Mon, 3 Aug 2009 04:11:15 +0000 (04:11 +0000)]
timer cost can now be set in bg_config and command line option (+bgtimercost)

12 years agoadded an option for using native AIX timer calls
Gengbin Zheng [Mon, 3 Aug 2009 03:36:42 +0000 (03:36 +0000)]
added  an option for using native AIX timer calls

12 years agoskip the walltime in making array broadcast in bigsim
Gengbin Zheng [Mon, 3 Aug 2009 03:34:18 +0000 (03:34 +0000)]
skip the walltime in making array broadcast in bigsim

12 years agoskip the time cost in tracing bigsim
Gengbin Zheng [Mon, 3 Aug 2009 03:32:22 +0000 (03:32 +0000)]
skip the time cost in tracing bigsim

12 years agominor changes
Abhinav Bhatele [Sun, 2 Aug 2009 00:08:25 +0000 (00:08 +0000)]
minor changes

12 years agoAdding new capability to automatically set message priorities based on critical path...
Isaac Dooley [Sat, 1 Aug 2009 22:24:15 +0000 (22:24 +0000)]
Adding new capability to automatically set message priorities based on critical path profile.

12 years agochanged FAQ manual to be split in several html pages
Filippo Gioachin [Fri, 31 Jul 2009 19:36:55 +0000 (19:36 +0000)]
changed FAQ manual to be split in several html pages

12 years agoRemoved old code that came from ChaNGa
Filippo Gioachin [Thu, 30 Jul 2009 22:19:27 +0000 (22:19 +0000)]
Removed old code that came from ChaNGa

12 years agovarious changes to reduce bigsim tracing overhead, including:
Gengbin Zheng [Thu, 30 Jul 2009 21:11:31 +0000 (21:11 +0000)]
various changes to reduce bigsim tracing overhead, including:
1.  tracing a thread used to include a context switch overhead from Cth back to scheduler. This can incidentally include charm scheduling overhead
2.  reduce timer overhead by substracting timer cost from elapsed time.

12 years agoendTime of event log starts now with -1 instead of 0, denoting unintialized value.
Gengbin Zheng [Thu, 30 Jul 2009 20:53:15 +0000 (20:53 +0000)]
endTime of event log starts now with -1 instead of 0, denoting unintialized value.

12 years agodon't need create nodegroup for reduction if the group level reduction is enabled...
Gengbin Zheng [Thu, 30 Jul 2009 20:26:29 +0000 (20:26 +0000)]
don't need create nodegroup for reduction if the group level reduction is enabled in bisim

12 years agoAdded some operators which were missing from SSE-Float.h
Lukasz Wesolowski [Thu, 30 Jul 2009 16:48:56 +0000 (16:48 +0000)]
Added some operators which were missing from SSE-Float.h

12 years agoadded old way of doing reduction via processor level tree (no nodegroup) for bigsim...
Gengbin Zheng [Thu, 30 Jul 2009 16:40:29 +0000 (16:40 +0000)]
added old way of doing reduction via processor level tree (no nodegroup) for bigsim (defined in macro GROUP_LEVEL_REDUCTION)

also make recvMsg in ckreduction.ci expedited

12 years agoexit with real compilation status instead of the status of the last command
Gengbin Zheng [Thu, 30 Jul 2009 16:13:20 +0000 (16:13 +0000)]
exit with real compilation status instead of the status of the last command

12 years agoskip trhead listener hook functions if they are NULL pointers.
Gengbin Zheng [Wed, 29 Jul 2009 20:01:45 +0000 (20:01 +0000)]
skip  trhead listener hook functions if they are NULL pointers.

12 years agoremove static in a Cpv to make icpc happy without internal error with -O and -fPIC
Gengbin Zheng [Wed, 29 Jul 2009 19:31:36 +0000 (19:31 +0000)]
remove static in a Cpv to make icpc happy without internal error with -O and -fPIC

12 years agoadd magic number to barrier message. we still need a reliable cmibarrier for udp...
Gengbin Zheng [Wed, 29 Jul 2009 17:05:49 +0000 (17:05 +0000)]
add magic number to barrier message. we still need a reliable cmibarrier for udp layer

12 years agomake the new code that finds the max isomalloc region using mmap probing scheme more...
Gengbin Zheng [Wed, 29 Jul 2009 16:48:38 +0000 (16:48 +0000)]
make the new code that finds the max isomalloc region using mmap probing scheme more robust.

12 years agotest if can build shared library when mpi library is linked
Gengbin Zheng [Wed, 22 Jul 2009 15:35:07 +0000 (15:35 +0000)]
test if can build shared library when mpi library is linked

12 years agoadded fortran interface for AMPI_Setmigratable
Gengbin Zheng [Wed, 22 Jul 2009 03:12:17 +0000 (03:12 +0000)]
added fortran interface for AMPI_Setmigratable

12 years agofixed long standing (9 years!) bug in CCS when running without charmrun (like in...
Filippo Gioachin [Wed, 22 Jul 2009 00:26:15 +0000 (00:26 +0000)]
fixed long standing (9 years!) bug in CCS when running without charmrun (like in MPI build)

12 years agomake sure -1 is returned when all topo function APIs are not supported
Gengbin Zheng [Tue, 21 Jul 2009 19:18:01 +0000 (19:18 +0000)]
make sure -1 is returned when all topo function APIs are not supported

12 years agofixed a typo in previous checkin
Gengbin Zheng [Tue, 21 Jul 2009 19:02:27 +0000 (19:02 +0000)]
fixed a typo in previous checkin

12 years agoa bug in numUniqNodes() that returns wrong num of physical nodes when called multiple...
Gengbin Zheng [Tue, 21 Jul 2009 18:21:52 +0000 (18:21 +0000)]
a bug in numUniqNodes() that returns wrong num of physical nodes when called multiple times

12 years agoremove the hack for autobuild on vmi
Gengbin Zheng [Tue, 21 Jul 2009 15:49:11 +0000 (15:49 +0000)]
remove the hack for autobuild on vmi