Bug #1577: User-level thread based OpenMP integration support on Mac
User-level thread implementation based on Boost context library
Currently, Converse has several user-level implementations.
3. uJcontext(setjmp/longjmp based)
4. pthread based
5. stack copy
Ucontext_t is the default if it is available and uJcontext is used for Mac and Quickthread is used for ARM.
Ucontext_t is deprecated and has not been maintained by the Unix and Posix standard. The behavior of this sys calls is not defined on different environments.
For example, ucontext_t can allow migration of user-level context across threads in Linux but it doesn't allow the migration on Mac. (Different implementation)
uJcontext based on setjmp/longjmp cannot be used for this kind of migration. Setjmp/longjmp is designed for signal handling and nonlocal goto. So, the context should be resumed on the thread where the context is suspended previously.
Quickthread also doesn't show the same behavior on different environments as ucontext_t.
Now, Boost context is the only available user-level context for most of environments we are targeting.
1. Supported OS: WINDOWS, Mac, LINUX/UNIX, IOS
2. Supported Architecture: Details in the following link (most of architectures on each supported OS)
The license of the boost library is similar to BSD and MIT license, which means we don't have liability to disclose our codes using Boost.
In addition, Boost provides assembly codes(machine executable object code) for this context implementation. (Don't need to compile their codes. We can use their implementation only with these assembly codes)
Argobot also adopts fcontext_t and using assembly codes for their user-level threads.
#4 Updated by Seonmyeong Bak 3 months ago
Currently, I implemented uFcontext using assembly codes from boost context library.
And changed the build script to choose appropriate assembly source codes depending on the build target of Charm++.
Boost context assembly codes support PPC, ARM, MIPS and x86_64 on linux, Mac OS X and Windows.
This assembly codes doesn't include the support of Windows. This will be added in a separate patch.
#5 Updated by Seonmyeong Bak 3 months ago
With this library, the OpenMP integration works well on MacOSX with GCC and barrier related directives also works on Linux and Mac OS X.
Boost context doesn't guarantee thread-local variables so I turned off TLS based Cpv variable option (CMK_NOT_USE_TLS_THREAD 1)
#7 Updated by Seonmyeong Bak about 2 months ago
Copied reply from gerrit
This ULT implementation stores register values in stack. So, if stack for each ULT can be migrated via CthPup properly and stack can be accessed in the same address via Isomalloc, then this ULT works correctly. (some of space in stack is reserved for register values) AMPI uses MEMPOOL_ISOMALLOC by default(not sure for other environments. At least on netlrts-x86_64 smp) so that it doesn't need to pack stack data and unpack stack data in CthPup. (Only pack the pointer to ULT stack)
So, each ULT continues to use its stack in the same address after it is migrated and fctx(pointer to context data in ULT stack) is also valid after migration.
To sum up, this ULT impl doesn't need to store and restore register values separately.