Project

General

Profile

Bug #1929

AMPIF print and write statements break when tlsglobals is enabled

Added by Evan Ramos 13 days ago. Updated 13 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
AMPI
Target version:
-
Start date:
06/08/2018
Due date:
% Done:

0%

Tags:

Description

When running an AMPI Fortran program with tlsglobals, text format statements of the form WRITE(*,[fmt]) and PRINT [fmt], crash inside libgfortran. Changing these to WRITE(*) and PRINT *, works around the crash but breaks the intended format of the string, and is not something we should require of users in order to AMPI-ize their code.

History

#1 Updated by Evan Ramos 13 days ago

I built and am using a debug build of libgfortran to investigate further. For reference, this is how I built it:

sudo apt install flex bison libgmp-dev libgmp3-dev libgmp10 libmpc-dev libmpc3 libmpfr-dev libmpfr6
git clone git://gcc.gnu.org/git/gcc.git
mkdir build
cd build
../gcc/configure --disable-multilib
make -j

libgfortran will be one of the last pieces to build. For partial rebuilding, make can be run from build/x86_64-pc-linux-gnu/libgfortran/.

To run, copy build/x86_64-pc-linux-gnu/libgfortran/.libs/libgfortran.* to your binary's path, and manually re-run the link command, adding -L. -rpath-origin to the ampif90 invocation.

#2 Updated by Evan Ramos 13 days ago

Compiling libgfortran with -O0 instead of -O2 seems to delay crashing until later in MiniGhost's execution.

-O2:

Charm++: standalone mode (not using charmrun)
Charm++> Running in non-SMP mode: 1 processes (PEs)
Converse/Charm++ Commit ID: v6.8.2-743-gb94e029c7
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 6 cores x 2 PUs = 12-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Charm++> -tlsglobals enabled for privatization of thread-local variables.

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b7ec2c in next_char (fmt=fmt@entry=0x555555ddf6d0, literal=literal@entry=0) at ../../../src/libgfortran/io/format.c:196
196          c = toupper (*fmt->format_string++);
(gdb) bt
#0  0x00007ffff7b7ec2c in next_char (fmt=fmt@entry=0x555555ddf6d0, literal=literal@entry=0) at ../../../src/libgfortran/io/format.c:196
#1  0x00007ffff7b7ed44 in format_lex (fmt=fmt@entry=0x555555ddf6d0) at ../../../src/libgfortran/io/format.c:309
#2  0x00007ffff7b805fe in format_lex (fmt=0x555555ddf6d0) at ../../../src/libgfortran/io/format.c:1346
#3  _gfortrani_parse_format (dtp=dtp@entry=0x4010ff400) at ../../../src/libgfortran/io/format.c:1348
#4  0x00007ffff7b8fa28 in data_transfer_init (dtp=dtp@entry=0x4010ff400, read_flag=read_flag@entry=0) at ../../../src/libgfortran/io/transfer.c:2793
#5  0x00007ffff7b904a4 in _gfortran_st_write (dtp=dtp@entry=0x4010ff400) at ../../../src/libgfortran/io/transfer.c:4133
#6  0x000055555572b7b7 in mg_utils_mod::mg_print_header (comm_method=10, stencil=21, ierr=0) at MG_UTILS.F:323
#7  0x00005555557408ac in mini_ghost (scaling_in=<optimized out>, nx_in=<optimized out>, ny_in=<optimized out>, nz_in=<optimized out>, nvars_in=<optimized out>, percent_sum_in=<optimized out>, nspikes_in=1, ntsteps_in=100, stencil_in=21, comm_method_in=10, bc_in=31, error_tol_in=8, report_diffusion_in=20, npx_in=1, npy_in=1, npz_in=1, report_perf_in=0, 
    cp_method_in=1, cp_interval_in=0, cp_file_in=..., restart_cp_num_in=-2, restart_file_in=..., debug_grid_in=0, _cp_file_in=1433575663, _restart_file_in=1439154320) at DRIVER.F:175
#8  0x000055555572a208 in AMPI_Main (argc=1, argv=0x555555c7c090) at main.c:375
#9  0x00005555557cb142 in AMPI_Main_c (argc=1, argv=0x555555c7c090) at compat_ampi.c:15
#10 0x0000555555754655 in AMPI_Fallback_Main (argc=1, argv=0x555555c7c090) at ampi.C:826
#11 0x0000555555797c44 in MPI_threadstart_t::start (this=0x401100098) at ampi.C:1031
#12 0x0000555555754c0f in AMPI_threadstart (data=0x555555dcef40) at ampi.C:1051
#13 0x0000555555741366 in startTCharmThread (msg=0x555555dcef20) at tcharm.C:175
#14 0x00005555558c5261 in CthStartThread (arg=...) at libthreads-default-tls.c:1770
#15 0x00005555558c56ff in make_fcontext () at make_x86_64_sysv_elf_gas.S:70
#16 0x0000000000000000 in ?? ()
(gdb) disas
Dump of assembler code for function next_char:
   0x00007ffff7b7ebf0 <+0>:    xor    $0x1,%esi
   0x00007ffff7b7ebf3 <+3>:    push   %r12
   0x00007ffff7b7ebf5 <+5>:    push   %rbp
   0x00007ffff7b7ebf6 <+6>:    mov    %esi,%r12d
   0x00007ffff7b7ebf9 <+9>:    push   %rbx
   0x00007ffff7b7ebfa <+10>:    mov    0x24(%rdi),%ebp
   0x00007ffff7b7ebfd <+13>:    mov    %rdi,%rbx
   0x00007ffff7b7ec00 <+16>:    and    $0x1,%r12d
   0x00007ffff7b7ec04 <+20>:    jmp    0x7ffff7b7ec47 <next_char+87>
   0x00007ffff7b7ec06 <+22>:    nopw   %cs:0x0(%rax,%rax,1)
   0x00007ffff7b7ec10 <+32>:    sub    $0x1,%ebp
   0x00007ffff7b7ec13 <+35>:    mov    %ebp,0x24(%rbx)
   0x00007ffff7b7ec16 <+38>:    callq  0x7ffff7a1b0f0 <__ctype_toupper_loc@plt>
   0x00007ffff7b7ec1b <+43>:    mov    (%rax),%rdx
   0x00007ffff7b7ec1e <+46>:    mov    (%rbx),%rax
   0x00007ffff7b7ec21 <+49>:    lea    0x1(%rax),%rcx
   0x00007ffff7b7ec25 <+53>:    mov    %rcx,(%rbx)
   0x00007ffff7b7ec28 <+56>:    movsbq (%rax),%rax
=> 0x00007ffff7b7ec2c <+60>:    mov    (%rdx,%rax,4),%eax
   0x00007ffff7b7ec2f <+63>:    cmp    $0x20,%eax
   0x00007ffff7b7ec32 <+66>:    mov    %al,0x18(%rbx)
   0x00007ffff7b7ec35 <+69>:    sete   %cl
   0x00007ffff7b7ec38 <+72>:    cmp    $0x9,%eax
   0x00007ffff7b7ec3b <+75>:    sete   %dl
   0x00007ffff7b7ec3e <+78>:    or     %dl,%cl
   0x00007ffff7b7ec40 <+80>:    je     0x7ffff7b7ec50 <next_char+96>
   0x00007ffff7b7ec42 <+82>:    test   %r12b,%r12b
   0x00007ffff7b7ec45 <+85>:    je     0x7ffff7b7ec50 <next_char+96>
   0x00007ffff7b7ec47 <+87>:    test   %ebp,%ebp
   0x00007ffff7b7ec49 <+89>:    jne    0x7ffff7b7ec10 <next_char+32>
   0x00007ffff7b7ec4b <+91>:    mov    $0xffffffff,%eax
   0x00007ffff7b7ec50 <+96>:    pop    %rbx
   0x00007ffff7b7ec51 <+97>:    pop    %rbp
   0x00007ffff7b7ec52 <+98>:    pop    %r12
   0x00007ffff7b7ec54 <+100>:    retq   
End of assembler dump.

-O0:

Charm++: standalone mode (not using charmrun)
Charm++> Running in non-SMP mode: 1 processes (PEs)
Converse/Charm++ Commit ID: v6.8.2-743-gb94e029c7
Charm++> scheduler running in netpoll mode.
CharmLB> Load balancer assumes all CPUs are same.
Charm++> Running on 1 hosts (1 sockets x 6 cores x 2 PUs = 12-way SMP)
Charm++> cpu topology info is gathered in 0.000 seconds.
Charm++> -tlsglobals enabled for privatization of thread-local variables.

 ========================================================
           Mantevo miniapp MiniGhost experiment
 ========================================================

 Communication strategy: full message aggregation (COMM_METHOD_BSPMA)

 Computation: 5 pt difference stencil on a 2D grid (STENCIL_2D5PT)

Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7b67dfb in format_lex (fmt=0x555555de0730) at ../../../src/libgfortran/io/format.c:370
370          if (!isdigit (c))
(gdb) bt
#0  0x00007ffff7b67dfb in format_lex (fmt=0x555555de0730) at ../../../src/libgfortran/io/format.c:370
#1  0x00007ffff7b68f7f in parse_format_list (dtp=0x4010ff400, seen_dd=0x4010ff1a7) at ../../../src/libgfortran/io/format.c:1096
#2  0x00007ffff7b695e2 in _gfortrani_parse_format (dtp=0x4010ff400) at ../../../src/libgfortran/io/format.c:1349
#3  0x00007ffff7b7cf52 in data_transfer_init (dtp=0x4010ff400, read_flag=0) at ../../../src/libgfortran/io/transfer.c:2793
#4  0x00007ffff7b7fd21 in _gfortran_st_write (dtp=0x4010ff400) at ../../../src/libgfortran/io/transfer.c:4133
#5  0x000055555572ba15 in mg_utils_mod::mg_print_header (comm_method=<optimized out>, stencil=<optimized out>, ierr=0) at MG_UTILS.F:378
#6  0x00005555557408ac in mini_ghost (scaling_in=<optimized out>, nx_in=<optimized out>, ny_in=<optimized out>, nz_in=<optimized out>, nvars_in=<optimized out>, percent_sum_in=<optimized out>, nspikes_in=1, ntsteps_in=100, stencil_in=21, comm_method_in=10, bc_in=31, error_tol_in=8, report_diffusion_in=20, npx_in=1, npy_in=1, npz_in=1, report_perf_in=0, 
    cp_method_in=1, cp_interval_in=0, cp_file_in=..., restart_cp_num_in=-2, restart_file_in=..., debug_grid_in=0, _cp_file_in=1433575663, _restart_file_in=1439154320) at DRIVER.F:175
#7  0x000055555572a208 in AMPI_Main (argc=1, argv=0x555555c7c090) at main.c:375
#8  0x00005555557cb142 in AMPI_Main_c (argc=1, argv=0x555555c7c090) at compat_ampi.c:15
#9  0x0000555555754655 in AMPI_Fallback_Main (argc=1, argv=0x555555c7c090) at ampi.C:826
#10 0x0000555555797c44 in MPI_threadstart_t::start (this=0x401100098) at ampi.C:1031
#11 0x0000555555754c0f in AMPI_threadstart (data=0x555555dcef40) at ampi.C:1051
#12 0x0000555555741366 in startTCharmThread (msg=0x555555dcef20) at tcharm.C:175
#13 0x00005555558c5261 in CthStartThread (arg=...) at libthreads-default-tls.c:1770
#14 0x00005555558c56ff in make_fcontext () at make_x86_64_sysv_elf_gas.S:70
#15 0x0000000000000000 in ?? ()
(gdb) disas
Dump of assembler code for function format_lex:
[...]
   0x00007ffff7b67de7 <+447>:    callq  0x7ffff79abc70 <__ctype_b_loc@plt>
   0x00007ffff7b67dec <+452>:    mov    (%rax),%rax
   0x00007ffff7b67def <+455>:    mov    -0xc(%rbp),%edx
   0x00007ffff7b67df2 <+458>:    movslq %edx,%rdx
   0x00007ffff7b67df5 <+461>:    add    %rdx,%rdx
   0x00007ffff7b67df8 <+464>:    add    %rdx,%rax
=> 0x00007ffff7b67dfb <+467>:    movzwl (%rax),%eax
   0x00007ffff7b67dfe <+470>:    movzwl %ax,%eax
   0x00007ffff7b67e01 <+473>:    and    $0x800,%eax
   0x00007ffff7b67e06 <+478>:    test   %eax,%eax
   0x00007ffff7b67e08 <+480>:    je     0x7ffff7b67e2d <format_lex+517>
[...]

Interestingly, the section of assembly code crashing with -O0 matches a commented explanation here: https://stackoverflow.com/a/50296176

Both crashes have the common characteristic that a segfault occurs during access of an implementation detail of functions in ctype.h, which happens to use thread-local storage.

#3 Updated by Evan Ramos 13 days ago

This seems relevant, but doesn't specifically help narrow down a solution: https://gcc.gnu.org/onlinedocs/gfortran/Thread-safety-of-the-runtime-library.html

Also available in: Atom PDF