HDF5 issues in AMPI
The HDF5 library is available for AMPI at
This bug tracks several of the issues that are still needed for complete support.
#1 Updated by Matthias Diener 6 months ago
The issues to test and improve are:
- Test if applications work with the shared library (currently, only the static hdf5 library is built)
- This is currently blocked by the lack of a shared-library ROMIO
SMP mode(seems to work 07/16) Virtualization(seems to work 07/16)
- Some spurious crashes/segfaults at hdf5 library termination
- Test other architectures than linux/netlrts. Currently works with:
- netlrts-linux-x86_64 smp
'-tlsglobals' currently requires static linking: https://charm.cs.illinois.edu/redmine/issues/1220
For migration, the main concern is migrating with open files: so far we've told users to explicitly close and re-open files before and after migration (or if doing serial I/O, make that rank non-migratable), but we could potentially do that for them in our ROMIO and HDF5 distributions.
#5 Updated by Matthias Diener 5 months ago
I have a patch to update romio to 1.2.6 (shipped with last version of mpich1) that compiles successfully with the current AMPI and passes all of the romio test suite. More advanced features not currently supported by AMPI (such as generalized requests) are still optional in 1.2.6.
Getting it to actually build a shared library is not so easy though, the current Makefile generates some weird libtool archive that I haven't been able to convert to an .so yet, which is why haven't submitted the patch to gerrit yet.
#7 Updated by Matthias Diener 5 months ago
HDF5 serial tests working (all 62):¶
- testhdf5, cache, cache_api, cache_image, cache_tagging, lheap, ohdr, stab, gheap, evict_on_close, farray, earray, btree2, fheap, pool, accum, hyperslab, istore, bittests, dt_arith, page_buffer, dtypes, dsets, cmpd_dset, filter_fail, extend, external, efc, objcopy, links, unlink, twriteorder, big, mtime, fillval, mount, flush1, flush2, app_ref, enum, set_extent, ttsafe, enc_dec_plist, enc_dec_plist_cross_platform, getname, vfd, ntypes, dangle, dtransform, reserved, cross_read, freespace, mf, vds, file_image, unregister, cache_logging, cork, swmr, testerror.sh, testlibinfo.sh, testcheck_version.sh
NB: testcheck_version.sh shows some discrepencies in the exit codes returned on failure, but this is not significant for application execution (and can't be fixed for now).