ROMIO and “large counts”
Hey remember a long time ago when I went through ROMIO to make it “big data” clean? ( Large transfers in ROMIO or this paper about fun with datatypes https://ieeexplore.ieee.org/document/7018162 ) . So I thought I was done when MPI-4 introduced new “large count” routines that now take an MPI_Count
type for counts of items instead of int
.
Well it turns out I was *not* done. A vendor reported problems and we started digging. Thankfully compilers have gotten more helpful in the last decade. -fsanitize=undefined
catches all the runtime overflows, but collaborators kept finding problems. Clang still has -Wshorten-64-to-32
, though, and ROMIO can also call the _c
versions of MPI routines, so it was time to go tackle those lingering warnings.
The result is a pretty large and invasive patch set (https://github.com/pmodels/mpich/pull/6928 ) but now you (or more likely your I/O library like Parallel-NetCDF or HDF5) can pass 2 billion or more items to ROMIO routines. I think we are going to back-port this to the MPICH-4.1 series as well as the next MPICH-4.2 maintenance update