Last week I did some comparative runs of the “testpio” kernel to find out why pnetcdf I/O was slower than raw binary MPI-IO. In this scenario, 512 cores write a 51MB file ten times.
There were some minor differences: binary (MPI-IO) uses a blockindexed type, while pnetcdf uses subarray. Pnetcdf syncs the file a few more times – pnetcdf calls MPI_FILE_SYNC when exiting define mode, but I think we will change that soon.