Skip to content

ROMIO

  • ROMIO: A High-Performance, Portable MPI-IO Implementation

Collective I/O to overlapping regions

September 6, 2018 by Latham, Robert J.

It is an error for multiple MPI processes to write to the same or to overlapping regions of a file. ROMIO will let you get away with this but if your processes are writing different data, I can’t tell you what will end up in the file in the end.
What about reads, though?
ROMIO’s two phase collective buffering algorithm handles overlapping read requests the way you would hope: I/O aggregators read from the file and send the data to the right MPI process. N processes reading a config file, for example, will result in one read followed by a network exchange.
proce
As an aside, ROMIO’s two-phase algorithm is general and so not as good as a “read and broadcast” — if you the application/library writer know you are going to have every process read the file, here is one spot (maybe the only spot) where I’d encourage you to (independently) read from one processor and broadcast to everyone else.
I bet you are excited to go try this out on some code. Maybe you will have every process read the same elements out of an array. Did you get the performance you expected? Probably not. ROMIO tries to be clever. If the requests from processes are interleaved, ROMIO will carry on with two phase collective I/O. If the requests are not interleaved, then ROMIO will fall back to independent I/O on the assumption that the expense of the two-phase algorithm will not be worth it.
You can see the check here: https://github.com/pmodels/mpich/blob/master/src/mpi/romio/adio/common/ad_read_coll.c#L149
In 2018, two-phase is almost always a good idea — even if the requests are well-formed, collective buffering will map request sizes to underlying file system quirks, reduce the overall client count thanks to I/O aggregators, and probably place those aggregators strategically.
You can force ROMIO to always use collective buffering by setting the hint "romio_cb_read" to “enable” . On Blue Gene systems, that is the default setting already. On other platforms, the default is “automatic”, which triggers that check we mentioned.

Post navigation

Previous Post:

ROMIO at SC 2017

Next Post:

ROMIO at SC 2018

Recent Posts

  • ROMIO and MPICH-4.3.0
  • ROMIO and “large counts”
  • Hintdump: a small utility for poking at MPI implementations.
  • Quobyte file system
  • ROMIO at SC 2019

Recent Comments

  • ROMIO » New ROMIO optimizations for Blue Gene /Q on bglockless
  • bglockless | ROMIO on New ROMIO optimizations for Blue Gene /Q

Archives

  • February 2025
  • May 2024
  • April 2023
  • October 2020
  • November 2019
  • February 2019
  • December 2018
  • November 2018
  • September 2018
  • November 2017
  • September 2017
  • March 2017
  • August 2016
  • June 2016
  • January 2016
  • December 2015
  • November 2015
  • June 2015
  • May 2015
  • February 2015
  • October 2014
  • August 2014
  • July 2014
  • June 2014
  • August 2013
  • July 2013
  • February 2012
  • September 2010
  • November 2009
  • November 2008
  • September 2008
  • February 2006
  • August 2003
  • February 2002

Categories

  • development
  • features
  • gpfs
  • intel-mpi
  • lustre
  • presentations
  • publications
  • releases
  • tuning
  • Uncategorized

Meta

  • Log in
  • Entries feed
  • Comments feed
  • WordPress.org
© 2025 ROMIO | WordPress Theme by Superbthemes