Quarterly newsletter, April 2021

New presentation materials:

GitHub migration complete:

New software releases:

  • Argobots 1.1
    • Underlying user-level threading package for Mochi
    • includes performance improvements, broader platform support, and new profiling and debugging capabilities (more on that later)
  • Mercury 2.0.1rc3
    • Underlying RPC communication package for Mochi
    • improved logging and several performance optimizations
    • final 2.0.1 release coming soon
  • Mochi-sdskv 0.1.12
    • Key/Value store microservice
    • Bedrock support
    • various packaging (cmake, pkgconfig, and dependency) improvements
  • Bedrock 0.2.1
    • Flexible service composition tool
    • various packaging (cmake, pkgconfig) improvements
  • Sonata 0.6.2
    • Document store microservice
    • various packaging (cmake) improvements

Performance regressions from previous quarterly newsletter resolved:

  • Power9 CPU mutex locking performance regression is resolved in Argobots 1.1
  • OmniPath network performance regression is resolved in Mercury 2.0.1rc3

New debugging/profiling/maintenance features:

  • Margo is now using munit for unit testing
    • Available in origin/main (or mochi-margo@main in Spack)
    • Coverage is limited for now but will be expanded over time
    • We will also be leveraging this frame work in additional components over time
  • Recent Argobots updates include multiple (optional) stack guard methods
    • See Argobots documentation or Spack package variants. Notable optoins:
      • “mprotect”: real time detection of stack overruns (with some performance overhead; just use this for debugging)
      • “canary”: lightweight deferred stack overrun detection (lighter weight, but will not report that a stack overflow occurred until shutdown)
  • margo_state_dump() function
    • Available in origin/main (or mochi-margo@main in Spack)
    • function that can be called at any time to dump point-in-time state to a text file or stdout for debugging purposes
    • includes Margo json configuration, Argobots configuration, current Argobots ES layout, Argobots performance profile, in flight RPC counts, stack dump for blocked user-level threads, etc. See https://github.com/mochi-hpc/mochi-margo/blob/main/doc/debugging.md for details.

The Mochi Github migration is complete

All Mochi source code repositories have been migrated to github.com at https://github.com/mochi-hpc/ as of March 22, 2021.

If you are already using spack to install Mochi components, please update your Mochi repository at your earliest convenience:

spack repo rm mochi
git clone https://github.com/mochi-hpc/mochi-spack-packages.git
spack repo add mochi-spack-packages

The package names have not changed; this will just enable you to retrieve new versions as they are released by updating your cloned copy of the mochi-spack-packages repo.

Mochi BoF at the ECP Annual Meeting

The Mochi team will be hosting a BoF entitled “Using Mochi to build data services: Overview and Updates” at the (virtual) ECP Annual Meeting on Tue. Apr 13, 2021 at 2:30 PM.

If you are an ECP project member attending the meeting, you can find more information about the BoF in the meeting agenda.

We will be providing an overview of Mochi, highlighting new capabilities, and offering sign-ups for one-on-one sessions for anyone who would like more detailed information or help.

Quarterly newsletter, January 2021

Platform notes:

Logistics:

  • All of the ANL-hosted git repositories (https://xgitlab.cels.anl.gov/sds/) will be moving within the next few months. We will communicate when that happens.
    • There are no changes to policy or access (in fact, access will likely change for the better); it’s just that the xgitlab.cels instance that we are using is being decommissioned
  • We are working on landing a Spack PR (https://github.com/spack/spack/pull/20273) that will introduce a “mochi-margo” package, maintained by us, to replace the out-of-date “margo” package
    • Once this is done, we will likely start upstreaming more packages that depend on margo-mochi

Mochi service development news:

  • Work continues on a new component called “Bedrock” that can be used to more easily bootstrap microservice compositions (https://xgitlab.cels.anl.gov/sds/bedrock).
    • Bedrock is already available, and we are in the process of updating existing services to use it.
    • You can think of bedrock as a general-purpose Mochi daemon that takes a JSON configuration file describing how to spin up embedded microservices
  • We are actively working on performance tuning of “Benvolio”, which you can think of as a runtime I/O delegation service (i.e. that provides a more generic version of MPI-IO aggregation capabilities). https://xgitlab.cels.anl.gov/sds/benvolio

Upcoming training events:

  • We plan to host a BoF session at this year’s (virtual) ECP annual meeting
    in mid-April, with a mechanism for people to sign up for one on one
    sessions for more detailed interaction. (https://ecpannualmeeting.com/)
  • Please let us know what other kinds of outreach/training you are interested in this year.

Mochi at SC 2020

Highlighted events for the Mochi project at this year’s SC conference included the following:

  • Pascal Grosset, Jesus Pulido, and James Ahrens. 2020. “Personalized In Situ Steering for Analysis and Visualization,” in ISAV’20 In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization (ISAV’20). Association for Computing Machinery, New York, NY, USA, 1–6. DOI:https://doi.org/10.1145/3426462.3426463
  • Christopher Kelly, Sungsoo Ha, Kevin Huck, Hubertus Van Dam, Line Pouchard, Gyorgy Matyasfalvi, Li Tang, Nicholas D’Imperio, Wei Xu, Shinjae Yoo, and Kerstin Kleese Van Dam. 2020. “Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool,” in ISAV’20 In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization (ISAV’20). Association for Computing Machinery, New York, NY, USA, 15–19. DOI:https://doi.org/10.1145/3426462.3426465
  • P. Carns, K. Harms, B. Settlemyer, B. Atkinson and R. Ross, “Keeping It Real: Why HPC Data Services Don’t Achieve I/O Microbenchmark Performance,” in 2020 IEEE/ACM Fifth International Parallel Data Systems Workshop (PDSW), GA, USA, 2020 pp. 1-6. doi: 10.1109/PDSW51947.2020.00006

Mochi at SC 2019

Highlighted events for the Mochi project at this year’s Supercomputing included:

  • Srinivasan Ramesh, Philip Carns, Robert Ross, Shane Snyder, and Allen Malony. “Profiling Composable HPC Data Services”, Work-In-Progress report at the 4th International Parallel Data Systems Workshop (PDSW 2019). LINK (paper) LINK (slides)
  • Jerome Soumagne et al. “Data Services for High Performance Computing”, Birds of a Feather session. LINK