Mochi team activities at SC24

The Mochi team participated in the following activities in the SC24 program in Atlanta in November 2024:

Quarterly meeting and newsletter, October 2024

Please join us for the next Mochi quarterly meeting on Thursday, October 31, 2024, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app
Click here to join the meeting

Mochi updates and agenda items

  • Events at SC24 (November 17-22, Atlanta GA): Look for the Mochi team at the following events at SC24:
  • Platform updates:
    • Most HPE Slingshot equipped systems (including Aurora@ALCF, Perlmutter@NERSC, and Frontier@OLCF) have now been updated to Libfabric version 1.20.1. We’ve updated our example installation recipes accordingly, see the platform-configurations repo for updates. You must use Mercury version 2.4.0rc3 or higher for compatibility with this update.
    • The same repository now also includes an example of how to install Mochi in an Apptainer image so that it can still leverage the native host networking stack.
    • We (and HPE) are aware of an issue in the Libfabric CXI provider that causes processes using Libfabric (and Mercury and Mochi) to consume 100% CPU at all times when waiting for network events. We expect this to be resolved in a future HPE software release.
  • Software updates:
    • Mercury version 2.4.0 was released in October 2024. It includes a variety of new features, bug fixes, and performance optimizations, and is compatible with the Libfabric 1.20.1 CXI provider on HPE platforms.
    • Margo version 0.18.2 was released in October 2024. It includes a variety of minor bug fixes and enhancements. It also introduces a new polling mode with a configurable “spindown” parameter to control how long Margo waits before idling after servicing requests. It is enabled by default and significantly speeds up bursty or sequential workloads.
    • Mofka version 0.3.3 was released in October 2024. Mofka is a distributed event streaming service for HPC systems. Recent releases include a variety of performance optimizations and features, including support for access to Kafka via the Mofka API.
    • Bedrock version 0.15.1 was released in October 2024. Bedrock provides bootstrapping and configuration management capabilities for Mochi services. Recent releases include significant refactoring and API revisions to help simplify service configurations.

Quarterly meeting and newsletter, July 2024

Please join us for the next Mochi quarterly meeting on Thursday, July 25, 2024, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app
Click here to join the meeting

Mochi updates and agenda items

  • Upcoming publications:
    • Ankush Jain, Chuck Cranor, Qing Zheng, Brad Settlemyer, George Amvrosiadis, Gary Grider, “CARP: A Streaming Partitioner for Range Queries”, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 24 (2024).
  • Recent publications:
    • Matthieu Dorier, Philip Carns, Robert Ross, Shane Snyder, Rob Latham, Amal Gueroudji, George Amvrosiadis, Chuck Cranor, Jerome Soumagne, “Extending the Mochi Methodology to Enable Dynamic HPC Data Services,” in proceedings of the 5th Workshop on Extreme-Scale Storage and Analysis (ESSA 2024) May, 2024.
  • Software updates:
    • mochi-margo 0.17.0 was released in May 2024. This is the core RPC management component that ties together Mercury RPCs and Argobots user-level threads. The 0.17.0 release features a new margo_monitor_dump() function that can be used to emit (and optionally reset) integrated Margo monitoring data at runtime rather than waiting for program termination.
    • Mochi-flock 0.3.1 was released in July 2024. It is a new implementation of group membership functionality for Mochi services (replacing mochi-ssg). Recent release have added a variety of new features as well as a Python and C++ API.
    • Mochi-bedrock 0.13.1 was released in July 2024. It is a bootstrapping and configuration management component for Mochi services. Recent releases have added mochi-flock support (enabled by default), added support for TOML configurations, and split the module api out into a separate component called mochi-bedrock-module-api.
    • Mofka 0.1.1 was released in July 2024. Mofka is a new top-level Mochi service that implements a distributed event streaming model for HPC services. Mofka is still under active development but includes documentation and a stable API.
    • Mercury 2.4.0rc3 was released in June 2024. Mercury is the underlying RPC and bulk RDMA transfer framework used for communication in in Mochi. This preview release includes several new tuning parameters and a new API that enables users to wait on a file descriptor for new events. We plan to leverage this feature in Margo in the future to improve network polling efficiency.
  • Featured topics:

Quarterly Meeting and Newsletter, April 2024

Please join us for the next Mochi quarterly meeting on Thursday, April 25, 2024, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app
Click here to join the meeting
Or call in (audio only)
+1 630-556-7958,,254649841#

Mochi updates and agenda items

  • Upcoming publications:
    • Matthieu Dorier, Philip Carns, Robert Ross, Shane Snyder, Rob Latham, Amal Gueroudji, George Amvrosiadis, Chuck Cranor, Jerome Soumagne, “Extending the Mochi Methodology to Enable Dynamic HPC Data Services,” to appear at the 5th Workshop on Extreme-Scale Storage and Analysis (ESSA 2024) in May, 2024.
  • Recent publications and presentations:
    • P. Carns, M. Dorier, R. Latham, R. B. Ross, S. Snyder and J. Soumagne, “Mochi: A Case Study in Translational Computer Science for High-Performance Computing Data Management,” in Computing in Science & Engineering, vol. 25, no. 4, pp. 35-41, July-Aug. 2023, doi: 10.1109/MCSE.2023.3326436.
    • Philip Carns. “Harnessing Programmable Devices in the Data Path with Composable Services”. Short presentation at the Joint Laboratory for Extreme-Scale Computing workshop (JLESC16), April 17, 2024. PDF
    • Philip Carns. “HPC Storage: Adapting to Change”. Keynote presentation at the 3rd Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (REX-IO), October 31, 2023. PDF
  • Software updates:
    • Mochi-abt-io 0.7.0 was released in February 2024. This package provides an Argobots-aware abstraction of of common POSIX I/O functions to enable efficient, highly-concurrent I/O in Mochi services. The 0.7.0 release features support for the high-performance liburing Linux I/O interface. Documentation on how to to use this feature is available on the Mochi readthedocs page.
    • Mochi-flock 0.1.0 was released in April 2024. It is a new component meant to replace SSG for group membership in Mochi services. The initial release includes Bedrock integration and support for static groups.
    • Mochi-margo 0.16.0 was released in April 2024. Margo is the core Mochi component that combines Mercury RPCs and Argobots lightweight threads into a coherent data service programming model. The 0.16.0 release features improve timer support with a particular emphasis on more robust handling of cancelled timers.
    • Argobots 1.2 was released in March 2024. Argobots is the foundational user-level threading package used by Mochi. The 1.2 release features a large collection of bug fixes, performance enhancements, and new elasticity features. Mochi has relied on a stable 1.2 release candidate as the preferred version of Argobots prior to this release; most users should not notice any changes.

Quarterly Newsletter, January 2024

The Mochi quarterly meeting for Thursday, January 25, 2024 has been cancelled. Please reach out to us on the mailing list or Slack space if you have anything that you would like to discuss with the Mochi team. Otherwise we hope to see you at our next quarterly meeting on April 25.

Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.

Mochi updates

  • Software updates
    • Mofka version 0.0.2 was released on January 19, 2024. Mofka is a distributed event streaming service with semantics and data models similar to that of Kafka, but geared towards scientific computing on HPC platforms. It is still in an early alpha stage but we invite questions, feedback, and comments. We will cover Mofka in more detail in a future meeting.
    • Mercury version 2.3.1 was released on October 2.3.1. Mercury is the RPC framework used by Mochi for all communication and data transfer. This point release includes a collection of bug fixes and optimizations as well as support for more Slingshot VNI configurations.
  • Platform updates
    • The Mochi team is now using the ALCF Gitlab CI infrastructure to execute nightly performance regression tests on the Polaris system at the ALCF. We hope to expand this testing in the future. For now its primary objective is to monitor Slingshot network performance across not only Mochi updates but also HPE system software updates.

Quarterly meeting and newsletter, October 2023

Please join us for the next Mochi quarterly meeting on Thursday, October 26, 2023, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app
Click here to join the meeting
Or call in (audio only)
+1 630-556-7958,,254649841#

Mochi updates and agenda items

  • Upcoming Publications and Presentations
    • “Mochi: A Case Study in Translational Computer Science for High-Performance Computing Data Management” (under preparation for an upcoming issue of IEEE Computing in Science and Engineering)
    • If you are attending IEEE Cluster 2023 in Santa Fe, please consider stopping by the REX-IO workshop on Tuesday October 31 for the keynote presentation “Anticipating and Adapting to Change in HPC Storage” by Phil Carns.
  • New Mochi Microservices
    • Matthieu Dorier will present an overview of Warabi, a new blob storage microservice. Warabi has similarities to Bake, but has been designed from the ground up with a cleaner, more comprehensive API and seamless integration with the Bedrock ecosystem.
  • HPE Slingshot Status Update
    • Communicating on a Slingshot network requires access to a Virtual Network Interface, or VNI, to authorize communication across processes. You may need to take additional steps to configure the VNI depending on your use case.
      • Communicating across processes that were launched together (e.g. in the same srun or mpiexec invocation):
        • Mercury and thus Mochi will use the same VNI allocated for use by MPI with no additional configuration needed.
        • You may need to use a “–single-node-vni” argument to mpiexec or a “–network=single_node_vni” argument to srun, depending on your platform, to make sure that a VNI is allocated even if the launcher believes that all processes will be executing on the same node.
      • Communicating across independently-launched processes within a job:
        • On the Aurora or Sunspot systems at ANL, no additional configuration is needed.
        • On HPE/SLURM based systems (i.e. Frontier and Perlmutter) additional configuration is needed, because these systems utilize a unique VNI for each job step by default. You can instruct Mercury to instead use a job-level VNI by passing a special value of “0:0” in the “auth_key” field of the Mercury json configuration in Mochi. This feature is already available in mercury@master but will also be available in the next point release. In addition, you must also enable the job-level VNI with the –network=job_vni option to the sbatch command or as a directive at the top of your job script.
      • Communicating across jobs:
        • We are still working with HPE on a general solution to enable communication across jobs.
  • General platform updates:

Quarterly meeting and newsletter, July 2023

Please join us for the next Mochi quarterly meeting on Thursday, July 27, 2023, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app
Click here to join the meeting
Or call in (audio only)
+1 630-556-7958,,254649841#

Mochi updates and agenda items

  • Recent presentations
    • The Mochi team presented two seminars in June 2023 as part of the Mathematics and Computer Science Division’s CS seminar series. These seminars provide an overview of the state of Mochi and how it can be used in 2023. The first covers high-level motivation, concepts, and key technologies, while the second describes the Mochi methodology for composable data services and highlights success stories in domain-specific data services (HEPnOS) and elastic in situ visualization (Colza).
      • Philip Carns. “Mochi Project Overview: the Democratization of Data Services in HPC”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 13, 2023. ABSTRACT PDF VIDEO
      • Matthieu Dorier. “Mochi in Practice: Data Services for High-Energy Physics and Elastic In Situ Visualization Workflows”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 20, 2023. ABSTRACT PDF VIDEO
  • Recent tutorials
    • Matthieu Dorier, Philip Carns, and Marc-André Vef presented the following tutorial on May 21 at ISC High Performance 2023:
  • New features
    • Yokan now enjoys 4 new families of functions: yk_fetch , yk_doc_fetch, yk_iter and yk_doc_iter (each with variants to access multiple key/value pairs or documents at once). These functions are equivalent to yk_get, yk_doc_load, yk_list_keyvals and yk_doc_list, respectively, but take a callback that is invoked on each key/value pair or document, instead of taking a buffer in which the key/value pair or document is copied. These functions allow for fewer memory copies and simpler code (no need for the caller to manage their own buffer or call other functions to query the size of values/documents first). The _iter functions also provide automatic pipelining and batching.
    • Mercury 2.3.0 is out now, including several notable performance enhancements for libfabric and CXI:
      • new “multi-recv” optimization to improve RPC throughput
      • avoid performance degradation in FI_SOURCE
      • use WAIT_FD for graceful idling on Slingshot (CXI) transports
  • HPE Slingshot status update
    • Mercury support for Slingshot (CXI) is feature complete and performing well with Mercury 2.3.0, but there are some important usability issues to be aware of regarding Virtual Network Interfaces (VNIs). VNIs are a mandatory method for Slingshot network access control between compute nodes.
    • The default job launcher on HPE systems will automatically provision a VNI for MPI. Mercury will inherit and use this same VNI without any additional action on your part.
    • However, this default, launcher-provided VNI is not sufficient for communicating across MPI jobs or among manually-launched processes.
    • We are in communication with HPE about this issue, but they are still working on a general solution. If you encounter problems, please alert your facility or vendor contacts and let us know about your experience!

Mochi CS seminars at Argonne National Laboratory

The Mochi team presented the following two seminars in June 2023 as part of the Mathematics and Computer Science Division’s CS seminar series:

  • Philip Carns. “Mochi Project Overview: the Democratization of Data Services in HPC”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 13, 2023. ABSTRACT PDF VIDEO
  • Matthieu Dorier. “Mochi in Practice: Data Services for High-Energy Physics and Elastic In Situ Visualization Workflows”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 20, 2023. ABSTRACT PDF VIDEO

The first is a high-level presentation of the motivation, concepts, and key technologies that make Mochi possible. The second describes the Mochi methodology for composable data services and highlights success stories in domain-specific data services (HEPnOS) and elastic in situ visualization (Colza).

Together, these presentations give a nice overview of the state of Mochi and how it can be used in 2023. See the links above for more detailed abstracts, slides, and a video recordings of the presentations.

Quarterly meeting and newsletter, April 2023

Please join us for the next Mochi quarterly meeting on Thursday, April 27, 2023, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app

Click here to join the meeting

Or call in (audio only)
 

Mochi updates and agenda items


Quarterly meeting and newsletter, January 2023

Please join us for the next Mochi quarterly meeting on Thursday, January 26, 2023, at 10am CT. Mochi quarterly meetings are a great opportunity to learn about community activities, share best practices, get help with problems, and find out what’s new in Mochi.

Please suggest agenda items on the Mochi slack space or the [email protected] mailing list.


Microsoft Teams meeting
Join on your computer or mobile app

Click here to join the meeting

Or call in (audio only)
+1 630-556-7958,,254649841#   United States, Big Rock

Phone Conference ID: 254 649 841#


We plan to discuss the following topics at this meeting:

  • Call for lightning presentations:
    • Do you have something that you would like to present at the Mochi quarterly meeting? We would love to hear about interesting services you have built using Mochi technology, performance results, challenges and obstacles, or all of the above! It may be short notice for this meeting, but please let us know if you would like to share a presentation this week or request a slot for a future meeting.
  • Elasticity support in Margo (Matthieu Dorier):
  • Call for feedback on Mochi tutorial topics
    • The Mochi team is planning to introduce new tutorial material this year (venues TBA).
    • Upcoming tutorials will focus on hands-on exercises that use containers and Mochi service templates to get up and running quickly.
    • What suggestions do you have for us on what points we should cover as we develop this new material?
    • Examples of previous Mochi tutorials can be found at https://www.mcs.anl.gov/research/projects/mochi/tutorials/