The following external projects are using Mochi components:
- UnifyFS (LLNL): Distributed burst buffer file system
- https://github.com/LLNL/UnifyCR
- Michael Brim, Adam Moody, Seung-Hwan Lim, Ross Miller, Swen Boehm, Cameron Stanavige, Kathryn Mohror, Sarp Oral, “UnifyFS: A User-level Shared File System for Unified Access to Distributed Local Storage,” 37th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2023), St. Petersburg, FL, May 2023.
- Proactive Data Containers (LBNL): Novel data abstraction for storing science data in an object-oriented manner
- https://github.com/hpc-io/pdc
- Houjun Tang, Suren Byna, Francois Tessier, Teng Wang, Bin Dong, Jingqing Mu, Quincey Koziol, Jerome Soumagne, Venkatram Vishwanath, Jialin Liu, and Richard Warren, “Toward Scalable and Asynchronous Object-centric Data Management for HPC”, 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid) 2018 [pdf]
- GekkoFS (JGU Mainz): Temporary distributed file system for HPC applications
- https://storage.bsc.es/gitlab/hpc/gekkofs
- Vef, Marc-André & Moti, Nafiseh & Süß, Tim & Tocci, Tommaso & Nou, Ramon & Miranda, Alberto & Cortes, Toni & Brinkmann, André. “GekkoFS – A temporary distributed file system for HPC applications”, IEEE Cluster 2018, Belfast
- DAOS (Intel): Distributed Asynchronous Object Storage
- IOF (Intel): POSIX I/O forwarding
- Hermes (IIT, the HDF Group, and UIUC): management of I/O storage tiers
- https://github.com/HDFGroup/hcl
- H. Devarajan, A. Kougkas, K. Bateman, and X. Sun. “HCL: Distributing Parallel Data Structures in Extreme Scales.” In 2020 IEEE International Conference on Cluster Computing (CLUSTER). IEEE, 2020.
- Seer (LANL): lightweight insitu wrapper library adding insitu capabilities to simulations
- https://github.com/lanl/seer
-
Pascal Grosset, Jesus Pulido, and James Ahrens. “Personalized In Situ steering for Analysis and Visualization.” In Proceedings of ISAV 2020: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization.
- Chimbuko (BNL): in-situ performance analysis for HPC applications
- https://github.com/CODARcode/Chimbuko
- Christopher Kelly, Sungsoo Ha, Kevin Huck, Hubertus Van Dam, Line Pouchard, Gyorgy Matyasfalvi, Li Tang, Nicholas D’Imperio, Wei Xu, Shinjae Yoo, and Kerstin Van Dam. “Chimbuko: A Workflow-Level Scalable Performance Trace Analysis Tool”. In Proceedings of ISAV 2020: In Situ Infrastructures for Enabling Extreme-Scale Analysis and Visualization.
- Dataspaces (Rutgers): shared tuple-space abstraction for use between HPC applications
- https://github.com/rdi2dspaces/dspaces
- Zhang, B., Davis, P.E., Morales, N., Zhang, Z., Teranishi, K., Parashar, M. (2023). Optimizing Data Movement for GPU-Based In-Situ Workflow Using GPUDirect RDMA. In: Cano, J., Dikaiakos, M.D., Papadopoulos, G.A., Pericàs, M., Sakellariou, R. (eds) Euro-Par 2023: Parallel Processing. Euro-Par 2023. Lecture Notes in Computer Science, vol 14100. Springer, Cham.
- CHFS (Tsukuba): ad hoc file system for persistent memory based on consistent hashing
- https://github.com/otatebe/chfs
- Osamu Tatebe, Kazuki Obata, Kohei Hiraga, Hiroki Ohtsuji, “CHFS: Parallel Consistent Hashing File System for Node-local Persistent Memory”, Proceedings of the ACM International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia 2022), 2022.
- SERVIZ (University of Oregon): A Shared In Situ Visualization Service
- https://github.com/srini009/serviz
- S. Ramesh, H. Childs and A. Malony, “SERVIZ: A Shared In Situ Visualization Service,” in 2022 SC22: International Conference for High Performance Computing, Networking, Storage and Analysis (SC) (SC), Dallas, TX, US, 2022 pp. 277-290.
- HXHIM (LANL): Hexadimensional hashing indexing middleware
- SOMA (University of Oregon): Framework for in situ monitoring and analysis
- https://github.com/soma-monitoring-toolbox
- SOMA: Observability, monitoring, and in situ analytics for exascale applications. Concurrency Computat Pract Exper. 2024; 36(19):e8141. , , , et al.
- ProxyStore (University of Chicago): wide-area object reference management for distributed applications
- https://github.com/proxystore/proxystore
- Pauloski, J. Gregory, Valerie Hayot-Sasson, Logan Ward, Nathaniel Hudson, Charlie Sabino, Matt Baughman, Kyle Chard, and Ian Foster. “Accelerating communications in federated applications with transparent object proxies.” In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1-15. 2023.
- Dstore (Johns Hopkins University and ANL): Distributed deep learning model repository
- Meghana Madhyastha, Robert Underwood, Randal Burns, and Bogdan Nicolae. 2023. DStore: A Lightweight Scalable Learning Model Repository with Fine-Grain Tensor-Level Access. In Proceedings of the 37th ACM International Conference on Supercomputing (ICS ’23). Association for Computing Machinery, New York, NY, USA, 133–143.
- Evostore (ANL and Johns Hopkins University): Distributed deep learning model repository with metadata and provenance
- Robert Underwood, Meghana Madhyastha, Randal Burns, and Bogdan Nicolae. 2024. EvoStore: Towards Scalable Storage of Evolving Learning Models. In Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing (HPDC ’24). Association for Computing Machinery, New York, NY, USA, 148–159.
- Neomem (Inria/Rennes and ANL): Machine learning data loader for continuous learning with rehearsal buffers
- https://github.com/thomas-bouvier/neomem
- T. Bouvier et al., “Efficient Data-Parallel Continual Learning with Asynchronous Distributed Rehearsal Buffers,” in 2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid), Philadelphia, PA, USA, 2024, pp. 245-254.
- Chronolog (Illinois Institute of Technology): Distributed tiered log-ordered storage system
- https://github.com/grc-iit/ChronoLog
- A. Kougkas, H. Devarajan, K. Bateman, J. Cernuda, N. Rajesh, X.-H. Sun. ” ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering,” Proceedings of the 36th International Conference on Massive Storage Systems and Technology (MSST 2020).
- Copper (Argonne National Laboratory): Cooperative caching service for large scale data loading
- https://github.com/argonne-lcf/copper
- Noah Lewis, Kevin Harms, Kaushik Velusamy, and Huihuo Zheng. “Copper: Cooperative Caching Layer for Scalable Data Loading in Exascale Supercomputers”, Proceedings of the 9th International Parallel Data Systems Workshop (PDSW 2024).
- HVAC (Oak Ridge National Laboratory): a distributed read cache for node-local storage
- https://code.ornl.gov/42z/hvac-high-velocity-ai-cache
- A. Khan et al., “HVAC: Removing I/O Bottleneck for Large-Scale Deep Learning Applications,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), Heidelberg, Germany, 2022, pp. 324-335
- Diaspora (ANL, University of Chicago, SLAC, ORNL, and Johns Hopkins University): Distributed resilient event fabric
- https://diaspora-project.github.io/
- Bogdan Nicolae, Justin M Wozniak, Tekin Bicer, Hai Nguyen, Parth Patel, Haochen Pan, Amal Gueroudji, Maxime Gonthier, Valerie Hayot-Sasson, Eliu Huerta, Kyle Chard, Ryan Chard, Matthieu Dorier, Nageswara SV Rao, Anees Al-Najjar, Alessandra Corsi, Ian Foster, Diaspora: Resilience-Enabling Services for Real-Time Distributed Workflows, 2024 IEEE 20th International Conference on e-Science (e-Science).
- RECUP (SNL, BNL, ANL, and Texas State University): ScaIable Metadata and Provenance for Reproducible Hybrid Workflows
- https://sites.google.com/view/recup-reproducibility/
- Nicolae B, Islam TZ, Ross R, Pouchard LC, et al (2023) Building the I (Interoperability) of FAIR for Performance Reproducibility of Large-Scale Composable Workflows in RECUP. 2023 IEEE 19th International Conference on e-Science (e-Science).
In addition, the Mochi project itself has also produced the following user-facing data services:
- CARP: dynamic indexing of streaming data
- https://github.com/pdlfs/carp
- Ankush Jain, Chuck Cranor, Qing Zheng, Brad Settlemyer, George Amvrosiadis, Gary Grider, “CARP: A Streaming Partitioner for Range Queries”, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 24 (2024).
- DeltaFS: Scalable file system with in situ indexing
- https://github.com/pdlfs/deltafs
- Q. Zheng, C. Cranor, A. Jain, G. Ganger, G. Gibson, G. Amvrosiadis, B. Settlemyer, G. Grider. “Streaming Data Reorganization at Scale with DeltaFS”, In ACM Transactions on Storage, Volume 16, Issue 4, No. 23, September 2020.
- Mofka: disributed event streaming for HPC platforms
- Colza: Elastic in situ visualization service
- https://github.com/mochi-hpc/mochi-colza
- Matthieu Dorier, Zhe Wang, Utkarsh Ayachit, Shane Snyder, Robert Ross, Manish Parashar. “Colza: Enabling Elastic In Situ Visualization for High-performance Computing Simulations.” in Proceedings of the 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022).
- HEPnOS: High-Energy Physics’s new Object Store
- https://github.com/hepnos/HEPnOS
- Sajid Ali, Steven Calvez, Philip Carns, Matthieu Dorier, Pengfei Ding, James Kowalkowski, Robert Latham, Andrew Norman, Marc Paterno, Robert Ross, Saba Sehrish, Shane Snyder, and Jerome Soumagne “HEPnOS: a Specialized Data Service for High Energy Physics Analysis,” in Proceedings of the 4th Workshop on Extreme-Scale Storage and Analysis (ESSA 2023).
- FlameStore: Object-based storage systems for Keras models used in CANDLE cancer research workflows
- Mobject: In-system object store