Publications

If you use Mochi in a scholarly work, then we recommend that you cite one of the papers highlighted in blue below.

Papers:

  • Ankush Jain, Chuck Cranor, Qing Zheng, Brad Settlemyer, George Amvrosiadis, Gary Grider, “CARP: A Streaming Partitioner for Range Queries”, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 24 (2024).
  • Matthieu Dorier, Philip Carns, Robert Ross, Shane Snyder, Rob Latham, Amal Gueroudji, George Amvrosiadis, Chuck Cranor, Jerome Soumagne, “Extending the Mochi Methodology to Enable Dynamic HPC Data Services,” in Proceedings of the 5th Workshop on Extreme-Scale Storage and Analysis (ESSA 2024) May, 2024.
  • P. Carns, M. Dorier, R. Latham, R. B. Ross, S. Snyder and J. Soumagne, “Mochi: A Case Study in Translational Computer Science for High-Performance Computing Data Management,” in Computing in Science & Engineering, vol. 25, no. 4, pp. 35-41, July-Aug. 2023, doi: 10.1109/MCSE.2023.3326436.
  • Sajid Ali, Steven Calvez, Philip Carns, Matthieu Dorier, Pengfei Ding, James Kowalkowski, Robert Latham, Andrew Norman, Marc Paterno, Robert Ross, Saba Sehrish, Shane Snyder, and Jerome Soumagne “HEPnOS: a Specialized Data Service for High Energy Physics Analysis,” in Proceedings of the 4th Workshop on Extreme-Scale Storage and Analysis (ESSA 2023).
  • Matthieu Dorier, Romain Egele, Prasanna Balaprakash, Jaehoon Koo, Sandeep Madireddy, Srinivasan Ramesh, Allen D. Malony, and Robert Ross “HPC Storage Service Autotuning Using Variational-Autoencoder-Guided Asynchronous Bayesian Optimization,” 2022 IEEE International Conference on Cluster Computing (CLUSTER), 2022, pp. 381-393, doi: 10.1109/CLUSTER51413.2022.00049.
  • Matthieu Dorier, Zhe Wang, Utkarsh Ayachit, Shane Snyder, Robert Ross, Manish Parashar. “Colza: Enabling Elastic In Situ Visualization for High-performance Computing Simulations.” in Proceedings of the 36th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2022).
  • P. Matri and R. Ross, “Neon: Low-Latency Streaming Pipelines for HPC,” 2021 IEEE 14th International Conference on Cloud Computing (CLOUD), 2021, pp. 698-707, doi: 10.1109/CLOUD53861.2021.00089.
  • Bradley Settlemyer, George Amvrosiadis, Philip Carns, and Robert Ross. It’s time to talk about HPC storage: Perspectives on the past and future. Computing in Science & Engineering, 23(6):63–68, 2021.
  • Z. Wang, P. Subedi, M. Dorier, P. E. Davis and M. Parashar, “Adaptive Placement of Data Analysis Tasks For Staging Based In-Situ Processing,” 2021 IEEE 28th International Conference on High Performance Computing, Data, and Analytics (HiPC), 2021, pp. 242-251, doi: 10.1109/HiPC53243.2021.00038.
  • Z. Wang, M. Dorier, P. Subedi, P. E. Davis and M. Parashar, “An Adaptive Elasticity Policy For Staging Based In-Situ Processing,” 2021 IEEE Workshop on Workflows in Support of Large-Scale Science (WORKS), 2021, pp. 33-41, doi: 10.1109/WORKS54523.2021.00010.
  • Srinivasan Ramesh, Robert B Ross, Matthieu Dorier, Allen D Malony, Philip Carns, and Kevin Huck. SYMBIOMON: A High Performance, Composable Monitoring Service. In 29th IEEE International Conference on High Performance Computing, Data, & Analytics (HiPC). IEEE, 2021.
  • Srinivasan Ramesh, Allen D Malony, Philip Carns, Robert B Ross, Matthieu Dorier, Jerome Soumagne, and Shane Snyder. SYMBIOSYS: A methodology for performance analysis of composable HPC data services. In 2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 35–45. IEEE, 2021.
  • Q. Zheng, C. Cranor, A. Jain, G. Ganger, G. Gibson, G. Amvrosiadis, B. Settlemyer, G. Grider. “Streaming Data Reorganization at Scale with DeltaFS”, In ACM Transactions on Storage, Volume 16, Issue 4, No. 23, September 2020.
  • Philip Carns, Kevin Harms, Bradley W. Settlemyer, Brian Atkinson, and Robert B. Ross.  “Keeping It Real: Why HPC Data Services Don’t Achieve I/O Microbenchmark Performance”, in Proceedings of the 5th International Parallel Data Systems Workshop (PDSW20).  LINK (paper), LINK (slides), LINK (video)
  • Jerome Soumagne, Philip Carns and Robert B. Ross.  “Advancing RPC for Data Services at Exascale”, in IEEE Data Engineering Bulletin.  43, 23-34 (2020). LINK
  • Nathanael Cheriere, Matthieu Dorier, Gabriel Antoniu, Stefan M Wild, Sven Leyffer, Robert Ross.  “Pufferscale: Rescaling HPC Data Services for High Energy Physics Applications”,  in Proceedings of the 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (Ccgrid 2020).
  • Robert B. Ross et al. “Mochi: Composing Data Services for High-Performance Computing Environments”,  Journal of Computer Science and Technology. 35, 121–144 (2020). https://doi.org/10.1007/s11390-020-9802-0  LINK
  • Q. Zheng, C. D. Cranor, A. Jain, G. R. Ganger, G. A. Gibson, G. Amvrosiadis, B. W. Settlemyer, G. A. Grider. “Compact Filters for Fast Online Data Partitioning.” In Proceedings of 2019 IEEE International Conference on Cluster Computing (CLUSTER), September 2019.
  • Srinivasan Ramesh, Philip Carns, Robert Ross, Shane Snyder, and Allen Malony.” Profiling Composable HPC Data Services”, Work-In-Progress report at the 4th International Parallel Data Systems Workshop (PDSW 2019). LINK (paper) LINK (slides)
  • Qing Zheng, Charles D. Cranor, Danhao Guo, Gregory R. Ganger, George Amvrosiadis, Garth A. Gibson, Bradley W. Settlemyer, Gary Grider, Fan Guo. “Scaling Embedded In Situ Indexing with DeltaFS”, proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC18), November 2018.  LINK
  • Matthieu Dorier, Philip Carns, Robert Latham, Robert Ross, Shane Snyder, Justin Wozniak, Samuel K. Gutiérrez, Bob Robey, Brad Settlemyer, Galen Shipman, Jerome Soumagne, James Kowalkowski, Marc Paterno, Saba Sehrish. “Methodology for the Rapid Development of Scalable HPC Data Services”, in Proceedings of the 3rd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS ’18). LINK (paper) LINK (slides)
  • Nathanael Cheriere, Matthieu Dorier, Gabriel Antoniu. “Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage”, in Proceedings of the 3rd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS ’18). LINK (paper) LINK (slides)
  • P. Matri, Y. Alforov, Á. Brandon, M.S. Pérez, A. Costan, G. Antoniu, M. Kuhn, P. Carns, T. Ludwig, “Mission possible: Unify HPC and big data stacks towards application-defined blobs at the storage layer”, Future Generation Computer Systems (2018), https://doi.org/10.1016/j.future.2018.07.035
  • M. A. Sevilla, C. Maltzahn, P. Alvaro, R. Nasirigerdeh, B. W. Settlemyer, D. Perez, D. Rich, and G. M. Shipman, “Programmable Caches with a Data Management Language & Policy Engine,” In Proceedings of the 18th International Symposium on Cluster, Cloud and Grid Computing (CCGrid ’18). IEEE/ACM Washington, DC, USA.
  • Sangmin Seo, Abdelhalim Amer, Pavan Balaji, Cyril Bordage, George Bosilca, Alex Brooks, Philip Carns, Adrian Castello, Damien Genet, Thomas Herault, et al. “Argobots: A lightweight low-level threading and tasking framework,” in IEEE Transactions on Parallel and Distributed Systems, vol. 29, no. 3, pp. 512-526, 2018. doi: 10.1109/TPDS.2017.2766062
  • Qing Zheng, George Amvrosiadis, Saurabh Kadekodi, Garth A. Gibson, Charles D. Cranor, Bradley W. Settlemyer, Gary Grider, and Fan Guo. “Software-defined storage for fast trajectory queries using a DeltaFS indexed massive directory,” In Proceedings of the 2nd Joint International Workshop on Parallel Data Storage & Data Intensive Scalable Computing Systems (PDSW-DISCS ’17). ACM, New York, NY, USA, 7-12. DOI: https://doi.org/10.1145/3149393.3149398. LINK (paper) LINK (slides)
  • John Jenkins, Galen Shipman, Jamaludin Mohd-Yusof, Kipton Barros, Philip Carns, and Robert Ross. “A Case Study in Computational Caching Microservices for HPC,” in 2nd Annual Workshop on Emerging Parallel and Distributed Runtime Systems and Middleware (IPDRM 17), 2017.
  • F. J. R. Duro, J. G. Blas, F. Isaila, J. Carretero, J. M. Wozniak, and R. Ross. Experimental evaluation of a flexible I/O architecture for accelerating workflow engines in ultrascale environments. Parallel Computing, 61:52–67, January 2017.
  • F. Isaila, J. Garcia, J. Carretero, R. Ross, and D. Kimpe. Making the case for reforming the I/O software stack of extreme-scale systems. Advances in Engineering Software, 111:26–31, 2017.
  • P. Carns, J. Jenkins, S. Seo, S. Snyder, R. B. Ross, C. D. Cranor, S. Atchley, and T. Hoefler, “Enabling NVM for data-intensive scientific services,” in 4th Workshop on Interactions of NVM/Flash with Operating Systems and Workloads (INFLOW 16), 2016. LINK
  • T. Hoefler, R. B. Ross, and T. Roscoe. Distributing the Data Plane for Remote Storage Access. In Proceedings of the 15th Workshop on Hot Topics in Operating Systems, HotOS’15. USENIX Association, May 2015.

Presentations:

  • Philip Carns. “Harnessing Programmable Devices in the Data Path with Composable Services”. Short presentation at the Joint Laboratory for Extreme-Scale Computing workshop (JLESC16), April 17, 2024. PDF
  • Philip Carns. “HPC Storage: Adapting to Change”. Keynote presentation at the 3rd Workshop on Re-envisioning Extreme-Scale I/O for Emerging Hybrid HPC Workloads (REX-IO), October 31, 2023. PDF
  • Matthieu Dorier. “Mochi in Practice: Data Services for High-Energy Physics and Elastic In Situ Visualization Workflows”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 20, 2023. ABSTRACT PDF VIDEO
  • Philip Carns. “Mochi Project Overview: the Democratization of Data Services in HPC”, CS seminar series, Argonne National Laboratory Mathematics and Computer Science division, June 13, 2023. ABSTRACT PDF VIDEO
  • Robert Ross, Philip Carns, Matthieu Dorier, and Jerome Soumagne “Using Mochi to build data services: Overview and Updates (BoF)”, Presented at the 2021 ECP Annual Meeting, April 13, 2021. LINK
  • P. Carns. “BYOFS: The Opportunities and Dangers of Specialization in the Age of Exascale Data Storage”, SOS23 Workshop, March 2019. LINK
  • R. B. Ross. “Versatile Data Services for Computational Science”, Workshop on Clusters, Clouds, and Data for Scientific Computing (CCDSC), September 2018. LINK
  • G. Amvrosiadis. “How I Learned to Stop Worrying and Love the Exascale”, Microsoft Research Lab, Redmond, WA. June 2018. LINK
  • P. Carns. “Understanding and Tuning HPC I/O: How hard can it be?”, keynote presentation at the 4th annual HPC I/O in the Data Center Workshop (HPC-IODC) and Workshop on Performance and Scalability of Storage Systems (WOPSSS), June 2018. LINK
  • M. Dorier. “Composing HPC Micro-Services to Build Application-Tailored Distributed Object Stores”, presented at the inaugural SIG-IO-UK workshop, June 2018.  LINK
  • P. Carns. “Incorporating NVM into Data-Intensive Scientific Computing”, presented at the 34th International Conference on Massive Storage Systems and Technology (MSST 2018), May 2018.  LINK
  • S. Snyder. “Fault Detection and Group Membership in HPC Data Services”, presentation at the 7th Joint Laboratory on Extreme Scale Computing (JLESC) workshop, July 2017. LINK
  • R. B. Ross. “A Renaissance for Data Management in HPC?”, keynote presentation at the 29th International Conference on Scientific and Statistical Database Management, June 2017. LINK
  • P. Carns. “Building blocks for user-level HPC storage systems”, presentation at Dagstuhl Seminar 17202: Challenges and Opportunities of User-Level File Systems for HPC, May 2017. LINK
  • R. B. Ross. “To FS or not to FS…”, presentation at Dagstuhl Seminar 17202: Challenges and Opportunities of User-Level File Systems for HPC, May 2017. LINK
  • P. Carns. “Mochi: composable lightweight data services for HPC”, presentation at the 6th Joint Laboratory on Extreme Scale Computing (JLESC) workshop, November 2016.  LINK
  • R. B. Ross. “From file systems to services: Changing the data management model in HPC.” Presented at the Workshop on Clusters, Clouds, and Data for Scientific Computing October, 2016. LINK
  • R. B. Ross. “From file systems to services: Changing the data management model in HPC.” Presented at the Joint Laboratory for Extreme-Scale Computing Workshop, Lyon, France, June 2016. LINK
  • R. B. Ross. “From file systems to services: Changing the data management model in HPC.” Presented at the Salishan Conference on High-Speed Computing, 2016. LINK

Press:

  • Jo Napolitano. “Argonne’s new menu of data storage software helps scientists realize findings earlier.” Argonne National Laboratory press release, June 1, 2020. LINK
  • George Amvrosiadis. “Building the Software Infrastructure of the Future.” Carnegie Mellon University College of Engineering Youtube Channel, February 27, 2020. LINK
  • Sally Johnson. “Scaling up metadata.” Deixis Magazine, May 2020. LINK
  • Nancy Ambrosiano. “Handling trillions of supercomputer files just got simpler.” Los Alamos National Laboratory press release, March 14, 2019. LINK
  • Bradley Wade Settlemyer. “Using 1 trillion files helps scientist find a needle in a haystack.” Albuquerque Journal, June 22, 2018. LINK
  • Sarah Scoles. “This Bomb-Simulating US Supercomputer Broke a World Record.” WIRED Magazine, July 23, 2018. LINK
  • Dan Carroll. “One framework to rule them all.” Carnegie Mellon University College of Engineering News, November 11, 2018. LINK