Darshan 2.2.0 release

Darshan 2.2.0 is now available for download.   The biggest change is that this release splits the code into separate components for instrumentation and analysis.  There are also several improvements in documentation, portability, and an assortment of bug fixes.
Changelog:

  • split darshan into separate packages:
    • darshan-runtime: for runtime instrumentation
    • darshan-utils: for processing darshan log files
  • changed default output file name for darshan-job-summary.pl to be based on input file name rather than summary.pdf
  • reorganized init and finalize routines so that can be linked separately (to allow for easier integration with other instrumentation tools)
  • add -cc, -cxx, -f77, -f90, and -fc support to compiler scripts generated by the darshan-gen-*.pl scripts
  • bug fixes:
    • potential MAX_BYTE overflow on 32 bit systems
    • incorrect pread and pwrite offset tracking
    • corrections to darshan-job-summary variance table
    • better runtime error handling if bzip or gnuplot tools are insufficient
    • improvements to time range in darshan-job-summary graphs
  • documentation:
    • improved documentation for both the darshan-runtime and darshan-util portions of Darshan can be found in the respective doc/ subdirectory for each

Updated documentation

The latest Darshan documentation can be found at the documentation link above.  We’ve made several improvements to the documentation, including “recipes” to help get started on various systems including Blue Gene, Cray, and Linux clusters using MPICH, OpenMPI, or Intel MPI.

Darshan 2.1.2 Release

Darshan 2.1.2 is a minor bug fix release to improve error handling in cases where Darshan is unable to write a log file.
Changelog:

  • improved error handling when writing log files.  If a write fails on any process then the log file will be deleted and a warning will be printed to stderr.

Darshan Data Repository now online

We are pleased to announce the public release of the Darshan Data Repository.  The Darshan Data Repository is a collection of anonymized I/O characterization data captured from production systems. The first data set to be made available covers three months of activity (as recorded by Darshan) from the Intrepid Blue Gene/P system at the Argonne Leadership Computing Facility.  We hope to add more data in the future.  See the Darshan Publications page for examples of analysis that can be performed with this data.

Darshan 2.1.1 Release

This release includes performance and bug fixes.  It also includes a new utility to convert Darshan log files, while also optionally anonymizing them or re-compressing them in bzip2 format.
Changelog:

  • new darshan-convert command line utility for converting existing log files, with optional anonymization and optional bzip2 compression
  • bzip2 support in command line utilities (but not in the darshan library itself)
  • updated log file format that allows for string key/value pairs to be stored in the header
  • added ability to set MPI-IO hints when writing darshan log
    • at configure time: –with-log-hints
    • at run time: DARSHAN_LOGHINTS environment variable
  • bug fix contributed by Sandra Schröder: use case-insensitive search for MPI symbols in Fortran wrapper script
  • performance bug fix: remove unecessary call to MPI_File_set_size when writing log
  • added –with-logpath-by-env configure option to allow absolute log path to be specified via environment variable

Best Paper Award, MSST 2011

A paper featuring Darshan (“Understanding and Improving Computational Science Storage Access through Continuous Characterization“) was awarded Best Paper at the 27th IEEE (MSST 2011) Symposium on Massive Storage Systems and Technologies.  The paper outlines a methodology for characterizing a large scale production workload and presents a 2 month study of I/O activity on the Intrepid Blue Gene/P system at Argonne National Laboratory.

Darshan 2.1.0 Release

This release primarily enhances portability and adds the option to use LD_PRELOAD for instrumentation rather than link time wrappers. This release does not add any new instrumentation or change the log file format.
Downloads Page
ChangeLog

  • additional environment variables to control log location, jobid and alignment parameters
  • additional configure tests to improve portability
  • bug fixes for darshan-parser –perf calculations
  • support for MPI1.x
  • support for OpenMPI
  • support for PGI and Intel compilers
  • new libdarshan.so dynamic library for use with LD_PRELOAD

Darshan 2.0.2 release

Changelog:

  • added a random identifier to job logs (to avoid collisions from multiple application instances within a single scheduler job)
  • improved installation and library path management for darshan-job-summary.pl
  • improved error handling in darshan-job-summary.pl
  • additional derived statistics categories for darshan-parser output:
    • ––all   : all sub-options are enabled
    • ––base  : darshan log field data [default]
    • ––file  : total file counts
    • ––perf  : derived perf data
    • ––total : aggregated darshan field data

Darshan 2.0.1 release

Changelog:

  • bug fix to variance/minimum calculations on shared files
  • switch to automatic generation of all MPI compiler scripts using darshan-gen-* tools
  • new run time environment variable: DARSHAN_INTERNAL_TIMING. If set at job execution time, it will cause Darshan to time its own internal data aggregation routines and print the results to stdout at rank 0.