This site is currently under development for the RSE-ops community.

SCR: Scalable Checkpoint/Restart for MPI

Published: November 27, 2017

Reading time: 1 min

Multilevel checkpointing allows HPC applications to take both frequent inexpensive checkpoints and less frequent, more resilient checkpoints, resulting in better efficiency and reduced load on the parallel file system. Accordingly, LLNL researchers developed the Scalable Checkpoint/Restart (SCR) library for the large-scale, production system context.

Learn more on the LLNL Computing website. Read the SCR user guide and fork the code on GitHub.

Next post: Caliper: Application Introspection System

Previously: MFEM 3.3.2 Released

This page is open source

Help improve its content by opening a Pull Request on GitHub.