Thursday, March 11, 2010

Computational Science and Reproducibility - Part 1

When developing numerical computer models with which to do science, an easy trap to fall into is creating an experimental environment that makes it very difficult to reproduce results from an earlier time. This problem is articulated very well by the people over at ArsTechnica, but given my every day experience with this challenge, I thought I would share my perspective here.

Depending on the model's complexity and the rate of development, it can become challenging to reproduce a simulation that was performed two years ago or one that you performed just two hours ago. This may sound surprising, but those who depend on such models understand how rapidly these situations arise. All of these things (and more) can change between one run and a subsequent, seemingly identical run:
  1. The source code changes in a seemingly innocuous way. A change for efficiency or to improve the appearance of the source can lead to this. If changes in the input parameters of the code must be changed through the source, this can often lead to changes in the results that are difficult to diagnose in the future.
  2. The input data changes. Many codes require a plethora of input data, and these files are often swapped around to see how different sources create different results It's easy to lose track of which files gave which results. Even when one is careful, small changes in the input can create large changes in output. Once, I updated some files that listed satellite positions so that they contained additional significant digits. Such a change sounds harmless, but the one one-hundreth change in the satellite's position produced drastically different results!
  3. The settings have changed. Grid resolution, numerical scheme settings, input parameters, smoothing and blending factors, activation of additional capabilities, and more - all of these items can be changed from run to run. As a code matures, uncountable combinations of settings are possible. Trying to reconstruct the right settings to reproduce a set of results is not an enviable task.
  4. The code was run on a different computer. Changing compilers, CPU architecture, or linking to different external libraries are all things that can change results- sometimes drastically. This problem frequently manifests itself when care is not taken to ensure the proper precision of floating point real numbers throughout a code. Various compilers handle this differently, and numbers that you thought were double precision can become single precision without warning.
  5. Breaking a long run into several parts can also change results. When simulations take days to weeks (even on super computing systems), it is commonplace to develop a system that allows you to save progress and restart the simulation later. Care must be taken to ensure that stopping and restarting yields the same answer as an uninterrupted simulation.
  6. The code gives different results when run in parallel versus serial mode. Large computer simulations can be sped up drastically through parallel computation- that is breaking the problem up into smaller parts and spreading the work over several computers. If not implemented correctly, there can be changes in the solution when switching from serial (one computer) to parallel mode. Things can get trickier if there are changes when the number of CPUs used increases (though the code was always used in parallel).
When you lose reproducibility, others can no longer independently verify your work- a key tenet of well-performed science. Furthermore, you can not return to an experiment for further analysis and open questions about your conclusions will persist. In the best case scenario, this leads to inconveniences in research and time lost when you are required to start from scratch. In the worst case scenarios, failure to reproduce results can strip you of your credibility.

So how do you avoid such problems? In my next post, I will outline steps that the space science modeling community frequently takes to overcome the pitfalls listed above.

No comments:

Post a Comment