A snapshot of the then current source for the Goddard Institute for Space Studies Global Climate Model ModelE 2.12 was analysed in August 2024. Initially it was hoped to analyse the HadCM4 code of the UK Met Office, however this does not appear to be publicly available which limits the ability to critique it. US public sector code is generally available unlike UK public sector code, both having been paid for by their respective taxpayers.
ModelE is written in Fortran, although ostensibly Fortran 90, it includes
elements familiar from older Fortran due to a heritage dating back to Fortran 66.
Searching for both goto
and go to
produces over 1500 hits, admittedly a number of these are in commented out lines of code
and they have not been analysed to see whether they crudely implement loops,
breaks from loops, error handling or something else. N.B. In 1968 Dijkstra wrote a letter
titled "go to statement considered harmful".
Some of the subroutines are large in size compared to modern coding standards
which makes untangling the code somewhat hard and thus this analysis will be shallow.
The software implements its own pseudo-random number generator returning
a uniform distribution in the range [0, 1]
. This is used in place of the compiler
specific implementation of the language standard (except for computing hashes) and
thus allows cross-platform testing.
It also includes a routine for
incrementing the random numbers to ensure reproducable output when using parallel processing.
Note that the software comes supplied with a test suite, however this has not been analysed to see how it copes with testing such large blocks of functionality. In particular it is hard to imagine how a validation test could be written to show that a model from an academic paper has been implemented without error given the code structure. The reproducibility of stochastic output by setting of a specific seed, however, allows the implementation of regression tests. Regression tests fail in one of two ways, either when a bug is introduced or when a bug is fixed. i.e. they say nothing about the correctness of the code.
The main use of random numbers in the code appears to be fuzzing
and not merely for initial conditions.
A uniform distribution is used to perturb tropospheric temperatures by
[-1, 1]
times some scale.
Note that Gauss's theory of errors results in a normal
distribution not a uniform distribution.
This would appear to be an implementation of the stochastic forcing scheme of
the European Centre for Medium-Range Weather Forecasts.
In other places random numbers are used to determine which
index of an array to access.
Proper stochastic models appear in the cloud modelling where probabilities are calculated for events and then compared to a random number. e.g. cloud seeding. Similar tests are made in the radiation modelling.
The butterfly effect of Lorenz (1961) is another form of fuzzing in which the number of significant figures was truncated, this would not lead to a uniform distribution. Given that the length scale for the effect could be of the same order as that in which the continuum hypothesis breaks down the question has to be asked as to whether the effect is merely a model artefact rather than a feature of the real world. The Navier-Stokes equations for fluid motion date back to circa 1850. The equations rely on the continuum hypothesis which was disproved by the discovery of individual molecules as implied in Einstein's 1905 model of Brownian motion. Contrary to his later claims God does play dice since Brownian motion is normally distributed.
It's significance to weather and climate modelling is the use of ensemble forcecasts in which the simulation is repeated from the same start point but the addition of random elements cause the individual results to diverge. There are reports that some of these simulations then produce unphysical results which are removed from the ensemble rather than being taken as evidence that the model may not be fit for purpose.