Models are not data – 2

Why should I make the data available to you when all you want to do is find something wrong with it?” These words were written by Professor Phil Jones, a climate scientist at the centre of the 2009 “Climategate” scandal, in an email to a scientist who was sceptical of his reconstruction of past climate, and wanted to see how the models he had used to perform it worked.

In normal scientific discourse, sharing of data and methods is routine, allowing experiments to be replicated and repeated – the vital and time-honoured means by which error is detected and erroneous theories rejected. As Charles Darwin observed – “To kill an error is as good a service as, and sometimes better than, the establishing of a new truth or fact”.

Climate science, however, is not ‘normal’, in the sense that it has strong political associations, and makes strident claims to inform public policy, at huge social and economic cost. Jones was part of a coterie of scientists who suddenly found themselves feted and celebrated in a way they cannot have imagined when they began their careers, and for which they were ill-prepared to cope. Jones’ ill-judged riposte exemplifies this.

Furthermore, the availability of fast, relatively cheap computers has spawned a new meta-science in numerical modelling, allowing scientists of modest attainment to produce superficially impressive work, behind which may lie outright scientific error.

Prof Michael Kelly, of Cambridge University’s Department of Engineering had this to say about computer models:

I take real exception to having simulation runs described as experiments (without at least the qualification of ‘computer’ experiments). It does a disservice to centuries of real experimentation and allows simulations output to be considered as real data. This last is a very serious matter, as it can lead to the idea that real ‘real data’ might be wrong simply because it disagrees with the models! That is turning centuries of science on its head.”

He went on to say

My overall sympathy is with Ernest Rutherford: “If your experiment needs statistics, you ought to have done a better experiment.


Clearly, when governments are using model-derived forecasts of statistically probable outcomes to drive policy, they need to be particularly careful, and should at least require their advisors to show some evidence that the models they are using have the predictive skill they are claiming for them, and that they make their models available for others to review. It’s not clear to me that our politicians yet understand this. The most obvious form this evidence could take is that of case studies, in which similar phenomena (viral contagion) was modelled, and the resulting forecasts found to agree with real-world outcomes. It must be admitted that this may in some cases simply not be possible, but in that case the scientists should be clear in their advice to government that their model has not been experimentally validated, and its outputs should therefore be accorded lower confidence.

Covid-19 has propelled the discipline of epidemiology to prominence even more suddenly than that experienced by Climate ‘Science’. Much of the panic surrounding Covid 19, in the UK and globally, was kindled by the modelled forecasts of Professor Neil Ferguson of Imperial College, which turned out to be exaggerated by an order of magnitude. When Ferguson was belatedly forced to make his models available for critique, they were found by peers to be riddled with schoolboy errors.

The Federal Government is currently relying heavily on the modelling performed by the Doherty Institute, an element of Melbourne University, and their report makes interesting reading. Like most people who are not themselves practitioners of epidemiology or closely related disciplines, I cannot claim to understand all the science it embodies. However, and remembering Climategate, it is possible for the layman to see:

  • whether, and to what extent, the modelling methods are described and experimentally justified.
  • Whether the code and formulae of the model have been made available to the public in a form that allows them to be replicated. Importantly, this is NOT the same as ‘peer review’ – a rather vaguely defined process whereby peers check the work for obvious errors of method or reasoning, but not for validity.
  • Whether the DI has cited earlier successful forecasting to inspire confidence in its methods and

In the case of the Doherty report, dealing with these in order:

On the plus side, the report does cite the authorities upon which it relies for its choice of parameter values, for example, on page 32 “Population mixing within and between age groups is configured based on widely accepted social

contact matrices published by Prem et al (PLoS Computational Biology 2017)

  • Code, etc. available for download – as far as I could see, no attempt has been made to make the model available for replication.
  • Doherty Institute’s record of successful forecasting. As far as I can see, no attempt has been made to provide evidence of past success which would justify confidence in the present modelling.

It may be that the Doherty Institute has published elsewhere material which makes good these deficiencies. However, as detailed in “Models are not data”, I have emailed the Institute asking where any such material may be found, and have received no satisfactory reply. We are therefore left to take their skill and competence on trust.

Considering the immense implications of the advice they are giving our government this is, to say the least, disappointing.

One thought on “Models are not data – 2

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s