Temperature Reconstructions and (Un)certainty around Warming Trends
Arnout Everts (PhD)
Geoscientist and Energy Consultant with 30+ years Experience in Oil and Gas, Geothermal Resources, CCUS, Natural Hydrogen, Techno-Commercial Advisory, Resource Certification and Data Analytics
Introduction
This paper is about surface-temperature reconstructions through recent history. Temperature reconstructions are obviously a key element of climate-change research. Objective of this post is to dive into the details of how regional and eventually “global” temperature curves through time are reconstructed from individual station records. I will demonstrate some of the issues and ambiguities involved in reconstructing “global temperature” and from there, discuss the ranges of uncertainty that one should associate with such temperature reconstructions.
To avoid any misunderstanding, let me emphasize that I am not a “climate denier”. I concur there is a significant recent warming-trend seen in the surface temperature records of most regions and it is very plausible that human activity, including greenhouse-gas emissions, has contributed to this. I firmly believe we should be much more conscious and responsible in dealing with the earth’ ecosystem and its natural resources and I agree we should strive to reduce greenhouse gas emissions as much as practically possible.
At the same time, as a geologist and exact scientist with 30+ year experience in working “global” reconstructions from sparse, localized data, I am rather surprised about the firmness with which temperature and climate predictions are presented in climate-change reports and essays and the narrow uncertainty bands that are carried. A good illustration is Figure 1, the so-called IPCC “hockey-stick graph” of “global” temperature through time. It carries uncertainty bands that are incredibly narrow: less than a few decimal-point degrees Celsius even for reconstructions going back more than a 1,000 years ago. This paper will zoom in on derivation of the “observed” part of reconstructed temperature, ca. 1850 till present, to illustrate what the real magnitude of uncertainty associated with this part of the curve might be. And then extrapolate this to the “reconstructed” portion of the curve, pre-1850.
Temperature Data and Reconstructions
Temperature reconstructions through geological time are based on temperature measurements and/or indications (temperature proxies). From around 1850 onward, people started recording surface temperatures at regular intervals using a variety of instruments (thermometers). Given relatively high precision of the instruments deployed, temperature records from 1850 onward are typically referred to as “observed”. On the other hand, temperature reconstructions further back in time are generally based on temperature proxies (such as tree-rings, pollen distribution etc). Given indirect nature of the proxies involved, a significant measurement and interpretation uncertainty applies to such “inferred temperature” reconstructions.
However, even in case of the 1850-to-present “global” temperature curves use of the term “observed” is not very appropriate. There is no “measured” or “observed” global temperature-curve because there is no such a thing as a temperature gauge for the “globe”, or for a country or region. Temperature measurements are made at weather stations at fixed geographic locations, often near airports, harbors or other human activity hubs. Those weather stations are not uniformly distributed across the globe and moreover, subject to a host of imperfections: missing records, discontinuities due to instrument upgrades, gauge relocations etc etc. Curves of “global” or “regional” temperature over time are therefore reconstructions built from non-uniformly distributed and imperfect sets of station-records via a process of data quality control, filtering and editing, assigning different weights to different data (to correct for location bias etc) and finally mapping; all far from straightforward. The “Berkeley Earth” temperature model described in the next section is one of the most elaborate and widely used reconstructions.
Berkeley Earth
The Berkeley Earth project (www.berkeleyearth.org) created and published a series of gridded monthly temperature maps through time (1850 until 2018). The gridding is anchored to a dataset of thousands of individual temperature station-records across the globe. From these temperature grids, representative temperature curves for a country, region, continent or even the entire globe can easily be extracted via averaging (since each temperature grid-node represents an equal area, the average temperature of, say, a country at a given month is simply the average of all grid-nodes falling within that country). Berkeley Earth was initiated in 2010 to in response to the criticism that IPCC over-relied on complex climate models (as opposed to temperature reconstructions anchored to hard data). It has since become the key benchmarking and “reality checking” tool for IPCC’s climate models.
In the Berkeley Earth Methodology, temperature data for each snapshot in time is gridded using a linear least-squares estimation algorithm known as Gaussian-process regression or Kriging Interpolation. To end up with smooth temperature-fields and (presumably) avoid large proportions of unexplained variance (a “nugget effect” also known as “bulls-eyes” around individual datapoints), Kriging is preceded by a process of rigorous station-data analysis, filtering and editing that is interpretative and often somewhat subjective. It involves:
Whilst the normalized, regional trend introduced in Step 3 is a key element of the Berkeley Earth approach, in publications documenting Berkeley’s approach it is not made clear exactly how this trend is derived, what data is used to establish it and to what extent the trend is adjusted as more data is analyzed. In any case, the Berkeley Earth approach assumes the earth temperature field over time consists of a strong regional/global temperature trend with only small local fluctuations around it. Berkeley Earth believes that most deviations from the regional trend seen in the raw, local station data are artifacts like typographical errors, instrumentation changes, station moves, and urban or agricultural development near the station. This is a conceptual assumption that is plausible but cannot be proved or disproved. Figure 2 below show four randomly-selected station examples of the different editing steps described above and documented in full on Berkeley Earth’ website (https://berkeleyearth.lbl.gov:4443/station-list/). From the hundreds of stations where I have gone through the data, my impression is edits cover the full spectrum, from entirely reasonable to highly interpretative and ambiguous. For example, at Kuala Lumpur Subang (Malaysia, Figure 2, leftmost set of graphs) edits result in removal of temporal fluctuation. At Buenos Aires Observatorio (Argentina, Figure 2 graphs second from left) edits reduce an otherwise rather strong trend of rising temperature. At St Johns (Canada, Figure 2 graphs second from right) edits introduce a trend of rising temperature in otherwise stable temperature-records. At Saig (Oman, Figure 2 rightmost graphs-set) data edits reverse a trend of declining temperature.
My view is, this argument completely side-steps the interpretative (and hence, subjective and uncertain) aspects of the data conditioning and gridding made by Berkeley Earth and others (e.g., NASA, NOOA). Edits made to the raw data are open to alternative interpretation whilst also the interpolation could have been done differently (e.g., with a different variogram, with longer or shorter range and/or with a nugget). Different data-conditioning choices and gridding method could have created rather different temperature maps and hence, a different “global” curve regardless of data weighing. Berkeley Earth does not show or discuss the Kriging Variance associated with their maps (a measure of the uncertainty around the interpolation solution away from datapoints). Considering the amount of local variability still remaining in the Adjusted data, I reckon Kriging Variance must be considerable and hence, stochastic simulation (rather than single-solution Kriging) using the Berkeley Earth dataset would likely result in a range of equiprobable temperature reconstructions spanning a range of uncertainty much wider than 0.15-0.2 degC. This is besides the fact that the editing steps made prior to Kriging reduce a lot of the variability seen in the original, raw data.
Data-Driven Regional Temperature Curves
To develop a view on ow much local and regional variability exists in the data and to what extent this variability may be “random” or “correlated” (the latter suggesting consistent fluctuations in temperature through time), I have analyzed the data for a number of countries and compared this with the Berkeley Earth curves. Methods and Results of this comparison are discussed in the following sections.
Methods
Since I don’t have the resources and time to review all the stations and curves for all countries, I have more-or-less randomly picked one country per continent / region. The countries I selected are:
For each of these countries I downloaded long-term temperature station data with an as-wide-as-possible geographic spread within the country (in most instances this equates to all available long-term stations) from the Berkeley Earth website.
Data-analysis steps are as follows. First, data are annualized by averaging the monthly temperatures. A region-average is then computed from the individual station data. To avoid artifacts where records from individual stations are interrupted (virtually all stations have certain periods of missing data), a “normalization” step is done that involves computing the mean temperature for each station over a period for which all selected stations in the region have data. For Argentina, Canada, Oman and Poland, the normalization period is 1996-2006, for Malaysia it is 2000-2010 and for Ghana it is 2008-2013.
These mean values are subsequently used to normalise the data, as follows:
A 10-year running average is then computed to filter out some obvious spikes visible in the raw curves. Standard Deviation in normalized temperature across the stations is also computed to reflect variability around the cross-stations temperature trend. The10-year moving average curves are then compared with the Berkeley Earth curves for the same region and also, to curves of different regions (to look for possible cross-region correlated temporal variability).
Results
Figure 3, Figure 4 and Figure 5 show, for each country, Temperature Anomaly (relative to the cross-station-average over the normalization period) over time, for the analyzed raw stations together with the 10-year running average across all stations. For comparison, the Berkeley-Earth curves (again with a 10-year running average) are also shown. The pink shaded area indicates the P90-P10 uncertainty band around the station-average as computed from the raw, normalized station data.
领英推荐
Clearly, there is considerable temperature variability from station to station. The pattern of temperature variability, temporal and between stations, appears different for different regions. In Argentina and Canada there appears to be a lot of random temperature variability that is not correlated between stations. On the other hand, Poland shows a distinct alternation of slightly colder and warmer periods with a periodicity of 5 to 7 years, noticeable in (and correlated across) virtually all stations. Whilst in Malaysia, Ghana and to a less extent Oman the station temperatures record a longer-term (50-70 years) temperature oscillation that is again correlated across most (if not all) stations and therefore almost certainly genuine. Note that the Berkeley Earth temperature curves do not include any of these variations which were apparently edited out as part of the pre-gridding data conditioning.
Standard Deviation in normalized temperature across the stations (a measure of variability around the cross-stations temperature trend) ranges from 0.4degC (Malaysia, Poland) to 0.9degC (Canada, Argentina).
Figure 6a compares the Temperature Anomaly curves averaged from the raw station data for each of the six (6) analyzed regions. For this comparison, Temperature Anomaly was computed relative to region temperature recorded over the 1900-1920 period. Figure 6b shows Temperature Anomaly computed from the Berkeley Earth model-curves for the same regions. The pink shaded area in both graphs is the P90-P10 uncertainty interval.
The curves averaged from individual station records evidently demonstrate a significant variability in temperature profiles both temporally and spatially. Some of the temporal variability noticed in individual regions may in fact be supra-regional. Specifically, the temperature oscillation with a period of 50-70 years involving a slight cooling from 1920 until medio 1970 and a subsequent warming from then on, appears noticeable in all regions except Argentina. The Berkeley Earth model-curves, on the other hand, are much closer aligned with long-term oscillations barely noticeable. Close alignment of curves from different regions and suppression of long-term oscillations are presumably due to the firm data editing and “forced alignment” applied as part of the Berkeley Earth methodology. Standard Deviation in normalised temperature across the regions (a measure of variability around the “global” temperature trend) is about 0.25degC for the curves averaged from raw station data. This is more than double the 0.14degC Standard Deviation computed from the Berkeley Earth model curves.
Discussion
Temperature curves for different regions, constructed by averaging the records from individual long-term observation stations, evidently show a significant variability in temperature profiles both temporally and spatially. Given that a large part of the observed temporal variability is correlated across stations and even regions (, i.e., similar short-term temperature oscillations are recorded across multiple locations), much of this variability appears genuine. In the Berkeley Earth-model reconstruction method, most of this variability is dismissed (edited out as “typographical errors, instrumentation changes, station moves, and urban or agricultural development near stations”). The very close alignment of Berkeley Earth-model temperature curves for different regions is therefore to some extent artificial and reflective of over-editing the data.
The significant temporal and spatial variability in surface temperature seen in the data suggests that evolution of global temperature over the last 120 years is more complex than the steady, more or less linear warming trend suggested by the Berkeley Earth model and carried in the IPCC “hockey stick” graph. Given this complexity and variability and considering that temperature stations are not spread uniformly across the globe and that there are large regions (e.g., oceans, deserts, mountain ranges) without any observation stations, the uncertainty pertaining to a representative “global” temperature curve must be significant. Figure 6a shows that the Standard Deviation around a “global mean” curve computed from the 6 different regions is around 0.25degC; this may be a realistic estimate of the genuine uncertainty in “observed global” temperature. Note that this is 3 times as much as the 0.08degC Standard Deviation carried by IPCC for the “observed” part of their global curve (Figure 1).
As explained before, the pre-1850 “reconstructed” part of the temperature curve is not based on direct station temperature-measurements but on indirect temperature indications (such as tree-rings, pollen distribution, oxygen isotope evidence). Those observations are likely even more sparsely distributed than the temperature stations that underpin the 1850-to-recent “observed” curve and hence, subject to at least a similar if not larger distribution uncertainty. In addition, indirect inferences of temperature would also be subject to an interpretation uncertainty (that is, from the same data at a given station, different interpreters may estimate a different temperature) that may not be insignificant. All in all, I believe it is reasonable to assume the pre-1850 temperature curve is subject to at least double the uncertainty compared to the 1850-to-present period, i.e., a tentative Standard Deviation of 0.5 degC.
Figure 7 shows the IPCC 2021 temperature curves with the associated P85-P15 uncertainty bands (+/-?one Standard Deviation) as suggested by this study (right-hand graph) compared to IPCC’s uncertainty (left-hand graph). The magnitude of temperature uncertainty around the pre-1850 curve as per this paper (right-hand graph) would in principle allow for significant temperature oscillations, either up or down; with an amplitude similar to the rise in global temperature suggested from the data in recent times.
This brings me to the final point of concern I like to raise with regards to the IPCC 2021 temperature curves. The pre-1850 temperature curve of IPCC 2021 is essentially flat with some minimal, short term oscillations with amplitude in the order of 0.1 to 0.2degC. I am not sure what these oscillations are based on but given the indirect nature of pre-1850 temperature estimates (from tree rings and the like), I reckon 0.1 to 0.2degC is below the resolution power of such data. We know that significant regional variability both spatially and temporally existed in recent times (1850 to present); for example, the temperature curves of Malaysia (Figure 3), Ghana and Oman (Figure 4) show oscillations with amplitude of around 0.5degC. There is no reason to assume such variability did not exist in earlier times of earth history, it is just that the sparse, uneven distributed and ambiguous pre-1850 data (indirect inferences rather than hard measurements) simply do not allow to resolve it.
Stochastic simulation is a technique that allows to “add back” unresolved but likely present variability to things like temperature reconstructions, to produce sample outcomes or “realizations” of what temperature evolution through time may have looked like in reality.
Figure 8 shows four (4) randomly selected, equi-probable simulation outcomes of global temperature evolution through time. The pre-1850 curve anchors to the IPCC2021 “reconstructed” curve but with added-on simulated temperature oscillations that have a periodicity similar to those recorded in the “observed” part of the curve (1850 to recent) and with amplitude contained within the P85-P15 uncertainty band. The Post-1850 curve anchors to the arithmetic average of the 6 regional curves studied in this paper (Figure 6a).
This stochastic simulation experiment is to illustrate that where the global warming trend in recent years is irrefusable, IPCC’s point that this warming is “unprecedented in recent earth history” is not substantiated by the data and its limitations. Pre-1850 temperatures may or may not have oscillated up or down with a magnitude similar to the 1970-to-recent warming. The sparse and indirect nature of the pre-1850 temperature inferences added to the already-significant spatial and temporal variability noted for the “observed” part of the curve (1850 to present), simply do not have the resolution power to resolve such temperature swings.
Conclusions
Key conclusions from this study are as follows:
1.?????1850-to-present “global” temperature curves are not “measured” of “observed” as such but are reconstructions based on records of temperature at individual stations.
2.?????Individual temperature stations record a variability in temperature evolution, both temporal and spatial, that is significant. Some of this variability appears correlated across multiple stations in a region or even supra-regionally. Much of this variability is therefore likely genuine and should not be dismissed as “artifacts” like what is done in the Berkeley Earth model (one of the key sources of IPCC’s temperature charts).
3.?????Considering the temperature variability from station to station and region to region, and noting that distribution of temperature stations across the globe is highly uneven, any reconstruction of “global” temperature based on actual measured station-temperature (i.e., post 1850) should carry a significant uncertainty band. This paper suggests to use the Standard Deviation of around 0.25degC evident from comparison of regional curves. This is three (3) times as wide as the uncertainty band suggested by IPCC2021.
4.?????Pre-1850 temperature “reconstructions” are not based on actual temperature measurements at stations but on sparse, indirect inferences of temperature (like tree rings, pollen distribution etc). Such reconstructions have limited resolution power, are subject to interpretation uncertainty and hence should carry at least double the uncertainty of the post-1850 records (i.e., a Standard Deviation of 0.5degC or more).
5.?????Temperature oscillations with a magnitude similar to the 1970-to-recent warming may or may not have occurred in pre-1850 times. The sparse and uncertain nature of the pre-1850 temperature inferences does not have the resolution power to resolve such detail.
Rzeczoznawca ds. Pomp Ciep?a wpisany na list? WIIH at NORMATYW | Ekspert w odnawialnych ?ród?ach energii
1 年What can you tell about Integrated Ocean Drilling Program (IODP) chart? Is it correct or wrong?
We, the people. Van binnen uit werken. Grenswerker. (Milieu)adviseur. Secuur vakman. Publieke dienaar. Loyale rebel. Versimpelaar. Ynnovator. Verbindingsman.
1 年We, the people. Van binnen uit werken. Grenswerker. (Milieu)adviseur. Secuur vakman. Publieke dienaar. Loyale rebel. Versimpelaar. Ynnovator. Verbindingsman.
1 年This book is also execellent work on climate uncertainty and risk.
We, the people. Van binnen uit werken. Grenswerker. (Milieu)adviseur. Secuur vakman. Publieke dienaar. Loyale rebel. Versimpelaar. Ynnovator. Verbindingsman.
1 年J?rvi de Vlugt