The Written Path to the Climate of the Past
Published in Earth & Environment
This is the story of the largely overlooked potential of documentary climate data for climate reconstruction.
Climate science has always relied on historical information to understand past climates and their variations. Traditionally, most of the research in this field has focused on reconstructing annual temperature trends. In recent times, though, the questions asked by climate scientists have become more diverse, encompassing changes in the water cycle, occurrences of extreme weather events, and atmospheric dynamics. As a result, new approaches are required to produce comprehensive paleoclimatic datasets that capture the complexity of Earth's climate history. One such approach is off-line paleo data assimilation which provides past climate fields at increasing spatial and temporal resolution. For this and any approach to be successful, however, we strongly depend on sufficient high-quality data inputs.
A Written Path?
When we think of climate data, our minds often turn to instrumental measurement and natural proxies like tree rings and ice cores. However, while instrumental measurements provide highly precise and detailed climate information, their coverage is limited to the relatively recent past. Natural proxies on the other hand allow insights into climate variations (predominantly temperature) further back in time, but present with limited temporal and spatial coverage. Tree rings, for example, are widely used in climate reconstructions due to their extensive spatial distribution across the globe but primarily capture the annual growing season signal.
Having said that, there is another, often overlooked pathway to the climate of the past, one that is written and preserved within historical documents. These documents, such as diaries, chronicles, newspapers, and logbooks, hold a treasure trove of observations and descriptions that offer unique insights into past climate variability. Documentary proxies, i.e., document-based time series derived from historical documents, could provide an essential contribution since they cover seasons, regions, and combinations thereof (e.g., winter in East Asia) that are not well represented by natural proxies.
Holyoke’s diary-almanac, 1742. Harvard University Archives, Holyoke Family Collection.
How It All Began
The story of DOCU-CLIM began with recognizing the immense potential for climate reconstruction hidden within historical documents. In recent years, there has been a global effort to tap into the archives of societies for climate reconstruction. The PAGES CRIAS working group, founded in 2018, has been at the forefront of this effort, striving to unlock the wealth of information hidden within historical records.
Over the past century, historians and climatologists have meticulously investigated (written) documentary sources from all over the world and generated numerous document-based time series of local and regional climates. However, up until now, these valuable records have never been consolidated into a single comprehensive dataset on a global scale. That might be the reason why documentary data is seldom utilized in large-scale climate reconstructions to date. To address this shortcoming, we present the first-ever global multi-variable collection of documentary climate records DOCU-CLIM.
Let me take you behind the scenes of how this novel dataset came to be and what it is all about.
First, we needed to get a comprehensive overview of what was available. That entailed a whole lot of digging into literature and databases on historical climate evidence. While an overwhelming amount of documentary climate evidence exists, I systematically inventoried series with significant potential for climate reconstruction, focusing on high-resolution records (Burgdorf, 2022). Based on this elaborated selection, I began compiling the actual data series. While it is common practice nowadays to publish data alongside publications, that was rarely done back in the day. As a consequence, many of the records were not publicly available. Therefore, I reached out to the original authors of the identified record series, often tracking down long-retired or even deceased individuals, to inquire about their data. Many researchers agreed to collaborate and became part of the initiative to make documentary climate evidence more accessible and thus visible in the field of climate reconstruction. Aside from the data contributions and records found in repositories, we completed our dataset by rescuing and digitizing many additional time series. Ultimately, this global collaboration brings together a total of 621 document-based time series from across the globe with significant potential for climate reconstruction.
What Makes It Special?
DOCU-CLIM is the first-ever dataset combining multi-variable documentary climate records from across the globe, providing invaluable information on historical variations not only in temperature but also precipitation and wind regime.
Our rigorous evaluation using forward modeling revealed that the significance of documentary climate data is indeed astonishing. Robust and highly significant correlations emerged across Europe, North America, and Asia. In fact, the correlations surpassed those found in traditional tree ring modeling, with the majority of series exhibiting correlations above 0.5, with a striking peak between 0.7 and 0.9.
Map of Pearson correlation coefficients between documentary data and forward modelled data. Grey dotted circles indicate series where no evaluation was possible.
But not only are the document-based records very accurate climate proxies, but they also have the unique ability to fill gaps in regions and seasons that are not well represented by instrumental data and other paleoclimate proxies. Most climate field reconstructions based on natural proxies are subject to a bias toward the warm season since tree rings represent the growing season. Consequently, the cold season is rather poorly represented. Here, document-based series, particularly records of plant- and ice phenology come into play. By studying the phenological phases of plants, including leaf coloring in fall, and blossom in spring, as well as ice phenology parameters like freezing and thawing dates in early and late winter, we can gain unprecedented insights into the cold season temperature.
Amidst my data quest, I stumbled upon two extraordinary Russian publications dating back to the days of the Russian Empire that merit special mention: Written in Cyrillic, they contain a wealth of freeze-up and break-up records of rivers and lakes across the vast expanse of the empire, with over 50 records covering the early 19th century and a total of 21 records going beyond 1800 CE. These records, most of which are hitherto unknown to Western scholars, allow for unprecedented insights into past cold-season temperature variability spanning from Eastern Europe all the way to the Russian Far East coast.
What's the Use?
The use of documentary (written) climate data holds great promise for climate research and reconstruction. By bringing together previously scattered and underutilized document-based (written) climate data, we lay the foundation for incorporating documentary climate data into large-scale climate reconstructions. Alongside instrumental measurements and natural proxies, they provide a more comprehensive understanding of past climate variability on inter-annual to decadal timescales. That, in turn, allows us to validate and refine climate models, providing crucial insights into the Earth's past and improving our ability to make informed projections about future climate scenarios.
References
Burgdorf, A.-M. A global inventory of quantitative documentary evidence related to climate since the 15th century. Climate of the Past 18, 1407–1428 (2022).
Rykachev, M. Openings and freezings of rivers in the Russian Empire. (тип. Имп. Акад. наук, 1886).
Follow the Topic
-
Scientific Data
A peer-reviewed, open-access journal for descriptions of datasets, and research that advances the sharing and reuse of scientific data.
Related Collections
With Collections, you can get published faster and increase your visibility.
Data for crop management
Publishing Model: Open Access
Deadline: Jan 17, 2026
Computed Tomography (CT) Datasets
Publishing Model: Open Access
Deadline: Feb 21, 2026
Please sign in or register for FREE
If you are a registered user on Research Communities by Springer Nature, please sign in