Using machine learning for climate prediction

Climate change has widespread effects around the world. Predicting those effects poses an extraordinarily hard, and extraordinarily important, challenge for Earth and atmospheric scientists. In the next few years, machine learning may provide many new tools to help make better climate predictions.

Citation: ‘Climate Prediction’ by K. Kochanski in ‘Tackling climate change with machine learning’, D. Rolnick, P. Donti, L. Kaack, K. Kochanski et al., 2019, arxiv:1906.05433.
Shaped by feedback from: David Rolnick, Karthik Mukkavilli, Priya Donti, David John Gagne II, Ben Kravitz, Ghaleb Abdulla, Goodwin Gibbons, Andrew Ross, Andrew Ng, John Platt, Jennifer Chayes, Yoshua Bengio

Machine learning for climate prediction

The first global warming prediction was made in 1896, when Arrhenius estimated that burning fossil fuels could eventually release enough CO 2 to warm the Earth by 5 ◦ C. The fundamental physics underlying Arrhenius’s calculations has not changed, but our predictions have become far more detailed and precise. The predominant predictive tools are climate models, known as general circulation models (GCMs) or Earth system models (ESMs). 45 These models inform local and national government decisions (see IPCC reports [4, 26, 437]), help individuals calculate their climate risks (see §10) and allow us to estimate the potential impacts of solar geoengineering (see §9).

Recent trends have created opportunities for ML to advance the art of climate prediction. First, new and cheaper satellites are creating petabytes of climate observation data. Second, massive climate modeling projects are generating petabytes of simulated climate data. Third, climate forecasts are computationally expensive [441] (the simulations in [440] took three weeks to run on NCAR supercomputers), but ML applications are driving the design of next-generation supercomputers that could ease current computational bottlenecks. As a result, climate scientists have recently begun to explore ML techniques, and are starting to team up with computer scientists to build new and exciting applications.

Uniting data, ML, and climate science

Climate models represent our understanding of Earth and climate physics. We can learn about the Earth by collecting data. To turn that data into useful predictions, we need to condense it into coherent, computationally tractable models. ML models are likely to be more accurate or less expensive than other models where: (1) there is plentiful data, but it is hard to model systems with traditional statistics, or (2) there are good models, but they are too computationally expensive to use in production.

When data is plentiful, climate scientists build many data-driven models. These models are mostly built by solving regression and classification problems, and new ML techniques may solve many problems that were previously challenging. For example, the authors of [442–444] use ML to calibrate satellite sensors, classify crop cover, and identify pollutant sources. More applications like these are likely to appear as satellite databases grow. This year, Reichstein et al. proposed that deep learning could be used extensively for pattern recognition, super-resolution, and short-term forecasting in climate models [445], and Mukkavilli proposed to compile a new labelled dataset of environmental imagery, called EnviroNet, that would accelerate ML work in environmental science [446]. We recommend that modellers who seek to learn directly from data see [447] for specific advice on fitting and over-fitting climate data.

Many climate prediction problems are irremediably data-limited. No matter how many weather stations we construct, how many field campaigns we run, or how many satellites we deploy, the Earth will generate at most one year of new climate data per year. Existing climate models deal with this limitation by relying heavily on physical laws, such as thermodynamics. ML models can leverage existing physics-based models as data sources to solve important climate problems.

Recent work has shown how deep neural networks and existing thermodynamics knowledge could be combined to fix the largest source of uncertainty in current climate models: clouds. Bright clouds block sunlight and cool the Earth; dark clouds catch outgoing heat and keep the Earth warm [437, 448]. These effects are controlled by small-scale processes such as cloud convection and atmospheric aerosols (see uses of aerosols for cloud seeding and solar geoengineering in §9). Physical models of these processes are far too computationally expensive to include in global climate models — but ML models are not. Gentine et al. trained a deep neural network to emulate the behavior of a high-resolution cloud simulation, and found
that the network gave similar results for a fraction of the cost [449] and was stable in a simplified global model [450]. Existing scientific models have fixed trade-offs between cost and accuracy, and sometimes these trade-offs do not include any great solutions. Neural networks trained on those scientific models produce similar predictions, but offer an entirely new set of compromises between training cost, production cost, and accuracy. Replacing select climate model components with neural network approximators may thus improve both the cost and the accuracy of global climate models. Additional work is needed to optimize the cloud model above; to identify more climate model components that could be replaced by neural networks (we highlight other impactful components below); to train neural networks that replace those components; and to build pipelines that re-train these neural networks in response to errors or extrapolation (example workflow in §4.5 of [442]).

The next most important targets for climate model improvements are ice sheet dynamics and sea level rise. The Arctic and Antarctic are warming faster than anywhere else on Earth, and their climates control the future of global sea level rise and many vulnerable ecosystems [4, 26]. Unfortunately, these regions are dark and cold, and until recently they were difficult to observe. In the past few years, however, new satellite campaigns have illuminated them with hundreds of terabytes of data. 48 These data could make it possible to use ML to solve some of the field’s biggest outstanding questions. In particular, models of mass loss from the Antarctic ice-sheet are highly uncertain [451] and models of the extent of Antarctic sea ice do not match reality well [452]. The most uncertain parts of these models, and thus the best targets for improvement, are snow reflectivity, sea ice reflectivity, ocean heat mixing and ice sheet grounding line migration rates [447, 451, 453]. Computer scientists who wish to work in this area could build models that learn snow and sea ice properties from satellite data, or use new video prediction techniques (e.g. [454]) to predict short-term changes in the extent of sea ice.

ML could also improve climate model efficiency by identifying and leveraging relationships between climate variables. For example, Nowack et al. demonstrated that ozone concentrations could be computed as a function of temperature, rather than physical transport laws, which led to considerable computational savings [455]. Pattern recognition and feature extraction techniques could allow us to identify more useful connections in the climate system, and regression models could allow us to quantify non-linear relationships between connected variables.

In the further future, the Climate Modeling Alliance has proposed to build an entirely new climate model that learns continuously from data and from high-resolution simulations [456]. The proposed model would be written in Julia, in contrast to existing models which are mostly written in C++ and inherited Fortran. At the cost of a daunting translation workload, they aim to build a model that is more accessible to new developers and more compatible with ML libraries.

Finally, the best climate predictions are synthesized from ensembles of 20+ climate models [457]. Making good ensemble predictions is an excellent ML problem. Monteleoni et al. proposed that online ML algorithms (e.g. [458]) could select the best-performing model at any given point in time [459]; this idea has been refined in further work [460, 461]. More recently, Anderson and Lucas used random forests to make high-resolution predictions from a mix of high- and low-resolution models, thereby reducing the costs of building multi-model ensembles [462]. These studies leave room for the development of more specialized and sophisticated ensemble methods. For example, climate models serve many users with different objectives. The model in [459] optimizes the ensemble to predict global temperature; however, their solution is not necessarily optimal for users who need predictions of local temperatures, local rainfall, or the dates the Northwest Passage will open for shipping.

Forecasting extreme events

For most people, extreme event prediction means the local weather forecast and a few days’ warning to stockpile food, go home, and lock the shutters. Weather forecasts are shorter-term than climate forecasts, but they produce abundant data that makes them amenable to some ML techniques that would not work in climate models. Weather models are optimized to track the rapid, chaotic changes of the atmosphere; since these changes are fast, tomorrow’s weather forecast is made and tested every day. Climate models, in contrast, are chaotic on short time scales, but their long-term trends are driven by slow, predictable changes of ocean, land, and ice (see [463]). 49 As a result, climate model output can only be tested against long- term observations (at the scale of years to decades).

Intermediate time scales, of weeks to months, are exceptionally difficult to predict, although Cohen et al. [464] argue that machine learning could bridge that gap by making good predictions on four to six week timescales [465]. Thus far, however, weather modelers have had hundreds of times more test data than climate modelers, and began to adopt ML techniques earlier. Numerous ML weather models are already running in production. For example, Gagne et al. recently used an ensemble of random forests to improve hail predictions within a major weather model [466].

Climate models do predict changes in long-term trends like drought frequency and storm intensity, although they cannot predict the specific dates of future events. These trends help individuals, corporations and towns to make informed decisions about infrastructure, asset valuation and disaster response plans (see also §8.4). Identifying extreme events in climate model output, however, is a classification problem with a twist: all of the available data sets are strongly skewed because extreme events are, by definition, rare. ML has been used successfully to classify some extreme weather events. Liu et al. used deep convolutional neural networks to count cyclones and weather fronts in climate data sets [467], and Lakshmanan has devised a series of techniques to track storms and tornadoes (e.g. [468]). Tools for more event types would be useful, as would online tools that work within climate models, and statistical tools that quantify the uncertainty in new extreme event forecasts.

Forecasts are most actionable if they are specific and local. ML is widely used to make local forecasts from coarse 10–100 km climate or weather model predictions; various authors have attempted this using support vector machines, autoencoders, Bayesian deep learning, and super-resolution convolutional neural networks (e.g. [469]). Several groups are now working to translate high-resolution climate forecasts into risk scenarios. For example, ML can predict localized flooding patterns from past data [470], which could inform individuals buying insurance or homes. Currently, flood maps from the U.S. Federal Emergency Management Agency (FEMA) (part of the National Flood Insurance Program) do not account for the effects of climate change on flooding [471]. Since ML methods like neural networks are effective at predicting local flooding during extreme weather events [472], these could be used to update local flood risk estimates to benefit individuals. The start-up Jupiter Intelligence 50 is working to make climate predictions more accessible and actionable to companies and local governments, by translating climate forecasts into localised flood and temperature risk scores.

A full review of the applications of ML for extreme weather forecasting is beyond the scope of this article. Fortunately, that review has already been written: see [473]. The authors describe ML systems that correct bias, recognize patterns, and predict storms. Moving forward, they envision human experts working in sync with automated forecasts.

Discussion

ML may change the way that scientific modeling is done. The examples above have shown that many components of large climate models can be replaced with ML models at lower computational costs. From an ML standpoint, learning from an existing model has many advantages: modelers can generate new training and test data on-demand, and the new ML model inherits some community trust from the old one. This is an area of active ML research. Recent papers have explored data-efficient techniques for learning dynamical systems [474], including physics-informed neural networks [475] and neural ordinary differential equations [128]. In the further future, researchers are developing ML solutions for a wide range of scientific modeling challenges, including crash prediction [476], adaptive numerical meshing [477], uncertainty quantification [478, 479] and performance optimization [480]. If these solutions are effective, they may solve some of the largest structural challenges facing current climate models.

New ML models for climate will be most successful if they are closely integrated into existing scientific models. This has been emphasized, again and again, by authors who have laid future paths for artificial intelligence within climate science [450, 456, 473, 481]. New models need to leverage existing knowledge to make good predictions with limited data. In ten years, we will have more satellite data, more interpretable ML techniques, hopefully more trust from the scientific community, and possibly a new climate model written in Julia. For now, however, ML models must be creatively designed to work within existing climate models. The best of these models are likely to be built by close-knit teams including both climate and computational scientists.