« MetPy 1.7.0 Released | Main | New eLearning: Super... »

Running Pretrained AI-NWP Models, Our Experience at NSF Unidata on Jetstream2

07 May 2025

By Thomas Martin, Ana Espinoza, Julien Chastang, and Drew Camron

Wind output from AI-NWP. (Click to enlarge.)

At NSF Unidata, we have successfully implemented and re-used weights from several global AI-NWP (Artificial Intelligence-Numerical Weather Prediction) models (FourCastNet, Pangu) using the NVIDIA earth2mip package. We can confirm that these models are open source and can be reused on high-end, but increasingly standard, HPC hardware. While traditional numerical weather prediction requires massive supercomputing resources, these AI models can potentially deliver similar or better results using standard GPU hardware for inference. The training of these AI-NWP models still requires immense GPU resources. The process demanded careful consideration of computational resources, model architecture adaptations, and optimization strategies, as detailed below, and has opened up new possibilities for research institutions to run state-of-the-art weather predictions without access to supercomputing facilities. Below, we lay out some of the roadblocks we encountered and how to address them, hoping to smooth the path for others looking to implement these pre-trained models.

Need a Large-ish GPU!

A standard desktop GPU often won't suffice for loading and running these complex models. Jetstream2 is a U.S. National Science Foundation (NSF) funded cloud computing system designed to provide on-demand, interactive, and programmatic cyberinfrastructure for research and education, featuring advanced AI capabilities, virtual GPUs, and high-performance storage to support a broad range of scientific and engineering applications. We quickly ran into memory limitations when using GPUs with under 10GB of VRAM. Our successful implementation relied on a Jetstream2 g3.large instance equipped with a 20GB GPU (see Instance Flavors in the Jetstream2 Documentation), and even then encountered limits with specific models. Cloud-based serverless options like Modal are also viable, allowing you to access the necessary resources on demand.

State-of-the-art AI-NWP models can contain billions of parameters, with some requiring upwards of 16GB of GPU memory just to load the weights before computation begins. Additionally, at least 100GB of hard drive space is recommended for storing model weights and associated data. The initial loading process can be surprisingly time-consuming, often taking several minutes as the weights are transferred from storage to the GPU's memory. We also discovered that having sufficient memory isn't the whole story — efficient memory management is crucial. Strategies like clearing the CUDA cache, setting memory limits, and using mixed precision (FP16) are essential for preventing out-of-memory errors and ensuring smooth execution. At the end of this blog, there is a link to an NSF Unidata repository that has some helpful files and scripts for your own implementation of the earth2mip project.

AI/ML Python Packaging is Still a Mess

These products represent the cutting edge of Earth Systems Science, where APIs and datasets evolve rapidly to incorporate new research findings and methodological advances. The AI/ML Python packaging ecosystem is improving significantly week by week — a notable example being the recent ability to package PyTorch and CUDA on Windows — but the implementation process remains intricate and heavily dependent on specific hardware configurations, firmware versions, and package compatibilities in ways that exceed the typical requirements of scientific software. The emergence of modern Python package installers like uv (from Astral) offers promising solutions to these dependency challenges, as it provides dramatically faster installations and more reliable dependency resolution compared to traditional tools like pip. While we share our implementation approach below, we anticipate that these instructions may need frequent updates to remain current with the fast-paced developments in both AI/ML technologies and earth systems modeling. This rapid evolution means that successful implementation often requires staying closely connected with the broader AI/ML and Earth Systems communities to track breaking changes and emerging best practices.

Input Data Sourcing and Pre-processing

Access to specialized meteorological data sources is required for many AI-NWP frameworks. Specifically, the ECMWF ai-models repository requires direct MARS (Meteorological Archival and Retrieval System) access, which is typically only available to ECMWF member states and licensed institutions. While other data access has improved through tools like the CDS API for ERA5 and the earth2mip package's data preprocessing pipeline, significant challenges remain. This differs significantly from simpler machine learning approaches like random forest models that can work directly with local CSV or NetCDF files. Organizations must obtain appropriate credentials (ECMWF MARS access for ai-models, or CDS API credentials for ERA5) and become familiar with meteorological file formats like GRIB2 and their associated libraries (ecCodes, cfgrib) to properly load and preprocess the data. Additionally, downloading and storing the required ERA5 variables can demand several terabytes of storage space, depending on the temporal and spatial resolution needed.

Hope for the Future

This research and workflow is at the frontier of large-scale weather prediction, where artificial intelligence meets traditional numerical methods. While today's packages and workflows require careful handling and specific expertise, we're seeing small signs of maturation in the field. The community's growing adoption of these AI-driven approaches is steadily leading to more robust implementations and better documentation. We've already witnessed significant improvements in model packaging, data access, and installation processes — suggesting that what seems complex today may become routine practice tomorrow. This research is the frontier of large scale weather prediction.

While these packages and workflows are delicate today, we predict that models and methods that are used more in the community will get more support to smooth out the rough edges. Already we have seen improvement on this front with recent work from our CIRA colleagues using ai-models package package supported by ECMWF. Jacob Radford (https://github.com/jacob-radford) of CIRA created a great google colab notebook titled Running AI Weather Prediction (AIWP) models.

If You Just Want the Data

Our colleagues at CSU Fort Collins serve up these AI model outputs, in an accessible S3 bucket. Their work is described in Accelerating Community-Wide Evaluation of AI Models for Global Weather Prediction by Facilitating Access to Model Output.
and the data are available at https://noaa-oar-mlwp-data.s3.amazonaws.com/index.html.

Map display created by MetPy.

We want to thank them for doing this service for the community. At the 2025 American Meteorological Society Annual Meeting we saw more than a few presentations using this data archive for interesting research. Much more to do in this space!

Use it in MetPy

With the release of MetPy 1.7, we can access this data using the code below:

from datetime import datetime

from metpy.plots import MapPanel, PanelContainer, RasterPlot
from metpy.remote import MLWPArchive

###################
# Access the GraphCast forecast closest to the desired date/time
dt = datetime(2025, 3, 19, 00)  # Target datetime in UTC
# MLWPArchive accesses Machine Learning Weather Prediction models from NOAA's data archive
# get_product retrieves GraphCast data for the specified datetime
ds = MLWPArchive().get_product('graphcast', dt).access()

###################
# Plot the data using MetPy's simplified plotting interface.
raster = RasterPlot()
raster.data = ds  # Assign xarray Dataset
raster.field = 't2'  # Plot 2-meter temperature field
raster.time = dt  # Set valid time
raster.colorbar = 'horizontal'  # Position colorbar
raster.colormap = 'RdBu_r'  # Red-Blue reversed colormap (blue=cold, red=warm)

panel = MapPanel()
panel.area = 'co'  # Set geographic area to Colorado
panel.projection = 'lcc'  # Lambert Conformal Conic projection
panel.layers = ['coastline', 'borders', 'states']  # Add map features
panel.plots = [raster]  # Add raster plot to panel
panel.title = f"{ds[raster.field].attrs['long_name']} @ {dt}"  # Title with field name and time

pc = PanelContainer()
pc.size = (8, 8)  # Figure size in inches
pc.panels = [panel]  # Add panel to container
pc.draw()  # Render the figure
pc.show()  # Display the figure

Parting Thoughts

The dramatic computational efficiency gains demonstrated by AI-based weather prediction models like GraphCast and FourCastNet — showing orders of magnitude speedup over traditional physics-based models — make a compelling case for their operational integration. The hydrology community is doing something similar with GPU-based inference for complex simulations (Bennett et al. 2022); these learnings will only expand across the Earth Systems Sciences.

However, AI-NWP models remain dependent on traditional numerical models for training data, creating an interesting symbiotic relationship. While computationally expensive to run, physics-based models provide the essential foundation for training more efficient AI emulators. The path forward likely involves a hybrid approach where AI models handle routine operational forecasting on GPUs, while traditional models advance our understanding and generate training data.

The key challenges ahead involve thorough validation across different weather regimes and scales to build trust in the meteorological community. This requires continued collaboration between AI and weather prediction experts to ensure these models maintain physical consistency while capitalizing on their computational advantages. Finding this balance between efficiency and reliability will be crucial for successfully integrating AI-NWP into operational systems.

NSF Unidata-built Resources

In the Unidata/MLscratchpad GitHub repository, we have some of the resources that we used for this initial test. While it’s not a complete end-to-end implementation, it might help you get over the line. As always, feel free to get in touch with us via our support channels for more customized assistance.

References

Bauer, Peter. "What if? Numerical weather prediction at the crossroads." Journal of the European Meteorological Society 1 (2024): 100002.

Bennett, Andrew, Hoang Tran, Luis De la Fuente, Amanda Triplett, Yueling Ma, Peter Melchior, Reed M. Maxwell, and Laura E. Condon. "Spatio‐temporal machine learning for regional to continental scale terrestrial hydrology." Journal of Advances in Modeling Earth Systems 16, no. 6 (2024): e2023MS004095.

Bi, Kaifeng, Lingxi Xie, Hengheng Zhang, Xin Chen, Xiaotao Gu, and Qi Tian. "Pangu-weather: A 3d high-resolution model for fast and accurate global weather forecast." arXiv preprint arXiv:2211.02556 (2022).

Kurth, Thorsten, Shashank Subramanian, Peter Harrington, Jaideep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, and Anima Anandkumar. "Fourcastnet: Accelerating global high-resolution weather forecasting using adaptive fourier neural operators." In Proceedings of the platform for advanced scientific computing conference, pp. 1-11. 2023.

Radford, Jacob T., Imme Ebert-Uphoff, Jebb Q. Stewart, Kate D. Musgrave, Robert DeMaria, Natalie Tourville, and Kyle Hilburn. "Accelerating Community-Wide Evaluation of AI Models for Global Weather Prediction by Facilitating Access to Model Output." Bulletin of the American Meteorological Society 106, no. 1 (2025): E68-E76.

Thomas Martin is an AI/ML Software Engineer at the NSF Unidata Program Center. Have questions? Contact support-ml@unidata.ucar.edu or book an office hours meeting with Thomas on his Calendar.

Posted by Unidata News [ Comments [1] ]