Skip to content

Analysis framework and collection of process-oriented diagnostics for weather and climate simulations

License

Notifications You must be signed in to change notification settings

erileydellaripa/MDTF-diagnostics

 
 

Repository files navigation

MDTF-diagnostics: A Portable Framework for Weather and Climate Model Data Analysis

All Contributors

MDTF_test CodeQL Documentation Status

The MDTF-diagnostics package is a portable framework for running process-oriented diagnostics (PODs) on weather and climate model data.

What is a POD?

MDTF_logo Each process-oriented diagnostic [POD; Maloney et al.(2019)] targets a specific physical process or emergent behavior to determine how well one or more models represent the process, ensure that models produce the right answers for the right reasons, and identify gaps in the understanding of phenomena. Each POD is independent of other PODs. PODs generate diagnostic figures that can be viewed as an html file using a web browser.

Available Diagnostics

The links in the table below show sample output, a brief description, and a link to the full documentation for each currently-supported POD.

Diagnostic Contributor
Blocking Neale Rich Neale (NCAR), Dani Coleman (NCAR)
Convective Transition Diagnostics J. David Neelin (UCLA)
Diurnal Cycle of Precipitation Rich Neale (NCAR)
Eulerian Storm Track James Booth (CUNY), Jeyavinoth Jeyaratnam
Extratropical Variance (EOF 500hPa Height) CESM/AMWG (NCAR)
Forcing Feedback Diagnostic Brian Soden (U. Miami), Ryan Kramer
Mixed Layer Depth Cecilia Bitz (U. Washington), Lettie Roach
MJO Propagation and Amplitude Xianan Jiang (UCLA)
MJO Spectra and Phasing CESM/AMWG (NCAR)
MJO Teleconnections Eric Maloney (CSU)
Moist Static Energy Diagnostic Package H. Annamalai (U. Hawaii), Jan Hafner (U. Hawaii)
Ocean Surface Flux Diagnostic Charlotte A. DeMott (Colorado State University), Chia-Weh Hsu (GFDL)
Precipitation Buoyancy Diagnostic J. David Neelin (UCLA), Fiaz Ahmed
Rossby Wave Sources Diagnostic Package H. Annamalai (U. Hawaii), Jan Hafner (U. Hawaii)
Sea Ice Suite Cecilia Bitz (U. Washington), Lettie Roach
Soil Moisture-Evapotranspiration coupling Eric Wood (Princeton)
Stratosphere-Troposphere Coupling: Annular Modes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Eddy Heat Fluxes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: QBO and ENSO stratospheric teleconnections Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL), Dillon Elsbury (NOAA)
Stratosphere-Troposphere Coupling: Stratospheric Ozone and Circulation Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Stratospheric Polar Vortex Extremes Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Stratosphere-Troposphere Coupling: Vertical Wave Coupling Amy H. Butler (NOAA CSL), Zachary D. Lawrence (CIRES/NOAA PSL)
Surface Albedo Feedback Cecilia Bitz (U. Washington), Aaron Donahoe (U. Washington), Ed Blanchard, Wei Cheng, Lettie Roach
Surface Temperature Extremes and Distribution Shape J. David Neelin (UCLA), Paul C Loikith (PSU), Arielle Catalano (PSU)
TC MSE Variance Budget Analysis Allison Wing (Florida State University), Jarrett Starr (Florida State University)
Top Heaviness Metric Zhuo Wang (U.Illinois Urbana-Champaign), Jiacheng Ye (U.Illinois Urbana-Champaign)
Tropical Cyclone Rain Rate Azimuthal Average Daehyun Kim (U. Washington), Nelly Emlaw (U.Washington)
Tropical Pacific Sea Level Jianjun Yin (U. Arizona), Chia-Weh Hsu (GFDL)
Wavenumber-Frequency Spectra CESM/AMWG (NCAR)

Example POD Analysis Results

Quickstart installation instructions

See the documentation site for all other information, including more in-depth installation instructions.

Visit the GFDL Youtube Channel for tutorials on package installation and other MDTF-diagnostics-related topics

Prerequisites

  • Anaconda3, Miniconda3, or micromamba.

  • Installation instructions are available here.

  • MDTF-diagnositics is developed for macOS and Linux systems. The package has been tested on, but is not fully supported for, the Windows Subsystem for Linux.

  • Attention macOS M-series chip users: the MDTF-diagnostics base and python3 conda environments will only build with micromamba on machines running Apple M-series chips. The NCL and R environments will NOT build on M-series machines because the conda packages do not support them at this time.

Notes

  • $ indicates strings to be substituted, e.g., the string $CODE_ROOT should be substituted by the actual path to the MDTF-diagnostics directory.
  • Consult the Getting started section to learn how to run the framework on your own data and configure general settings.
  • POD contributors can consult the Developer Cheatsheet for brief instructions and useful tips

1. Install MDTF-diagnostics

  • Open a terminal and create a directory named mdtf, then $ cd mdtf

  • Clone your fork of the MDTF repo on your machine: git clone https://github.com/[your fork name]/MDTF-diagnostics

  • Check out the latest official release: git checkout tags/[version name]

  • Run % conda info --base to determine the location of your Conda installation. This path will be referred to as $CONDA_ROOT.

  • cd $CODE_ROOT, then run

ANACONADA/MINICONDA

% ./src/conda/conda_env_setup.sh --all --conda_root $CONDA_ROOT --env_dir $CONDA_ENV_DIR

MICROMAMBA on machines that do NOT have Apple M-series chips

% ./src/conda/micromamba_env_setup.sh --all --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

MICROMAMBA on machines with Apple M-series chips

% ./src/conda/micromamba_env_setup.sh -e base --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

% ./src/conda/micromamba_env_setup.sh -e python3_base --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR

  • Substitute the actual paths for $CODE_ROOT, $CONDA_ROOT, $MICROMAMBA_ROOT, MICROMAMBA_EXE, and $CONDA_ENV_DIR.
  • $MICROMAMBA_ROOT is the path to micromamba installation on your system (e.g., /home/${USER}/micromamba). This is defined by the $MAMBA_ROOT_PREFIX environment variable on your system when micromamba is installed
  • $MICROMAMBA_EXE is full path to the micromamba executable on your system (e.g., /home/${USER}/.local/bin/micromamba). This is defined by the MAMBA_EXE environment variable on your system
  • The --env_dir flag allows you to put the program files in a designated location $CONDA_ENV_DIR (for space reasons, or if you don’t have write access). You can omit this flag, and the environments will be installed within $CONDA_ROOT/envs/ by default.

NOTE: The micromamba environments may differ from the conda environments because of package compatibility discrepancies between solvers

% ./src/conda/micromamba_env_setup.sh --all --micromamba_root $MICROMAMBA_ROOT --micromamba_exe $MICROMAMBA_EXE --env_dir $CONDA_ENV_DIR builds the base environment, NCL_base environment, and a limited version of the python3_base enviroment that excludes the following packages and dependencies that may be required by the POD(s) you want to run:

  • falwa
  • gridfill

2. Download the sample data

Supporting observational data and sample model data are available via anonymous FTP at ftp://ftp.cgd.ucar.edu/archive/mdtf.

  • Digested observational data: run wget ftp://ftp.cgd.ucar.edu/archive/mdtf/obs_data_latest/\* or download the collection "NCAR CGD Anon" from Globus
  • NCAR-CESM-CAM sample data (12.3 Gb): model.QBOi.EXP1.AMIP.001.tar (ftp://ftp.cgd.ucar.edu/archive/mdtf/model.QBOi.EXP1.AMIP.001.tar)
  • NOAA-GFDL-CM4 sample data (4.8 Gb): model.GFDL.CM4.c96L32.am4g10r8.tar (ftp://ftp.cgd.ucar.edu/archive/mdtf/model.GFDL.CM4.c96L32.am4g10r8.tar)

Note that the above paths are symlinks to the most recent versions of the data and will be reported as zero bytes in an FTP client.

Running tar -xvf [filename].tar will extract the contents in the following hierarchy under the mdtf directory:

mdtf
 ├── MDTF-diagnostics
 ├── inputdata
     ├── model
     │   ├── GFDL.CM4.c96L32.am4g10r8
     │   │   └── day
     │   │       ├── GFDL.CM4.c96L32.am4g10r8.precip.day.nc
     │   │       └── (... other .nc files )
     │   └── QBOi.EXP1.AMIP.001
     │       ├── 1hr
     │       │   ├── QBOi.EXP1.AMIP.001.PRECT.1hr.nc
     │       │   └── (... other .nc files )
     │       ├── 3hr
     │       │   └── QBOi.EXP1.AMIP.001.PRECT.3hr.nc
     │       ├── day
     │       │   ├── QBOi.EXP1.AMIP.001.FLUT.day.nc
     │       │   └── (... other .nc files )
     │       └── mon
     │           ├── QBOi.EXP1.AMIP.001.PS.mon.nc
     │           └── (... other .nc files )
     └── obs_data ( = $OBS_DATA_ROOT)
         ├── (... supporting data for individual PODs )

The default test case uses the QBOi.EXP1.AMIP.001 sample data. The GFDL.CM4.c96L32.am4g10r8 sample data is only needed to test the MJO Propagation and Amplitude POD.

You can put the observational data and model output in different locations (e.g., for space reasons) by changing the values of OBS_DATA_ROOTas described below in section 3.

3. Generate a data catalog for the sample input data

The MDTF-diagnostics package provides a basic catalog generator to assist users with building data catalogs in the tools/catalog_builder directory

4. Configure framework paths

The MDTF framework supports setting configuration options in a file as well as on the command line. An example of the configuration file format is provided at templates/runtime_config.[jsonc | yml]. We recommend configuring the following settings by editing a copy of this file.

  • CATALOG_DIR: path to the ESM-intake data catalog
  • If you've saved the supporting data in the directory structure described in section 2, and use observational input data the default value for OBS_DATA_ROOT (../inputdata/obs_data) will be correct. If you put the data in a different location, the path should be changed accordingly.
  • WORK_DIR is used as a scratch location for files generated by the PODs, and should have sufficient quota to handle the full set of model variables you plan to analyze. This includes the sample model and observational data (approx. 19 GB) PLUS data required for the POD(s) you are developing.** No files are saved here unless you set OUTPUT_DIR to the same location as WORK_DIR, so a temporary directory would be a good choice.
  • OUTPUT_DIR should be set to the desired location for output files. OUTPUT_DIR and WORK_DIR are set to the same locations by default. The output of each run of the framework will be saved in a different subdirectory in this location. As with the WORK_DIR, ensure that OUTPUT_DIR has sufficient space for all POD output.
  • conda_root should be set to the value of $CONDA_ROOT used in section 2.
  • Likewise, set conda_env_root to the same location as $CONDA_ENV_DIR in section 2

We recommend using absolute paths in runtime_config.[jsonc | yml], but relative paths are also allowed and should be relative to $CODE_ROOT.$CODE_ROOT contains the following subdirectories:

  • diagnostics/: directory containing source code and documentation of individual PODs
  • doc/: directory containing documentation (a local mirror of the documentation site)
  • src/: source code of the framework itself
  • submodules/: location to place 3rd-party submodules to run as part of the MDTF-diagnostics workflow
  • tests/: unit tests for the framework
  • templates/: runtime configuration template files
  • tools/: helper scripts for building ESM-intake catalogs, and other utilities
  • user_scripts/: directory where users can place custom preprocessing scripts

5. Run the framework

The framework runs PODs that analyze one or more model datasets (cases), along with optional observational datasets, using. To run the framework on the example_multicase POD, modify the example configuration file and run

cd $CODE_ROOT
./mdtf -f templates/[runtime_config.[jsonc | yml]

The above command will execute PODs included in pod_list block of runtime_config.[jsonc | yml].

If you re-run the above command, the result will be written to another subdirectory under $OUTPUT_DIR, i.e., output files saved previously will not be overwritten unless you change overwrite in the configuration file to true.

The output files for the test case will be written to $OUTPUT_DIR/MDTF_Output/. When the framework is finished, open $OUTPUT_DIR/MDTF_Output/[POD NAME]/index.html in a web browser to view the output report.

You can specify your own datasets in the caselist block of the runtime config file and provide a catalog with the model data, or run the example_multicase POD on the synthetic data and associated test catalog specified in the configuration file. To generate the synthetic CMIP data, run:

mamba env create --force -q -f ./src/conda/_env_synthetic_data.yml
conda activate _MDTF_synthetic_data
pip install mdtf-test-data
mkdir mdtf_test_data && cd mdtf_test_data
mdtf_synthetic.py -c CMIP --startyear 1980 --nyears 5
mdtf_synthetic.py -c CMIP --startyear 1985 --nyears 5

Then, modify the path entries in diagnostic/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.csv, and the "catalog_file": path in diagnostic/example_multicase/esm_catalog_CMIP_synthetic_r1i1p1f1_gr1.json to include the root directory locations on your file system. Full paths must be specified.

Depending on the POD(s) you run, the size of your input datasets, and your system hardware, run time may be 10--20 minutes.

6. Next steps

For more detailed information, consult the documentation site. Users interested in contributing a POD should consult the "Developer Information" section.

Acknowledgements

MDTF_funding_sources

Development of this code framework for process-oriented diagnostics was supported by the National Oceanic and Atmospheric Administration (NOAA) Climate Program Office Modeling, Analysis, Predictions and Projections (MAPP) Program (grant # NA18OAR4310280). Additional support was provided by University of California Los Angeles, the Geophysical Fluid Dynamics Laboratory, the National Center for Atmospheric Research, Colorado State University, Lawrence Livermore National Laboratory and the US Department of Energy.

Many of the process-oriented diagnostics modules (PODs) were contributed by members of the NOAA Model Diagnostics Task Force under MAPP support. Statements, findings or recommendations in these documents do not necessarily reflect the views of NOAA or the US Department of Commerce.

Citations

Guo, Huan; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna; Rand, Kristopher; Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas; Underwood, Seth; Vahlenkamp, Hans; Bushuk, Mitchell; Dunne, Krista A.; Dussin, Raphael; Gauthier, Paul PG; Ginoux, Paul; Griffies, Stephen M.; Hallberg, Robert; Harrison, Matthew; Hurlin, William; Lin, Pu; Malyshev, Sergey; Naik, Vaishali; Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Reichl, Brandon G; Schwarzkopf, Daniel M; Seman, Charles J; Shao, Andrew; Silvers, Levi; Wyman, Bruce; Yan, Xiaoqin; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.; Held, Isaac M; Krasting, John P.; Horowitz, Larry W.; Milly, P.C.D; Shevliakova, Elena; Winton, Michael; Zhao, Ming; Zhang, Rong (2018). NOAA-GFDL GFDL-CM4 model output historical. Version YYYYMMDD[1].Earth System Grid Federation. https://doi.org/10.22033/ESGF/CMIP6.8594

Krasting, John P.; John, Jasmin G; Blanton, Chris; McHugh, Colleen; Nikonov, Serguei; Radhakrishnan, Aparna; Rand, Kristopher; Zadeh, Niki T.; Balaji, V; Durachta, Jeff; Dupuis, Christopher; Menzel, Raymond; Robinson, Thomas; Underwood, Seth; Vahlenkamp, Hans; Dunne, Krista A.; Gauthier, Paul PG; Ginoux, Paul; Griffies, Stephen M.; Hallberg, Robert; Harrison, Matthew; Hurlin, William; Malyshev, Sergey; Naik, Vaishali; Paulot, Fabien; Paynter, David J; Ploshay, Jeffrey; Schwarzkopf, Daniel M; Seman, Charles J; Silvers, Levi; Wyman, Bruce; Zeng, Yujin; Adcroft, Alistair; Dunne, John P.; Dussin, Raphael; Guo, Huan; He, Jian; Held, Isaac M; Horowitz, Larry W.; Lin, Pu; Milly, P.C.D; Shevliakova, Elena; Stock, Charles; Winton, Michael; Xie, Yuanyu; Zhao, Ming (2018). NOAA-GFDL GFDL-ESM4 model output prepared for CMIP6 CMIP historical. Version YYYYMMDD[1].Earth System Grid Federation. https://doi.org/10.22033/ESGF/CMIP6.8597

E. D. Maloney et al. (2019): Process-Oriented Evaluation of Climate and Weather Forecasting Models. BAMS, 100 (9), 1665–1686, doi:10.1175/BAMS-D-18-0042.1.

Disclaimer

This repository is a scientific product and is not an official communication of the National Oceanic and Atmospheric Administration, or the United States Department of Commerce. All NOAA GitHub project code is provided on an ‘as is’ basis and the user assumes responsibility for its use. Any claims against the Department of Commerce or Department of Commerce bureaus stemming from the use of this GitHub project will be governed by all applicable Federal law. Any reference to specific commercial products, processes, or services by service mark, trademark, manufacturer, or otherwise, does not constitute or imply their endorsement, recommendation or favoring by the Department of Commerce. The Department of Commerce seal and logo, or the seal and logo of a DOC bureau, shall not be used in any manner to imply endorsement of any commercial product or activity by DOC or the United States Government.

Contributors ✨

Thanks goes to our code contributors.
Thanks goes to these wonderful people (emoji key):

Dani Coleman
Dani Coleman

⚠️
John Krasting
John Krasting

👀
Aparna Radhakrishnan
Aparna Radhakrishnan

🤔

This project follows the all-contributors specification. Contributions of any kind welcome!

About

Analysis framework and collection of process-oriented diagnostics for weather and climate simulations

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 50.8%
  • NCL 35.0%
  • HTML 6.1%
  • Jupyter Notebook 5.4%
  • Shell 1.1%
  • MATLAB 0.6%
  • Other 1.0%