Skip to content

CEFI Computing Guide Main Page

Yi-Cheng Teng - NOAA GFDL edited this page Sep 18, 2024 · 9 revisions

These pages are under construction. Comments on the content and/or contributions are welcome.

NOAA’s Climate, Ecosystems and Fisheries Initiative (CEFI) is charged with developing an end-to-end ocean prediction and decision support system to support NOAA’s marine resource and ecosystem mandates. To meet this challenge, CEFI is leveraging transformational investments in high performance computing (HPC), high capacity data storage, robust climate-scale modeling workflows and analysis environments, and an open development approach. These pages provide a roadmap for those within CEFI’s ocean modeling community looking to harness these resources and work productively and responsibly within the NOAA/CEFI community to produce a sustainable state-of-the-art modeling system for NOAA’s coastal ocean and marine resource mandates.

An Overview of CEFI’s Computing Resources and the resources on this page

CEFI’s core HPC resources are provided by the Research and Development component of NOAA’s High Performance Computing and Communications Program (HPCC). Specifically, the Inflation Reduction Act has enabled a substantial CEFI allocation on a new partition of NOAA’s Gaea supercomputer (“Gaea/C6”). This allocation provides transformational computational power for CEFI’s coastal ocean modeling and prediction efforts. It will support simulations required for all of CEFI’s core products, including the thousands of years of retrospective simulations and forecasts required to develop a robust national scale coastal ocean prediction system. It will also provide a numerical laboratory for understanding ocean predictability and ensuring CEFI’s modeling system remains state-of-the-art. The links “Getting and Maintaining your RDHPCS and GFDL accounts” and “Remote Access” provide information to get established on and access these systems.

The “Configuring and Running Models” link provides information on how to efficiently and responsibly run CEFI models on Gaea/C6. This includes introducing the extensible markup language scripts (i.e., “xmls”) used within CEFI to organize the complex sets of model parameters, forcing, and outputs that go into each model simulation. These scripts effectively direct the workflows associated with each flavor of CEFI simulation (e.g., retrospective ocean simulations, seasonal prediction, decadal predictions and multi-decadal projections). They were initially developed for the Geophysical Fluid Dynamics Laboratory’s (GFDL’s) global climate and earth system modeling efforts, including climate change projections contributing to the reports of the Intergovernmental Panel on Climate Change (IPCC), and they are now being leveraged by CEFI for its climate-scale workflows.

A second key component of CEFI’s ocean modeling and prediction system is data storage. Thousands of years of simulation generates petabytes of data. CEFI is leveraging GFDL’s Archive file system (GFDL Archive) for the purpose. Archive provides primary storage for all of GFDL’s climate and earth system modeling activities, and currently has nearly 400 Petabytes of data on high capacity tape silos. Storage is as precious as HPC and the “Data Storage and Archive” link provides important information on responsible use of this critical shared resource. The “Moving Data” section also provides information on transferring data from GFDL to other systems, including systems at the Fisheries Science Centers, other OAR labs, and the CEFI data portal.

A third key component of CEFI’s ocean modeling and prediction system is a robust platform for analysis of model output. Rigorous model analysis and diagnostics are required to derive new insights from CEFI models, ensure they are fit for the marine ecosystem applications for which they are intended, and to facilitate the translation of model outputs into decision-relevant information for stakeholders. In addition to an RDHPCS account, CEFI’s core modeling team have also been given accounts on GFDL’s computing system, including the Post-Processing and Analysis (PPAN) cluster. The “Analyzing Model Output” link provides guidance on how to leverage GFDL’s various file systems and analysis software (e.g., Python, matlab, R, ferret) for this process. It also provides links to GitHub-managed scripts developed for past CEFI analyses and documentation papers that can be leveraged for new analyses, and directions on how to contribute to these libraries moving forward.

CEFI leverages a number of GitHub managed code bases described under “CEFI Model Components”, including version 6 of the Modular Ocean Model (MOM6), version 2 of the Sea Ice Simulator (SIS), the Carbon, Ocean Biogeochemistry and Lower Trophics (COBALT) ocean biogeochemistry and plankton ecosystem model, and GFDL’s Flexible Modeling System to link these components together and provide a range of basic functionalities. MOM6/SIS2 is now a community open development model and CEFI is enabling COBALT to reach this standard. The “CEFI Code Management” page provides information on how to utilize these codes and participate in the co-development process.

Getting Help

The first help line for the CEFI community is the CEFI community. There are multiple means of tapping into the collective experience of this community to resolve issues. This can be done through several “judgment free zones”. Chances are good that others have been encountering the same issues that you are, so please don’t hesitate to bring them forward! First, please check the list of a number of commonly encountered issues that are cataloged under “Frequently Asked Questions”. If these do not address the issue, please post it on the discussion forum on GFDL’s GitHub page. This forum is actively monitored and also has the benefit of archiving the discussion for the benefit of future users. GFDL’s CEFI modeling team holds a weekly “office hour” on Friday’s at 1-2 PM eastern time. E-mail is of course also an option, but we encourage users to engage through the GitHub forums when possible.

If an issue cannot be resolved by the CEFI team they can be elevated to the GFDL helpdesk. Some may also require assistance from the NOAA RDHPCS helpdesk. In these cases, discussion within the CEFI community first will lead to a more effective request to the GFDL and RDHPCS helpdesk teams.

A Note on CEFI values, Responsible Use and Best Practices

The most critical requirement for CEFI’s success is bringing together interdisciplinary teams of researchers from across NOAA and its partners in common purpose. This is the only way to realize the pioneering end-to-end ocean modeling and decision support system that CEFI envisions to support resilient coastal ecosystems and communities. Creating this community requires each member of CEFI’s modeling team to commit to building a constructive, supportive and positive environment consistent with NOAA’s ideals. The documents below lay out these principles. Thank you for being a part of the CEFI modeling team. We welcome comments and suggestions.

  • Code of Conduct for the CEFI modeling community and forums
  • Fair use policy for model codes and data
  • Co-authorship Guidance
  • NOAA Resources

Topics (under construction)

Additional Resource Links