Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JCB-based obs+bias staging, Jedi class updates, and marine B-matrix refactoring #2992

Open
wants to merge 150 commits into
base: develop
Choose a base branch
from

Conversation

DavidNew-NOAA
Copy link
Contributor

@DavidNew-NOAA DavidNew-NOAA commented Oct 8, 2024

Description

This PR is a companion to GDASApp PR #1312 (merged) and JCB-GDAS PR #31 (merged).

This PR does three things:

  1. It changes the observation and bias staging for the atmospheric analysis tasks to use JCB templates instead of reading the full JEDI input configuration dictionary in order to construct a list of files to stage. This is cleaner and places fewer constraints on how to initialize the analysis.
  2. The Jedi constructor now takes as input a dictionary that is essentially subset of the task_config dictionary. This makes the code clearer and less opaque and makes debugging easier. Each dictionary is constructed from a YAML file with configuration parameters for each JEDI application that is run.
  3. All JEDI applications and their input YAMLs are now initialized in the initialize job of the AtmAnalysis and AtmEnsAnalysis. Before, in the atmensanl* jobs for example, the LETKF solver was initialized in the atmensanlinitcjob, but the LETKF solver and FV3 increment converter were both initialized and executed in the atmensanlobs and atmensanlfv3inc jobs respectively. This makes more sense in terms of resource allocation.

Addendum:

I'm now rolling in the refactoring of the marine B-matrix task into this PR. That makes it also a companion of GDASApp PR #1346 and JCB-GDAS PR #36.

These new changes introduce the Jedi class and JCB into the marine B-matrix job.

Partially resolves GDASApp issue #1296

Type of change

  • Bug fix (fixes something broken)
  • New feature (adds functionality)
  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? YES
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO
    • EMC verif-global
    • GDAS #1346
    • GFS-utils
    • GSI
    • GSI-monitor
    • GSI-utils
    • UFS-utils
    • UFS-weather-model
    • wxflow

How has this been tested?

C96C48_ufs_hybatmDA CI runs successfully
GDASApp jjob tests pass successfully

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have documented my code, including function, input, and output descriptions
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • This change is covered by an existing CI test or a new one has been added
  • I have made corresponding changes to the system documentation if necessary

RussTreadon-NOAA and others added 30 commits August 28, 2024 17:01
Copy link
Contributor

@aerorahul aerorahul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks ok to me.

@WalterKolczynski-NOAA WalterKolczynski-NOAA added the CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera label Nov 12, 2024
@emcbot emcbot added CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress and removed CI-Hera-Ready **CM use only** PR is ready for CI testing on Hera CI-Hera-Building **Bot use only** CI testing is cloning/building on Hera labels Nov 12, 2024
@emcbot
Copy link

emcbot commented Nov 13, 2024

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 1 with error logs:

/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_a1dcfdf4/logs/2024022400/gdas_atmanlinit.log

Follow link here to view the contents of the above file(s): (link)

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Nov 13, 2024
@emcbot
Copy link

emcbot commented Nov 13, 2024

Experiment C96C48_ufs_hybatmDA FAILED on Hera in Build# 1 in
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/EXPDIR/C96C48_ufs_hybatmDA_a1dcfdf4

@emcbot
Copy link

emcbot commented Nov 13, 2024

Experiment C96C48_hybatmaerosnowDA FAILED on Hera in Build# 1 with error logs:

/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/enkfgdas_esnowrecen.log
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/gdas_aeroanlvar.log
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/gfs_aeroanlvar.log

Follow link here to view the contents of the above file(s): (link)

@emcbot
Copy link

emcbot commented Nov 13, 2024

Experiment C96C48_hybatmaerosnowDA FAILED on Hera in Build# 1 in
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/EXPDIR/C96C48_hybatmaerosnowDA_a1dcfdf4

@emcbot emcbot added CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed and removed CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed labels Nov 13, 2024
@emcbot
Copy link

emcbot commented Nov 13, 2024

CI Failed on Hera in Build# 1
Built and ran in directory /scratch1/NCEPDEV/global/CI/2992


Experiment C96C48_ufs_hybatmDA_a1dcfdf4 Terminated with 0
FAIL
FAIL tasks failed and 1 dead at Wed Nov 13 02:06:39 UTC 2024
Experiment C96C48_ufs_hybatmDA_a1dcfdf4 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_ufs_hybatmDA_a1dcfdf4/logs/2024022400/gdas_atmanlinit.log
Experiment C96C48_hybatmaerosnowDA_a1dcfdf4 Terminated with 0
FAIL
FAIL tasks failed and 3 dead at Wed Nov 13 02:07:17 UTC 2024
Experiment C96C48_hybatmaerosnowDA_a1dcfdf4 Terminated: *FAIL*
Error logs:
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/enkfgdas_esnowrecen.log
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/gdas_aeroanlvar.log
/scratch1/NCEPDEV/global/CI/2992/RUNTESTS/COMROOT/C96C48_hybatmaerosnowDA_a1dcfdf4/logs/2021122018/gfs_aeroanlvar.log
Experiment C48mx500_3DVarAOWCDA_a1dcfdf4 Completed 2 Cycles: *SUCCESS* at Wed Nov 13 03:31:26 UTC 2024
Experiment C96_S2SWA_gefs_replay_ics_a1dcfdf4 Completed 1 Cycles: *SUCCESS* at Wed Nov 13 04:14:29 UTC 2024
Experiment C48_S2SW_a1dcfdf4 Completed 2 Cycles: *SUCCESS* at Wed Nov 13 07:59:30 UTC 2024
Experiment C48_ATM_a1dcfdf4 Completed 2 Cycles: *SUCCESS* at Wed Nov 13 08:48:03 UTC 2024
Experiment C48_S2SWA_gefs_a1dcfdf4 Completed 1 Cycles: *SUCCESS* at Wed Nov 13 09:11:07 UTC 2024
Experiment C96_atm3DVar_a1dcfdf4 Completed 3 Cycles: *SUCCESS* at Wed Nov 13 10:19:14 UTC 2024
Experiment C96C48_hybatmDA_a1dcfdf4 Completed 3 Cycles: *SUCCESS* at Wed Nov 13 10:56:30 UTC 2024

@DavidNew-NOAA
Copy link
Contributor Author

DavidNew-NOAA commented Nov 13, 2024

@WalterKolczynski-NOAA The issue with the CI failures is the update of the GDASApp hash, since the old hash points to some patches for its submodules. We should hold off on CI until #1362 is resolved. @RussTreadon-NOAA is using this branch as his baseline and is making rapid progress. Sorry, I should have let you known ealier.

@guillaumevernieres
Copy link
Contributor

@WalterKolczynski-NOAA The issue with the CI failures is the update of the GDASApp hash, since the old hash points to some patches for its submodules. We should hold off on CI until #1362 is resolved. @RussTreadon-NOAA is using this branch as his baseline and is making rapid progress. Sorry, I should have let you known ealier.

@DavidNew-NOAA , any reason why we need to have the new variable name as part of this pr? If not I'd suggest to revert to a working gdas.cd # so we can move the other pr's through the pipeline.

@DavidNew-NOAA
Copy link
Contributor Author

DavidNew-NOAA commented Nov 13, 2024

@guillaumevernieres This PR depends on a more recent GDASApp hash due to GDASApp PR #1312 (merged), but I think we could throw together a quick GDASApp PR that would at least fix these CI failures.

The enkfgdas_esnowrecen failure in C96C48_hybatmaerosnowDA is due to a bug in the FMS2 IO interface in FV3-JEDI. I made a quick bugfix PR #1304. It's approved, but I think Francois was just waiting for JCSDA CI to pass, which it did.

The failure in gdas_atmanlinit in C96C48_ufs_hybatmDA is due to lack of bias corrections for abi_g16, but that's commented out in the latest hash by GDASApp PR #1363.

The failure in gdas_aeroanlvar and gfs_aeroanlvar in C96C48_hybatmaerosnowDA is due to https://github.com/NOAA-EMC/GDASApp/blob/develop/parm/io/fv3jedi_fieldmetadata_restart.yaml having DELP rather than delp for its io name. That's the only one I'm not sure that if we change it, we'll break other stuff. Russ' https://github.com/NOAA-EMC/GDASApp/tree/feature/resume_nightly branch changes it, but it also updates all the JEDI hashes associated with the variable naming sprint. Any thoughts on that particular detail @RussTreadon-NOAA ?

@RussTreadon-NOAA
Copy link
Contributor

I've been logging progress in getting g-w DA CI to pass with DavidNew-NOAA:feature/jcb-obsbias using GDASApp feature/resume_nightly in GDASApp issue #1362. As noted in #1362 I am currently working through issues with gdas_marineanlvar.

@guillaumevernieres
Copy link
Contributor

@guillaumevernieres This PR depends on a more recent GDASApp hash due to GDASApp PR #1312 (merged), but I think we could throw together a quick GDASApp PR that would at least fix these CI failures.

The enkfgdas_esnowrecen failure in C96C48_hybatmaerosnowDA is due to a bug in the FMS2 IO interface in FV3-JEDI. I made a quick bugfix PR #1304. It's approved, but I think Francois was just waiting for JCSDA CI to pass, which it did.

The failure in gdas_atmanlinit in C96C48_ufs_hybatmDA is due to lack of bias corrections for abi_g16, but that's commented out in the latest hash by GDASApp PR #1363.

The failure in gdas_aeroanlvar and gfs_aeroanlvar in C96C48_hybatmaerosnowDA is due to https://github.com/NOAA-EMC/GDASApp/blob/develop/parm/io/fv3jedi_fieldmetadata_restart.yaml having DELP rather than delp for its io name. That's the only one I'm not sure that if we change it, we'll break other stuff. Russ' https://github.com/NOAA-EMC/GDASApp/tree/feature/resume_nightly branch changes it, but it also updates all the JEDI hashes associated with the variable naming sprint. Any thoughts on that particular detail @RussTreadon-NOAA ?

Ha ... yes ... Thanks for the reminders. I also just remembered that @RussTreadon-NOAA had to cook a saber/vader # in a fork ... Never mind my comment above!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Failed **Bot use only** CI testing on Hera for this PR has failed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants