Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate Checking in RDAS #258

Open
delippi opened this issue Jan 6, 2025 · 1 comment
Open

Duplicate Checking in RDAS #258

delippi opened this issue Jan 6, 2025 · 1 comment
Assignees

Comments

@delippi
Copy link
Collaborator

delippi commented Jan 6, 2025

Description

In GSI, RRFS uses l_closeobs in the setup routines to perform duplicate checks. This option retains the observation at a given lat/lon/pressure that is closest to the analysis time. Currently, we have been using Temporal Thinning filter in JEDI to perform duplicate checks; however, we have been specifying the category variable (i.e., how the observations are grouped) as MetaData/stationIdentification which will not always work. For example, if you have a profiler where there may be many observations at the same lat/lon/stationID but different pressure, only a single observation will be retained from that profile. The category variable must be a string or integer and it can't [currently] be a list of variables. I thought my proposed solution below would be easier than updating the code to be able to use more than one category variable.

Proposed fix:
We can add an ad hoc variable such as a string like "longitude_latitude_pressure" to the IODA file to be used as the category variable. This would be best suited to be placed in a python bufr2ioda converter; however, we are currently using yaml based converters. Therefore, I am proposing to add this to the offline_domain check code and we will just run that offline tool for the early stages of development.

Acceptance Criteria (Definition of Done)

The Temporal Thinning filter can use longitude latitude and pressure (not just station ID) as category variable either by adding the ad hoc variable to the offline domain check, a separate tool, python converter, or updating the Temporal Thinning code.

  • Link any relevant pull requests here:
    • PR #
    • PR #

Dependencies

None

@delippi
Copy link
Collaborator Author

delippi commented Jan 6, 2025

Are there any objections to adding the variable to the offline domain check and just using that for early cycling experiments? That is likely the easiest way forward until a more permanent solution can be made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants