Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revised e3sm_to_cmip --info handling. Details within. #199

Merged
merged 7 commits into from
Feb 16, 2024

Conversation

TonyB9000
Copy link
Contributor

@TonyB9000 TonyB9000 commented Jul 21, 2023

Modifications to main.py (_run_info_mode) include:

  1. Testing for presence of "info_out_path" as first option for output (otherwise defaults to "output_path" or stdout.)
  2. Reject mode-3 data test for MPAS data with error message
  3. Revise mode-3 data test output to provide table=, variable=, and DataSupport= (True,False) for each supporting variable.
  4. Created function "get_handler_info_msg(handler)" in util.py to replace 3 repetitions of the same 10 lines.
  5. Added "info" mode to util.py print_message for colorless output.
  6. Revised help text to clarify --info functionality.

(unrelated) In lib.py, added "if do_pbar" switch before instances of progress-bar codes to reduce batch noise.

Copy link
Collaborator

@chengzhuzhang chengzhuzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codes look fine by visual check. Tony, thanks for summarizing the code change, that makes it easier to review. Regarding the progress bar hack, I recall that I saw the discussion somewhere, but can't seem to find it. Would you add here why you implemented this.

@TonyB9000
Copy link
Contributor Author

@chengzhuzhang Regarding the progress bar (hack): When examining detailed output logs, I often uncovered multiple megabytes of output (vi for instance does not compress the progress-bar output) and this output has no value when running jobs in batch mode (no human to stare at the progress bar). I wanted to implement a "--quiet" or "--no-progress-bar" command-line option but didn't want to puzzle through it at the time. Probably a "self.progressbar" (True/False) attribute would work.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder (I hope I did the issue-link correctly). Are you interested in reviewing the code mods, or should I merge?

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Jul 24, 2023

@tomvothecoder (I hope I did the issue-link correctly). Are you interested in reviewing the code mods, or should I merge?

Hey @TonyB9000, I usually do code review for e3sm_to_cmip. Were you hoping to have this merged soon to integrate into datasm?

I don't expect us to have a new release candidate by 8/1 (E3SM Unified deadline) unless there is an urgent bug fix.
We can take our time working on this PR and reviewing. I'll aim to review either on Wed (7/26) before I go on vacation or sometime next week when I get back

@TonyB9000
Copy link
Contributor Author

@tomvothecoder No Hurry. For datasm operation I can (I hope) just pip install my local e2c.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder Sorry I abandoned tracking this. When I create the datasm conda env with "prod.yml", the value for e2c is specified as e3sm_to_cmip >=1.9.1, and in fact, what gets included is e3sm_to_cmip 1.11.1. It is currently failing for v2_1 amip, (77 datasets) with

ERROR:root:UTIL:ERROR checking variables: No data returned from e3sm_to_cmip --info:

I had inadvertently been running an environment that "pip installed" local e2c changes, and does not reflect your latest work.
E2C_NetworkGraph

I am going to attempt the "Use the command line to resolve conflicts" before anything else, and see how far I get.

@TonyB9000
Copy link
Contributor Author

@tomvothecoder "All checks passed" (FWIW). Please review. I know some changes in the "conflict-resolved" merge to "main" are yours and some are mine. I left out my addition, around line 710, that read

                    if handler['table'] in [ "CMIP6_Omon.json", "CMIP6_SImon.json" ]:
                        stat_msg = f"Cannot presently test for variable support in MPAS files"
                        print_message(stat_msg, status="info")

because I could not tell if subsequent changes made this obsolete.

Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Tony, I had some minor comments and suggestions. After reviewing and addressing, feel free to merge if your testing works out.

Comment on lines 646 to 658
def get_handler_info_msg(self, handler):
msg = {
"CMIP6 Name": handler["name"],
"CMIP6 Table": handler["table"],
"CMIP6 Units": handler["units"],
"E3SM Variables": ", ".join(handler["raw_variables"]),
}
if handler.get("unit_conversion"):
msg["Unit conversion"] = handler["unit_conversion"]
if handler.get("levels"):
msg["Levels"] = handler["levels"]
return msg

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_handler_info_msg(self, handler):
msg = {
"CMIP6 Name": handler["name"],
"CMIP6 Table": handler["table"],
"CMIP6 Units": handler["units"],
"E3SM Variables": ", ".join(handler["raw_variables"]),
}
if handler.get("unit_conversion"):
msg["Unit conversion"] = handler["unit_conversion"]
if handler.get("levels"):
msg["Levels"] = handler["levels"]
return msg

This looks like a duplicate of util.get_handler_info_msg and is not being used in this module.

It can be deleted.

@@ -648,44 +665,30 @@ def _run_info_mode(self): # noqa: C901

# if the user just asked for the handler info
if self.freq == "mon" and not self.input_path and not self.tables_path:
messages = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

messages = [] can be defined once outside the conditional.


# if the user asked if the variable is included in the table
# but didnt ask about the files in the inpath
elif self.freq and self.tables_path and not self.input_path:
messages = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
messages = []

file_path = next(Path(self.input_path).glob("*.nc"))

messages = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
messages = []


elif self.freq and self.tables_path and self.input_path:

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

messages.append(hand_msg)
else:
stat_msg = f"Table={handler['table']}:Variable={handler['name']}:DataSupport=FALSE"
print_message(stat_msg, status="info")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
print_message(stat_msg, status="info")
logger.info(stat_msg)

Can we use the logger object defined at the top of this module instead of print_message()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also should we still print out stat_msg if it is None?

Or should it be within the else: statement to avoid printing None?

# We test here against the input "freq", because atmos mon data satisfies BOTH
# CMIP6_day.json AND CMIP6_mon.json, but we only want the latter in the "hand_msg" output.
# The variables-check below is insufficient as vars "hass" and "rlut" have multiple freqs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

Comment on lines 724 to 726
# We test here against the input "freq", because atmos mon data satisfies BOTH
# CMIP6_day.json AND CMIP6_mon.json, but we only want the latter in the "hand_msg" output.
# The variables-check below is insufficient as vars "hass" and "rlut" have multiple freqs.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if these comments are within 80 characters per line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In total, no. But does one begin measuring at the "#" or at column 1? (Never knew very long comment lines to be an issue.)

if self.freq == "day" and handler['table'] == "CMIP6_Amon.json":
continue

hand_msg = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hand_msg = None

Don't need to set the default value of hand_msg to None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - true of "stat_msg" as well, "Defensive Programming"? (someone later changes code in a manner that causes a exception because "None" was not assigned?)

Unless a portion of code is in a tight-loop (may be executed millions of times), I an comfortable with defensive programming techniques. Too much minimalism has its drawbacks - and most compilers (I assume) elide unnecessary codes.

continue

hand_msg = None
stat_msg = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
stat_msg = None

Probably don't need to set the default value of stat_msg to None if we don't want to print out None values.

@TonyB9000 TonyB9000 merged commit fd821e1 into master Feb 16, 2024
3 checks passed
@TonyB9000 TonyB9000 deleted the fix_e3c_info_mode branch February 16, 2024 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

--info option fails on some inputs with TypeError: expected str, bytes or os.PathLike object, not NoneType
3 participants