-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds netCDF3 vs netCDF4 distinction to _automatically_determine_filetype. #43
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #43 +/- ##
==========================================
+ Coverage 87.66% 88.14% +0.48%
==========================================
Files 13 13
Lines 851 869 +18
==========================================
+ Hits 746 766 +20
+ Misses 105 103 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
with netCDF4.Dataset(filepath, "r") as dataset: | ||
if dataset.data_model == "NETCDF4": | ||
filetype = "netCDF4" | ||
elif dataset.data_model == "NETCDF3_CLASSIC": | ||
filetype = "netCDF3" | ||
else: | ||
raise NotImplementedError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool! Presumably this has to open the file to read this data_model
information? Is that likely to incur any significant cost over the opening that kerchunk
already needs to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the read done by the NetCDF4 library is only reading attributes and not loading the entire file. Not 100% sure on this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can do this without opening the file with netCDF at all. Both file types have a "magic" at the beginning of the file.
def guess_file_type(fp) -> FileType:
magic = fp.read(4)
fp.seek(0)
if magic[:3] == b"CDF":
return FileType.netcdf3
elif magic == b"\x89HDF":
return FileType.hdf5
else:
raise ValueError(f"Unknown file type - magic {magic}")
Curious on what you think of this @TomNicholas.
netCDF4
dependency to tests. 🤷