-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some fixes for OPeNDAP attributes #48
Conversation
I've gotten testing setup so that we can spin up Xpublish in the background and test real clients against it. Works on xpublish-community#18
Thanks for the PR @noaaroland ! It would be great to add a test with example problematic attributes, though I think we would need to get #20 working so that we can serve a dataset to clients that would have issues. |
Where does this data set dap_client.get("/datasets/air/opendap.dds") come from? Is it part of the xpublish distribution? I have small data files which illustrate the problem. How would I reference them from a test? |
I'm guessing you are looking at? xpublish-opendap/tests/test_opendap_router.py Lines 33 to 34 in bcd0133
That's using the Py.test fixture system so that we can reuse shared code across multiple tests. Those fixtures are setup here: xpublish-opendap/tests/test_opendap_router.py Lines 9 to 30 in bcd0133
We're setting up a mock server and client that we can make some requests of. It doesn't actually expose a port, so we can't pass it to Xarray or NetCDF and try to see how different engines parse things. That's what I tried to do in #20 |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #48 +/- ##
==========================================
+ Coverage 92.04% 92.92% +0.88%
==========================================
Files 3 3
Lines 88 99 +11
==========================================
+ Hits 81 92 +11
Misses 7 7 ☔ View full report in Codecov by Sentry. |
I'm not sure what you're telling me regarding the getting #20 working or how I could help, but it occurs to me that the change I made relates to a data which which at a word in double quotes in a global attribute. See the word history in history_of_appended_files global attribute here: https://data.pmel.noaa.gov/aclim/thredds/dodsC/B10K-K20_CORECFS/Level1/1970-1974/B10K-K20_CORECFS_1970-1974_average_temp.nc.das Perhaps one could test that case by adding such an attribute to the data set you're building. E.g.
The other modifications I made were because attributes were falling through to the else which were not strings. Eventually I went testing for instanceof str directly, and falling through to the else as it was before. I feel like this might not be ideal in the end. Let me know specifically how you'd like me to help out and I'll try to help improve the plugin. |
Builds upon xpublish-community#48 by merging in the live server testing from xpublish-community#20 (with a few of the failing tests disabled from there that are in response to another issue). The live server has an additional dataset that is specifically focused on testings if attributes with double quotes are correctly round tripped via the OpenDAP plugin. Merge branch 'test-server' into quote-with-server
Ok, maybe I didn't get my thoughts down well. I just made a PR to your branch with most of the changes from #20 merged in. I went and disabled the changes from #20 that were causing failing tests (what we were trying to figure out in #18 ), but left the bits active that allow us to run a live server in our testing process. That allows us to see how real clients interpret things like attributes (and may help with debugging your question here too xpublish-community/xpublish#246 ). With the test server, I created a new dataset for the server to test how quotes get formatted in attributes, and a test that checks that. Hopefully that makes a bit more sense. If that sounds good, you can land noaaroland#1 onto your branch. Then I would suggest that you add datasets and tests for each one of the attribute format conversions to make sure they can be appropriately formatted by the server and loaded on the client. |
Use test server to make sure that attributes with quotes are escaped
for more information, see https://pre-commit.ci
Separate quote and casts tests. Co-authored-by: Alex Kerney <[email protected]>
for more information, see https://pre-commit.ci
@abkfenris, it seems that the two issues just posted by @dblodgett-usgs might be fixed by this PR: Note that the endpoint that @dblodgett-usgs was testing is our deployment of https://github.com/xpublish-experiments/Catalog-To-Xpublish developed by @xaviernogueira and that @sjordan29 will maintain. How might we help move this PR forward? |
Whoops, I hadn't seen that there had been more progress here. The first issue I believe is with a Catalog-to-Xpublish specific route, so I'm going to try to nudge that over there, but you're right in that the second one may be solved by this. @noaaroland have you managed to deal with your attribute issues with this? Sorry I had missed that you had put further work into this. |
I could get the code to run without error based on the changes I made to the way attributes are treated, but it does still not return the correct OPeNDAP data structures for my use case. See: xpublish-community/xpublish#246 If you can whip this pull request into shape and use it please do, but I pretty much moved on when I didn't get any feedback. |
Sorry, I don't know quite enough about how Xarray decides if something should be an coordinate and/or dimension when reading OpenDAP, so I pinged some folks offline about xpublish-community/xpublish#246 I'll try to give folks a nudge again to take a look. I can try to take care of the PR, but it may take me a few weeks as I'm about to head out on an expedition. |
Thank you. In my understanding of an old-school OPeNDAP server it's not really making an decisions about what is a coordinate, data variable or whatnot. The client figures out what's what by applying the convention by interpreting the attributes and looking at the structure (like is a variable the same name as the dimension). The server is just serving the attributes, structure and hunks of data as the client requests. In this implementation since it's xarray on both ends, it gets confusing as to which piece is responsible for organizing the xarray object according to the conventions. I hope we can get help sorting it out. Good luck with your expedition. |
Additionally fix pre-commit failures, and allow passing additional arguments into pytest/nox.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Thank you again @noaaroland for your work on this, and I'll try to get someone to take a look at the dimensions/coordinates issues you've had. |
It seems that double quotes in a String attribute value need to be escaped. The THREDDS Data Server escapes double quotes and the OPeNDAP client I tested (netCDF 4 C-library) fails when reading an attribute with a double quote.
Some old-school netCDF files have np.float32 as attribute values so these need to be accounted for when building the OPeNDAP object (other data types too probably).