-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
identical descriptor files are incorrectly associated with different time axes #1984
Comments
Hello Andrew. Noted and we will be in touch with more questions. |
[editing to say the similarity may be with TSERIES on-the-fly aggregations, not FMRC aggregations.] Sure, Andrew and Eugene. I have a thought or two and will get with Joshua to see if we can see what this is about. At a guess, the change in behavior with V7.6 comes from work having to do with time-axis handling for datasets defined as TSERIES aggregations, which share some of the descriptor-file handling machinery. If so it sounds like something has just had the effect of pointing out an existing bug and making it happen consistently. |
Hi Andrew, |
Hi Andrew, Thank you for your patience as I learn the ropes. The issue here had to do with some memory allocation added along with the DSG functionality like @ACManke suggested. After patching it up, using your example, I get:
I also checked the fix against combining two netCDF files to ensure there is no redundancy in the time axis.
|
fixing dsg memory issue for time axes: issue #1984
Thanks Joshua and Ansey, this fix seems to be working as advertised. But one loose end: why do the variables appear to be on two different grids? (These are named GBB1 and GJC1 in your example.) They should be on the same grid, since they share all the same axes. Is this simply an oversight in the new code, not recognizing that the grids are the same? In my example above with the old Ferret v7.52, both variables appeared on the same grid, named GMO1 in that case. But with the new v7.63 I get:
where the two "apparently" (but not actually) different grids are here named GQV1 and GMO1. |
Any progress on this loose end? |
Hello Andrew,
my apologies for this oversight - will focus on this tomorrow.
Eugene
…--------------------------------------------------------------------
Eugene F Burger
206.526.4586
Associate Director of IT, NOAA Pacific Marine Environmental Laboratory
7600 Sand Point Way NE
Seattle WA 98115
https://www.pmel.noaa.gov
---------------------------------------------------------------------
On Thu, Feb 4, 2021 at 4:37 PM AndrewWittenberg ***@***.***> wrote:
Any progress on this loose end?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1984 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD5FHIVG63QH2KBOBHI7433S5M4TDANCNFSM4TU5IADA>
.
|
Hi Andrew, What do you see when you instead try I get:
So they are after all on the same grid once Ferret reads both. The grid names are given by Ferret when the datasets are loaded, and should be unique to each grid. I'll keep looking into why this behavior is different across versions. Josh Edit: this bug occurs in v7.2 as well |
Indeed you're right! I hadn't checked that. The grid of the first variable gets assigned the name of the new matching grid, rather than vice versa as I was expecting. So this is a much smaller issue that I had thought. And as you said, I just tested this in earlier versions and it also seems to be the case in v7.52 and all the way back to v6.82 (August 2012) and probably earlier. So it's not a new bug. I don't think any of my scripts use the grid names directly; instead they refer to the grid indirectly via the variable name, e.g. It seems a bit unusual to be changing the names of existing grids out from under variables that have already been loaded. Unless there's a compelling reason to do so, we might want to give the newer (matching) grid the pre-existing name, instead of giving the pre-existing grid the new name. I may dimly recall something Ansley might have said about this, but I'm not sure. Could drop her a quick note to ask. But bottom line, I'd suggest resolving this issue and then opening a new "minor" and "low-priority" one with the above suggestion. Given that this is longstanding behavior with no obvious consequences, I'd only change it if it looked really easy, with no loose ends! If it's difficult, or opens up a can of worms, then just let it be. Thanks for your detective work! |
It's always so useful to be able to go back to older versions. I think it'll be best to leave this one alone. The process of defining and checking grids for datasets has a number of steps. Grid information is kept on a linked list: get the next slot in the the list; set up the grid, filling in its axes; see if the grids are consistent for variables within the dataset that's being initialized; and finally check to see if they're identical to any existing grid. You'd think that if so that we'd use the previously defined grid, but instead the choice was made to use the new grid and assign the new grid to any variables using the old grid, and mark the old grid as unused. It may be that this was an arbitrary choice, but I think, without studying things, that it's a matter of not cleaning up old stuff but always moving forward in the list. The unused slot might be used again if needed, now that it's marked unused, but as a new element on the list. I believe the linked-list logic that's used wants to go forward not back-and-forth. I don't know if that makes any sense, or even if it's accurate, but it's my intuition. Still might be worth a new ticket, just so there can be an issue description that's correct for this bit of it, but mark it as low priority, and note that it may be more trouble than it's worth. |
Thanks Ansley for weighing in! That sounds good to me. |
This bug is crashing our automated analyses at GFDL, so I'm requesting the highest priority. It's probably worth reaching out to Ansley for initial clues, as she may have done some recent work on this.
Setup: create two NetCDF files, and then wrap them in descriptors using make_des:
The resulting descriptors point to NetCDF files with identical grids, so they ought to appear on the same grid (and axes) when we use them in Ferret.
But the new Ferret v7.6 puts them on different time axes (TIME1 and TIME2):
The previous version (Ferret v7.52, dated 12/03/2019) showed inconsistent behavior, depending on the order of commands:
That fails (giving TIME1 & TIME2), but this works (both grids on TIME2):
For both command orderings to produce identical grids (both on TIME1), we have to go all the way back to Ferret v7.3 (dated 12/04/2017). What's going on?
The text was updated successfully, but these errors were encountered: