Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patching: Increasing size of existing array produces corrupt namelist #80

Open
letmaik opened this issue May 29, 2018 · 12 comments
Open

Comments

@letmaik
Copy link
Contributor

letmaik commented May 29, 2018

Input:

&sect
ids = 1, 2
foo = 10
bar = 20
/

Patch:

patch = {'sect': {
    'ids': [3, 4, 5],
    'foo': 6.0,
    'bar': 7.0
}}

Patched output:

&sect
ids = 3, 4
5 = 6.0
bar = 7.0
/

Patch command: f90nml.patch("input.nml", patch, "output.nml")

@letmaik
Copy link
Contributor Author

letmaik commented May 29, 2018

I just noticed that this only happens when the array in the patch has a bigger size than in the input. If it's equal or smaller everything works.

@letmaik letmaik changed the title Patching of array differing in length produces corrupt namelist Patching: Increasing size of existing array produces corrupt namelist May 29, 2018
@marshallward
Copy link
Owner

Yes, I can reproduce this one.

This will be hard to fix because the function does not know if a token is a value or a variable until it reaches the next =, and the patching does not currently lookahead, instead streams the output very quickly. It will take some time to rewrite this to do the lookahead.

@chrisb13
Copy link

Firstly: amazing work @marshallward, what a great utility! Likewise to @aekiss for nmltab.

I have this issue too (version 1.0.2 and dd2491e), I came onto this page planning to report it.

If possible, I'd prefer the fix to preserve the one line per each variable name. Here's an example, currently:

Input:

!-----------------------------------------------------------------------
&nam_diaharm   !   Harmonic analysis of tidal constituents ('key_diaharm')
!-----------------------------------------------------------------------
    nit000_han = 1         ! First time step used for harmonic analysis
    nitend_han = 75        ! Last time step used for harmonic analysis
    nstep_han  = 15        ! Time step frequency for harmonic analysis
    tname(1)   = 'M2'      ! Name of tidal constituents
    tname(3)   = 'ZZ'
/

Patch (created using @aekiss's nmltab)

OrderedDict([('nam_diaharm', OrderedDict([('nit000_han', 20), ('nitend_han', 60), ('nstep_han', 540), ('tname', ['M2', 'S2', 'N2', 'K2'])]))])

Current output

!-----------------------------------------------------------------------
&nam_diaharm   !   Harmonic analysis of tidal constituents ('key_diaharm')
!-----------------------------------------------------------------------
    nit000_han = 20         ! First time step used for harmonic analysis
    nitend_han = 60        ! Last time step used for harmonic analysis
    nstep_han  = 540        ! Time step frequency for harmonic analysis
    tname(1)   = 'M2'      ! Name of tidal constituents
    'S2'(3)   = 'ZZ'
/

Ideal output after fix

!-----------------------------------------------------------------------
&nam_diaharm   !   Harmonic analysis of tidal constituents ('key_diaharm')
!-----------------------------------------------------------------------
    nit000_han = 20         ! First time step used for harmonic analysis
    nitend_han = 60        ! Last time step used for harmonic analysis
    nstep_han  = 540        ! Time step frequency for harmonic analysis
    tname(1)   = 'M2'      ! Name of tidal constituents
    tname(2)   = 'S2' 
    tname(3)   = 'N2'
    tname(4)   = 'K2'
/

The current new file (non-patch) approach compresses tname into one line (but otherwise works great), i.e.:

&nam_diaharm
    nit000_han = 20
    nitend_han = 60
    nstep_han = 540
    tname(1:4) = 'M2', 'S2', 'N2', 'K2'
/

More broadly, it would be great if there was some error checking that the patched file contains the patch contents, perhaps this already occurs? I'm thinking basic things like, number of changes in patch equals number of different lines in namelist using an independent check, e.g.:
diff -y --suppress-common-lines fileone filetwo | grep '^' | wc -l
I do a quick very basic check currently but personally found it's not easy to do without re-writing parts of f90nml. Am I re-doing something that already exists?

@marshallward
Copy link
Owner

Yeah, I'd agree this is the same issue. In your case it's sort of "working as intended" since I seem to have intentionally avoided patching values that don't already exist. (~L600). I seem to handle the case of a patch which is a subset of the original vector, but not when the patch is larger.

I guess I should probably say that I originally only considered patch as a means to modify existing content (and, most importantly, preserve comments), so there are going to be a lot more cases where adding and removing content won't work.

As for displaying indexed content, there is a separate issue related to this issue (#24) but I don't know when (or if) it could be resolved. The current index writing is very clunky and ad-hoc stuff, and to generalise it will probably take some major effort on my part. Not to say that I don't want to do it, but the issue is over 2 years old and I don't really see it happening anytime soon.

And for the error checking, I don't think I have anything to add. From my perspective, if the patched file doesn't have the contents, then it's a bug to be fixed :). But maybe there's something in the test suite.

Long story, but just a heads up that #77 and #80 (and #24) might be hard to resolve, and could take a long time to sort out. I expect it might take a bigger rewrite than I really have in me right now.

@marshallward
Copy link
Owner

marshallward commented May 30, 2018

I've pushed a change (18f5d2d) which resolves this bug, although it doesn't resolve the issue of extending the vector.

I don't think the solution is a great one, because it creates a new iterator after every index, and some of the index arithmetic seems dodgy to me, but it seems to work so far.

More importantly, there is now some awareness of the existing vector length, so it is going to be easier to extend the vector. I think appending values will probably not be difficult, but prepending, as in @chrisb13's case, could be more of a problem.

@chrisb13
Copy link

Thanks for following up.

When I run @letmaik's example the new version (18f5d2d) works, i.e.:

ids = 3, 4
foo = 6.0
bar = 7.0
/

As I think you mentioned (Re: prepending), my version still doesn't work with:

!-----------------------------------------------------------------------
&nam_diaharm   !   Harmonic analysis of tidal constituents ('key_diaharm')
!-----------------------------------------------------------------------
    nit000_han = 20         ! First time step used for harmonic analysis
    nitend_han = 60        ! Last time step used for harmonic analysis
    nstep_han  = 540        ! Time step frequency for harmonic analysis
    tname(1)   = 'M2'      ! Name of tidal constituents
    'S2'(3)   = 'ZZ'
/

Irrespective of the example and particularly if a complete fix is impractical. I'd appreciate some kind of warning that the patch was only partially applied (even with @letmaik's example ids is truncated).

@marshallward
Copy link
Owner

I'd say the main problem is that the patch method currently would not realise that anything was skipped, so could not issue a warning. This is because it pops the entire tname vector from the patch while parsing the tokens. It changes the existing values, but chucks all the rest.

It is possible in principle to save these values, since we know when the index is outside the bounds of the patch and (at the moment) skip over thosee values. Any missing values could be re-applied to the patch namelist, which is probably the easiest solution.

But I don't think it will ever be smart enough to simply replace, say, tname(1) = 'M2' with tname(1:4) = 'M2', 'S2', 'N2', 'K2'. This is because I handle each token in sequence, and either write it to the output file or replace it with a patched value. Also, these missing elements could re-appear later in the namelist.

I should add that I've been working on a new token parser which is (hopefully) much more organised, and not the ad-hoc mess of the current parser. Ideally I'd rather try to sort out these changes to patching in the new parser. But if there's some urgency here, then I can try to implement something a little more quickly.

@chrisb13
Copy link

Okay, cool, thanks for the update.

But if there's some urgency here, then I can try to implement something a little more quickly.

No urgency for my application, actually my use of the package is done for the time being (I've just hacked around the issue). I'm sure I'll use it again at some point though. Thus, happy to wait for the new parser.

@amicitas
Copy link

amicitas commented Mar 5, 2019

It turns out that this bug (the ability to expand arrays as in the original bug report) affects my application in a critical way. For the time being instead of patching the file, I will need to just use a read->update->write process. This gets the values right, but of course strips out all the formatting and comments. I look forward to the new parser!

@lisha992610
Copy link

@marshallward This is such amazing tool.

I am working with the fortran namelist for WRF. there is such string in namelist that I would like to modify
start_date = '2016-04-30_00:00:00','2016-04-30_00:00:00','2016-04-30_00:00:00','2016-04-30_00:00:00','2016-04-30_00:00:00'
while reading and parsing, it will be recognized as a list of 5 elements:
nml['start_date'] = ['2016-04-30_00:00:00', '2016-04-30_00:00:00', '2016-04-30_00:00:00', '2016-04-30_00:00:00', '2016-04-30_00:00:00']
but when I tried to replace it with a list of the same length, only half of the list is updated using "patch" function.
Result: end_date = '2017-04-28_00:00:00','2017-04-28_00:00:00','2017-04-28_00:00:00','2016-04-30_00:00:00','2016-04-30_00:00:00'

the whole list should be 5 times '2017-04-28_00:00:00', but last 2 elements are not be updated.

best regards

@marshallward
Copy link
Owner

@lisha992610 could you post an example of the namelist file and the Python commands that you are executing?

@marshallward
Copy link
Owner

@lisha992610 I've manated to reproduce this problem in #107, let's move the discussion there since it's probably an unrelated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants