Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-generated cython_numpy wrapper #6

Open
wants to merge 37 commits into
base: master
Choose a base branch
from
Open

Auto-generated cython_numpy wrapper #6

wants to merge 37 commits into from

Conversation

jwoillez
Copy link
Contributor

I added a new wrapping experiment 'cython_numpy_auto' that adds cython code auto-generation for the same approach as 'cython_numpy'.

To test drive:

  • Auto-generate the cython code by running python cython_generator.py in directory cython_numpy_auto
  • Install the wrapper with python setup.py install --user or similar command.
  • Run the test with python test_erfa.py in directory tests.

Note:
You must have the erfa sources installed next to erfa-wrapping-experiments for the auto-generation to work.

To Do List:

@jwoillez jwoillez changed the title Musing with automation Auto-generated cython_numpy wrapper Jul 29, 2014
@astrofrog
Copy link
Contributor

Thanks for working on this! I'm cc-ing @taldcroft, @mdboom and @embray as they might be interested in reviewing this.

def __init__(self, name, source_path):
self.name = name
self.pyname = name.split('era')[-1].lower()
self.filename = name.split("era")[-1].lower()+".c"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm following along correctly, this means that it's opening the .c file to find the signature for the function? It should probably open the .h file, as a system installation of erfa will generally have the headers, but not the source code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. The reason for parsing the .c file instead is that SOFA/ERFA has properly documented the input/output status of each parameter. It would also let us generate the documentation for each auto-generated function.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- having gone and looked at the SOFA/ERFA code, I see what you're saying now, and I see why that's the only real alternative. Not to sound like a curmudgeon, but SOFA is totally doing it wrong. The .h files should include the documentation, not the .c files.

At the end of the day, it probably matters little, because I suspect we would generate the .pyx file, and its resulting .c file through Cython, and ship that in our tarball, so the end user won't need the SOFA/ERFA .c files to build astropy. But it might be worth a note to the SOFA folks about this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with you on the documentation location. We could also relocate the documentation when deriving ERFA from SOFA. This could help those that will use or develop with ERFA directly.

Will open a separate issue against ERFA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You already did! :)
liberfa/erfa#24

@mdboom
Copy link

mdboom commented Jul 29, 2014

Cool stuff!

@mdboom
Copy link

mdboom commented Jul 29, 2014

Thanks for working on this -- hope my comments don't seem to nitpicky -- indeed I think this is a really good approach overall.

@jwoillez
Copy link
Contributor Author

http://jinja.pocoo.org is very nice indeed!

@astrofrog
Copy link
Contributor

This looks very nice! As a further test, it would be interesting to replace the Cython wrappers for SOFA that are already in Astropy with the auto-generated versions and see if the Astropy tests still pass. The routines are in https://github.com/astropy/astropy/blob/master/astropy/time/erfa_time.pyx

@jwoillez
Copy link
Contributor Author

Good idea!

I had a quick look at erfa_time.pyx. The existing wrapper does not have a standard way of converting the ERFA names into python names. Could we agree on a standard interface to help with the exercise? Maybe something like this: eraName -> name? If needed, I can work on a PR for this on the https://github.com/astropy/astropy side, as a starting point.

@eteq
Copy link
Member

eteq commented Aug 4, 2014

@mdboom - see #2803 where I just made the same comment before you said this. I think it makes sense to do something like this:

  • C-function name: eraGmst00
  • "standardized" name: astropy.erfa.gmst00
  • "long-form" name (not always present): astropy.erfa.greenwich_mean_sid_time_iau00

However, we only assign "long-form" names manually for function we actually want to use (it should be a trivial task, like just adding an entry to a dictionary).

In the docs, the function would live under its long-from name if it has one, and if not, it's standardized name.

An alternative is to have these be separate namespaces. I.e., the long form names live in astropy.erfa.useful_name, but the "standardized" names are in astropy.erfa.std.sdznm.

@eteq
Copy link
Member

eteq commented Aug 4, 2014

@jwoillez - This is fantastic! In addition to @astrofrog's suggestion, I suggest you also try timing the current tests vs. the approach here. That is, see if one runs noticeably slower or faster than the other. I would think they shouldn't "in theory", but theory and practice often disagree...

Perhaps you can just by-hand implement the mapping from what this generates to the current names, just so we can see if there are any major compatibility or speed problems? If not, I think I'm happy to just say lets go with this approach!

@eteq
Copy link
Member

eteq commented Aug 4, 2014

One procedural question for the crowd: do we want these to be auto-generated a la how we generate Cython code? That is, as part of the build process, or is this a script that should be just run by hand when we want to update the source code in astropy? I tend towards the latter, because then we don't need to put jinja in astropy.extern.

@jwoillez
Copy link
Contributor Author

jwoillez commented Aug 5, 2014

If we go forward with longer function names, I would also manually add input/output specifications for the functions arguments. The use of the documentation to deduce this seems rather fragile.

I am not quite happy yet with the way matrices and vectors are handled. At the moment, you have to use rec arrays. A tri-vector v3 would have dtype=[('','d',(3,))], rather than have its inner dimension be of size 3: [...,3]. There seems to be a new multi-iterator that would let me deal with this, but the iterator cannot, from what I understand and at the moment, return anything but scalars. I may need to poke some numpy guru...

Last, I will be away for a month or so. Be patient, or take over! ;)

@eteq
Copy link
Member

eteq commented Aug 5, 2014

Ah, good point about the input/output specifications. More generally, it probably makes sense to write actual docstrings for the "longer name" versions. When I think about it, this may just be best to be kept simple by just doing something like:

def greenwich_mean_sid_time_iau00(*args, **kwargs):
    """
    Compute the mean sidereal time at Greenwich using the IAU 2000 algorithm.

    ERFA function: gmst00

    Parameters
    ----------
    ...
    """
    from . import gmst00
   return gmst00(*args, **kwargs)

In case someone else wants to finish this while you're away, @jwoillez, one clarification question: are you saying the need for a matrix comes from a limitation of cython's conversion to c arrays, or with how ERFA consumes arrays?

@astrofrog
Copy link
Contributor

@jwoillez - just to check, are you still able to work on this? I think it is going to be very important to have this working since some of the big improvements in astropy.coordinates are going to rely on this. Would it help if we organize a hangout with a few people interested in this to talk over the options? It does seem like this is the way to go, and we just need to decide on some details?

@eteq
Copy link
Member

eteq commented Sep 29, 2014

Oh, I probably should have mentioned I've been working on porting this to astropy. See astropy/astropy#2992 .

One major difference: astropy uses the all-one-file erfa.c file, and this parser was written to handle one-function-per-file. So I had to do some mildly hacky stuff to get it to work with astropy, but it does indeed work.

@mdboom
Copy link

mdboom commented Sep 29, 2014

Do we need to use the all-in-one file? I've never personally understood the motivation for that -- it makes using an upstream ERFA have an extra step that maybe we don't need to do either.

@eteq
Copy link
Member

eteq commented Sep 29, 2014

@mdboom - because it deals more specifically with astropy/astropy#2992, I'll respond to your question there. (This PR currently does not handle the single-file case).

@mhvk
Copy link

mhvk commented Oct 21, 2014

Very late to the game here, but this is wonderful! Reading the long thread, my only comment would be not to worry about long names for this PR -- for liberfa all one really cares about is to have the wrapper, and the names should just be the same. Similarly, just copying over the documentation into a docstring should be enough.

Of course, for astropy, it is probably good to at least have @eteq's "standardized" names. I'm less in favor of long names, since it then forces us to think of a standard when people might as well get used to the IAU mandated one (even if we think it is not necessarily a good one). And within individual subpackages such as Time, a simple dict with a mapping should suffice.

OK, I better go try it!

@mhvk
Copy link

mhvk commented Oct 21, 2014

Trying to understand the Time errors, I found that d2dtf does not always cast properly:

In [58]: iy, im, id, ihms = erfa_time.jd_dtf(t.scale.upper(), 6, t.jd1+np.arange(2), t.jd2)

In [59]: iy, im, id, ihms
Out[59]: 
(array([139930034505694, 139930034505694]),
 array([139930034503690, 139930034503690]),
 array([139930034503701, 139930034503702]),
 array([([13, 59, 54, 999369],), ([13, 59, 54, 999369],)], 
       dtype=[('f0', '<i4', (4,))]))

This seems to a problem of long vs short integers (i4 vs i8):

In [60]: iy.view('<2i4')
Out[60]: 
array([[ 2014, 32580],
       [ 2014, 32580]], dtype=int32)

@eteq
Copy link
Member

eteq commented Oct 22, 2014

@mhvk - this is actually better addressed in astropy/astropy#2992 now - that has a lot of additional work on top of this.

And just to be clear, the plan is not necessarily to include anything like this in ERFA itself. The tentative thought is for astropy.erfa to be the "official" wrappers.

@mhvk
Copy link

mhvk commented Oct 22, 2014

@eteq - OK, and nice to see that you already solved this particular problem! I think it is slightly a pity to not give the python bindings separately -- better to give people freedom to just use a small part of astropy without downloading the whole package -- but as I'm not the one doing the work and personally will only use it through astropy anyway, take this with a pinch of salt.

That said, if people were to want to just use the bindings, would the .pyx file suffice for them? In that case, we could just point to it somewhere in the liberfa web site.

@eteq
Copy link
Member

eteq commented Oct 24, 2014

@mhvk - the main issue in my mind is the management overhead, because we need an "official" version for astropy. My thinking is that we don't want to include it with liberfa because that's supposed to be C-only. So it would need to be a separate project just for the wrappers, with releases timed to match liberfa. I could get behind it if someone was willing to be in charge of managing it, in which case we would just treat it as an extern package in astropy.

That said, we could consider altering the scope of liberfa to include the wrappers. I'm not sure what that would mean for the packagers though - wouldn't want to say they have to also include python dependencies...

@mhvk
Copy link

mhvk commented Oct 24, 2014

@eteq - OK, let's minimize overheads for everybody! Ideally, just point to the .pyx file would help the "I don't want all of astropy" problems anyway!

@eteq
Copy link
Member

eteq commented Oct 24, 2014

@mhvk - oh, yeah, that's a good point - anyone who wants it by itself can just pull out our .pyx file and use it directly!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants