Auto-generated cython_numpy wrapper #6

jwoillez · 2014-07-25T20:25:07Z

I added a new wrapping experiment 'cython_numpy_auto' that adds cython code auto-generation for the same approach as 'cython_numpy'.

To test drive:

Auto-generate the cython code by running python cython_generator.py in directory cython_numpy_auto
Install the wrapper with python setup.py install --user or similar command.
Run the test with python test_erfa.py in directory tests.

Note:
You must have the erfa sources installed next to erfa-wrapping-experiments for the auto-generation to work.

To Do List:

Integrate the templating engine jinja2
Automate structure generation/conversion from erfam.h
Generate full API wrapper from erfa.h
Check usability as https://github.com/astropy/astropy/blob/master/astropy/time/erfa_time.pyx

For later automation, it is easier to have all inputs vectorized.

Just because one of its parameters is an input and an output at the same time. And it also uses the complex eraASTROM structure.

…instead of empty() and copyto()

astrofrog · 2014-07-29T10:09:55Z

Thanks for working on this! I'm cc-ing @taldcroft, @mdboom and @embray as they might be interested in reviewing this.

mdboom · 2014-07-29T14:57:17Z

cython_numpy_auto/code_analyzer.py

+    def __init__(self, name, source_path):
+        self.name = name
+        self.pyname = name.split('era')[-1].lower()
+        self.filename = name.split("era")[-1].lower()+".c"


If I'm following along correctly, this means that it's opening the .c file to find the signature for the function? It should probably open the .h file, as a system installation of erfa will generally have the headers, but not the source code.

Correct. The reason for parsing the .c file instead is that SOFA/ERFA has properly documented the input/output status of each parameter. It would also let us generate the documentation for each auto-generated function.

Ah -- having gone and looked at the SOFA/ERFA code, I see what you're saying now, and I see why that's the only real alternative. Not to sound like a curmudgeon, but SOFA is totally doing it wrong. The .h files should include the documentation, not the .c files.

At the end of the day, it probably matters little, because I suspect we would generate the .pyx file, and its resulting .c file through Cython, and ship that in our tarball, so the end user won't need the SOFA/ERFA .c files to build astropy. But it might be worth a note to the SOFA folks about this.

I tend to agree with you on the documentation location. We could also relocate the documentation when deriving ERFA from SOFA. This could help those that will use or develop with ERFA directly.

Will open a separate issue against ERFA.

You already did! :)
liberfa/erfa#24

mdboom · 2014-07-29T15:02:38Z

Cool stuff!

mdboom · 2014-07-29T16:08:01Z

Thanks for working on this -- hope my comments don't seem to nitpicky -- indeed I think this is a really good approach overall.

jwoillez · 2014-07-30T07:59:33Z

http://jinja.pocoo.org is very nice indeed!

astrofrog · 2014-07-30T08:04:27Z

This looks very nice! As a further test, it would be interesting to replace the Cython wrappers for SOFA that are already in Astropy with the auto-generated versions and see if the Astropy tests still pass. The routines are in https://github.com/astropy/astropy/blob/master/astropy/time/erfa_time.pyx

jwoillez · 2014-07-30T08:32:45Z

Good idea!

I had a quick look at erfa_time.pyx. The existing wrapper does not have a standard way of converting the ERFA names into python names. Could we agree on a standard interface to help with the exercise? Maybe something like this: eraName -> name? If needed, I can work on a PR for this on the https://github.com/astropy/astropy side, as a starting point.

eteq · 2014-08-04T22:22:22Z

@mdboom - see #2803 where I just made the same comment before you said this. I think it makes sense to do something like this:

C-function name: eraGmst00
"standardized" name: astropy.erfa.gmst00
"long-form" name (not always present): astropy.erfa.greenwich_mean_sid_time_iau00

However, we only assign "long-form" names manually for function we actually want to use (it should be a trivial task, like just adding an entry to a dictionary).

In the docs, the function would live under its long-from name if it has one, and if not, it's standardized name.

An alternative is to have these be separate namespaces. I.e., the long form names live in astropy.erfa.useful_name, but the "standardized" names are in astropy.erfa.std.sdznm.

eteq · 2014-08-04T22:29:30Z

@jwoillez - This is fantastic! In addition to @astrofrog's suggestion, I suggest you also try timing the current tests vs. the approach here. That is, see if one runs noticeably slower or faster than the other. I would think they shouldn't "in theory", but theory and practice often disagree...

Perhaps you can just by-hand implement the mapping from what this generates to the current names, just so we can see if there are any major compatibility or speed problems? If not, I think I'm happy to just say lets go with this approach!

eteq · 2014-08-04T22:29:35Z

One procedural question for the crowd: do we want these to be auto-generated a la how we generate Cython code? That is, as part of the build process, or is this a script that should be just run by hand when we want to update the source code in astropy? I tend towards the latter, because then we don't need to put jinja in astropy.extern.

jwoillez · 2014-08-05T05:07:31Z

If we go forward with longer function names, I would also manually add input/output specifications for the functions arguments. The use of the documentation to deduce this seems rather fragile.

I am not quite happy yet with the way matrices and vectors are handled. At the moment, you have to use rec arrays. A tri-vector v3 would have dtype=[('','d',(3,))], rather than have its inner dimension be of size 3: [...,3]. There seems to be a new multi-iterator that would let me deal with this, but the iterator cannot, from what I understand and at the moment, return anything but scalars. I may need to poke some numpy guru...

Last, I will be away for a month or so. Be patient, or take over! ;)

eteq · 2014-08-05T15:52:00Z

Ah, good point about the input/output specifications. More generally, it probably makes sense to write actual docstrings for the "longer name" versions. When I think about it, this may just be best to be kept simple by just doing something like:

def greenwich_mean_sid_time_iau00(*args, **kwargs):
    """
    Compute the mean sidereal time at Greenwich using the IAU 2000 algorithm.

    ERFA function: gmst00

    Parameters
    ----------
    ...
    """
    from . import gmst00
   return gmst00(*args, **kwargs)

In case someone else wants to finish this while you're away, @jwoillez, one clarification question: are you saying the need for a matrix comes from a limitation of cython's conversion to c arrays, or with how ERFA consumes arrays?

astrofrog · 2014-09-28T15:55:25Z

@jwoillez - just to check, are you still able to work on this? I think it is going to be very important to have this working since some of the big improvements in astropy.coordinates are going to rely on this. Would it help if we organize a hangout with a few people interested in this to talk over the options? It does seem like this is the way to go, and we just need to decide on some details?

eteq · 2014-09-29T07:56:00Z

Oh, I probably should have mentioned I've been working on porting this to astropy. See astropy/astropy#2992 .

One major difference: astropy uses the all-one-file erfa.c file, and this parser was written to handle one-function-per-file. So I had to do some mildly hacky stuff to get it to work with astropy, but it does indeed work.

mdboom · 2014-09-29T15:29:38Z

Do we need to use the all-in-one file? I've never personally understood the motivation for that -- it makes using an upstream ERFA have an extra step that maybe we don't need to do either.

eteq · 2014-09-29T19:57:39Z

@mdboom - because it deals more specifically with astropy/astropy#2992, I'll respond to your question there. (This PR currently does not handle the single-file case).

mhvk · 2014-10-21T13:42:01Z

Very late to the game here, but this is wonderful! Reading the long thread, my only comment would be not to worry about long names for this PR -- for liberfa all one really cares about is to have the wrapper, and the names should just be the same. Similarly, just copying over the documentation into a docstring should be enough.

Of course, for astropy, it is probably good to at least have @eteq's "standardized" names. I'm less in favor of long names, since it then forces us to think of a standard when people might as well get used to the IAU mandated one (even if we think it is not necessarily a good one). And within individual subpackages such as Time, a simple dict with a mapping should suffice.

OK, I better go try it!

mhvk · 2014-10-21T14:17:50Z

Trying to understand the Time errors, I found that d2dtf does not always cast properly:

In [58]: iy, im, id, ihms = erfa_time.jd_dtf(t.scale.upper(), 6, t.jd1+np.arange(2), t.jd2)

In [59]: iy, im, id, ihms
Out[59]: 
(array([139930034505694, 139930034505694]),
 array([139930034503690, 139930034503690]),
 array([139930034503701, 139930034503702]),
 array([([13, 59, 54, 999369],), ([13, 59, 54, 999369],)], 
       dtype=[('f0', '<i4', (4,))]))

This seems to a problem of long vs short integers (i4 vs i8):

In [60]: iy.view('<2i4')
Out[60]: 
array([[ 2014, 32580],
       [ 2014, 32580]], dtype=int32)

eteq · 2014-10-22T08:34:45Z

@mhvk - this is actually better addressed in astropy/astropy#2992 now - that has a lot of additional work on top of this.

And just to be clear, the plan is not necessarily to include anything like this in ERFA itself. The tentative thought is for astropy.erfa to be the "official" wrappers.

mhvk · 2014-10-22T13:28:48Z

@eteq - OK, and nice to see that you already solved this particular problem! I think it is slightly a pity to not give the python bindings separately -- better to give people freedom to just use a small part of astropy without downloading the whole package -- but as I'm not the one doing the work and personally will only use it through astropy anyway, take this with a pinch of salt.

That said, if people were to want to just use the bindings, would the .pyx file suffice for them? In that case, we could just point to it somewhere in the liberfa web site.

eteq · 2014-10-24T15:55:46Z

@mhvk - the main issue in my mind is the management overhead, because we need an "official" version for astropy. My thinking is that we don't want to include it with liberfa because that's supposed to be C-only. So it would need to be a separate project just for the wrappers, with releases timed to match liberfa. I could get behind it if someone was willing to be in charge of managing it, in which case we would just treat it as an extern package in astropy.

That said, we could consider altering the scope of liberfa to include the wrappers. I'm not sure what that would mean for the packagers though - wouldn't want to say they have to also include python dependencies...

mhvk · 2014-10-24T16:02:48Z

@eteq - OK, let's minimize overheads for everybody! Ideally, just point to the .pyx file would help the "I don't want all of astropy" problems anyway!

eteq · 2014-10-24T16:27:27Z

@mhvk - oh, yeah, that's a good point - anyone who wants it by itself can just pull out our .pyx file and use it directly!

jwoillez added 12 commits July 28, 2014 09:08

Vectorize all d2dtf() imputs.

627719c

For later automation, it is easier to have all inputs vectorized.

Added aper()

b815ba7

Just because one of its parameters is an input and an output at the same time. And it also uses the complex eraASTROM structure.

Added a test for aper()

f1b7ce9

Postfixed all output arguments with '_out'

3209705

cdef and assign all C arguments before use

64e3068

Use copy constructor array() and broadcast_arrays()...

0cdfa62

…instead of empty() and copyto()

Added *.py[cod] to ignore list

9c9d296

Initial commit of code analyzer

3dbc8f8

Initial commit of cython_generator

7523521

Ignore erfa.pyx and erfa.c in cython_numpy_auto

2a60933

Added cython_numpy_auto extension to setup.py

6e653c1

Run test against both cython_numpy and cython_numpy_auto

bcbe709

jwoillez changed the title ~~Musing with automation~~ Auto-generated cython_numpy wrapper Jul 29, 2014

This was referenced Jul 29, 2014

More work on the python_numpy experiment #7

Closed

Explore whether various methods can be automated #5

Open

mdboom reviewed Jul 29, 2014
View reviewed changes

jwoillez added 3 commits July 30, 2014 09:53

Base all classes on object for attribute access.

d1dcc3b

Turn cython_numpy_auto into a package

729d2e0

Use jinja2 for code generation

b6da8b8

jwoillez mentioned this pull request Jul 30, 2014

Small step toward standard interface to ERFA astropy/astropy#2803

Closed

jwoillez added 12 commits July 31, 2014 08:44

Three new jinja filters: prefix, postfix, surround

763c08c

Process all ERFA functions

d03c0c1

Add eraLDBODY structure support

c9be43e

Use prefix, postfix, and surround in template.

56d80ca

Re-generate extern function signature with pointers

e3ccefd

Add returned values

e641fd4

Functions with no inputs do not need vectorisation

dade3e0

Improvements for no input functions

e16ff9a

Pointer arguments are actually not necessarily inputs

080325d

Deal with "Given and returned" documentation sections

9981fb0

Remove unwanted print statements

a85fcf9

Identify sections/subsections and process only Astronomy section

c43c83c

eteq mentioned this pull request Sep 29, 2014

Auto-generated ERFA wrappers astropy/astropy#2992

Merged

2 tasks

eteq mentioned this pull request Nov 5, 2014

New Python iterator approach: vectorise without dtype #9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-generated cython_numpy wrapper #6

Auto-generated cython_numpy wrapper #6

jwoillez commented Jul 25, 2014

astrofrog commented Jul 29, 2014

mdboom Jul 29, 2014

jwoillez Jul 29, 2014

mdboom Jul 29, 2014

jwoillez Jul 30, 2014

jwoillez Jul 30, 2014

mdboom commented Jul 29, 2014

mdboom commented Jul 29, 2014

jwoillez commented Jul 30, 2014

astrofrog commented Jul 30, 2014

jwoillez commented Jul 30, 2014

eteq commented Aug 4, 2014

eteq commented Aug 4, 2014

eteq commented Aug 4, 2014

jwoillez commented Aug 5, 2014

eteq commented Aug 5, 2014

astrofrog commented Sep 28, 2014

eteq commented Sep 29, 2014

mdboom commented Sep 29, 2014

eteq commented Sep 29, 2014

mhvk commented Oct 21, 2014

mhvk commented Oct 21, 2014

eteq commented Oct 22, 2014

mhvk commented Oct 22, 2014

eteq commented Oct 24, 2014

mhvk commented Oct 24, 2014

eteq commented Oct 24, 2014

Auto-generated cython_numpy wrapper #6

Are you sure you want to change the base?

Auto-generated cython_numpy wrapper #6

Conversation

jwoillez commented Jul 25, 2014

astrofrog commented Jul 29, 2014

mdboom Jul 29, 2014

Choose a reason for hiding this comment

jwoillez Jul 29, 2014

Choose a reason for hiding this comment

mdboom Jul 29, 2014

Choose a reason for hiding this comment

jwoillez Jul 30, 2014

Choose a reason for hiding this comment

jwoillez Jul 30, 2014

Choose a reason for hiding this comment

mdboom commented Jul 29, 2014

mdboom commented Jul 29, 2014

jwoillez commented Jul 30, 2014

astrofrog commented Jul 30, 2014

jwoillez commented Jul 30, 2014

eteq commented Aug 4, 2014

eteq commented Aug 4, 2014

eteq commented Aug 4, 2014

jwoillez commented Aug 5, 2014

eteq commented Aug 5, 2014

astrofrog commented Sep 28, 2014

eteq commented Sep 29, 2014

mdboom commented Sep 29, 2014

eteq commented Sep 29, 2014

mhvk commented Oct 21, 2014

mhvk commented Oct 21, 2014

eteq commented Oct 22, 2014

mhvk commented Oct 22, 2014

eteq commented Oct 24, 2014

mhvk commented Oct 24, 2014

eteq commented Oct 24, 2014