Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Julia bindings #23

Open
2 tasks
Luthaf opened this issue Apr 26, 2023 · 18 comments
Open
2 tasks

Add Julia bindings #23

Luthaf opened this issue Apr 26, 2023 · 18 comments

Comments

@Luthaf
Copy link
Contributor

Luthaf commented Apr 26, 2023

This will require multiple steps:

  • Register the C package with Yggdrasil, creating a Sphericart_jll package. This will require making sure we can cross-compile the C++ code from Linux to all supported architectures and OS;
  • Write the pure Julia bindings, re-exposing the functions from Sphericart_jll with a nicer Julia API
@cortner
Copy link
Member

cortner commented Oct 26, 2023

I have a prototype implementation, which - on small basis sizes - is about a factor 2-3 faster than my previous one based on spherical coords (for large bases they are about the same) and will make this public soon once I also have the rrules done.

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo. I don't particularly mind either way. We will most likely want to maintain our own implementation with the interfaces we need but if that can be made compatible with what is here then all the better.

For sure I wouldn't mind getting input on performance optimizations from people who actually understand this rather than just trial-and-error.

I also don't plan on a GPU implementation immediately so that would be another thing where I'd be interested in following somebody else.

@cortner
Copy link
Member

cortner commented Oct 26, 2023

An interesting complication is that a @generated julia implementation is really extremely fast, but then I can't exploit batched evaluation via AVX or SIMD. So I'm planning to provide two codes, one that is @generated for a single input and outputs an SVector and a second one that takes the loop of inputs inside and SIMD or AVXs it (to be tested which is faster on which hardware, I'm only working with M1 right now ...)

@ceriottm
Copy link
Contributor

If you can / are willing to "harmonize" the API, I think it'd be interesting to have it in the same repo.
The idea here is to re-use as much code as possible (so that later on, if there's a need) one can get crazy with optimizations, and/or provide more hardware-specific back-ends, but I can see the argument for being somehow less tightly bound for Julia.

@cortner
Copy link
Member

cortner commented Oct 26, 2023

OkSo the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry. I haven't done this before but maybe it's obvious and if not maybe @Luthaf can help if he is interested in having this?

One other caveat is that the Julia code should probably be MIT licensed.

Regarding different hardware -- In Julia that would probably be achievable via KernelAbstractions.jl later.

@ceriottm
Copy link
Contributor

Yeah let's hear from @Luthaf and @frostedoyster . I think it'd be really good to consolidate the infrastructure we all use, and this is an easy starting point. When I was mentioning "harmonizing API" I meant just making sure function/class arguments and defaults are consistent, so users don't get too confused.

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default.

@cortner
Copy link
Member

cortner commented Oct 26, 2023

We have an ongoing offline discussion as to whether we should default to solid harmonics or normalize-by-default

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

@Luthaf
Copy link
Contributor Author

Luthaf commented Oct 26, 2023

So the procedure would be have a Julia package live in a subfolder I think that will then be registered with the Julia General registry.

This should be fine, the web interface in https://juliahub.com/ui/Packages can take a directory for packages that live in sub-directories. I'm not sure if the @JuliaRegistrator bot can also do this, but there should be a way to register the package.

One other caveat is that the Julia code should probably be MIT licensed.

Sure, we could also dual license the whole thing Apache/MIT (I know this is popular in the Rust ecosystem, but I don't remember why, I'll check).

One could argue having the pure julia implementation here in this repo instead of a separate Julia repo.

Especially if you already have a prototype, I'll be happy with also having a pure Julia implementation. If we also do bindings to the native implementation, we can have something like this in the long term

sph = spherical_harmonics(xyz; backend=:Julia) # pure julia, support autodiff & multiple dispatch


sph = spherical_harmonics(xyz; backend=:Native)  # native code, potentially faster, support fewer features

@ceriottm
Copy link
Contributor

I guess for large maxL it won't matter anyhow but for small maxL it might have an impact on performance.

For modelling I would want to provide both options - it's just and easy extra transformation layer. For performance optimization if you already have a splined Rnl basis then it seems pretty obvious?

I agree entirely that for performance there's no question it should be the solid harmonics, and that it's no biggie to provide both options. the discussion was about the default. I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default, but @frostedoyster has a (good) argument that probably most users will expect the normalized version. Anyway, for the moment it should def be default normalize=False for consistency - I was just curious to hear your opinion.

@cortner
Copy link
Member

cortner commented Oct 27, 2023

I feel quite strongly that the splid harmonics are actually the more natural things and it's ok to push for the better option as default

I'm coming around to that perspective as well.

but @frostedoyster has a (good) argument that probably most users will expect the normalized version.

In my Julia codes I always provide solid_harmonics and spherical_harmonics (named a bit differently but that's the gist). I'm not sure why you wouldn't want to just supply both? Let people choose which one to use?

EDIT: I looked at your Python docs again. Why not create two classes, one called SphericalHarmonics the other SolidHarmonics - that would make it crystal clear. The keyword normalized=true is actually very confusing. It could be an option to normalize the basis in different ways (there are different conventions in different fields...)

@cortner
Copy link
Member

cortner commented Oct 27, 2023

I didn't yet test on the same CPU so maybe premature, but I think the Julia codes might be in a similar ballpark as yours. I am guessing I'm somewhere in-between your general-purpose and hard-coded. This is on a 2-year old M1. I tried to run the benchmarks as described on your readme but got no useful output, only red ...

Generated: (this is just basic meta-programming, no computer-algebra optimized thing)

[ Info: static_solid_harmonics
L = 1
  2.208 ns (0 allocations: 0 bytes)
L = 2
  4.708 ns (0 allocations: 0 bytes)
L = 3
  7.925 ns (0 allocations: 0 bytes)
L = 4
  11.970 ns (0 allocations: 0 bytes)
L = 5
  34.073 ns (0 allocations: 0 bytes)
L = 6
  40.575 ns (0 allocations: 0 bytes)

And the dynamic but batched implementation: (vectorized, but no multi-threading yet)

[ Info: batched evaluation vs broadcast
[ Info: nX = 32 (try a nice number)
L = 3
    single:   7.758 ns (0 allocations: 0 bytes)
broadcast!:   248.021 ns (0 allocations: 0 bytes)
   batched:   396.455 ns (0 allocations: 0 bytes)
L = 6
    single:   39.060 ns (0 allocations: 0 bytes)
broadcast!:   887.146 ns (0 allocations: 0 bytes)
   batched:   823.037 ns (0 allocations: 0 bytes)
L = 9
    single:   67.092 ns (0 allocations: 0 bytes)
broadcast!:   1.692 μs (0 allocations: 0 bytes)
   batched:   1.479 μs (0 allocations: 0 bytes)
L = 12
    single:   82.467 ns (0 allocations: 0 bytes)
broadcast!:   2.708 μs (0 allocations: 0 bytes)
   batched:   2.329 μs (0 allocations: 0 bytes)
L = 15
    single:   149.455 ns (0 allocations: 0 bytes)
broadcast!:   4.786 μs (0 allocations: 0 bytes)
   batched:   3.615 μs (0 allocations: 0 bytes)

TBH - I wouldn't mind feedback if anybody in your group is willing to look at this. I could make WIP-PR before I have the gradients finished.

@Luthaf
Copy link
Contributor Author

Luthaf commented Oct 27, 2023

A WIP PR sounds like a good idea! I don't have a lot of time to look at it over the next 2 weeks, but @frostedoyster can also give the code a look!

@frostedoyster
Copy link
Collaborator

I don't have much experience with Julia, but I'd be happy to take a look

@cortner
Copy link
Member

cortner commented Oct 27, 2023

Thanks - I'll prepare the PR. I propose to call the subfolder SpheriCart.jl which is the Julia naming convention since SpheriCart will be the name of the package in the Julia package registry.

If you have any concerns about it let me know.

@frostedoyster
Copy link
Collaborator

Perfect! I'm looking forward to the PR. Thank you also for the feedback on normalized. That's something we took from e3nn, but perhaps it isn't the best long-term solution. What are your feelings @ceriottm @Luthaf?

@cortner
Copy link
Member

cortner commented Oct 27, 2023

Hm - maybe one more question. So far I haven't looked at your code at all to avoid license problems. Going forward, if we were to think about optimizing maybe this would become useful. This is especially true to extend the Julia codes to GPU. Does Apache-2 allow me to produce a derived product and license it under MIT? If not, then I would need your explicit permission to do so.

@ceriottm
Copy link
Contributor

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

Re the normalization, I suggest that for the moment we adapt the Julia API to what we have, and open an issue to refactor. As @frostedoyster says, that's a lot of refactoring and re documenting so maybe not top priority.

@cortner cortner mentioned this issue Oct 28, 2023
13 tasks
@cortner
Copy link
Member

cortner commented Oct 28, 2023

I suggest that for the moment we adapt the Julia API to what we have

Initially I'm just going to do SolidHarmonics. I'll add the wrapper for SphericalHarmonics asap. I'm not convinced the two APIs need to be identical. The two languages are too different.

@cortner
Copy link
Member

cortner commented Oct 28, 2023

Re license I don't mind dual license. Also better to do it now than after we have a bunch of contributors we can't track.

My question was more specific : I thought I'd need explicit permission from the copyright holder if I look at your code, "transpile" it and release as new code under a different license. But it seems Apache-2 is sufficiently permissive that it doesn't matter:

The Apache 2.0 License is permissive. It allows you to use, modify, and distribute the licensed software, including creating derivative works, without requiring those derivative works to be licensed under the same terms. You can release the modified parts of the code under any license you prefer

But let me know if I've misunderstood it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants