-
Notifications
You must be signed in to change notification settings - Fork 165
GSoC 2024 Ideas
Below we list developed project ideas listed by priority. The "High Priority" section contains projects that we are especially interested in, as they lie on the critical path to a minimal viable product: make LPython usable for simpler projects.
However, feel free to propose any project idea that you like to improve LPython, for example by browsing open issues:
https://github.com/lcompilers/lpython/issues
And our dedicated issue for GSoC ideas for 2024: https://github.com/lcompilers/lpython/issues/2472
If you are interested in applying, please get in touch with us at either our Zulip chat or our mailing list:
- https://lfortran.zulipchat.com (LPython channel)
- https://groups.io/g/lfortran
We will help answer questions and help with finding and refining a project idea. You do not need to have prior experience with compilers, we will teach you. It is fun. LPython is written in C++, but we do not use many advanced features and if you have any programming experience you will be able to pick it up.
Here are a few projects for inspiration, they contain a mix of well-developed ideas and less developed ideas. You are welcome to propose your own idea as well.
We have a patch requirement in order to consider your application. Please send a patch (Pull Request) to LPython that has to be merged by the time the application period closes (April 2, 2024). You can fix or improve anything you like. If you have any questions, please contact us (for example on Zulip) and we will help.
Potential mentors:
- Ondřej Čertík
- Gagandeep Singh
- Rohit Goswami
- Thirumalai Shaktivel
- Ubaid Shaikh
- Pranav Goswami
- Luthfan Lubis
- Anutosh Bhat
Compile benchmarking code written in Python with LPython and improving LPython's performance on these benchmarks
https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/python.html contains all the benchmark codes written for various problems such as n-body, sepctral norm, mandelbrot. The workflow would involve first doing bug fixes to compile the code (modifying the input code would be okay) with LPython and producing correct outputs. Then, improving LPython to perform better or equivalent to other benchmarks written in compiled languages such as C/C++.
Expected outcomes: LPython can compile as many benchmark codes as possible. Performing better than other compilers would be an additional plus.
Skills preferred: Python and C++ programming
Difficulty: intermediate/hard, 350 hours
Mentors - Gagandeep Singh (Github - @czgdp1807)
Data structures such as dict
, list
have been partially implemented in LPython and some like set
haven't been touched yet. This project would involve improving the support for already implemented data structures and adding new ones. We would also benchmark our implementations with the equivalents in other languages such as (list
vs std::vector
, set
vs std::set
, dict
vs std::unordered_map
).
Expected outcomes: Increased support for advanced data structures in LPython. Performing better in benchmarks is an additional plus.
Skills preferred: C++ Programming, Python. LLVM familiarity would be a plus but not necessary (you can learn this on the fly).
Mentors - Gagandeep Singh (Github - @czgdp1807)
Relevant PRs and issues - https://github.com/lcompilers/lpython/issues/983, https://github.com/lcompilers/lpython/issues/941, https://github.com/lcompilers/lpython/pull/1111. No issue for set
has been opened yet but it has the same priority as other data structures for this particular project.
LPython has initial generics implemented, they allow to write functions that accept generic arguments and are instantiated at the call site using the actual types provided by the user. In this project we will extend it to many more cases and implement enough features so that we can implement the standard library and NumPy using generics.
Expected outcomes: LPython can compile untyped code using generics.
Skills preferred: Python and C++ programming
Difficulty: intermediate, 350 hours
Mentors - Ondřej Čertík, Rohit Goswami, Gagandeep Singh, Luthfan Lubis
The roadmap https://github.com/lcompilers/lpython/issues/155 issue contains a list of Python features that we want implemented. Each feature should be implemented at the ASR level and in the LLVM backend to be complete. If AST is missing for a given feature, then it has to be implemented also.
Here you can pick a feature or a set of features from the list and propose it as a GSoC project. In other words, this project idea can accommodate multiple student projects.
List of resources for more information and background:
- ASR.asdl, the comment at the top explains the design motivation
- asr_to_llvm.cpp is the LLVM backend
- ast_to_asr.cpp is the AST -> ASR conversion where all semantics checks are being done and compiler errors reported to the user
- Developer Tutorial
If you have any questions, please do not hesitate to ask, we can discuss or provide more details.
Difficulty: easy/intermediate (depending on the task), can be 175 hours or 350 hours
Mentors: Ondrej Certik (@certik), Gagandeep Singh
We have a demo of LPython running in the browser using WASM here: https://www.ubaidshaikh.me/lcompilers_web_frontend/lpython, the goal of this project would be to improve the user interface. Here is a list of issues that the project can work on fixing: https://github.com/lfortran/lcompilers_frontend/issues
Skills preferred: Python and C++ programming
Difficulty: intermediate, 350 hours
Mentors - Ondřej Čertík, Rohit Goswami
This project would be used to implement language server features like find a symbol declaration and expose it to a language server written in TypeScript that works out of the box in VSCode.
Expected outcomes: LPython can be used as a Python language server that can be used in other software such as source code editors and IDEs.
Skills preferred: Python and C++ programming
Difficulty: intermediate, 350 hours
Mentors: Ondřej Čertík (@certik), Smit Lunagariya
The Python standard library has a lot of modules: https://docs.python.org/3/library/index.html. However, the highest priority module is NumPy for enhancing array programming support of LPython. Even in NumPy we reduction functions like (numpy.sum
, numpy.mean
, etc) are the most important. We plan to implement them via intrinsic functions infrastructure recently added in LCompilers (see, src/libasr/pass/intrinsic_function_registry.h
which will soon be a part of LPython). Rest every other module is low priority.
The project includes discussing which modules will be needed for LPython (from a scientific computing perspective, in the beginning), creating a priority list, and then implementing each module properly. The aim of this project is to make LPython work for any Python code down the road.
See #200 as a related issue. Feel free to discuss the details with us.
Skills preferred: Python and C++ programming
Difficulty: hard, 350 hours
Mentors: Ondřej Čertík (@certik), Smit Lunagariya
LPython has a very fast WASM backend that can translate large parts of ASR to WASM. This project would work on making the WASM work for all of ASR, by adding tests and implementing missing features. As every backend in LPython, the backend receives the code as ASR, and it recursively walks over each ASR node and generates WASM code.
Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh
Difficulty: intermediate, 350 hours
LPython has WASM to x86_64 code generation backend implemented which allows very fast compiling (many times faster than going via LLVM). The x86 backend does not do any optimizations, so it is meant to be used in Debug mode only. The backend recieves WASM code and creates an x86_64 ELF binary.
The purpose of this project would be to extend this backend to cover more LPython/WASM features.
If you have any questions, please do not hesitate to ask, we can discuss or provide more details.
Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh
Difficulty: intermediate, 350 hours
This project would create an initial WASM to Apple M1 backend. It would work similarly to the WASM->x86 backend, but it would generate ARM code and MachO binary format that works on Apple M1.
If you have any questions, please do not hesitate to ask, we can discuss or provide more details.
Mentors: Ondrej Certik (@certik), Gagandeep Singh, Ubaid Shaikh
Difficulty: intermediate, 350 hours
Add a backend to LPython that automatically exposes (eventually all) LPython module contents to Python. That will allow using LPython compiled code to be used from CPython itself.
Related issues:
- lfortran#133: Automatic wrappers ASR -> Python
Mentors: Ondrej Certik (@certik), Rohit Goswami
Difficulty: intermediate, 350 hours