Major release: v0.5.0 is finally here! 🚀

We are proud to finally release our latest version, 0.5.0, after much work done for NeurIPS!
(Will our paper finally get accepted? 🤞)

Changelog

New documentation released at rl4.co!
Add SOTA FJSP environment @LTluttmann
Add Improvement methods and respective environments MDP @yining043
- N2S
- DACT
- NeuOpt
Ade HetGNN model for the JFSP @LTluttmann
Add L2D model @LTluttmann
Add Multi-task VRP (MTVRP) environment
Add temperature in NARGNN policies @Furffico
Add multiple batch sizes for different dataset
Local search support, DeepACO + Local search @hyeok9855
Add MTPOMO, MVMoE model @RoyalSkye @FeiLiu36
Supporting the meta learning trainer @jieyibi
Supporting the improvement training @yining043
Add graph problems: MCP and FLP @bokveizen
New PPO versions:
- Stepwise @LTluttmann
- Improvement @yining043
PolyNet support @ahottung
Different distributions support + MDPOMO @jieyibi
Add initial support for solvers API from RL4CO (MTVRP): PyVRP, OR-Tools, LKH3 @N-Wouda @leonlan
Faster data logprobs collection: now we don't need to collect logprobs for unused trajectories, but we gather only logprobs for selected nodes by default, which decreases memory consumption
Add Codecov to track the tests coverage

[Environment] Supporting generator_params arguments for environments, more modularized and flexible.
Modularization of the Attention Model decoder’s QKV calculation for more flexibility @LTluttmann
Refactor the MatNet encoder with the cross attention only needs to be calculated once @LTluttmann

Fix the DeepACO’s log_heuristic calculation bug to raise the performance. @Furffico @henry-yeh
Solve memory leakage during the autoregressive decoding @LTluttmann
Python versioning: remove Python 3.8, compatibility with Python 3.12, and poetry support @ShuN6211
Compatibility with tensordict>=0.5.0
Memory leak in OP and PCTSP
Fix A2C bug: optimize all parameters in module instead of only "policy" by default
Fix double logging parameters, better logging in Wandb