Skip to content

Navigation Menu

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Eclectic-Sheep / sheeprl Public

Notifications You must be signed in to change notification settings
Fork 33
Star 320

Code
Issues 23
Pull requests 3
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Releases: Eclectic-Sheep/sheeprl

Releases · Eclectic-Sheep/sheeprl

v0.5.7

29 May 09:40

belerico

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.7 Latest

Latest

v0.5.7 Release Notes

Fix policy steps computation for on-policy algorithms in #293

Assets 2

Loading

All reactions

v0.5.6

28 May 09:23

belerico

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.6

v0.5.6 Release Notes

Fix buffer checkpoint and added the possibility to specify the pre-fill steps upon resuming. Updated the how-tos accordingly in #280
Updated how-tos in #281
Fix division by zero when computing sps-train in #283
Better code naming in #284
Fix Minedojo actions stacking (and more generally multi-discrete actions) and missing keys in #286
Fix computation of prefill steps as policy steps in #287
Fix the Dreamer-V3 imagination notebook in #290
Add the ActionsAsObservationWrapper to let the user add the actions played as observations in #291

Assets 2

Loading

All reactions

v0.5.5

22 Apr 09:33

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.5

v0.5.5 Release Notes

Added parallel stochastic in dv3: #225
Update dependencies and python version: #230, #262, #263
Added dv3 notebook for imagination and obs reconstruction: #232
Created citation.cff: #233
Added replay ratio for off-policy algorithms: #247
Single strategy for the player (now it is instantiated in the build_agent() function: #244, #250, #258
Proper terminated and truncated signals management: #251, #252, #253
Added the possibility to choose whether or not to learn initial recurrent state: #256
Added A2C benchmarks: #266
Added prepare_obs() function to all the algorithms: #267
Improved code readability: #248, #265
bug fix: #220, #222, #224, #231, #243, #255, #257

Assets 2

Loading

All reactions

v0.5.4

26 Feb 11:20

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.4

v0.5.4 Release Notes

Added Dreamer V3 different sizes configs (#208).
Update torch version: 2.2.1 or in [2.0., 2.1.] (#212).
Fix observation normalization in dreamer v3 and p2e_dv3 (#214).
Update README (#215).
Fix installation and agent evaluation: new commands are made available for agent evaluation, model registration, and for the available agents (#216).

Assets 2

Loading

All reactions

v0.5.3

12 Feb 09:34

This commit was created on GitHub.com and signed with GitHub’s verified signature.

GPG key ID: B5690EEEBB952194

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.3

v0.5.3 Release Notes

Added benchmarks (#185)
Added possibility to use a user-defined evaluation file (#199)
Let the user choose for num_threads and matmul precision (#203)
Added Super Mario Bros Environment (#204)
Fix bugs (#183, #186, #193, #195, #200, #201, #202, #205)

Assets 2

Loading

All reactions

v0.5.2

12 Jan 08:39

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.2

v0.5.2 Release Notes

Added A2C algorithm (#33).
Added a new how-to on how to add an external algorithm (no need to clone sheeprl locally) in (#175).
Added optimizations (#177):
- Metrics are instantiated only when needed.
- Removed the torch.cat() operation between empty and dense tensors in the MultiEncoder class.
- Added possibility not to test the agent after training.
Fixed GitHub actions workflow (#180).
Fixed bugs (#181, #183).
Added benchmarks with respect to StableBaselines3 (#185).
Added BernoulliSafeMode distribution, which is a Bernoulli distribution where the mode is computed safely, i.e. it returns self.probs > 0.5 without seeting any NaN (#186) .

Assets 2

Loading

All reactions

v0.5.1

19 Dec 14:11

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.1

v0.5.1 Release Notes

Fix bugs (#174).

Assets 2

Loading

All reactions

v0.5.0

19 Dec 13:22

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.5.0

v0.5.0 Release Notes

Added Numpy buffers (#169):
- The user can now decide if to use the torch.as_tensor function or the torch.from_numpy one to convert the Numpy buffer into tensors when sampling (#172).
Added optimizations to reduce training time (#168).
Added the possibility to keep only the last n checkpoints in an experiment to avoid filling up the disk (#171).
Fix bugs (#167).
Update documentation.

Assets 2

Loading

All reactions

v0.4.9

01 Dec 14:49

belerico

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.9

v0.4.9 Release Notes

Added torch>=2.0 as dependency in #161
Let mlflow be an optional package to be installed, i.e. the user can directly install it with pip install sheeprl[mlflow] in #164
Fix the resume_from_checkpoint in #163. In particular:
- Added save_configs function to save the configs of the experiment in the <log_dir>/config.yaml file.
- Fix the resume from checkpoint of all the algorithms (restart from the correct policy step + fix decoupled).
- Given more flexibility to p2e finetuning scripts regarding the fabric configs.
- MineDojo Wrapper: avoid modifying the kwargs (to always save consistent configs in the <log_dir>/config.yaml file).
- Tensorboar Logger creation: update logger configs to always save consistent configs in the <log_dir>/config.yaml file.
- Added as_dict() method (to dotdict class) to get a primitive python dictionary from a dotdict object.

Assets 2

Loading

All reactions

v0.4.8

28 Nov 15:07

belerico

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.

GPG key ID: 4AEE18F83AFDEB23

Expired

Learn about vigilant mode.

Compare

Choose a tag to compare

Loading

v0.4.8

v0.4.8 Release Notes

The following config keys have been moved in #158 :
- cnn_keys, mlp_keys, per_rank_batch_size, per_rank_sequence_length, per_rank_num_batches and total_steps have been moved to the specifig algo config
We have added the integration of the MLflowLogger in #159 . This comes with new documentation and notebooks under the example folder on how to use it.

Assets 2

Loading

All reactions

Previous 1 2 3 Next

Previous Next

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.