From 0707e072236d955497b9f83de7c9fddd11093385 Mon Sep 17 00:00:00 2001 From: Shixian Sheng Date: Mon, 6 May 2024 12:06:42 -0400 Subject: [PATCH] [Doc] Update README.md (#2155) --- README.md | 64 +++++++++++++++++++++++++++---------------------------- 1 file changed, 32 insertions(+), 32 deletions(-) diff --git a/README.md b/README.md index c28161bb949..4f2cd68b0f2 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ It provides pytorch and **python-first**, low and high level abstractions for RL that are intended to be **efficient**, **modular**, **documented** and properly **tested**. The code is aimed at supporting research in RL. Most of it is written in python in a highly modular way, such that researchers can easily swap components, transform them or write new ones with little effort. -This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar ([torchrl/envs](torchrl/envs)), [transforms](torchrl/envs/transforms), [models](torchrl/modules), data utilities (e.g. collectors and containers), etc. +This repo attempts to align with the existing pytorch ecosystem libraries in that it has a dataset pillar ([torchrl/envs](https://github.com/pytorch/rl/blob/main/torchrl/envs)), [transforms](https://github.com/pytorch/rl/blob/main/torchrl/envs/transforms), [models](https://github.com/pytorch/rl/blob/main/torchrl/modules), data utilities (e.g. collectors and containers), etc. TorchRL aims at having as few dependencies as possible (python standard library, numpy and pytorch). Common environment libraries (e.g. OpenAI gym) are only optional. On the low-level end, torchrl comes with a set of highly re-usable functionals for cost functions, returns and data processing. @@ -141,7 +141,7 @@ lines of code*! Here is an example of how the [environment API](https://pytorch.org/rl/stable/reference/envs.html) relies on tensordict to carry data from one function to another during a rollout execution: -![Alt Text](docs/source/_static/img/rollout.gif) +![Alt Text](https://github.com/pytorch/rl/blob/main/docs/source/_static/img/rollout.gif) `TensorDict` makes it easy to re-use pieces of code across environments, models and algorithms. @@ -268,11 +268,11 @@ And it is `functorch` and `torch.compile` compatible! ## Features -- A common [interface for environments](torchrl/envs) +- A common [interface for environments](https://github.com/pytorch/rl/blob/main/torchrl/envs) which supports common libraries (OpenAI gym, deepmind control lab, etc.)(1) and state-less execution (e.g. Model-based environments). - The [batched environments](torchrl/envs/batched_envs.py) containers allow parallel execution(2). - A common PyTorch-first class of [tensor-specification class](torchrl/data/tensor_specs.py) is also provided. + The [batched environments](https://github.com/pytorch/rl/blob/main/torchrl/envs/batched_envs.py) containers allow parallel execution(2). + A common PyTorch-first class of [tensor-specification class](https://github.com/pytorch/rl/blob/main/torchrl/data/tensor_specs.py) is also provided. TorchRL's environments API is simple but stringent and specific. Check the [documentation](https://pytorch.org/rl/stable/reference/envs.html) and [tutorial](https://pytorch.org/rl/stable/tutorials/pendulum.html) to learn more! @@ -288,7 +288,7 @@ And it is `functorch` and `torch.compile` compatible! ``` -- multiprocess and distributed [data collectors](torchrl/collectors/collectors.py)(2) +- multiprocess and distributed [data collectors](https://github.com/pytorch/rl/blob/main/torchrl/collectors/collectors.py)(2) that work synchronously or asynchronously. Through the use of TensorDict, TorchRL's training loops are made very similar to regular training loops in supervised @@ -315,10 +315,10 @@ And it is `functorch` and `torch.compile` compatible! ``` - Check our [distributed collector examples](examples/distributed/collectors) to + Check our [distributed collector examples](https://github.com/pytorch/rl/blob/main/examples/distributed/collectors) to learn more about ultra-fast data collection with TorchRL. -- efficient(2) and generic(1) [replay buffers](torchrl/data/replay_buffers/replay_buffers.py) with modularized storage: +- efficient(2) and generic(1) [replay buffers](https://github.com/pytorch/rl/blob/main/torchrl/data/replay_buffers/replay_buffers.py) with modularized storage:
Code @@ -357,7 +357,7 @@ And it is `functorch` and `torch.compile` compatible!
-- cross-library [environment transforms](torchrl/envs/transforms/transforms.py)(1), +- cross-library [environment transforms](https://github.com/pytorch/rl/blob/main/torchrl/envs/transforms/transforms.py)(1), executed on device and in a vectorized fashion(2), which process and prepare the data coming out of the environments to be used by the agent:
@@ -391,7 +391,7 @@ And it is `functorch` and `torch.compile` compatible!
- various tools for distributed learning (e.g. [memory mapped tensors](https://github.com/pytorch/tensordict/blob/main/tensordict/memmap.py))(2); -- various [architectures](torchrl/modules/models/) and models (e.g. [actor-critic](torchrl/modules/tensordict_module/actors.py))(1): +- various [architectures](https://github.com/pytorch/rl/blob/main/torchrl/modules/models/) and models (e.g. [actor-critic](https://github.com/pytorch/rl/blob/main/torchrl/modules/tensordict_module/actors.py))(1):
Code @@ -443,8 +443,8 @@ And it is `functorch` and `torch.compile` compatible! ```
-- exploration [wrappers](torchrl/modules/tensordict_module/exploration.py) and - [modules](torchrl/modules/models/exploration.py) to easily swap between exploration and exploitation(1): +- exploration [wrappers](https://github.com/pytorch/rl/blob/main/torchrl/modules/tensordict_module/exploration.py) and + [modules](https://github.com/pytorch/rl/blob/main/torchrl/modules/models/exploration.py) to easily swap between exploration and exploitation(1):
Code @@ -481,37 +481,37 @@ And it is `functorch` and `torch.compile` compatible!
-- a generic [trainer class](torchrl/trainers/trainers.py)(1) that +- a generic [trainer class](https://github.com/pytorch/rl/blob/main/torchrl/trainers/trainers.py)(1) that executes the aforementioned training loop. Through a hooking mechanism, it also supports any logging or data transformation operation at any given time. -- various [recipes](torchrl/trainers/helpers/models.py) to build models that +- various [recipes](https://github.com/pytorch/rl/blob/main/torchrl/trainers/helpers/models.py) to build models that correspond to the environment being deployed. If you feel a feature is missing from the library, please submit an issue! -If you would like to contribute to new features, check our [call for contributions](https://github.com/pytorch/rl/issues/509) and our [contribution](CONTRIBUTING.md) page. +If you would like to contribute to new features, check our [call for contributions](https://github.com/pytorch/rl/issues/509) and our [contribution](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md) page. ## Examples, tutorials and demos -A series of [examples](examples/) are provided with an illustrative purpose: -- [DQN](sota-implementations/dqn) -- [DDPG](sota-implementations/ddpg/ddpg.py) -- [IQL](sota-implementations/iql/iql_offline.py) -- [CQL](sota-implementations/cql/cql_offline.py) -- [TD3](sota-implementations/td3/td3.py) -- [A2C](examples/a2c_old/a2c.py) -- [PPO](sota-implementations/ppo/ppo.py) -- [SAC](sota-implementations/sac/sac.py) -- [REDQ](sota-implementations/redq/redq.py) -- [Dreamer](sota-implementations/dreamer/dreamer.py) -- [Decision Transformers](sota-implementations/decision_transformer) -- [RLHF](examples/rlhf) +A series of [examples](https://github.com/pytorch/rl/blob/main/examples/) are provided with an illustrative purpose: +- [DQN](https://github.com/pytorch/rl/blob/main/sota-implementations/dqn) +- [DDPG](https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py) +- [IQL](https://github.com/pytorch/rl/blob/main/sota-implementations/iql/iql_offline.py) +- [CQL](https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py) +- [TD3](https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py) +- [A2C](https://github.com/pytorch/rl/blob/main/examples/a2c_old/a2c.py) +- [PPO](https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/ppo.py) +- [SAC](https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py) +- [REDQ](https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py) +- [Dreamer](https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py) +- [Decision Transformers](https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer) +- [RLHF](https://github.com/pytorch/rl/blob/main/examples/rlhf) and many more to come! -Check the [examples](sota-implementations/) directory for more details +Check the [examples](https://github.com/pytorch/rl/blob/main/sota-implementations/) directory for more details about handling the various configuration settings. We also provide [tutorials and demos](https://pytorch.org/rl/stable#tutorials) that give a sense of @@ -670,7 +670,7 @@ it means that the C++ extensions were not installed or not found. ``` Versioning issues can cause error message of the type ```undefined symbol``` -and such. For these, refer to the [versioning issues document](knowledge_base/VERSIONING_ISSUES.md) +and such. For these, refer to the [versioning issues document](https://github.com/pytorch/rl/blob/main/knowledge_base/VERSIONING_ISSUES.md) for a complete explanation and proposed workarounds. ## Asking a question @@ -683,7 +683,7 @@ the [PyTorch forum](https://discuss.pytorch.org/c/reinforcement-learning/6). ## Contributing Internal collaborations to torchrl are welcome! Feel free to fork, submit issues and PRs. -You can checkout the detailed contribution guide [here](CONTRIBUTING.md). +You can checkout the detailed contribution guide [here](https://github.com/pytorch/rl/blob/main/CONTRIBUTING.md). As mentioned above, a list of open contributions can be found in [here](https://github.com/pytorch/rl/issues/509). Contributors are recommended to install [pre-commit hooks](https://pre-commit.com/) (using `pre-commit install`). pre-commit will check for linting related issues when the code is committed locally. You can disable th check by appending `-n` to your commit command: `git commit -m -n` @@ -696,4 +696,4 @@ BC-breaking changes are likely to happen but they will be introduced with a depr warranty after a few release cycles. # License -TorchRL is licensed under the MIT License. See [LICENSE](LICENSE) for details. +TorchRL is licensed under the MIT License. See [LICENSE](https://github.com/pytorch/rl/blob/main/LICENSE) for details.