Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

User guide #325

Open
TimotheeMathieu opened this issue Jun 27, 2023 · 3 comments
Open

User guide #325

TimotheeMathieu opened this issue Jun 27, 2023 · 3 comments
Assignees
Labels
documentation Improvements or additions to documentation Marathon To do during Marathon

Comments

@TimotheeMathieu
Copy link
Collaborator

TimotheeMathieu commented Jun 27, 2023

I propose we do a user guide for rlberry. The outline of which would be something like this:

  • Installation
  • Basic Usage
    • Quick Start RL
    • Quick Start Deep RL
  • Set up of an experiment
    • Agent Manager, agent, environment.
    • Training phase, evaluation phase
    • Logging
    • Parallelization how to
  • Running an experiment
    • Train an agent
    • Evaluate agents
    • Tune hyperparameters
    • Plot relevant statistics
  • Saving and Loading
    • Save and Load of agent
    • Save and Load of managers
    • Writers
    • Save and Load of data for plots
  • Make your own agent or environment
    • Interaction with Gymnasium
    • Using environment from gymnasium
    • Using agents from Stablebaselines
    • Deep RL agents
      • Neural network utils
      • Interatctions with torch
    • Seeding
  • Using Bandits in rlberry

Feel free to suggest any change to this outline. Once we all agree to the outline, we can distribute the work among us.

@TimotheeMathieu
Copy link
Collaborator Author

An I suggest we use rundoc or something similar to verify that the code in the user guide actually does something and have exit code 0.

I think this should go into the long tests because the user guide will contain some code to train agents and it would be too heavy for azure.

@KohlerHECTOR KohlerHECTOR added documentation Improvements or additions to documentation Marathon To do during Marathon labels Jul 13, 2023
@KohlerHECTOR
Copy link
Collaborator

An example of a user guide section from pr #276 : https://rlberry--276.org.readthedocs.build/en/276/basics/comparison.html

We can try Jupytext to edit markdown in jupyter.

@riiswa
Copy link
Collaborator

riiswa commented Jul 21, 2023

I'm adding notes concerning Philippe's remarks (check your mailbox):

  • The user guide should telling "how rl-berry should used?". Example: experiments should be reproducible, and make sure that all the examples we give are reproducible
  • Example of what is a more clearer documentation: eval([eval_horizon, n_simulations, gamma])'': Monte-Carlo policy evaluation [1] of an agent to estimate the value at the initial state.''
    • What do we evaluate? Do we eval the initial state or do we evaluate a policy/trained agent?
    • Define the 3 arguments
  • How do we seed an agent? call to reseed() or some other way. The description of reseed() is very unclear to me: we provide a sequence of numbers? or one number/seed?
  • kwargs should be explained, their attributes listed in all different cases. (See Handling **kwargs #334)
    • Regarding the save() method, what does ``Overwrite the 'save' function to manage CPU vs GPU save/load in torch agent'' mean? Does it save the RL-berry agent or just its Q-network? Q-network(s) in the case of DDQN? ...
      Same thing for load(). Moreover, we don't care that it overloads any other method (See Consistent naming #341). We want to know what it does.
  • Include all the arguments in the docstring
  • Why is the default value indicated for some arguments and not for all?
  • More details about, how evaluate an agent during training

Basically, we should pass on each function/methods, and write the documentation in a better way (if needed), so that everything is documented and explicit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation Marathon To do during Marathon
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants