Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor_edits_doc #298

Open
wants to merge 2 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ sure you look at the examples
hive_single_agent_loop -c <config-file>
hive_multi_agent_loop -c <config-file>

Finally, if instead you want to use your own custom custom components you can
simply register it with RLHive and run your config normally:
Finally, if instead you want to use your own custom components you can
simply register it with RLHive and run your config in the following way:

.. code-block:: python

Expand Down
1 change: 1 addition & 0 deletions docs/tutorials/agent_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ First, we define the constructor:
self._q_values = np.zeros(obs_dim, act_dim)
self._gamma = gamma
self._alpha = alpha
self._act_dim = act_dim
self._epsilon_schedule = LinearSchedule(1.0, final_epsilon, explore_steps)

Comment on lines 32 to 37
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tutorial needs to be updated based on the new api. We no longer pass act_dim and obs_dim, we instead pass spaces. Can you please update the tutorial?

In this constructor, we created a numpy array to keep track of the Q-values for every
Expand Down
10 changes: 5 additions & 5 deletions docs/tutorials/configuration_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,8 @@ In this example, :py:class:`~hive.agents.dqn_agent.DQNAgent` ,
:py:class:`~hive.agents.qnets.mlp.MLPNetwork` , and
:py:class:`~hive.replays.circular_replay.CircularReplayBuffer` are all classes
registered with RLHive. Thus, we can do this configuration directly. When the
``registry`` getter function for agents,
:py:meth:`~hive.utils.registry.Registry.get_agent` is then called with this config
``registry`` getter function for agents
:py:meth:`~hive.utils.registry.Registry.get_agent`, is then called with this config
dictionary (with the missing required arguments such as ``obs_dim`` and ``act_dim``,
filled in), it will build all the inner RLHive objects automatically.
This works by using the type annotations on the constructors of the objects, so
Expand All @@ -51,8 +51,8 @@ Overriding from command lines
--------------------------------
When using the ``registry`` getter functions, RLHive automatically checks any command
line arguments passed to see if they match/override any default or yaml configured
arguments. With ``getter`` functionyou provide a config and a prefix. That prefix
is added prepended to any argument names when searching the command line. For example,
arguments. With ``getter`` function you provide a config and a prefix. That prefix
is added, prepended to any argument names when searching the command line. For example,
with the above config, if it were loaded and the
:py:meth:`~hive.utils.registry.Registry.get_agent` method was called as follows:

Expand All @@ -65,7 +65,7 @@ python script: ``--ag.discount_rate .95``. This can go arbitrarily deep into reg
RLHive class. For example, if you wanted to change the capacity of the replay buffer,
you could pass ``--ag.replay_buffer.capacity 100000``.

If the type annotation the argument ``arg`` is ``List[C]`` where C is a registered
If the type annotation of the argument ``arg`` is ``List[C]`` where C is a registered
RLHive class, then you can override the argument of an individual object, ``foo``,
configured through YAML by passing ``--arg.0.foo <value>``.

Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/env_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Creating an Environment

RLHive Environments
^^^^^^^^^^^^^^^^^^^
Every environment used in RLHive should be a subclass of `~hive.envs.base.BaseEnv`.
Every environment used in RLHive should be a subclass of :py:class:`~hive.envs.base.BaseEnv`.
It should provide a ``reset`` function that resets the environment to a new episode
and returns a tuple of ``(observation, turn)`` and a ``step`` function that takes in
an action, performs the step in the environment, and returns a tuple of
Expand Down
2 changes: 1 addition & 1 deletion docs/tutorials/runner_tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ We provide two different :py:class:`~hive.runners.base.Runner` classes:
for both Runner classes can be viewed in their respective files with the
:py:meth:`set_up_experiment` functions.
The :py:meth:`~hive.utils.registry.get_parsed_args` function can be used
to get any arguments from the command line are not part of the signatures
to get any arguments from the command line that are not part of the signatures
of already registered RLHive class constructors.


Expand Down
8 changes: 4 additions & 4 deletions hive/agents/ddpg.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,18 +50,18 @@ def __init__(
None, defaults to :py:class:`~torch.nn.Identity`.
actor_net (FunctionApproximator): The network that takes the encoded
observations from representation_net and outputs the representations
used to compute the actions (ie everything except the last layer).
used to compute the actions (i.e. everything except the last layer).
critic_net (FunctionApproximator): The network that takes two inputs: the
encoded observations from representation_net and actions. It outputs
the representations used to compute the values of the actions (ie
the representations used to compute the values of the actions (i.e.
everything except the last layer).
init_fn (InitializationFn): Initializes the weights of agent networks using
create_init_weights_fn.
actor_optimizer_fn (OptimizerFn): A function that takes in the list of
parameters of the actor returns the optimizer for the actor. If None,
parameters of the actor and returns the optimizer for the actor. If None,
defaults to :py:class:`~torch.optim.Adam`.
critic_optimizer_fn (OptimizerFn): A function that takes in the list of
parameters of the critic returns the optimizer for the critic. If None,
parameters of the critic and returns the optimizer for the critic. If None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure any changes to docstrings/comments result in lines <=88 columns.

defaults to :py:class:`~torch.optim.Adam`.
critic_loss_fn (LossFn): The loss function used to optimize the critic. If
None, defaults to :py:class:`~torch.nn.MSELoss`.
Expand Down
1 change: 1 addition & 0 deletions hive/agents/rainbow.py
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ def act(self, observation):
def update(self, update_info):
"""
Updates the DQN agent.

Args:
update_info: dictionary containing all the necessary information to
update the agent. Should contain a full transition, with keys for
Expand Down
4 changes: 2 additions & 2 deletions hive/agents/td3.py
Original file line number Diff line number Diff line change
Expand Up @@ -62,10 +62,10 @@ def __init__(
None, defaults to :py:class:`~torch.nn.Identity`.
actor_net (FunctionApproximator): The network that takes the encoded
observations from representation_net and outputs the representations
used to compute the actions (ie everything except the last layer).
used to compute the actions (i.e. everything except the last layer).
critic_net (FunctionApproximator): The network that takes two inputs: the
encoded observations from representation_net and actions. It outputs
the representations used to compute the values of the actions (ie
the representations used to compute the values of the actions (i.e.
everything except the last layer).
init_fn (InitializationFn): Initializes the weights of agent networks using
create_init_weights_fn.
Expand Down
3 changes: 0 additions & 3 deletions hive/replays/circular_replay.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,9 +43,6 @@ def __init__(
a numpy type, a string of the form np.uint8 or numpy.uint8 is
acceptable.
action_shape: Shape of actions that will be stored in the buffer.
action_dtype: Type of actions that will be stored in the buffer. Format is
described in the description of observation_dtype.
action_shape: Shape of actions that will be stored in the buffer.
action_dtype: Type of actions that will be stored in the buffer. Format is
described in the description of observation_dtype.
reward_shape: Shape of rewards that will be stored in the buffer.
Expand Down
3 changes: 0 additions & 3 deletions hive/replays/prioritized_replay.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,6 @@ def __init__(
a numpy type, a string of the form np.uint8 or numpy.uint8 is
acceptable.
action_shape: Shape of actions that will be stored in the buffer.
action_dtype: Type of actions that will be stored in the buffer. Format is
described in the description of observation_dtype.
action_shape: Shape of actions that will be stored in the buffer.
action_dtype: Type of actions that will be stored in the buffer. Format is
described in the description of observation_dtype.
reward_shape: Shape of rewards that will be stored in the buffer.
Expand Down
2 changes: 1 addition & 1 deletion hive/runners/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ def load_config(
logger_config=None,
):
"""Used to load config for experiments. Agents, environment, and loggers components
in main config file can be overrided based on other log files.
in main config file can be overriden based on other log files.

Args:
config (str): Path to configuration file. Either this or :obj:`preset_config`
Expand Down
1 change: 1 addition & 0 deletions hive/utils/experiment.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ def should_save(self):

def save(self, tag="current"):
"""Saves the experiment.

Args:
tag (str): Tag to prefix the folder.
"""
Expand Down
2 changes: 1 addition & 1 deletion hive/utils/registry.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ class Registry:
For example, let's consider the following scenario:
Your agent class has an argument `arg1` which is annotated to be `List[Class1]`,
`Class1` is `Registrable`, and the `Class1` constructor takes an argument `arg2`.
In the passed yml config, there are two different Class1 object configs listed.
In the passed yml config, there are two different Class1 object configs listed,
the constructor will check to see if both `--agent.arg1.0.arg2` and
`--agent.arg1.1.arg2` have been passed.

Expand Down