Merge pull request #52 from strakam/minor-fixes

Documentation updates
strakam · Sep 29, 2024 · fb3ffe9 · fb3ffe9
2 parents 1486737 + 6a79158
commit fb3ffe9
Show file tree

Hide file tree

Showing 3 changed files with 34 additions and 70 deletions.
diff --git a/README.md b/README.md
@@ -10,8 +10,7 @@
 
 
 
-[Installation](#-installation) • [Customization](#-customization) • [Environment](#-environment) • [Getting Started](#-getting-started) 
-
+[Installation](#-installation) • [Getting Started](#-getting-started) • [Customization](#-custom-maps) • [Environment](#-environment)
 </div>
 
 [Generals.io](https://generals.io/) is a real-time strategy game where players compete to conquer their opponents' generals on a 2D grid. While the goal is simple — capture the enemy general — the gameplay involves a lot of depth. Players need to employ strategic planning, deception, and manage both micro and macro mechanics throughout the game. The combination of these elements makes the game highly engaging and complex.
@@ -45,35 +44,8 @@ cd Generals-RL
 pip install -e .
 ```
 
-## Usage Example (🦁 PettingZoo)
-```python
-from generals.env import pz_generals
-from generals.agents import ExpanderAgent, RandomAgent
-
-# Initialize agents
-random = RandomAgent()
-expander = ExpanderAgent()
-
-agents = {
-    random.name: random,
-    expander.name: expander,
-}  # Environment calls agents by name
+<h2 id="custom-usage-example">Usage example (🤸 Gymnasium)</h2>
 
-# Create environment -- render modes: {None, "human"}
-env = pz_generals(agents=agents, render_mode="human")
-observations, info = env.reset()
-
-while not env.game.is_done():
-    actions = {}
-    for agent in env.agents:
-        # Ask agent for action
-        actions[agent] = agents[agent].play(observations[agent])
-    # All agents perform their actions
-    observations, rewards, terminated, truncated, info = env.step(actions)
-    env.render(fps=6)
-```
-
-## Usage example (🤸 Gymnasium)
 ```python
 from generals.env import gym_generals
 from generals.agents import RandomAgent, ExpanderAgent
@@ -94,9 +66,21 @@ while not done:
     done = terminated or truncated
     env.render(fps=6)
 ```
+You can also check an example for 🦁[PettingZoo](./examples/pettingzoo_example.py) or more extensive
+example [here](./examples/complete_example.py).
 
-## 🎨 Custom Maps
+## 🚀 Getting Started
+Creating your first agent is very simple. Start by subclassing an `Agent` class just like 
+[`RandomAgent`](./generals/agents/random_agent.py) or [`ExpanderAgent`](./generals/agents/expander_agent.py).
+- Every agent must have a name as it is his ID by which he is called for actions.
+- Every agent must implement `play(observation)` function that takes in `observation` and returns an action as described above.
+- You can start by copying the [Usage Example](#custom-usage-example) and replacing `agent` with your implementation.
+- When creating an environment, you can choose out of two `render_modes`:
+     - `None` that omits rendering and is suitable for training,
+     - `"human"` where you can see the game roll out.
+- Also check `Makefile` that runs examples so you can get a feel for the repo 🤗.
 
+## 🎨 Custom maps
 Maps are handled via `Mapper` class. You can instantiate the class with desired map properties and it will generate
 maps with these properties for each run.
 ```python
@@ -134,8 +118,8 @@ options = {'map' : map}
 env.reset(options=options)
 ```
 Maps are encoded using these symbols:
-- `.` for passable terrain
-- `#` for non-passable terrain
+- `.` for cells where you can move your army
+- `#` for mountains (terrain that can not be passed)
 - `A,B` are positions of generals
 - digits `0-9` represent cost of cities calculated as `(40 + digit)`
 
@@ -168,18 +152,21 @@ replay.play()
 
 ## 🌍 Environment
 ### 🔭 Observation
-An observation for one agent is a dictionary of 13 key/value pairs. Each key/value pair contains information about part of the game that is accessible to the agent.
+An observation for one agent is a dictionary of 13 key/value pairs.
+Each key/value pair contains information about part of the game that is accessible to the agent.
+Values are `numpy` matrices with shape `(N, M)`, where `N` is height of the map and `M` is the width.
+
 | Key                  | Shape/Type| Description                                                                                                                                    |
 | ---                  | ---       | ---                                                                                                                                            |
-| `army`               | `(N,N,1)` | Number of units in a cell regardless of owner                                                                                                  |
-| `general`            | `(N,N,1)` | Mask of cells that are visible to the agent                                                                                                    |
-| `city`               | `(N,N,1)` | Mask saying whether a city is in a cell                                                                                                        |
-| `visibile_cells`     | `(N,N,1)` | Mask indicating cells that are visible to the agent                                                                                            |
-| `owned_cells`        | `(N,N,1)` | Mask indicating cells controlled by the agent                                                                                                  |
-| `opponent_cells`     | `(N,N,1)` | Mask indicating cells owned by the opponent                                                                                                    |
-| `neutral_cells`      | `(N,N,1)` | Mask indicating cells that are not owned by agents                                                                                             |
-| `structure`          | `(N,N,1)` | Mask indicating whether cells contain cities or mountains, even out of FoV                                                                     |
-| `action_mask`        | `(N,N,4)` | Mask where `[i,j,d]` indicates whether you can move from a cell `[i,j]` to direction `d` where directions are in order (UP, DOWN, LEFT, RIGHT) |
+| `army`               | `(N,M)`   | Number of units in a cell regardless of owner                                                                                                  |
+| `general`            | `(N,M)`   | Mask of cells that are visible to the agent                                                                                                    |
+| `city`               | `(N,M)`   | Mask saying whether a city is in a cell                                                                                                        |
+| `visibile_cells`     | `(N,M)`   | Mask indicating cells that are visible to the agent                                                                                            |
+| `owned_cells`        | `(N,M)`   | Mask indicating cells controlled by the agent                                                                                                  |
+| `opponent_cells`     | `(N,M)`   | Mask indicating cells owned by the opponent                                                                                                    |
+| `neutral_cells`      | `(N,M)`   | Mask indicating cells that are not owned by agents                                                                                             |
+| `structure`          | `(N,M)`   | Mask indicating whether cells contain cities or mountains, even out of FoV                                                                     |
+| `action_mask`        | `(N,M,4)` | Mask where `[i,j,d]` indicates whether you can move from a cell `[i,j]` to direction `d` where directions are in order (UP, DOWN, LEFT, RIGHT) |
 | `owned_land_count`   | `Int`     | Int representing number of cells an agent owns                                                                                                 |
 | `owned_army_count`   | `Int`     | Int representing total number of units of an agent over all cells                                                                              |
 | `opponent_land_count`| `Int`     | Int representing number of cells owned by the opponent                                                                                         |
@@ -207,17 +194,3 @@ def custom_reward_fn(observations):
 env = generals_v0(reward_fn=custom_reward_fn)
 observations, info = env.reset()
 ```
-## 🚀 Getting Started
-Creating your first agent is very simple. Start by subclassing an `Agent` class just like `RandomAgent` [here](./generals/agents.py).
-- Every agent must have a name as it is his ID by which he is called for actions.
-- Every agent must implement `play(observation)` function that takes in `observation` and returns an action as described above.
-- You can simply follow examples to make your bot running.
-- When creating an environment, you can choose out of two `render_modes`:
-     - `none` that omits rendering and is suitable for training,
-     - `human` where you can see the game roll out.
-
-## 🛠️ Coming Soon
-- Examples and baselines using RL
-- Add human control to play against
-
-Requests for useful features and additions are welcome 🤗.
diff --git a/TODO.md b/TODO.md
diff --git a/examples/pettingzoo_example.py b/examples/pettingzoo_example.py
@@ -14,11 +14,14 @@
 env = pz_generals(agents=agents, render_mode="human")
 observations, info = env.reset()
 
-while not env.game.is_done():
+done = False
+
+while not done:
     actions = {}
     for agent in env.agents:
         # Ask agent for action
         actions[agent] = agents[agent].play(observations[agent])
     # All agents perform their actions
     observations, rewards, terminated, truncated, info = env.step(actions)
+    done = any(terminated.values()) or any(truncated.values())
     env.render(fps=6)