Skip to content

Commit

Permalink
docs(env): fix action format
Browse files Browse the repository at this point in the history
Signed-off-by: Jonas Dujava <[email protected]>
  • Loading branch information
jdujava committed Sep 25, 2024
1 parent df185cb commit f2f5af4
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 5 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,11 @@ An observation for one agent is a dictionary of 13 key/value pairs. Each key/val
| `timestep` | `(1,)` | Timestep |

### ⚡ Action
Action is an `np.array([p,i,j,d,s])`:
- Value of `p` is `1 (play)` or `0 (pass)`.
Action is an `np.array([pass,i,j,d,split])`:
- Value of `pass` is `0 (play)` or `1 (pass)`.
- Indices `i,j` say that you want to move from cell with index `[i,j]`.
- Value of `d` is a direction you want to choose: `0 (up)`, `1 (down)`, `2 (left)`, `3 (right)`
- Value of `s` says whether you want to split units. Value `1` sends half of units and value `0` sends all possible units to the next cell.
- Value of `d` is a direction of the movement: `0 (up)`, `1 (down)`, `2 (left)`, `3 (right)`
- Value of `split` says whether you want to split units. Value `1` sends half of units and value `0` sends all possible units to the next cell.

### 🎁 Reward
It is possible to implement custom reward function. The default is `1` for winner and `-1` for loser, otherwise `0`.
Expand Down
2 changes: 1 addition & 1 deletion TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,5 @@
- [x] Pre-commit hooks for conventional commit checks (enforcing conventional commits)
- [x] Add CI for running tests (pre commit)
- [x] Add CI passing badge to README
- [ ] Document agent move format
- [x] Document agent action/move format
- [ ] Split game step tests into more specific tests

0 comments on commit f2f5af4

Please sign in to comment.