[RL-baseline] Model v5 #42

ziritrion · 2021-04-02T09:37:36Z

The policy network for model v5 is completely redesigned. It features 6 convolutional layers rather than 3, and it drops the pooling layers because the convolutions already take care of reducing the dimensionality of the input. It also features 2 fully connected layers right after the convolutions, and both the actor head and critic head feature 2 fully connected layers as well.

This model is based on the model featured in this example . This code is meant for PPO, but since it managed to train successfully without significantly modifying the environment state, we took it as a base to see how it would work for REINFORCE with Baseline.

…sode, successful training)

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

…ignore change

…ing with other branches

…etuning

…ard references for ease of use

xeviknal and others added 27 commits March 11, 2021 18:30

Add metrics and logsoftmax

2106b4b

Updating the model

0f2f195

Add new model to baseline

39b3ad3

The line that fixes all

907af7a

Add mean entropy - to reduce tensorboard runs

8e4ee6c

Add action prob mean: mean of prob of actions taken in the episode

3970702

Added simple directory check to params folder

957a3b4

Added additional param save conditions (end of log_interval, last epi…

0105564

…sode, successful training)

Merge branch 'RL-baseline-new-model' of github.com:xeviknal/aidl-2021…

b5a5184

…-wo-rl into RL-baseline-new-model Trying to solve conflicts

Removing old runs; they don't apply to this branch

c6954ec

RL-baseline-NM-save-optim

a7c907c

Load optimizer params

d5b676c

8k runs

bd7f6c0

Fresh start with latest checkpoint load-save changes. Also, small git…

5f246c0

…ignore change

bugfix

e8aa5e4

10k runs

c52a4f2

Almost 20k runs. Reward is starting to improve little by little

192bece

Fixed runner.py for generating videos

11b46c4

25k runs. Slight improvement but far from desirable

befd201

Adding more linear layers for each head

610d5a8

Cleaned up the code and removed params and runs in order to ease merg…

e9ad814

…ing with other branches

Modified actions to include the no_action possibility, as well as fin…

4409cf6

…etuning

Tweaked the way the params filename is generated, as well as tensorbo…

db1bd06

…ard references for ease of use

In train_episode, moved some vars to GPU that weren't being moved before

e5a9707

Base start for Baseline model v4

c3aa2c7

Base commit for v5

81ea966

Model v5 finalized

77db903

ziritrion changed the base branch from main to RL-with-baseline April 2, 2021 09:37

Added additional action set with greater granularity

5edc9ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RL-baseline] Model v5 #42

[RL-baseline] Model v5 #42

ziritrion commented Apr 2, 2021

[RL-baseline] Model v5 #42

Are you sure you want to change the base?

[RL-baseline] Model v5 #42

Conversation

ziritrion commented Apr 2, 2021