Skip to content

on-policy optimization baselines for deep reinforcement learning

Notifications You must be signed in to change notification settings

robintyh1/onpolicybaselines

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

On-Policy Optimization Baselines for Deep Reinforcement Learning

On-Policy Optimization Baselines offer a suite of on-policy optimization algorithms, built on top of OpenAI baselines. In addition to the original on-policy optimization baselines, this repository offers implementations of trust region search algorithms (TRPO, ACKTR) combined with Gaussian Mixture Model (GMM) and Normalizing flows Policy. This repository also contains wrappers necessary for discretizing continuous action space for on-policy optimization.

These ideas are based on the following papers. Please find the code in the proper sub-directories.

Further, this repository provides some recent baselines (e.g. Beta distribution) as part of the comparison in the papers.

Citations

If you use this repo for academic research, you are highly encouraged to cite the following papers:

About

on-policy optimization baselines for deep reinforcement learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages