Skip to content
/ bup Public

Pessimistic policy iteration with bounded uncertainty

Notifications You must be signed in to change notification settings

qsa-fox/bup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Code for the paper "Pessimistic policy iteration with bounded uncertainty".

BUP is an uncertainty-based offline RL algorithm based on TD3 algorithm.

This code is built on the official TD3_BC repository and referred to the EDAC code for ensemble critic.

usage

Running main_uncertain.py to reproduce the results in the paper:

About

Pessimistic policy iteration with bounded uncertainty

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages