Code for the paper "Pessimistic policy iteration with bounded uncertainty". BUP is an uncertainty-based offline RL algorithm based on TD3 algorithm. This code is built on the official TD3_BC repository and referred to the EDAC code for ensemble critic. usage Running main_uncertain.py to reproduce the results in the paper: