From bc57f1a59b7cbf34d691ee7a1a0b79e5b4bae1e2 Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+yaboidav3@users.noreply.github.com> Date: Wed, 12 Jun 2024 20:33:52 -0500 Subject: [PATCH 1/9] Update README.md --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 9a3bb07..12e839c 100644 --- a/README.md +++ b/README.md @@ -56,20 +56,20 @@ Execute these scripts in your terminal: ### Main Experiments -This graph shows the comparison between different reward labeling methods: Oracle True Reward, ORL, Latent Reward Model, and IPL with True Reward. +Training log of learning with different methods on different datasets: Oracle True Reward, ORL, Latent Reward Model, and IPL with True Reward ![Graph 1](results/graphs/main_exp.png) ### Ablation Studies -This graph demonstrates the impact of using datasets of different sizes on the performance of the reward labeling method. +raining log of learning with a method on datasets of different sizes ![Graph 2](results/graphs/size.png) -This graph illustrates the performance of the reward labeling method when different Offline RL algorithms are applied. +Comparison between the learning efficiency of ORL combined with different standard offline RL algorithms ![Graph 3](results/graphs/algo.png) -This graph showcases the effect of performing multiple Bernoulli samples to generate preference labels on the performance of the reward labeling method. +Comparison between the cases where single or multiple preference labels are given to each pair of trajectories ![Graph 4](results/graphs/bernoulli.png) From 3c43adabec6e9b80b940f6cc0d6ba6a3c7581e31 Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+yaboidav3@users.noreply.github.com> Date: Wed, 12 Jun 2024 20:34:10 -0500 Subject: [PATCH 2/9] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 12e839c..d1cd139 100644 --- a/README.md +++ b/README.md @@ -62,7 +62,7 @@ Training log of learning with different methods on different datasets: Oracle Tr ### Ablation Studies -raining log of learning with a method on datasets of different sizes +Training log of learning with a method on datasets of different sizes ![Graph 2](results/graphs/size.png) From 1a25a3c9b9bb2ebc158986f174a388fba0a0f8ac Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Wed, 12 Jun 2024 20:37:59 -0500 Subject: [PATCH 3/9] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index d1cd139..180ff54 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ Set up and install the d4rl environments by following the instructions provided Clone the GitHub repository and install required packages: ```bash -git clone https://github.com/yaboidav3/ORL.git && cd ORL +git clone https://github.com/davidzhu27/ORL.git && cd ORL pip install -r requirements/requirements_dev.txt ``` From 708d3073e914fa25fa78387772f03700cbd51405 Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Sat, 15 Jun 2024 20:04:38 -0500 Subject: [PATCH 4/9] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 180ff54..cd58461 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ Set up and install the d4rl environments by following the instructions provided Clone the GitHub repository and install required packages: ```bash -git clone https://github.com/davidzhu27/ORL.git && cd ORL +git clone https://github.com/uiuc-focal-lab/ORL.git && cd ORL pip install -r requirements/requirements_dev.txt ``` From 8335a510c9f7d9d6b3cc183731b9695b96e32b96 Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Mon, 1 Jul 2024 00:59:05 -0500 Subject: [PATCH 5/9] Create placeholder.txt --- saved/placeholder.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 saved/placeholder.txt diff --git a/saved/placeholder.txt b/saved/placeholder.txt new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/saved/placeholder.txt @@ -0,0 +1 @@ + From b243a1982a6f567475662e08c715fca63db3b19a Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Mon, 1 Jul 2024 01:00:59 -0500 Subject: [PATCH 6/9] Update README.md --- README.md | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index cd58461..ae2101c 100644 --- a/README.md +++ b/README.md @@ -25,6 +25,15 @@ Follow the prompts to create a new project or connect to an existing one. Make s For more information on how to use Wandb, refer to the [Wandb documentation](https://docs.wandb.ai/). +## Generate Preference Datasets + +Run the shell files. They will be written into the `saved` folder. + +```bash +. generate_pbrl_datasets.sh +. generate_pbrl_datasets_no_overlap.sh + +``` ## Run Example Run the sample Python command. Make sure you have the necessary dependencies installed and the Python environment properly configured. @@ -32,7 +41,6 @@ Run the sample Python command. Make sure you have the necessary dependencies ins ```bash . example.sh ``` - ## Full Experiment and Ablation Study Scripts To run the full experiment and ablation study, use the following scripts: From 6aafc16386d3e8962fe043884d53fd051918c31c Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Mon, 1 Jul 2024 01:01:36 -0500 Subject: [PATCH 7/9] Create placeholder.txt --- saved/pbrl_datasets/placeholder.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 saved/pbrl_datasets/placeholder.txt diff --git a/saved/pbrl_datasets/placeholder.txt b/saved/pbrl_datasets/placeholder.txt new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/saved/pbrl_datasets/placeholder.txt @@ -0,0 +1 @@ + From 856b8ea87371457c100c504dac1864c10b6da7ad Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Mon, 1 Jul 2024 01:01:55 -0500 Subject: [PATCH 8/9] Create placeholder.txt --- saved/pbrl_datasets_no_overlap/placeholder.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 saved/pbrl_datasets_no_overlap/placeholder.txt diff --git a/saved/pbrl_datasets_no_overlap/placeholder.txt b/saved/pbrl_datasets_no_overlap/placeholder.txt new file mode 100644 index 0000000..8b13789 --- /dev/null +++ b/saved/pbrl_datasets_no_overlap/placeholder.txt @@ -0,0 +1 @@ + From 372a61bcf41acdb1282a1463d9af8080d6006751 Mon Sep 17 00:00:00 2001 From: David Zhu <45285516+davidzhu27@users.noreply.github.com> Date: Mon, 1 Jul 2024 01:02:11 -0500 Subject: [PATCH 9/9] Delete saved/placeholder.txt --- saved/placeholder.txt | 1 - 1 file changed, 1 deletion(-) delete mode 100644 saved/placeholder.txt diff --git a/saved/placeholder.txt b/saved/placeholder.txt deleted file mode 100644 index 8b13789..0000000 --- a/saved/placeholder.txt +++ /dev/null @@ -1 +0,0 @@ -