diff --git a/README.md b/README.md
index 9a3bb07..ae2101c 100644
--- a/README.md
+++ b/README.md
@@ -9,7 +9,7 @@ Set up and install the d4rl environments by following the instructions provided
 Clone the GitHub repository and install required packages:
 
 ```bash
-git clone https://github.com/yaboidav3/ORL.git && cd ORL
+git clone https://github.com/uiuc-focal-lab/ORL.git && cd ORL
 pip install -r requirements/requirements_dev.txt
 ```
 
@@ -25,6 +25,15 @@ Follow the prompts to create a new project or connect to an existing one. Make s
 
 For more information on how to use Wandb, refer to the [Wandb documentation](https://docs.wandb.ai/).
 
+## Generate Preference Datasets
+
+Run the shell files. They will be written into the `saved` folder.
+
+```bash
+. generate_pbrl_datasets.sh
+. generate_pbrl_datasets_no_overlap.sh
+
+```
 ## Run Example
 
 Run the sample Python command. Make sure you have the necessary dependencies installed and the Python environment properly configured.
@@ -32,7 +41,6 @@ Run the sample Python command. Make sure you have the necessary dependencies ins
 ```bash
 . example.sh
 ```
-
 ## Full Experiment and Ablation Study Scripts
 
 To run the full experiment and ablation study, use the following scripts:
@@ -56,20 +64,20 @@ Execute these scripts in your terminal:
 
 ### Main Experiments
 
-This graph shows the comparison between different reward labeling methods: Oracle True Reward, ORL, Latent Reward Model, and IPL with True Reward.
+Training log of learning with different methods on different datasets: Oracle True Reward, ORL, Latent Reward Model, and IPL with True Reward
 
 ![Graph 1](results/graphs/main_exp.png)
 
 ### Ablation Studies
 
-This graph demonstrates the impact of using datasets of different sizes on the performance of the reward labeling method.
+Training log of learning with a method on datasets of different sizes
 
 ![Graph 2](results/graphs/size.png)
 
-This graph illustrates the performance of the reward labeling method when different Offline RL algorithms are applied.
+Comparison between the learning efficiency of ORL combined with different standard offline RL algorithms
 
 ![Graph 3](results/graphs/algo.png)
 
-This graph showcases the effect of performing multiple Bernoulli samples to generate preference labels on the performance of the reward labeling method.
+Comparison between the cases where single or multiple preference labels are given to each pair of trajectories
 
 ![Graph 4](results/graphs/bernoulli.png)
diff --git a/saved/pbrl_datasets/placeholder.txt b/saved/pbrl_datasets/placeholder.txt
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/saved/pbrl_datasets/placeholder.txt
@@ -0,0 +1 @@
+
diff --git a/saved/pbrl_datasets_no_overlap/placeholder.txt b/saved/pbrl_datasets_no_overlap/placeholder.txt
new file mode 100644
index 0000000..8b13789
--- /dev/null
+++ b/saved/pbrl_datasets_no_overlap/placeholder.txt
@@ -0,0 +1 @@
+