diff --git a/README.md b/README.md
index e7b9bcd..6de9535 100644
--- a/README.md
+++ b/README.md
@@ -11,15 +11,100 @@ This README provies instructions on how to replicate our work.
 
 # Setup
 
-## anserini
+Clone [dstlr](https://github.com/dstlry/dstlr):
 
-Download and build [Anserini](http://anserini.io) and then follow the [Solrini](https://github.com/castorini/anserini/blob/master/docs/solrini.md) instructions to get a Solr instance running for indexing text documents. Index a document collection with Anserini, such as the Washington Post collection, and ensure the appropriate Solr [command-line parameters](https://github.com/dstlry/dstlr/blob/master/src/main/scala/io/dstlr/package.scala) for `dstlr` are adjusted if use non-default options.
+```
+git clone https://github.com/dstlry/dstlr.git
+```
+
+[sbt](https://www.scala-sbt.org/) is the build tool used for Scala projects, download it if you don't have it yet.
+
+Build the JAR using sbt:
+
+```
+sbt assembly
+````
+
+There is a [known issue](https://github.com/stanfordnlp/CoreNLP/issues/556) between recent Spark versions and CoreNLP 3.8. To fix this, delete the `protobuf-java-2.5.0.jar` file in `$SPARK_HOME/jars` and replace it with [version 3.0.0](https://repo1.maven.org/maven2/com/google/protobuf/protobuf-java/3.0.0/protobuf-java-3.0.0.jar).
+
+## Anserini
+
+### Download and build Anserini
+
+Clone [Anserini](http://anserini.io):
+
+```
+git clone https://github.com/castorini/anserini.git
+
+cd anserini
+```
+
+Change the [config file](https://github.com/castorini/anserini/blob/master/src/main/resources/solr/anserini/conf/managed-schema#L521) so that "contents" would be searchable and stored:
+
+```
+sed -i.bak 's/field name="contents" type="text_en_anserini" indexed="true" stored="false" multiValued="false"/field name="contents" type="text_en_anserini" indexed="true" stored="true" multiValued="false"/g' src/main/resources/solr/anserini/conf/managed-schema
+```
+
+Build Anserini using Maven:
+
+```
+mvn clean package appassembler:assemble
+```
+
+### Setting up a SolrCloud Instance for indexing text documents
+
+From the Solr [archives](https://archive.apache.org/dist/lucene/solr/), find the Solr version that matches Anserini's [Lucene version](https://github.com/castorini/anserini/blob/master/pom.xml#L36), download the `solr-[version].tgz` (non `-src`), and move it into the `anserini/` directory.
+
+Extract the archive:
+
+```
+mkdir solrini && tar -zxvf solr*.tgz -C solrini --strip-components=1
+```
+
+Start Solr:
+
+```
+solrini/bin/solr start -c -m 8G
+```
+
+Note: Adjust memory usage (i.e., `-m 8G` as appropriate).
+
+Run the Solr bootstrap script to copy the Anserini JAR into Solr's classpath and upload the configsets to Solr's internal ZooKeeper:
+
+```
+pushd src/main/resources/solr && ./solr.sh ../../../../solrini localhost:9983 && popd
+```
+   
+Solr should now be available at [http://localhost:8983/](http://localhost:8983/) for browsing.
+
+### Indexing document collections into SolrCloud from Anserini
+
+We'll index [Washington Post collection](https://github.com/castorini/anserini/blob/master/docs/regressions-core18.md) as an example.
+
+First, create the `core18` collection in Solr:
+
+```
+solrini/bin/solr create -n anserini -c core18
+```
+
+Run the Solr indexing command for `core18`:
+
+```
+sh target/appassembler/bin/IndexCollection -collection WashingtonPostCollection -generator WapoGenerator \
+   -threads 8 -input /path/to/WashingtonPost \
+   -solr -solr.index core18 -solr.zkUrl localhost:9983 \
+   -storePositions -storeDocvectors -storeTransformedDocs
+```
+
+Note: Make sure `/path/to/WashingtonPost` is updated with the appropriate path.
+
+Once indexing has completed, you should be able to query `core18` from the Solr [query interface](http://localhost:8983/solr/#/core18/query).
 
 ## neo4j
 
 Start a neo4j instance via Docker with the command:
 ```bash
-docker run -d --publish=7474:7474 --publish=7687:7687 \
+docker run -d --name neo4j --publish=7474:7474 --publish=7687:7687 \
     --volume=`pwd`/neo4j:/data \
     -e NEO4J_dbms_memory_pagecache_size=2G \
     -e NEO4J_dbms_memory_heap_initial__size=4G \
@@ -29,6 +114,8 @@ docker run -d --publish=7474:7474 --publish=7687:7687 \
 
 Note: You may wish to update the memory settings based on the amount of available memory on your machine.
 
+neo4j should should be available shortly at [http://localhost:7474/](http://localhost:7474/) with the default username/password of `neo4j`/`neo4j`. You will be prompted to change the password, this is the password you will pass to the load script.
+
 In order for efficient inserts and queries, build the following indexes in neo4j:
 ```
 CREATE INDEX ON :Document(id)
@@ -45,32 +132,110 @@ CREATE INDEX ON :Relation(type)
 CREATE INDEX ON :Relation(type, confidence)
 ```
 
+## Running
+
+### Extraction
+
+For each document in the collection, we extract mentions of named entities, the relations between them, and links to entities in an external knowledge graph.
+
+Run `ExtractTriples`:
+
+```
+./bin/extract.sh
+```
+
+Note: Modify `extract.sh` based on your environment (e.g., available memory, number of executors, Solr, neo4j password, etc.) - options available [here](src/main/scala/io/dstlr/package.scala).
+
+After the extraction is done, check if an output folder (called `triples/` by default) is created, and several Parquet files are generated inside the output folder.
+
+If you want to inspect the Parquet file:
+
+- Download  and build [parquet-tools](https://github.com/apache/parquet-mr/tree/master/parquet-tools) following instructions.
+
+Note: If you are on Mac, you could also install it with Homebrew `brew install parquet-tools`.
+
+- View the Parquet file in JSON format:
+
+```
+parquet-tools cat --json [filename]
+```
+
+### Enrichment
+
+We augment the raw knowledge graph with facts from the external knowledge graph (Wikidata in our case).
+
+Run `EnrichTriples`:
+
+```
+./bin/enrich.sh
+```
+
+Note: Modify `enrich.sh` based on your environment.
+
+After the enrichment is done, check if an output folder (called `triples-enriched/` by default) is created with output Parquet files.
+
+### Load
+
+Load raw knowledge graph and enriched knowledge graph produced from the above commands to neo4j.
+
+Set `--input triples` in `load.sh`, run `LoadTriples`:
+
+```
+./bin/load.sh
+```
+
+Note: Modify `load.sh` based on your environment.
+
+Set `--input triples-enriched` in `load.sh`, run `LoadTriples` again:
+
+```
+./bin/load.sh
+```
+
+Open [http://localhost:7474/](http://localhost:7474/) to view the loaded knowledge graph in neo4j.
+
 ## Data Cleaning Queries
 
-Find CITY_OF_HEADQUARTERS relation between two mentions:
+The following queries can be run against the knowledge graph in neo4j to discover sub-graphs of interest.
+
+### Supporting Information
+
+This query finds sub-graphs where the value extracted from the document matches the ground-truth from Wikidata.
+
 ```
 MATCH (d:Document)-->(s:Mention)-->(r:Relation {type: "ORG_CITY_OF_HEADQUARTERS"})-->(o:Mention)
 MATCH (s)-->(e:Entity)-->(f:Fact {relation: r.type})
+WHERE o.span = f.value
 RETURN d, s, r, o, e, f
-LIMIT 25
 ```
 
-Find CITY_OF_HEADQUARTERS relation between two mentions where the subject node doesn't have a linked entity:
+### Inconsistent Information
+
+This query finds sub-graphs where the value extracted from the document does not match the ground-truth from Wikidata.
+
 ```
 MATCH (d:Document)-->(s:Mention)-->(r:Relation {type: "ORG_CITY_OF_HEADQUARTERS"})-->(o:Mention)
-OPTIONAL MATCH (s)-->(e:Entity)
-WHERE e IS NULL
-RETURN d, s, r, o, e
-LIMIT 25
+MATCH (s)-->(e:Entity)-->(f:Fact {relation: r.type})
+WHERE NOT(o.span = f.value)
+RETURN d, s, r, o, e, f
 ```
 
 ### Missing Information
-Find CITY_OF_HEADQUARTERS relation between two mentions where the linked entity doesn't have the relation we're looking for:
+
+This query finds sub-graphs where the value extracted from the document does not have a corresponding ground-truth in Wikidata.
+
 ```
 MATCH (d:Document)-->(s:Mention)-->(r:Relation {type: "ORG_CITY_OF_HEADQUARTERS"})-->(o:Mention)
 MATCH (s)-->(e:Entity)
 OPTIONAL MATCH (e)-->(f:Fact {relation: r.type})
 WHERE f IS NULL
 RETURN d, s, r, o, e, f
-LIMIT 25
+```
+
+### Delete Relationships
+
+This query deletes all relationships in the database.
+
+```
+MATCH (n) DETACH DELETE n
 ```
diff --git a/bin/enrich.sh b/bin/enrich.sh
index 60d5c3b..2068f92 100755
--- a/bin/enrich.sh
+++ b/bin/enrich.sh
@@ -2,6 +2,6 @@
 
 spark-submit --class io.dstlr.EnrichTriples \
         --num-executors 1 --executor-cores 1 \
-        --driver-memory 64G --executor-memory 64G \
-        --conf spark.executor.heartbeatInterval=60 \
-        target/scala-2.11/dstlr-assembly-0.1.jar --input triples --output triples-enriched --partitions 1
\ No newline at end of file
+        --driver-memory 8G --executor-memory 8G \
+        --conf spark.executor.heartbeatInterval=10000 \
+        target/scala-2.11/dstlr-assembly-0.1.jar --input triples --output triples-enriched --partitions 1
diff --git a/bin/extract.sh b/bin/extract.sh
index a5f3273..8099d88 100755
--- a/bin/extract.sh
+++ b/bin/extract.sh
@@ -3,7 +3,7 @@
 spark-submit --class io.dstlr.ExtractTriples \
         --num-executors 32 --executor-cores 8 \
         --driver-memory 64G --executor-memory 48G \
-        --conf spark.executor.heartbeatInterval=60 \
+        --conf spark.executor.heartbeatInterval=10000 \
         --conf spark.executorEnv.JAVA_HOME=/usr/lib/jvm/java-9-openjdk-amd64 \
         target/scala-2.11/dstlr-assembly-0.1.jar \
-        --solr.uri 192.168.1.111:9983 --solr.index core18 --query *:* --partitions 2048 --output triples-$RANDOM --doc-length-threshold 10000 --sent-length-threshold 256
\ No newline at end of file
+        --solr.uri localhost:9983 --solr.index core18 --query *:* --partitions 2048 --output triples --sent-length-threshold 256
diff --git a/bin/load.sh b/bin/load.sh
index 4ada382..4cc25d5 100755
--- a/bin/load.sh
+++ b/bin/load.sh
@@ -2,7 +2,7 @@
 
 spark-submit --class io.dstlr.LoadTriples \
         --num-executors 1 --executor-cores 1 \
-        --driver-memory 16G --executor-memory 16G \
-        --conf spark.executor.heartbeatInterval=60 \
+        --driver-memory 8G --executor-memory 8G \
+        --conf spark.executor.heartbeatInterval=10000 \
         target/scala-2.11/dstlr-assembly-0.1.jar \
-        --input triples-5000d-128s --neo4j.password password --neo4j.uri bolt://192.168.1.110:7687 --neo4j.batch.size 10000
\ No newline at end of file
+        --input triples --neo4j.password password --neo4j.uri bolt://localhost:7687 --neo4j.batch.size 10000