Update README.md

add line about fragment length
quzhouxiachuan · Apr 18, 2016 · b6a8ea6 · b6a8ea6
1 parent 9373ad4
commit b6a8ea6
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -26,7 +26,7 @@ The input to 4C-ker are 4 column tab-delimited count files with <i>chr</i> <i>st
 
 **1. Creating a reduced genome and mapping 4C-Seq reads**
 
-4C-Seq reads are usually mapped to a reduced genome consisting of unique sequence fragments adjacent to the primary restriction enzyme sites in the genome. A script (reduced_genome.sh) has been provided to create a custom reduced genome (this requires oligoMatch (ucsc command line tools - kent) and fastaFromBed (bedtools)). Modify line 13,15,17 in reduced_genome.sh to reflect the correct size of fragment, primary enzyme and genome. In addition a FASTA(.fa) file for the primary enzyme and genome are required (files should be named according to what is provided in the .sh script - example: mm10.fa and hindiii.fa). The reduced genome can be used to map 4C-Seq single-end reads with <i>bowtie2</i>. First a bowtie2 index is created using <i>bowtie2-build</i>. With bowtie2, the -5 option can be used to trim the first ‘x’ bps that contain the barcode (if present) and the bait sequence including the RE. Below is an example to map 51bp long reads where the first 26 bps of the read contain a 6bp barcode + 20 bp of the sequence containing the bait sequence:
+4C-Seq reads are usually mapped to a reduced genome consisting of unique sequence fragments adjacent to the primary restriction enzyme sites in the genome. A script (reduced_genome.sh) has been provided to create a custom reduced genome (this requires oligoMatch (ucsc command line tools - kent) and fastaFromBed (bedtools)). Modify line 13,15,17 in reduced_genome.sh to reflect the correct size of fragment, primary enzyme and genome. The fragment length should be length of the read - (size of barcode+size of primer including the RE sequence). In addition a FASTA(.fa) file for the primary enzyme and genome are required (files should be named according to what is provided in the .sh script - example: mm10.fa and hindiii.fa). The reduced genome can be used to map 4C-Seq single-end reads with <i>bowtie2</i>. First a bowtie2 index is created using <i>bowtie2-build</i>. With bowtie2, the -5 option can be used to trim the first ‘x’ bps that contain the barcode (if present) and the bait sequence including the RE. Below is an example to map 51bp long reads where the first 26 bps of the read contain a 6bp barcode + 20 bp of the sequence containing the bait sequence:
 
 ```
 bowtie2-build mm10_hindiii_flanking_sequences_25_unique.fa mm10_hindiii_flanking_sequences_25_unique