idemp

Work needs to be done: trim R1 reads to get rid of barcodes in case R1 reads into barcode when template is short.

Barcode demultiplex for Illumina I1, R1, R2 fastq.gz files.

Only for typical Illumina runs, where the barcode sequence reads are saved in the I1_*.fastq.gz files, and the first fields of the sequence names are exactly same for I1, R1, and/or R2 fastq.gz files. This little program works as follows:

compare sequence names in the I1, R1, R2 read files;
read in barcode sequence from the I1 file;
read in barcode and sample id table;
calculate minimum edit distance among the designed barcodes, nEd;
find exact match for each barcode sequence read;
for reads not fully matched, calculate the minimum edit distance with the designed barcodes;
assign a barcode sequence if the edit distance is smaller than a cutoff;
by default, the cutoff=min(n, nEd), n(n=1) can be set by user.

Functions to be added

Trim R1 reads to get rid of barcode ends.

Compile and test

git clone https://github.com/yhwu/idemp
cd idemp
make
make test

Usage

Usage:
   idemp -b code -I1 I1 -R1 R1 -R2 R2 -m n -o folder

Options:
   code    barcode file, each line contains barcode\tid
   I1      barcode fastq file, text or gzipped
   R1      read1 fastq file, text or gzipped
   R2      read2 fastq file, text or gzipped, optional
   n       allowed base mismatches, optional, default=1
   folder  output folder, optional, default=.

Output:
   folder/R1.id.fastq.gz   #reads assigned to ids
   folder/R2.id.fastq.gz   #reads assigned to ids
   folder/I1.id            #read name to id
   folder/I1.id.stat       #barcode base error stat

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
I1.fastq		I1.fastq
LICENSE		LICENSE
Makefile		Makefile
R1.fastq		R1.fastq
R2.fastq		R2.fastq
README.md		README.md
barcode_sample.txt		barcode_sample.txt
checkDecodedFiles.R		checkDecodedFiles.R
functions.cpp		functions.cpp
functions.h		functions.h
idemp.cpp		idemp.cpp
kseq.h		kseq.h
kseq_test.c		kseq_test.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

idemp

Functions to be added

Compile and test

Usage

About

Releases

Packages

Languages

License

marissafujimoto/idemp

Folders and files

Latest commit

History

Repository files navigation

idemp

Functions to be added

Compile and test

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages