Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to specify sample and molecular barcode design #1

Open
martinaryee opened this issue Jan 14, 2015 · 1 comment
Open

Add option to specify sample and molecular barcode design #1

martinaryee opened this issue Jan 14, 2015 · 1 comment

Comments

@martinaryee
Copy link
Contributor

demultiplex.py should have an option to specify how to construct the sample and molecular barcodes. e.g.:

python demultiplex.py --sample_barcode i1:2-8,i2:2-8 --molecular_barcode i2:9-16 . . .

to specify a 14bp sample barcode that consists of bases 2-8 of Index read 1 and bases 2-8 of index read 2. The molecular barcode in this example is found in bases 8-16 of index read 2.

@rchowe
Copy link

rchowe commented Jan 14, 2015

I would approach this in the following way:

  1. Split the option up based on which file the range is acting on. So i1:2-8,i2:2-8 becomes the array ['i1:2-8', 'i2:2-8'].
  2. For each item in the array, find out which file it references and the range, either using .split or using a regular expression.
import re

x = 'i1:2-8,i2:2-8'
parts = x.split(',')
for part in parts:
    m = re.match(r'(?P<filename>.+):(?P<lower_bound>\d+)-(?P<upper_bound>\d+)', part)
    filename = m.group('filename')
    lower_bound = m.group('lower_bound')
    upper_bound = m.group('upper_bound')

    # Process the barcodes based on this.

EDIT: I fixed a bug in the code.

llipkinMGH referenced this issue in MGHComputationalPathology/umi Dec 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants