Data and methods for PopMLST, a high-resolution method to detect pathogen strain-level diversity in clinical samples.
The publication for this method is in press, and the citation will appear here shortly.
This software has been tested on the Ubuntu Linux 20.04 operating system. It may be possible to adapt it for other operating systems. We provide two different methods you can use for installation.
We assume Bioconda is already properly configured and working, per the instructions.
$ conda create -y -n popmlst python=3.8 bioconductor-dada2=1.22.0 vsearch=2.14.0 blast=2.12.0 pandas=1.3.4 biopython=1.79 cutadapt=3.5 pigz=2.6 colorama=0.4.4
$ conda activate popmlst
$ sudo apt update && sudo apt install libtre5
$ wget https://github.com/marade/PopMLST/raw/master/tre-python3.tar.gz && tar xzvf tre-python3.tar.gz && cd tre-python3/python3 && python3 setup.py install && cd ../../ && rm -rf tre-python*
The following software dependencies were used for the paper. Other versions may work but have not been tested:
- Python 3.8.12
- Python libraries:
- Biopython 1.79
- Pandas 1.3.4
- colorama 0.4.4
- tre 0.8.0 (see below)
- Cutadapt 3.5
- VSEARCH 2.14.0
- pigz 2.6
- R 4.1.1
- DADA2 1.22.0
These are the versions used for the paper, though other versions may work.
The tre Python library hasn't been formally updated for Python 3.x, but community-contributed patches are available. For your convenience we provide a version of the tre library patched for Python 3.x. You can install it like this:
$ wget https://github.com/marade/PopMLST/raw/master/tre-python3.tar.gz
$ tar -xzvf tre-python3.tar.gz
$ cd tre-python/python3
$ python3 setup.py install
We assume you have generated your sequencing data in roughly the manner described in the paper, using Illumina paired-end sequencing. We provide some example files for testing below.
This pipeline assumes your paired-end Fastq files are named like so:
sampleX_1.fastq.gz sampleX_2.fastq.gz
sampleY_1.fastq.gz sampleY_2.fastq.gz
Below are instructions for two simple runs using example data for Pseudomonas aeruginosa and Staphylococcus aureus.
$ git clone https://github.com/marade/PopMLST.git
$ cd PopMLST
# run Pseudomonas data
$ python3 ProcessAmpliconData data/Pa PA-cutadapt.tab PA-results
$ Rscript DADA2-PA.R PA-results
$ python3 ParseDADA2Tabs ./ DADA2-PA out.tab D2-PA-combined.tab
$ python3 ParseDADA2Tab -f D2-PA-combined.tab PA-ref D2-PA-table.tab D2-PA-blast.tab
$ python3 FilterDADA2Tab D2-PA-table.filt.tab D2-PA-table.filt2.tab
$ python3 SortColNames D2-PA-table.filt2.tab D2-PA-table.filt.sorted.tab
# run Staph data
$ python3 ProcessAmpliconData data/Sa SA-cutadapt.tab SA-results
$ Rscript DADA2-SA.R SA-results
$ python3 ParseDADA2Tabs ./ DADA2-SA out.tab D2-SA-combined.tab
$ python3 ParseDADA2Tab -f D2-SA-combined.tab SA-ref D2-SA-table.tab D2-SA-blast.tab
$ python3 FilterDADA2Tab D2-SA-table.filt.tab D2-SA-table.filt2.tab
$ python3 SortColNames D2-SA-table.filt2.tab D2-SA-table.filt.sorted.tab
The resulting D2-PA-table.filt.sorted.tab file should look like this:
acs_4 | acs_6 | aro_5 | aro_5* | aro_75 | gua_11 | gua_16 | mut_12 | mut_3 | nuo_1 | nuo_4 | pps_23 | pps_6 | trp_1 | trp_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
LES100sub | 0 | 11264 | 0 | 17 | 0 | 19523 | 0 | 0 | 8338 | 0 | 19069 | 21683 | 0 | 6311 | 0 |
PaCont2sub | 19879 | 1572 | 94 | 0 | 3402 | 1931 | 25718 | 16812 | 1274 | 34964 | 3880 | 395 | 4743 | 567 | 5840 |
ASV | GGCCCGTTGGCCAACGGCGCCACCACCATTCTGTTCGAGGGCGTACCGAACTACCCCGACGTGACCCGCGTGGCGAAGATCATCGACAAGCACAAGGTTAACATCCTCTACACCGCGCCGACCGCGATCCGCGCGATGATGGCCGAAGGCAAGGCGGCGGTGGCCGGTGCCGACGGTTCCAGCCTGCGTCTGCTCGGTTCGGTGGGCGAGCCGATCAACCCGGAAGCCTGGCAGTGGTACTACGAGACCGTCGGCCAGTCGCGCTGCCCGATCGTCGACACCTGGTGGCAGACCGAGACCGGCGCCTGCCTGATGACCCCGTTGCCGGGCGCCCATGCGATGAAGCCGGGCTCCGCGGCCAAGCCGTTCTTCGGCGTGGTCCCGGCGCTG | GGCCCGTTGGCCAACGGCGCCACCACCATTCTGTTCGAGGGCGTGCCGAACTACCCCGACGTGACCCGCGTGGCGAAAATCATCGACAAGCACAAGGTCAACATCCTCTACACCGCGCCGACCGCGATCCGCGCGATGATGGCCGAAGGCAAGGCGGCGGTGGCCGGTGCCGACGGTTCCAGCCTGCGTCTGCTCGGTTCGGTGGGCGAGCCGATCAACCCGGAAGCCTGGCAGTGGTACTACGAGACCGTCGGCCAGTCGCGCTGCCCGATCGTCGACACCTGGTGGCAGACCGAGACCGGCGCCTGCCTGATGACCCCGTTGCCGGGCGCCCATGCGATGAAGCCGGGCTCCGCGGCCAAGCCGTTCTTCGGCGTGGTCCCGGCGCTG | ATGTCACCGTGCCGTTCAAGGAAGAGGCCTATCGTCTGGTGGACGAGTTGAGCGAGCGGGCCACCCGGGCCGGGGCGGTGAACACCCTGATCCGCCTCGCCGACGGTCGCCTGCGCGGCGACAACACCGACGGCGCCGGCCTGCTGCGGGACCTGACGGCGAACGCCGGGGTCGAGCTGCGCGGCAAGCGGGTTCTCCTGCTCGGCGCCGGCGGTGCGGTGCGTGGGGTGCTCGAACCCTTCCTCGGCGAGTGCCCGGCGGAGTTGCTGATCGCCAACCGCACGGCGCGGAAGGCCGTGGACCTGGCCGAGCGGTTCGCCGACCTCGGCGCGGTGCACGGCTGCGGTTTCGCCGAGGTCGAAGGGCCTTTCGACCTGATCGTCAACGGCACCTCGGCCAGTCTTGCCGGCGACGTGCCGCCGCTGGCGCAGAGCGTGATCGAGCCCGGCCGTACCGTCTGCTACGACATGATGTATGCCAAGGAACCGACTGCCTTCA | ATGTCACCGTGCCGTTCAAGGAAGAGGCCTATCGTCTGGTGGACGAGTTGAGCGAGCGGGCCACCCGGGCCGGGGCGGTGAACACCCTGATCCGCCTCGCCGACGGTCGCCTGCGCGGCGACAACACCGACGGCGCCGGCCTGCTGCGGGACCTGACGGCGAACGCCGGGGTCGAGCTGCGCGGCAAGCGGGGTCTCCTGCTCGGCGCCGGCGGTGCGGTGCGTGGGGTGCTCGAACCCTTCCTCGGCGAGTGCCCGGCGGAGTTGCTGATCGCCAACCGCACGGCGCGGAAGGCCGTGGACCTGGCCGAGCGGTTCGCCGACCGCGGCGCGGTGCACGGCTGCGGTTTCGCCGAGGTCGAAGGGCCTTTCGACCTGATCGTCAACGGCACCTCGGCCAGTCTTGCCGGCGACGTGCCGCCGCTGGCGCAGAGCGTGATCGAGCCCGGCCGTACCGTCTGCTACGACATGATGTATGCCAAGGAACCGACTGCCTTCA | ATGTCACCGTGCCGTTCAAGGAAGAGGCCTATCGTCTGGTGGACGAATTGAGCGAGCGGGCCACCCGGGCCGGGGCGGTGAACACCCTGATCCGCCTGGCCGACGGTCGCCTGCGCGGCGACAACACCGACGGCGCGGGCTTGCTGCGGGACCTGACGGCGAACGCCGGGGTCGAGCTGCGCGGCAAGCGGGTTCTCCTGCTCGGCGCCGGCGGTGCGGTGCGCGGGGTGCTCGAACCCTTCCTCGGCGAGTGCCCGGCGGAGTTGCTGATCGCCAACCGCACGGCGCGGAAGGCCGTGGACCTGGCCGAGCGATTCGCCGATCTCGGCGCGGTGCGCGGCTGCGGTTTCGCCGAGGTCGAAGGGCCTTTCGACCTGGTCGTCAACGGCACCTCGGCCAGTCTTGCCGGCGACGTGCCGCCGCTGGCGCAGAGCGTGATCGAGCCCGGCCGTACCGTCTGCTACGACATGATGTATGCCAAGGAACCGACTGCCTTCA | CTGCTAGGCCTCTCCGGCGGCGTGGACTCCTCGGTGGTCGCCGCGCTGCTGCACAAGGCCATCGGCGACCAACTGACCTGCGTGTTCGTCGACAACGGCCTGCTGCGCCTGCACGAAGGCGACCAGGTGATGGCCATGTTCGCCGAGAACATGGGCGTGAAGGTGATCCGCGCCAACGCCGAGGACAAGTTCCTCGGCCGCCTGGCCGGCGTCGCCGACCCGGAAGAGAAGCGCAAGATCATCGGCCGCACCTTCATCGAAGTTTTCGACGAAGAAGCCACCAAGCTGCAGGACGTGAAGTTCCTCGCCCAGGGCACCATCTACCCCGACGTGATCGAGTCGGCCGGCGCCAAGACCGGCAAGGCCCACGTGA | CTGCTCGGCCTCTCCGGCGGCGTGGACTCCTCGGTGGTCGCCGCGCTGCTGCACAAGGCCATCGGCGACCAACTGACCTGCGTGTTCGTCGACAACGGCCTGCTGCGCCTGCACGAAGGCGACCAGGTGATGGCCATGTTCGCCGAGAACATGGGCGTGAAGGTGATCCGCGCCAACGCCGAGGACAAGTTCCTCGGCCGCCTGGCCGGCGTCGCCGATCCGGAAGAGAAGCGCAAGATCATCGGCCGCACCTTCATCGAAGTCTTCGACGAAGAAGCCACCAAGCTGCAGGACGTGAAGTTCCTCGCCCAGGGCACCATCTACCCCGACGTGATCGAGTCGGCCGGCGCCAAAACCGGCAAGGCCCACGTGA | CTGCAGGAAGTCATCAAGCGCCTGGCGCTGGCCCGTTTCGACGTGGCTTTCCACCTGCGCCACAACGGCAAGACCATCTTCGCCCTGCACGAGGCGCGAGACGAGCTGGCCCGCGCGCGCCGGGTCGGCGCGGTGTGCGGCCAGGCATTCCTCGAGCAGGCGCTGCCGATCGAGGTCGAGCGCAACGGCCTGCACCTGTGGGGCTGGGTCGGCTTGCCGACCTTCTCCCGCAGCCAGCCGGACCTGCAGTACTTCTATGTGAACGGGCGCATGGTGCGCGACAAGCTGGTCGCCCACGCGGTGCGCCAGGCTTATCGCGACGTGCTGTACAACGGCCGGCACCCGACCTTCGTGCTGTTCTTCGAAGTCGATCCGGCGGTGGTGGACGTCAACGTGCACCCGACCAAGCACGAAGTTCGCTTCCGTGACAGCCGGATGGTCC | CTGCAGGAGGTCATCAAGCGCCTGGCGCTGGCCCGCTTCGACGTGGCTTTCCACCTGCGCCACAACGGCAAGACCATCTTCGCCCTGCACGAGGCGCGAGACGAGCTGGCCCGCGCGCGCCGGGTCGGCGCGGTGTGCGGCCAGGCATTCCTCGAGCAGGCGCTGCCGATCGAGGTCGAGCGCAACGGCCTGCACCTGTGGGGCTGGGTCGGCTTGCCGACCTTCTCCCGCAGCCAGCCGGACCTGCAGTACTTCTATGTGAACGGGCGCATGGTGCGCGACAAGCTGGTCGCCCACGCGGTGCGCCAGGCTTATCGCGACGTGCTGTACAACGGCCGGCATCCGACCTTCGTGCTGTTCTTCGAAGTCGATCCGGCGGTGGTGGACGTCAACGTGCACCCGACCAAGCACGAAGTTCGCTTCCGTGACAGCCGGATGGTCC | ATGTTCCTCAACCTCGGCCCGAACCACCCGTCCGCCCACGGCGCGTTCCGCATCATCCTGCAACTGGACGGCGAGGAGATCATCGACTGCGTCCCGGAGATCGGCTACCACCACCGCGGCGCCGAGAAGATGGCCGAGCGCCAGTCCTGGCACAGTTTCATCCCCTACACCGACCGCATCGACTACCTCGGCGGGGTGATGAACAACCTGCCCTACGTACTCTCGGTGGAGAAGCTCGCCGGGATCAAGGTGCCGCAGCGGGTCGACGTGATCCGGATCATGATGGCGGAGTTCTTCCGTATCCTGAACCACCTGCTGTACCTGGGCACCTATATCCAGGACGTCGGCGCCATGACCCCGGTGTTC | ATGTTCCTCAACCTCGGCCCGAACCACCCGTCCGCCCACGGCGCGTTCCGCATCATCCTGCAACTGGACGGCGAGGAGATCATCGACTGCGTCCCGGAGATCGGCTACCACCACCGCGGCGCCGAGAAGATGGCCGAGCGCCAGTCCTGGCACAGTTTCATCCCCTACACCGACCGCATCGACTACCTCGGCGGGGTGATGAACAACCTGCCCTACGTACTCTCGGTGGAGAAGCTCGCCGGGATCAAGGTGCCCCAGCGGGTCGACGTGATCCGGATCATGATGGCGGAGTTCTTCCGTATCCTGAACCACCTGCTGTACCTGGGCACCTATATCCAGGACGTCGGCGCCATGACCCCGGTGTTC | CATCGTCCAGGCACGCCCGGAAACCGTGAAGAGCCGCGCCAGCGCCACGGTCATGGAGCGCTACCTGCTGAAAGAGAAGGGGACCGTCCTGGTGGAAGGGCGTGCCATCGGCCAGCGCATCGGTGCCGGTCCGGTCAAGGTGATCAACGACGTGTCGGAAATGGACAAGGTCCAACCGGGTGACGTCCTGGTCTCCGACATGACCGACCCGGACTGGGAGCCGGTGATGAAGCGCGCCAGCGCCATCGTCACCAACCGCGGCGGGCGTACCTGCCACGCGGCGATCATCGCTCGCGAACTGGGCATCCCGGCGGTGTTCGGTTGCGGCAACGCCACCCAGATCCTGCAGGATGGCCAGGGGGTGACCGTT | CATCGTCCAGGCACGCCCGGAAACCGTGAAGAGCCGCGCCAGCGCCACGGTCATGGAGCGCTACCTGCTGAAAGAGAAGGGGACCGTCCTGGTGGAAGGACGTGCCATCGGCCAGCGCATCGGTGCCGGTCCGGTCAAGGTGATCAACGACGTGTCGGAAATGGACAAGGTCCAACCGGGTGACGTCCTGGTCTCCGACATGACCGACCCGGACTGGGAGCCGGTGATGAAGCGCGCCAGCGCCATCGTCACCAACCGCGGCGGGCGTACCTGCCACGCGGCGATCATCGCTCGCGAACTGGGCATCCCGGCGGTGGTCGGTTGCGGCAACGCCACCCAGATCCTGCAGGATGGGCAGGGGGTGACCGTT | TGTCGTGGGCAGCTCGCCGGAGGTGCTGGTACGGGTCGAGGATGGCCTGGTGACGGTGCGCCCGATCGCCGGTACCCGTCCGCGCGGGATCAACGAAGAGGCCGACCTGGCGCTGGAGCAGGATCTGCTGTCGGACGCCAAGGAGATCGCCGAGCACCTGATGCTGATCGACCTGGGGCGCAACGACGTGGGGCGGGTGTCCGATATCGGCGCGGTGAAGGTCACCGAAAAAATGGTGATCGAACGTTACTCCAACGTCATGCACATCGTGTCCAACGTCACCGGGCAATTGCGCGAGGGGCTCAGCGCGATGGACGCGCTGCGGGCGATTCTGCCGGCGGGCACTCTATCCGGCGCGCCGAAGATCCGCGCCATGGAGATCATCGACGAGCTGGAGCCGGTCAAGCGTGGAGTCTACGGCGGCGCGGTCGGCTACCTGGCAT | TGTCGTGGGCAGCTCGCCGGAGGTGCTGGTACGGGTCGAGGATGGCCTGGTGACGGTGCGCCCGATCGCCGGTACCCGTCCGCGCGGGATCAACGAAGAGGCCGACCTGGCGCTGGAGCAGGATCTGCTGTCGGACGCCAAGGAGATCGCCGAGCACCTGATGCTGATCGACCTGGGGCGCAACGACGTGGGGCGGGTGTCCGACATCGGCGCGGTGAAGGTCACCGAAAAAATGGTGATCGAACGTTACTCCAACGTCATGCACATCGTGTCCAACGTCACCGGGCAATTGCGCGAGGGGCTCAGCGCGATGGACGCGCTGCGGGCGATCCTGCCGGCGGGTACGCTGTCCGGCGCGCCGAAGATCCGCGCCATGGAGATCATCGACGAGCTGGAGCCGGTCAAGCGTGGAGTCTACGGCGGCGCGGTCGGCTACCTGGCAT |
The resulting D2-SA-table.filt.sorted.tab file should look like this:
arcC_10 | arcC_13 | arcC_3 | aroE_13 | aroE_14 | aroE_3 | glpF_1 | glpF_8 | gmk_1 | gmk_6 | pta_10 | pta_12 | pta_4 | tpi_11 | tpi_3 | tpi_4 | yqiL_13 | yqiL_2 | yqiL_3 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2-8 | 0 | 10933 | 0 | 13163 | 0 | 0 | 11680 | 0 | 17466 | 0 | 0 | 9385 | 0 | 19384 | 0 | 0 | 13791 | 0 | 0 |
No-36-33-10-1A | 3790 | 0 | 921 | 0 | 6148 | 341 | 357 | 3278 | 1594 | 9787 | 5867 | 0 | 962 | 0 | 13174 | 2309 | 0 | 5441 | 1033 |
ASV | TTATTAATCCAACAAGCTAAATCGAACAGTGACACAACGCCGGCAATGCCATTGGATACTTGTGGTGCAATGTCACAGGGTATGATAGGCTATTGGTTGGAAACTGAAATCAATCGCATTTTAACTGAAATGAATAGTGATAGAACTGTAGGCACAATCGTAACACGTGTGGAAGTAGATAAAGATGATCCACGATTCAATAACCCAACCAAACCAATTGGTCCTTTTTATACGAAAGAAGAAGTTGAAGAATTACAAAAAGAACAGCCAGACTCAGTCTTTAAAGAAGATGCAGGACGTGGTTATAGAAAAGTAGTTGCGTCACCACTACCTCAATCTATACTAGAACACCAGTTAATTCGAACTTTAGCAGACGGTAAAAATATTGTCATTGCATGCGGTGGTGGCGGTATTCCAGTTATAAAAAAAGAAAATACCTATGAAGGTGTTGAAGCG | TTATTAATCCAACAAGCTAAATCGAACAGTGACACAACGCCGGCAATGCCATTGGATACTTGTGGTGCAATGTCACAGGGTATGATAGGCTATTGGTTGGAAACTGAAATCAATCGCATTTTAACTGAAATGAATAGTGATAGAACTGTAGGCACAATCGTTACACGTGTGGAAGTAGATAAAGATGATCCACGATTCAATACCCCAACCAAACCAATTGGTCCTTTTTATACGAAAGAAGAAGTTGAAGAATTACAAAAAGAACAGCCAGACTCAGTCTTTAAAGAAGATGCAGGACGTGGTTATAGAAAAGTAGTTGCGTCACCACTACCTCAATCTATACTAGAACACCAGTTAATTCGAACTTTAGCAGACGGTAAAAATATTGTCATTGCATGCGGTGGTGGCGGTATTCCAGTTATAAAAAAAGAAAATACCTATGAAGGTGTTGAAGCG | TTATTAATCCAACAAGCTAAATCGAACAGTGACACAACGCCGGCAATGCCATTGGATACTTGTGGTGCAATGTCACAGGGTATGATAGGCTATTGGTTGGAAACTGAAATCAATCGCATTTTAACTGAAATGAATAGTGATAGAACTGTAGGCACAATCGTTACACGTGTGGAAGTAGATAAAGATGATCCACGATTTGATAACCCAACTAAACCAATTGGTCCTTTTTATACGAAAGAAGAAGTTGAAGAATTACAAAAAGAACAGCCAGACTCAGTCTTTAAAGAAGATGCAGGACGTGGTTATAGAAAAGTAGTTGCGTCACCACTACCTCAATCTATACTAGAACACCAGTTAATTCGAACTTTAGCAGACGGTAAAAATATTGTCATTGCATGCGGTGGTGGCGGTATTCCAGTTATAAAAAAAGAAAATACCTATGAAGGTGTTGAAGCG | AATTTTAATTCTTTAGGATTAGCTGATACTTATGAAGCTTTAAATATTCCAATTGAAGATTTTCATTTAATTAAAGAAATTATTTCAAAAAAAGAATTAGATGGCTTTAATATCACAATTCCTCATAAAGAACGTATCATATCGTATTTAGATCATGTTGATGAACAAGCGATTAATGCAGGTGCAGTTAACACTGTTTTGATAAAAGATGGCAAGTGGATAGGGTATAATACAGATGGTATTGGTTATGTTAAAGGATTGCACAGCGTTTATCCAGATTTAGAAAATGCATACATTTTAATTTTGGGCGCAGGTGGTGCAAGTAAAGGCATTGCTTATGAATTAGCAAAATTTGTAAAGCCCAAATTAACTGTTGCGAATAGAACGATGGCTCGTTTTGAATCTTGGAATTTAAATATAAACCAAATTTCATTAGCAGATGCTGAAAAGTATTTA | AATTTTAATTCTTTGGGATTAGATGATACTTATGAAGCTTTAAATATTCCAATTGAAGATTTTCATTTAATTAAAGAAATTATTTCAAAAAAAGAATTAGATGGCTTTAATATCACAATTCCTCATAAAGAGCGTATCATACCGTATTTAGATCATGTTGATGAACAAGCGATTAATGCAGGTGCAGTTAATACTGTTTTGATAAAAGATGGCAAGTGGATAGGGTATAATACAGATGGTATTGGTTATGTAAAAGGATTGCACAGCGTTTATCCAGATTTAGAAAATGCATACATTTTAATTTTGGGAGCAGGTGGTGCAAGTAAAGGTATTGCTTATGAATTAGCAAAATTTGTAAAGCCCAAATTAACTGTTGCGAATAGAACGATGGCTCGTTTTGAATCTTGGAATTTAAATATAAACCAAATTTCATTGGCAGATGCTGAAAAGTATTTA | AATTTTAATTCTTTAGGATTAGATGATACTTATGAAGCTTTAAATATTCCAATTGAAGATTTTCATTTAATTAAAGAAATTATTTCGAAAAAAGAATTAGAAGGCTTTAATATCACAATTCCTCATAAAGAACGTATCATACCGTATTTAGATTATGTTGATGAACAAGCGATTAATGCAGGTGCAGTTAACACTGTTTTGATAAAAGATGGCAAGTGGATAGGGTATAATACAGATGGTATTGGTTATGTTAAAGGATTGCACAGCGTTTATCCAGATTTAGAAAATGCATACATTTTAATTTTGGGCGCAGGTGGTGCAAGTAAAGGTATTGCTTATGAATTAGCAAAATTTGTAAAGCCCAAATTAACTGTTGCGAATAGAACGATGGCTCGTTTTGAATCTTGGAATTTAAATATAAACCAAATTTCATTAGCAGATGCTGAAAAGTATTTA | GGTGCTGATTGGATTGTCATCACAGCTGGATGGGGATTAGCGGTTACAATGGGTGTGTTTGCTGTCGGTCAATTCTCAGGTGCACATTTAAACCCAGCGGTGTCTTTAGCTCTTGCATTAGACGGAAGTTTTGATTGGTCATTAGTTCCTGGTTATATTGTTGCTCAAATGTTAGGTGCAATTGTCGGAGCAACAATTGTATGGTTAATGTACTTGCCACATTGGAAAGCGACAGAAGAAGCTGGCGCGAAATTAGGTGTTTTCTCTACAGCACCGGCTATTAAGAATTACTTTGCCAACTTTTTAAGTGAGATTATCGGAACAATGGCATTAACTTTAGGTATTTTATTTATCGGTGTAAACAAAATTGCCGATGGTTTAAATCCTTTAATTGTCGGAGCATTAATTGTTGCAATCGGATTAAGTTTAGGCGGTGCTACTGGTTATGCAATCAACCCAGCACGT | GGTGCTGATTGGATTGTCATCACAGCTGGATGGGGATTAGCGGTTACAATGGGTGTATATGCTGTCGGTCAATTCTCAGGTGCACATTTAAACCCAGCGGTGTCTTTAGCTCTTGCATTAGACGGAAGTTTTGATTGGTCATTAGTTCCTGGTTATATTGTTGCTCAAATGTTAGGTGCAATTGTCGGAGCAACGATTGTATGGTTAATGTACTTGCCACATTGGAAAGCGACAGAAGAAGCTGGCGCGAAATTAGGTGTTTTCTCTACAGCACCGGCTATTAAGAATTACTTTGCCAACTTTTTAAGTGAGATTATCGGAACAATGGCATTAACTTTAGGTATTTTATTTATCGGTGTAAACAAAATTGCCGATGGTTTAAATCCTTTAATTGTCGGAGCATTAATTGTTGCAATTGGATTAAGTTTAGGCGGTGCTACTGGTTATGCAATCAACCCAGCACGT | CGAATATTTGAAGATCCAAGTACATCATATAAGTATTCTATTTCAATGACAACACGTCAAATGCGTGAAGGTGAAGTTGATGGCGTAGATTACTTTTTTAAAACTAGGGATGCGTTTGAAGCTTTAATCAAAGATGACCAATTTATAGAATATGCTGAATATGTAGGCAACTATTATGGTACACCAGTTCAATATGTTAAAGATACAATGGACGAAGGTCATGATGTATTTTTAGAAATTGAAGTAGAAGGTGCAAAGCAAGTTAGAAAGAAATTTCCAGATGCGCTATTTATTTTCTTAGCACCTCCAAGTTTAGAACACTTGAGAGAGCGATTAGTAGGTAGAGGAACAGAATCTGATGAGAAAATACAAAGTCGTATTAACGAAGCGCGTAAAGAAGTTGAAATGATGAATTTA | CGAATATTTGAAGATCCAAGTACATCATATAAGTATTCTATTTCAATGACAACACGTCAAATGCGTGAAGGTGAAGTTGATGGCGTAGATTACTTTTTTAAAACTAGGGATGCGTTTGAAGCTTTAATTAAAGATGACCAATTTATAGAATATGCTGAATATGTAGGCAACTATTATGGTACACCAGTTCAATATGTTAAAGATACAATGGACGAAGGTCATGATGTATTTTTAGAAATTGAAGTAGAAGGTGCAAAGCAAGTTAGAAAGAAATTTCCAGATGCGTTATTTATTTTCTTAGCACCTCCAAGTTTAGATCACTTGAGAGAGCGATTAGTAGGTAGAGGAACAGAATCTGATGAGAAAATACAAAGTCGTATTAACGAAGCACGTAAAGAAGTTGAAATGATGAATTTA | GCAACACAATTACAAGCAACAGATTATGTTACACCAATCGTGTTAGGTGATGAGACTAAGGTTCAATCTTTAGCGCAAAAACTTAATCTTGATATTTCTAATATTGAATTAATTAATCCTGCGACAAGTGAATTGAAAGCTGAATTAGTTCAATCATTTGTTGAACGACGTAAAGGTAAAGCGACTGAAGAACAAGCACAAGAATTATTAAACAATGTGAACTACTTCGGTACAATGCTTGTTTATGCTGGTAAAGCAGATGGTTTAGTTAGTGGTGCAGCACATTCAACAGGCGACACTGTGCGTCCAGCATTACAAATCATCAAAACGAAACCAGGTGTATCAAGAACATCAGGTATCTTCTTTATGATTAAAGGTGATGAACAATACATCTTTGGTGATTGTGCAATCAATCCAGAACTTGATTCACAAGGACTTGCAGAAATTGCAGTAGAAAGTGCAAAATCAGCATTA | GCAACACAATTACAAGCAACAGATTATGTTACACCAATCGTGTTAGGTGATGAGACTAAGGTTCAATCTTTAGCGCAAAAACTTGATCTTGATATTTCTAATATTGAATTAATTAATCCTGCGACAAGTGAATTGAAAGCTGAATTAGTTCAATCATTTGTTGAACGACGTAAAGGTAAAGCGACTGAAGAACAAGCACAAGAATTATTAAACAATGTGAACTACTTCGGTACAATGCTTGTTTATGCTGGTAAAGCAGATGGTTTAGTTAGTGGTGCAGCACATTCAACAGGAGACACTGTGCGTCCAGCTTTACAAATCATCAAAACGAAACCAGGTGTATCAAGAACATCAGGTATCTTCTTTATGATTAAAGGTGATGAACAATACATCTTTGGTGATTGTGCAATCAATCCAGAACTTGATTCACAAGGACTTGCAGAAATTGCAGTAGAAAGTGCAAAATCAGCATTA | GCAACACAATTACAAGCAACAGATTATGTTACACCAATCGTGTTAGGTGATGAGACTAAGGTTCAATCTTTAGCGCAAAAACTTGATCTTGATATTTCTAATATTGAATTAATTAATCCTGCGACAAGTGAATTGAAAGCTGAATTAGTTCAATCATTTGTTGAACGACGTAAAGGTAAAGCGACTGAAGAACAAGCACAAGAATTATTAAACAATGTGAACTACTTCGGTACAATGCTTGTTTATGCTGGTAAAGCAGATGGTTTAGTTAGTGGTGCAGCACATTCAACAGGCGACACTGTGCGTCCAGCTTTACAAATCATCAAAACGAAACCAGGTGTATCAAGAACATCAGGTATCTTCTTTATGATTAAAGGTGATGAACAATACATCTTTGGTGATTGTGCAATCAATCCAGAACTTGATTCACAAGGACTTGCAGAAATTGCAGTAGAAAGTGCAAAATCAGCATTA | CACGAAACAGATGAAGAAATTAACAAAAAAGCGCACGCTATTTTCAAATATGGAATGACTCCAATTATTTGTGTTGGTGAAACAGACGAAGAGCGTGAAAGTGGTAAAGCTAACGATGTTGTAGGTGAGCAAGTTAAGAAAGCTGTTGCAGGTTTATCTGAAGATCAACTTAAATCAGTTGTAATTGCTTATGAGCCAATCTGGGCAATCGGAACTGGTAAATCATCAACATCTGAAGATGCAAATGAAATGTGTGCATTTGTACGTCAAACTATTGCTGACTTATCAAGCAAAGAAGTATCAGAAGCAACTCGTATTCAATATGGTGGTAGTGTTAAACCTAACAACATTAAAGAATACATGGCACAAACTGATATTGATGGGGCATTAGTAGGTGGCGCA | CACGAAACAGATGAAGAAATTAACAAAAAAGCGCACGCTATTTTCAAACATGGAATGACTCCAATTATTTGTGTTGGTGAAACAGACGAAGAGCGTGAAAGTGGTAAAGCTAACGATGTTGTAGGTGAGCAAGTTAAGAAAGCTGTTGCAGGTTTATCTGAAGATCAACTTAAATCAGTTGTAATTGCTTATGAACCAATCTGGGCAATCGGAACTGGTAAATCATCAACATCTGAAGATGCGAATGAAATGTGTGCATTTGTACGTCAAACTATTGCTGACTTATCAAGCAAAGAAGTATCAGAAGCAACTCGTATTCAATATGGTGGTAGTGTTAAACCTAACAACATTAAAGAATACATGGCACAAACTGATATTGATGGGGCATTAGTAGGTGGCGCA | CACGAAACAGATGAAGAAATTAACAAAAAAGCGCACGCTATTTTCAAACATGGAATGACTCCAATTATATGTGTTGGTGAAACAGACGAAGAGCGTGAAAGTGGTAAAGCTAACGATGTTGTAGGTGAGCAAGTTAAGAAAGCTGTTGCAGGTTTATCTGAAGATCAACTTAAATCAGTTGTAATTGCTTATGAACCAATCTGGGCAATCGGAACTGGTAAATCATCAACATCTGAAGATGCAAATGAAATGTGTGCATTTGTACGTCAAACTATTGCTGACTTATCAAGCAAAGAAGTATCAGAAGCAACTCGTATTCAATATGGTGGTAGTGTTAAACCTAACAACATTAAAGAATACATGGCACAAACTGATATTGATGGGGCATTAGTAGGTGGCGCA | GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTGATGAAGTTATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACAGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAGTCACCAATGCTTGTCAACAACAGTCGCTTTGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTAGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA | GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTAATGAAGTCATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACAGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAATCACCAATGCTTGTCAACAACAGTCGCTTTGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTAGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA | GCGTTTAAAGACGTGCCAGCCTATGATTTAGGTGCGACTTTAATAGAACATATTATTAAAGAGACGGGTTTGAATCCAAGTGAGATTGATGAAGTTATCATCGGTAACGTACTACAAGCAGGACAAGGACAAAATCCAGCACGAATTGCTGCTATGAAAGGTGGCTTGCCAGAAACAGTACCTGCATTTACGGTGAATAAAGTATGTGGTTCTGGGTTAAAGTCGATTCAATTAGCATATCAATCTATTGTGACTGGTGAAAATGACATCGTGCTAGCTGGCGGTATGGAGAATATGTCTCAATCACCAATGCTTGTCAACAACAGTCGCTTTGGTTTTAAAATGGGACATCAATCAATGGTTGATAGCATGGTATATGATGGTTTAACAGATGTATTTAATCAATATCATATGGGTATTACTGCTGAAAATTTAGTAGAGCAATATGGTATTTCAAGAGAAGAACAAGATACATTTGCTGTAAACTCACAACAAAAAGCAGTACGTGCACAGCAA |
Please consult the paper for more details on this method.