Add cg/wgMLST allele calling pipeline #3

apetkau · 2023-08-21T13:48:04Z

1. Purpose

The cg/wgMLST allele calling pipeline will be used for calling alleles from genomic sequence data.

Note: This is an in-development description of this pipeline.

2. Input

2.1. Sequence data

The main input for this pipeline will be genomic sequence data. This will be in the form of either reads or assemblies. This will be provided to Nextflow via a --input samplesheet.csv file. The SampleSheet will be structured as follows:

sample	assembly	fastq_1	fastq_2
SampleA	/path/to/SampleA.fasta.gz
SampleB		/path/to/SampleB_1.fastq.gz	/path/to/SampleB_2.fastq.gz

2.2. MLST scheme

An MLST scheme will be provided, using the following parameters:

--mlst_scheme_name: The name of the scheme.
--mlst_scheme_data: Path to the data for the scheme.

3. Steps

The steps of this pipeline are to generate a (cg/wg)MLST profile from the input data.

4. Output

4.1. Tabular allele files

A table of all allele identifiers for every locus in the scheme will be provided.

sample	locus1	locus2	...
SampleA	5	10	...

4.2. JSON metadata

A JSON file output.json will be provided with all the allele calls structured in a way that they can be loaded by other systems (e.g., IRIDA Next). This will look like:

{
    "SampleA": {
        "listeria_cgmlst": {
            "locus1": 5,
            "locus2": 10,
        },
    },
    "SampleB": {
        "listeria_cgmlst": {
            "locus1": 1,
            "locus2": 10,
        },
    },
}

The text was updated successfully, but these errors were encountered:

apetkau · 2023-08-21T15:39:13Z

Test implementation at https://github.com/apetkau/nf-core-mlstprofiler

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add cg/wgMLST allele calling pipeline #3

Add cg/wgMLST allele calling pipeline #3

apetkau commented Aug 21, 2023 •

edited

Loading

apetkau commented Aug 21, 2023

Add cg/wgMLST allele calling pipeline #3

Add cg/wgMLST allele calling pipeline #3

Comments

apetkau commented Aug 21, 2023 • edited Loading

1. Purpose

2. Input

2.1. Sequence data

2.2. MLST scheme

3. Steps

4. Output

4.1. Tabular allele files

4.2. JSON metadata

apetkau commented Aug 21, 2023

apetkau commented Aug 21, 2023 •

edited

Loading