Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No alignments displayed #29

Open
snakesch opened this issue Dec 21, 2023 · 16 comments
Open

No alignments displayed #29

snakesch opened this issue Dec 21, 2023 · 16 comments

Comments

@snakesch
Copy link

snakesch commented Dec 21, 2023

Dear Jim,

Thank you for the amazing IGV notebook tool.

I recently wanted to visualize some alignments (in BAM format) using IGV notebook against the T2T genome from RefSeq (GCF_009914755.1). I tried using both relative paths (code provided below) and absolute paths but none of them was able to show alignments. Could you please advise if there is any incorrect usage with the following implementations? Many thanks in advance!

Note: Jupyter server was started under the directory /paedyl01/disk1/louisshe/. Working directory is /paedyl01/disk1/louisshe/work/common. The version of igv-notebook I am using us 0.5.2.

(Relative paths)

import igv_notebook

igv_notebook.init()

# Browse a local FASTA genome
b = igv_notebook.Browser(
    {
        "reference": {
            "id": "T2T-CHM13v2.0",
            "name": "T2T",
            "fastaURL": "ref/T2T/GCF_009914755.1/GCF_009914755.1_T2T-CHM13v2.0_genomic.fa",
            "indexURL": "ref/T2T/GCF_009914755.1/GCF_009914755.1_T2T-CHM13v2.0_genomic.fa.fai"
        },
        "locus": "NC_060928.1:193,432,000-193,554,000"
    }
)

# Load alignment data
b.load_track(
    {
        "name": "HG02723 PacBio HiFi",
        "url":  "out/D4Z4_HG02723_test/PacBio_HiFi/merged_mapped.sorted.bam",
        "indexURL": "out/D4Z4_HG02723_test/PacBio_HiFi/merged_mapped.sorted.bam.bai",
        "format": "bam",
        "type": "alignment"
    }
)

b.zoom_in()

(absolute paths)

import igv_notebook

igv_notebook.init()

# Browse a local FASTA genome
b = igv_notebook.Browser(
    {
        "reference": {
            "id": "T2T-CHM13v2.0",
            "name": "T2T",
            "fastaURL": "/paedyl01/disk1/louisshe/ref/T2T/GCF_009914755.1/GCF_009914755.1_T2T-CHM13v2.0_genomic.fa",
            "indexURL": "/paedyl01/disk1/louisshe/ref/T2T/GCF_009914755.1/GCF_009914755.1_T2T-CHM13v2.0_genomic.fa.fai"
        },
        "locus": "NC_060928.1:193,432,000-193,554,000"
    }
)

# Load alignment data
b.load_track(
    {
        "name": "HG02723 PacBio HiFi",
        "url":  "/paedyl01/disk1/louisshe/out/D4Z4_HG02723_test/PacBio_HiFi/merged_mapped.sorted.bam",
        "indexURL": "/paedyl01/disk1/louisshe/out/D4Z4_HG02723_test/PacBio_HiFi/merged_mapped.sorted.bam.bai",
        "format": "bam",
        "type": "alignment"
    }
)

b.zoom_in()
@jrobinso
Copy link
Contributor

It might be due to a sequence name mismatch. Compare the sequence names in our bam with those in the fasta. Or just add this to your "reference" definition and try again.

"aliasURL": "https://s3.amazonaws.com/igv.org.genomes/chm13v2.0/GCA_009914755.4.chromAlias.txt"

@snakesch
Copy link
Author

snakesch commented Dec 22, 2023

Hi Jim,

Thank you for the suggestion. The alignment file (merged_mapped.sorted.bam) was produced by mapping raw sequencing reads to the customised T2T genome, and therefore should have the same contig names.

I also tried adding aliasURL as suggested but I still got no alignments in the output IGV. It appears that the reference was not detected/contig were not recognised.

A snapshot of the IGV output:
image

On a side note, I noticed in the aliasURL that the contig names were like CP068277.2 but the ones I have in the customised reference are NC_060925.1. Is there another alias file I can use?

Thanks,
Louis

@jrobinso
Copy link
Contributor

What are the names in your fasta file? You can actually get these from the fasta index (fai), which is just a plain text file.

@jrobinso
Copy link
Contributor

Sorry maybe you answered that, so what are the names in your BAM file? You can get these with samtools

samtools view -H  <bam file>

@jrobinso
Copy link
Contributor

Ah, you might try this chrom alias file, which I just discovered at UCSC

https://hgdownload.soe.ucsc.edu/goldenPath/hs1/bigZips/hs1.chromAlias.txt

@snakesch
Copy link
Author

I tried using the UCSC alias file and creating one from fai index file. Both failed.

Here are the contig names I found from the BAM file:
image

Thanks,
Louis

@jrobinso
Copy link
Contributor

OK, well I'm out of ideas. If you are able to zip up and share a test case I will look into it. I can provide you a secure dropbox link. Email [email protected] if you would like to do this.

@snakesch
Copy link
Author

Many thanks! I just sent an email to the address. As all the data are downloaded from public domains, I think there is no concern to share the data.

@jrobinso
Copy link
Contributor

OK, you should get a link from dropbox to upload the files. If you don't let me know.

Also, could you share a notebook? Or just paste the notebook contents here.

@snakesch
Copy link
Author

Hi Jim,

Could you send the link to the address [email protected]? I will have the notebook shared with the FASTA and BAM.

Thanks,
Louis

@jrobinso
Copy link
Contributor

Yes, sent.

@jrobinso
Copy link
Contributor

jrobinso commented Jan 8, 2024

I think you have a path problem, the cell below worked for me from your test data. Unfortunately it looks like igv-notebook does not give a friendly message or any message, if the URL isn't found. I will work on that.

Where is this path relative to the location of the notebook?

out/D4Z4_HG02723_test/PacBio_HiFi/merged_mapped.sorted.bam

This worked for me, all files in the directory of the notebook as you sent to us. Note I increased the visibilityWindow to 100kb from the default 30kb as your initial locus is 62kb.

import igv_notebook

igv_notebook.init()

# Browse a local FASTA genome
b = igv_notebook.Browser(
    {
        "reference": {
            "id": "T2T-CHM13v2.0",
            "name": "T2T",
            "fastaURL": "GCF_009914755.1_T2T-CHM13v2.0_genomic.fa",
            "indexURL": "GCF_009914755.1_T2T-CHM13v2.0_genomic.fa.fai",
            "aliasURL": "https://hgdownload.soe.ucsc.edu/goldenPath/hs1/bigZips/hs1.chromAlias.txt"
        },
        "locus": "NC_060928.1:193,432,000-193,554,000"
    }
)

# Load alignment data
b.load_track(
    {
        "name": "HG02723 PacBio HiFi",
        "url":  "merged_mapped_chr4.bam",
        "indexURL": "merged_mapped_chr4.bam.bai",
        "format": "bam",
        "type": "alignment",
        "visibilityWindow": 100000
    }
)


b.zoom_in()

@jrobinso
Copy link
Contributor

jrobinso commented Jan 8, 2024

Also, BTW, please leave this issue open as the lack of an error message needs to be addressed.

@snakesch
Copy link
Author

snakesch commented Jan 8, 2024

Dear Jim,

Thank you for your kind help on the issue.

The issue was in fact related to paths. I might have mistaken "the directory relative to the notebook directory" for "the directory where the notebook server is hosted", and hence the issue. I also noticed that absolute paths do not appear to be supported on JupyterLab. Could you let me know if that's expected?

I will keep the issue open as suggested.

Thanks,
Louis

@jrobinso
Copy link
Contributor

jrobinso commented Jan 8, 2024

Absolute paths are not supported on JupyterLab. I find the best method for JupyterLab is to find the file in the Jupyter file widget, then right-click and use "copy path".

@snakesch
Copy link
Author

snakesch commented Jan 8, 2024

Thank you very much for the advice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants