Disease transcript does not match HGVS description #4424

Leif-glitch · 2021-11-10T09:32:05Z

Leif-glitch
Nov 10, 2021

Just looked at a variant in scout and discovered a potential bug. Below are the disease associated trancript for FOXI1 and Refseq transcript. According to it the variant is a deep intronic substitution in the disease associated transcript.

However, in the list of transcripts below it is a missense in the same transcript, and looking in UCSC and other dbs this is the correct version, not the above, which is a bit worrying.

The variant is on https://scout.scilifelab.se/cust003/21285/d0b7fc56c3626d80d1bbdafe1a178c2c

dnil · 2021-11-10T09:45:39Z

dnil
Nov 10, 2021
Maintainer

Right, by the looks of it is the same as usual? 😊 This is really a question for the MIP team, but briefly, the consequence annotation is done using VEP with ENSMBL transcripts primarily. The old mapping between RefSeq and ENSEMBL for hg19 isn't super good and often maps multiple RefSeq transcripts on the same ENSEMBL one. This has largely been resolved by them for hg38 with the MANE annotations. Switching to hg38 will auto-solve most of these, and enable us to complain and correct the mappings for the remainder on the latest version.

Luckily it is in a very common variant, and one that has ClinVar annotation to support your decision. But it is great that you are attentive - that will vigilance will be needed at least until hg38 for this particular issue.

0 replies

dnil · 2021-11-10T09:53:45Z

dnil
Nov 10, 2021
Maintainer

Ahem, maybe I was a little hasty placing this with the old ENSEMBL-RefSeq mapping.

In this particular case, you could actually also discuss this with the HGNC curators:

They are transitioning into MANE as the primary transcripts, and you can see that there they actually have your preferred transcript marked as MANE, but not as Primary, which holds the other one better matching the ENSEMBL transcript highlighted.

If it would be helpful, maybe we could actually do one thing here. There are not that many gene panels with annotated Disease associated transcripts, but when they exist, we could perhaps highlight them for you in the summary? They are kind of close, so I guess you might see it anyway. What do you think? Would it help or just be confusing with one more marking on that little table?

0 replies

dnil · 2021-11-10T10:17:11Z

dnil
Nov 10, 2021
Maintainer

And on a side-note, this perhaps also illustrates why it is difficult to maintain a local list of disease causing transcript rather than pooling resources with other labs: note that it is time to consider updating the transcript from NM_012188.4 to NM_012188.5, unless the new version does not conform to the experimental evidence? 😊

0 replies

Leif-glitch · 2021-11-11T09:19:58Z

Leif-glitch
Nov 11, 2021
Author

Ahem, maybe I was a little hasty placing this with the old ENSEMBL-RefSeq mapping.

In this particular case, you could actually also discuss this with the HGNC curators:

They are transitioning into MANE as the primary transcripts, and you can see that there they actually have your preferred transcript marked as MANE, but not as Primary, which holds the other one better matching the ENSEMBL transcript highlighted.

If it would be helpful, maybe we could actually do one thing here. There are not that many gene panels with annotated Disease associated transcripts, but when they exist, we could perhaps highlight them for you in the summary? They are kind of close, so I guess you might see it anyway. What do you think? Would it help or just be confusing with one more marking on that little table?

Thanks for the answer! I was not aware of the issue of potential mismatching between refseq and ensambl. It makes more sense now.

here is an example from the same case -> https://scout.scilifelab.se/cust003/21278/08e5e659b2cfde6675aae922801e51a2
In this case it is a bit more worrying as i if wouldnt look up the complete list of transcript on the bottom of the variant page and and do some extra investigation i would actually think that it is the p.Leu1431Phe substitution. This is not true and it is actually p.Leu1430Phe and is an already published variant which changes the intepretation somewhat.

I do not think highlighting them in the summary will help though. As you say it will probably be more confusing. Another solution could be to add the information from the disease associated transcript from the transcript table in the bottom of the variant page of to its table, see below. What do you think?

0 replies

dnil · 2021-11-11T12:07:45Z

dnil
Nov 11, 2021
Maintainer

I do like that in general, but unfortunately it would not work in this case. Maybe we could (also?) add a warning to it, noting that we don't have annotations for one or more of the "disease associated transctipts".

In a better world all would perhaps instead use the MANE Select version; NM_000352.6. It seems to have been out for quite some time, so perhaps it is simply not mapping well to hg19 - the release comment says the new version was created "from transcript and genomic sequence data to make the sequence consistent with the reference genome assembly" (https://www.ncbi.nlm.nih.gov/nuccore/NM_000352).

But as it stands, if we would show data from the transcript tab on Disease associated transcripts, we would not be able to pick it exactly here, since that one is (manually or historically?) given as NM_000352.4. It is not quite simple. Most of the time, it will be fine, but at some level one perhaps has to get to know ones genes. This appears to be a tricky one with several active isoforms, and a couple of them of exactly the same size though slightly different AA sequence. I'll trust you if you say NM_000352.5:p.Leu1430Phe is published, but one has to be very careful to see that the publication coordinates was not for NM_000352.4, NM_000352.6 or NM_001287174.2.

My best current advice would actually be that we focus attention on switching to hg38, and revisit this and similar discussions afterward, since at least some of the interlocking issues will be ironed out by that. Sound ok?

0 replies

Leif-glitch · 2021-11-11T14:38:50Z

Leif-glitch
Nov 11, 2021
Author

I do like that in general, but unfortunately it would not work in this case. Maybe we could (also?) add a warning to it, noting that we don't have annotations for one or more of the "disease associated transctipts".

In a better world all would perhaps instead use the MANE Select version; NM_000352.6. It seems to have been out for quite some time, so perhaps it is simply not mapping well to hg19 - the release comment says the new version was created "from transcript and genomic sequence data to make the sequence consistent with the reference genome assembly" (https://www.ncbi.nlm.nih.gov/nuccore/NM_000352).

But as it stands, if we would show data from the transcript tab on Disease associated transcripts, we would not be able to pick it exactly here, since that one is (manually or historically?) given as NM_000352.4. It is not quite simple. Most of the time, it will be fine, but at some level one perhaps has to get to know ones genes. This appears to be a tricky one with several active isoforms, and a couple of them of exactly the same size though slightly different AA sequence. I'll trust you if you say NM_000352.5:p.Leu1430Phe is published, but one has to be very careful to see that the publication coordinates was not for NM_000352.4, NM_000352.6 or NM_001287174.2.

My best current advice would actually be that we focus attention on switching to hg38, and revisit this and similar discussions afterward, since at least some of the interlocking issues will be ironed out by that. Sound ok?

Sounds good, just one last question. Could it be easier to do as suggested if the version for the transcript is removed from the disease transcripts for our panels? (e.g. NM_000352 instead of NM_000352.5)? We always should go through the publications anywasy to ensure we are not tricked by older transcript versions when we find these variants :)

Thanks for the answers and clarification! longing for hg38 here

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disease transcript does not match HGVS description #4424

{{title}}

Replies: 6 comments

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

Disease transcript does not match HGVS description #4424

Leif-glitch Nov 10, 2021

Replies: 6 comments

dnil Nov 10, 2021 Maintainer

dnil Nov 10, 2021 Maintainer

dnil Nov 10, 2021 Maintainer

Leif-glitch Nov 11, 2021 Author

dnil Nov 11, 2021 Maintainer

Leif-glitch Nov 11, 2021 Author

Leif-glitch
Nov 10, 2021

dnil
Nov 10, 2021
Maintainer

dnil
Nov 10, 2021
Maintainer

dnil
Nov 10, 2021
Maintainer

Leif-glitch
Nov 11, 2021
Author

dnil
Nov 11, 2021
Maintainer

Leif-glitch
Nov 11, 2021
Author