-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Historical GRCh38 refseq #51
Comments
Looks like this might be worth 40k new transcripts (has 97k other than latest, but we had 57k of those historical ones already) Can't just get rid of old gffs.... as a few are still in them only I might put it at the very front so that only the ones in there are used if nothing else is available
Then after moving historical to after UTA
|
Looks like the exons were being read twice in the historical ones. Comparing vs latest:
This was due to me copy/pasting the GFF3 in the refseq_transcripts_grch38.sh |
Around 80% are valid via VG hgvs_ok code - the invalid ones are about 80% something failing eg NM_000016.3 has no alignment gaps and exon end-start adds up to 2423 while the sequence length is 2454 Need to compare vs latest to work out what's happening |
https://ncbiinsights.ncbi.nlm.nih.gov/2023/06/29/access-to-historical-human-transcript-alignments/
The text was updated successfully, but these errors were encountered: