Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to version 5.1.0 #135

Merged
merged 1 commit into from
Nov 14, 2023
Merged

Conversation

susannasiebert
Copy link
Member

@susannasiebert susannasiebert commented Oct 16, 2023

This version introduces a few bugfixes and changes:

The Docker container is now based on the python 3.11 base image (instead of ubuntu) so that might introduce some subtle differences although I don't believe any changes are necessary in how we call the tools.

Due to these somewhat extensive changes, I would suggest to do a test immuno.wdl run before merging.

@malachig
Copy link
Member

I performed a full test of immuno.wdl with this PR applied. It provided me with this example of the impact of this fix.

A variant was called in the gene: LILRB2 and the transcript selected in pVACview (aggregate report) was: ENST00000391749.4 (ENSG00000131042).

The kallisto gene_abundance.tsv shows:

gene_name	gene	abundance	counts	length
LILRB2	ENSG00000131042	1.22981410950827	37.4431712278154	1519.92061351928
LILRB2	ENSG00000274513	0.523648071417863	3.67733451548483	350.57535471207
LILRB2	ENSG00000275463	0.229592362639494	4.14209850196773	900.638976403266
LILRB2	ENSG00000276146	2.85862722432629	94.9987124328477	1659.00595957922
LILRB2	ENSG00000277751	0.170873793229455	1.42026247930074	414.935799292377

Before the VAtools fix, the gene expression value annotated into the VCF was 5.013 (the sum of all genes with the ambiguous name "LILRB2", instead of just the one correct one). After the fix, the gene expression value was 1.230 (the correct gene and transcript).

@malachig
Copy link
Member

Additional examples followed the same pattern and we should now also see gene expression values for genes that did not have an ID at all.

@malachig
Copy link
Member

Also reviewing the parsing of ADF and ADR (for DNA) and RADF and RADR (for RNA) annotations for forward and reverse strand depth.

VCF record before and after applying this PR:

chr22	49885855	.	G	A	GT:AD:AF:DP:F1R2:F2R1:FAD:SB:MQ0:MQ0FRAC:RDP:RAF:RAD:GX:TX	0/0:103,0:0.0:103:41,0:40,0:83,0:58,45,0,0:.:.:.:.:.:.:.
chr22	49885855	.	G	A	GT:AD:AF:DP:F1R2:F2R1:FAD:SB:ADF:ADR:MQ0:MQ0FRAC:RDP:RAF:RAD:RADF:RADR:GX:TX	0/0:103,0:0.0:103:41,0:40,0:83,0:58,45,0,0:57,0:46,0:.:.:.:.:.:.:.:.:.

The new annotations do indeed appear to be present.

@malachig
Copy link
Member

This PR appears to be behaving as expected.

@malachig malachig merged commit 80bf6f6 into wustl-oncology:main Nov 14, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants