Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

diann2triqler generates incomplete input data #22

Open
tobiasko opened this issue Apr 17, 2023 · 2 comments
Open

diann2triqler generates incomplete input data #22

tobiasko opened this issue Apr 17, 2023 · 2 comments

Comments

@tobiasko
Copy link

Hi triqler developers,

first of all: Thanks for adding an import function for DIA-NN main reports. I tried using it by:

diann2triqler --file_list_file ~/Documents/anno.tsv --out_file ~/Documents/tmp/triqler_input.tsv ~/Downloads/2271000/WU287354/out-2023-03-28/diann-output.tsv
triqler.convert.diann version None
Copyright (c) 2018-2023 Matthew The, Patrick Truong. All rights reserved.
Written by:
- Matthew The ([email protected])
- Patrick Truong ([email protected])
in the School of Engineering Sciences in Chemistry, Biotechnology and Health
at the Royal Institute of Technology in Stockholm.
Issued command: diann.py --file_list_file /Users/tobiasko/Documents/anno.tsv --out_file /Users/tobiasko/Documents/tmp/triqler_input.tsv /Users/tobiasko/Downloads/2271000/WU287354/out-2023-03-28/diann-output.tsv

but got an output that is incomplete (run and condition column is empty):

head /Users/tobiasko/Documents/tmp/triqler_input.tsv
run	condition	charge	searchScore	intensity	peptide	proteins
		3	8.277991268693333	16392100.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		3	5.870511430361334	46752400.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		3	5.619801685261617	47872800.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		3	5.660269065244499	68029000.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		3	8.409200143913043	68997300.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		4	12.102047821511396	1022760.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		4	10.493966237675412	1983760.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		4	8.790321682860508	1721600.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108
		4	9.735068860911166	2366670.0	AAAAAAAAAPAAAATAPTTAATTAATAAQ	P37108

my annotation files looks like:

head ~/Documents/anno.tsv
20230307_002_S468675_negControl_Control.mzML	a
20230307_003_S468676_Probe1_Group_1.mzML	b
20230307_004_S468677_Probe2_Group_2.mzML	c
20230307_005_S478625_Probe3_Group_3.mzML	d
20230307_006_S478626_Probe4_Group_4.mzML	e

the DIA-NN main report looks like:

head /Users/tobiasko/Downloads/2271000/WU287354/out-2023-03-28/diann-output.tsv
File.Name	Run	Protein.Group	Protein.Ids	Protein.Names	Genes	PG.Quantity	PG.Normalised	PG.MaxLFQ	Genes.Quantity	Genes.Normalised	Genes.MaxLFQ	Genes.MaxLFQ.Unique	Modified.Sequence	Stripped.Sequence	Precursor.Id	Precursor.Charge	Q.Value	PEP	Global.Q.Value	Protein.Q.Value	PG.Q.Value	Global.PG.Q.Value	GG.Q.Value	Translated.Q.Value	Proteotypic	Precursor.Quantity	Precursor.Normalised	Precursor.Translated	Translated.Quality	Ms1.Translated	Quantity.Quality	RT	RT.Start	RT.Stop	iRT	Predicted.RT	Predicted.iRT	First.Protein.Description	Lib.Q.Value	Lib.PG.Q.Value	Ms1.Profile.Corr	Ms1.Area	Evidence	Spectrum.Similarity	Averagine	Mass.Evidence	CScore	Decoy.Evidence	Decoy.CScore	Fragment.Quant.Raw	Fragment.Quant.Corrected	Fragment.Correlations	MS2.Scan	IM	iIM	Predicted.IM	Predicted.iIM
/scratch/DIANN_A314/WU287354/20230307_002_S468675_negControl_Control.mzML	20230307_002_S468675_negControl_Control	P37108	P37108	SRP14_HUMAN	SRP14	1.63921e+07	1.10164e+07	1.44463e+07	1.63921e+07	1.10164e+07	1.44463e+07	1.44463e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ3	3	0.000254047	0.00751629	0.0010626	0.000270343	0.000267666	0.000234907	0.000268384	0	1	1.63921e+07	1.10164e+07	1.21446e+07		6.5771e+07	0.979711	32.8786	32.7155	33.044	31.1751	32.9558	30.9749	Signal recognition particle 14 kDa protein	0.0001731	0.000210526	0.996264	8.87744e+07	6.5037	0.844827	1	0	0.99422	1.46965	0.0158835	6.60174e+06;6.98001e+06;6.2017e+06;5.55096e+06;5.53627e+06;4.2541e+06;3.15814e+06;1.05764e+06;786330;1.02731e+06;603819;553297;	6.60174e+06;6.98001e+06;6.2017e+06;5.55096e+06;5.53627e+06;4.2541e+06;3.15814e+06;1.05764e+06;786330;1.02731e+06;603819;553297;	0.977612;0.977969;0.973274;0.972621;0.983241;0.978374;0.966172;0.969758;0.985712;0.942823;0.930083;0.959734;	20634	0	0	0	0
/scratch/DIANN_A314/WU287354/20230307_003_S468676_Probe1_Group_1.mzML	20230307_003_S468676_Probe1_Group_1	P37108	P37108	SRP14_HUMAN	SRP14	4.67524e+07	1.7241e+07	1.92563e+07	4.67524e+07	1.7241e+07	1.92563e+07	1.92563e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ3	3	0.00282143	0.196143	0.0010626	0.000221533	0.000253485	0.0002349070.000254259	0	1	4.67524e+07	1.7241e+07	2.72102e+07		1.22967e+08	0.991776	33.5899	33.4294	33.7503	31.1751	33.577	31.1972	Signal recognition particle 14 kDa protein	0.0001731	0.000210526	0.998894	2.11282e+08	6.85246	0.827158	1	0.862182	1.38556	0.00829299	1.94406e+07;1.91949e+07;1.80756e+07;1.60994e+07;1.52083e+07;1.21035e+07;8.76118e+06;2.71786e+06;2.27849e+06;2.14646e+06;2.08095e+06;1.78959e+06;	1.94406e+07;1.91949e+07;1.80756e+07;1.60994e+07;1.52083e+07;1.21035e+07;8.76118e+06;2.71786e+06;2.27849e+06;2.14646e+06;2.08095e+06;1.78959e+06;	0.992206;0.991059;0.992022;0.991335;0.991873;0.990964;0.992621;0.990091;0.991055;0.994416;0.985636;0.993594;	21229	0	0	0
/scratch/DIANN_A314/WU287354/20230307_004_S468677_Probe2_Group_2.mzML	20230307_004_S468677_Probe2_Group_2	P37108	P37108	SRP14_HUMAN	SRP14	4.78728e+07	7.45569e+07	6.67804e+07	4.78728e+07	7.45569e+07	6.67804e+07	6.67804e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ3	3	0.00362536	0.0622554	0.0010626	0.000282566	0.000365497	0.0002349070.000366838	0	1	4.78728e+07	7.45569e+07	4.651e+07		2.06901e+08	0.988583	33.3337	33.169	33.4967	31.1751	33.5277	30.592	Signal recognition particle 14 kDa protein	0.0001731	0.000210526	0.994478	2.12963e+08	6.76631	0.82038	1	0	.952574	1.95552	0.0009926	1.91125e+07;2.01564e+07;1.85404e+07;1.64623e+07;1.59419e+07;1.28183e+07;8.78569e+06;2.86711e+06;2.19047e+06;2.03757e+06;2.09137e+06;1.94675e+06;	1.91125e+07;2.01564e+07;1.85404e+07;1.64623e+07;1.59419e+07;1.28183e+07;8.78569e+06;2.86711e+06;2.19047e+06;2.03757e+06;2.09137e+06;1.94675e+06;	0.988607;0.987123;0.988697;0.987569;0.988235;0.98898;0.986544;0.986101;0.984394;0.979939;0.969332;0.992309;	20809	0	0	0	
/scratch/DIANN_A314/WU287354/20230307_005_S478625_Probe3_Group_3.mzML	20230307_005_S478625_Probe3_Group_3	P37108	P37108	SRP14_HUMAN	SRP14	6.8029e+07	9.99422e+07	9.59945e+07	6.8029e+07	9.99422e+07	9.59945e+07	9.59945e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ3	3	0.00348158	0.124682	0.0010626	0.000315259	0.000332889	0.000234907	0.00033389	0	1	6.8029e+07	9.99422e+07	7.12208e+07		3.43828e+08	0.99249	32.7995	32.6347	32.9646	31.1751	32.9432	30.7871	Signal recognition particle 14 kDa protein	0.0001731	0.000210526	0.997556	3.28419e+08	6.85753	0.851211	1	0	0.916983	.12423	0.00610422	2.74179e+07;2.87289e+07;2.66591e+07;2.41888e+07;2.22551e+07;1.8356e+07;1.29656e+07;4.26316e+06;3.57073e+06;3.03517e+06;3.13019e+06;2.62147e+06;	2.74179e+07;2.87289e+07;2.66591e+07;2.41888e+07;2.22551e+07;1.8356e+07;1.29656e+07;4.26316e+06;3.57073e+06;3.03517e+06;3.13019e+06;2.62147e+06;	0.992969;0.990732;0.991277;0.990546;0.991302;0.993214;0.991907;0.98782;0.985711;0.99099;0.990468;0.992394;	20389	0	0	0	0
/scratch/DIANN_A314/WU287354/20230307_006_S478626_Probe4_Group_4.mzML	20230307_006_S478626_Probe4_Group_4	P37108	P37108	SRP14_HUMAN	SRP14	6.89973e+07	1.30697e+08	1.05691e+08	6.89973e+07	1.30697e+08	1.05691e+08	1.05691e+08	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ3	3	0.000222808	0.00345861	0.0010626	0.00030303	0.000333667	0.0002349070.000334896	0	1	6.89973e+07	1.30697e+08	9.56483e+07		4.63718e+08	0.988966	33.6628	33.498	33.8296	31.1751	33.7499	30.9343	Signal recognition particle 14 kDa protein	0.0001731	0.000210526	0.994202	3.3451e+08	6.81443	0.839224	1	0.997559	0.392247	0.00268637	2.78639e+07;2.82421e+07;2.69313e+07;2.40967e+07;2.26158e+07;1.85177e+07;1.30993e+07;4.03366e+06;3.48445e+06;3.13427e+06;3.10429e+06;2.52782e+06;	2.78639e+07;2.82421e+07;2.69313e+07;2.40967e+07;2.26158e+07;1.85177e+07;1.30993e+07;4.03366e+06;3.48445e+06;3.13427e+06;3.10429e+06;2.52782e+06;	0.988595;0.988238;0.987579;0.988575;0.989639;0.988703;0.988102;0.988732;0.986705;0.985203;0.981841;0.983654;	20949	0	0	0
/scratch/DIANN_A314/WU287354/20230307_002_S468675_negControl_Control.mzML	20230307_002_S468675_negControl_Control	P37108	P37108	SRP14_HUMAN	SRP14	1.63921e+07	1.10164e+07	1.44463e+07	1.63921e+07	1.10164e+07	1.44463e+07	1.44463e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ4	4	5.54814e-06	5.54814e-06	3.12452e-05	0.000270343	0.000267666	0.000234907	0.000268384	0	1	1.02276e+06	690802	761544		921691	0.992176	32.8639	32.7007	33.0286	30.9212	32.8601	30.9357	Signal recognition particle 14 kDa protein	2.72478e-06	0.000210526	0.839792	1.23784e+06	6.22773	0.770542	1	0	.999996	1.46478	0.184098	404970;340150;277644;192080;213806;117049;144243;37453.2;44792.4;31578;11767.2;0;	404970;340150;277644;192080;213806;117049;144243;37453.2;44792.4;31578;11767.2;0;	0.988866;0.994455;0.994211;0.988567;0.96821;0.973227;0.992425;0.948632;0.934873;0.906882;0.530497;0;	20624	0	0	0	0
/scratch/DIANN_A314/WU287354/20230307_003_S468676_Probe1_Group_1.mzML	20230307_003_S468676_Probe1_Group_1	P37108	P37108	SRP14_HUMAN	SRP14	4.67524e+07	1.7241e+07	1.92563e+07	4.67524e+07	1.7241e+07	1.92563e+07	1.92563e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ4	4	2.77031e-05	3.68629e-05	3.12452e-05	0.000221533	0.000253485	0.0002349070.000254259	0	1	1.98376e+06	731558	1.15456e+06		2.11654e+06	0.983917	33.5752	33.415	33.7353	30.9212	33.4836	31.1572	Signal recognition particle 14 kDa protein	2.72478e-06	0.000210526	0.961882	3.63662e+06	5.62774	0.775991	1	0	.99997	1.6614	0.134911	889733;601818;492209;148091;244832;230450;162182;71635;202073;55980.6;57073.7;38331.1;	889733;601818;492209;148091;244832;230450;162182;71635;202073;55980.6;57073.7;38331.1;	0.982358;0.996716;0.971085;0.950065;0.941739;0.98919;0.966018;0.921609;0.845562;0.939;0.932851;0.838247;	21219	0	0	0	0
/scratch/DIANN_A314/WU287354/20230307_004_S468677_Probe2_Group_2.mzML	20230307_004_S468677_Probe2_Group_2	P37108	P37108	SRP14_HUMAN	SRP14	4.78728e+07	7.45569e+07	6.67804e+07	4.78728e+07	7.45569e+07	6.67804e+07	6.67804e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ4	4	0.000152199	0.000182927	3.12452e-05	0.000282566	0.000365497	0.0002349070.000366838	0	1	1.7216e+06	2.64235e+06	1.64835e+06		4.93663e+06	0.981658	33.3189	33.1535	33.4815	30.9212	33.4417	30.5483	Signal recognition particle 14 kDa protein	2.72478e-06	0.000210526	0.990952	5.15601e+06	6.22496	0.865859	1	0.999854	1.56904	0.158621	651915;548736;520951;347621;283782;196978;131609;57940.3;51600.4;60054.5;35806.7;16675.5;	651915;548736;520951;347621;283782;196978;131609;57940.3;51600.4;60054.5;35806.7;16675.5;	0.982183;0.988121;0.974193;0.980292;0.978389;0.953172;0.965087;0.978532;0.957752;0.974809;0.925948;0.645829;	20799	0	0	0	0
/scratch/DIANN_A314/WU287354/20230307_005_S478625_Probe3_Group_3.mzML	20230307_005_S478625_Probe3_Group_3	P37108	P37108	SRP14_HUMAN	SRP14	6.8029e+07	9.99422e+07	9.59945e+07	6.8029e+07	9.99422e+07	9.59945e+07	9.59945e+07	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ	AAAAAAAAAPAAAATAPTTAATTAATAAQ4	4	5.91716e-05	9.19448e-05	3.12452e-05	0.000315259	0.000332889	0.000234907	0.00033389	0	1	2.36667e+06	3.49189e+06	2.48839e+06		5.57617e+06	0.992525	32.8391	32.6741	33.0046	30.9212	32.8492	30.8939	Signal recognition particle 14 kDa protein	2.72478e-06	0.000210526	0.975219	5.30342e+06	6.6932	0.784224	1	0	.999933	1.21275	0.0746513	917976;767804;680893;569621;476183;377316;239178;85433.4;78361.1;71348.6;77283.3;63650.4;	917976;767804;680893;569621;476183;377316;239178;85433.4;78361.1;71348.6;77283.3;63650.4;	0.993232;0.991838;0.992345;0.976425;0.981981;0.976568;0.98792;0.976306;0.961929;0.962201;0.956322;0.937639;	20414	0	0	0	0

Any idea what is going wrong here?

@MatthewThe
Copy link
Contributor

It's probably because there are still file extensions in the mapping file. Can you try removing the .mzML
extensions?

Nevertheless, we should fix the converter to strip off the extensions or give a better error message.

@tobiasko
Copy link
Author

👍 removing the .mzML extension fixed the problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants