Banner proteomics metadata discrepancies #14

avanlinden · 2021-10-26T18:53:53Z

@jgockley62 identified inconsistencies in CERAD scores for individuals from the Banner cohort in the original Banner LFQ proteomics traits file and the new Consensus project TMT proteomics on the same samples.

The original Banner study needs updated metadata files that meet our current metadata standards. The original Banner case IDs (individualIDs) have been corrupted and lost from the existing traits file and can be taken from the consensus project traits file. The CERAD discrepancies are due to a change in how CERAD was evaluated and a note should be added to both study descriptions.

create standardized metadata files for original Banner LFQ data
re-map Banner caseIDs using consensus file + Tom Beach's data sheet forward from Eric dammer
update CERAD information
document changes to CERAD metadata in Banner study

avanlinden · 2021-10-26T18:55:10Z

Jake's original email to Eric Dammer:

Hey Eric,

I was digging around the Banner LFQ/TMT samples and I ran into a bit of a conundrum.
The CERAD scores for individuals are quite different between the former LFQ samples versus the new TMT samples.

The LFQ meta-data synID we have is syn9740295
And I'm using the new TMT from the consensus paper located here: syn25006658

I matched the samples from individuals with TMT and LFQ and noticed some discrepancies:

LFQ - syn9740295
table(comp_trial$CERAD)
-1 0 1 2 3
. 36 4 37 5 78

TMT - syn25006658
table(temp_trial$CERAD)
0 1 2 3
23 15 25 97

And compared also has some discordance beyond a simple adjustment
table( comp_trial$CERAD, temp_trial$CERAD)
     0   1   2  3
-1 22 13 0 1
0 1 1 2 0
1 0 1 17 19
2 0 0 5 0
3 0 0 1 77

Not too sure where the discordance comes from but I thought I'd try and track it down. I cc'd Mette and Abby on our DCC team as they have more info on the LFQ data side.

Best,
Jake

avanlinden · 2021-10-26T18:57:38Z

Eric's reply:

This overlooked CERAD discrepancy is troubling and should be addressed without compromising reproducibility of the published LFQ consensus analyses. See explanation in the email just forwarded to you and Mette (cc: Jim and Erik). I recommend keeping both the Mirra 1991 based score and adding in the updated plaque density-based CERAD, independent of cognition in the traits for the LFQ 201 Banner cases.

I cannot see any Banner case IDs in the LFQ traits on the SynID you provided, but do see the 201 cases with their original batch_runNumber file ID. The Banner IDs which were 2 numbers separated by a dash likely corrupted into date formatted cells by excel and then discarded, had to be remapped to the file IDs so that the CERAD differences for the same Banner IDs are clear. Please rely on the censored traits for the same 201 case samples in the Nature Neurosci TMT Banner traits attached here, based on Tom Beach's February 2019 update of CERAD from the prior Mirra 1991 criteria-based scores. The green tab has the map of Banner ID to LFQ fileID to TMT batch.channel, along with both CERAD score versions (Mirra 1991 and Beach 2019).

Sincerely,

Eric

The files Eric attached contain some potentially PHI so I uploaded them in the Staging folder of the original Banner study here: https://www.synapse.org/#!Synapse:syn26403225.

avanlinden · 2021-10-26T18:58:19Z

Jake identified three missing sample IDs from Eric's attached files that are not in the original LFQ metadata: Sample IDs are: b4_134_04, b4_007_23, and b3_041_03

Eric responded:

It looks like 9 case samples per TMT batch x 22 banner TMT batches = 198, which is short those 3 cases from the 201 originally purchased, received, and run for LFQ proteomics dating back to 2014.

Tom Beach's sheet in response to Erik's questions in the forwarded email should have the 3 corrected/updated CERAD scores, however.

avanlinden · 2021-10-26T19:59:55Z

Further information from Eric on the CERAD score changes:

Jake,

I confirm the discrepancy in CERAD 0-3 (previously 0, A, B, or C and corresponding literal key) for a number of the same 201 case samples from Banner Sun Health between the LFQ and the TMT traits for prefrontal cortex proteomics. I think the explanation you need dates back to the below February 2019 email from Tom Beach at Banner in response to our request to guarantee accuracy of the scale, and adaptive renumbering he performed at that time, and that we later used for the TMT, but did not correct/update in the LFQ traits. See below.

In a direct reply to your RFI, I will attach the full trait comparison with Banner IDs mapped to both LFQ and TMT batch runNumber/channel and corresponding CERAD. The discrepancies should make sense given the below logic.

Sorry we did not go back and amend the traits for the LFQ at the time.

May I suggest Mette, and the clinician scientists (Jim and Erik, cc:) confer on how best to address the LFQ traits? For reproducibility, the 1991 scoring used for correlations with the LFQ data should probably be retained, but displayed alongside the updated CERAD scores consistent with quantitative plaque density.

Sincerely,

Eric

The files he attached (PDF explaining CERAD scores and mapping file) are in the Staging folder: https://www.synapse.org/#!Synapse:syn26403225.

avanlinden · 2021-10-26T20:04:40Z

I saved the forwarded 2019 email thread from Erik Johnson and Thomas Beach explaining the CERAD changes as a pdf and uploaded it here (too long): https://www.synapse.org/#!Synapse:syn26403241.

avanlinden assigned avanlinden and amapeters Oct 26, 2021

avanlinden added the curation issue related to curation or cleaning of AD portal data label Nov 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Banner proteomics metadata discrepancies #14

Banner proteomics metadata discrepancies #14

avanlinden commented Oct 26, 2021 •

edited

Loading

avanlinden commented Oct 26, 2021

avanlinden commented Oct 26, 2021 •

edited

Loading

avanlinden commented Oct 26, 2021 •

edited

Loading

avanlinden commented Oct 26, 2021

avanlinden commented Oct 26, 2021

Banner proteomics metadata discrepancies #14

Banner proteomics metadata discrepancies #14

Comments

avanlinden commented Oct 26, 2021 • edited Loading

avanlinden commented Oct 26, 2021

avanlinden commented Oct 26, 2021 • edited Loading

avanlinden commented Oct 26, 2021 • edited Loading

avanlinden commented Oct 26, 2021

avanlinden commented Oct 26, 2021

avanlinden commented Oct 26, 2021 •

edited

Loading

avanlinden commented Oct 26, 2021 •

edited

Loading

avanlinden commented Oct 26, 2021 •

edited

Loading