Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Medline Record Drop - MeSH --> RxNorm #87

Open
ericaVoss opened this issue May 12, 2016 · 6 comments
Open

Medline Record Drop - MeSH --> RxNorm #87

ericaVoss opened this issue May 12, 2016 · 6 comments
Assignees

Comments

@ericaVoss
Copy link
Contributor

Related to #85.

One place #85 exists is in the 20160311 Vocabulary. Based on how some of the MeSH tagging has changed in PubMed we are missing some of the MeSH --> RxNorm maps.

I have an updated copy of the MeSH-->RxNorm map that will be coming out in the next release of the OMOP Vocabulary and I compared it to the 20160311 Vocab version. When trying to map to descriptorname_uis in the "adverse effects" portion of our Medline SQL, we were able to map the following:

image**

Here are the 64 missing ingredients we would get from the new mapping, definitely a few good ones in here:
abatacept,adalimumab,adapalene,aripiprazole,bevacizumab,bimatoprost,bortezomib,canagliflozin,capecitabine,celecoxib,certolizumab pegol,cetuximab,cobicistat,dabigatran,darbepoetin alfa,darunavir,dasatinib,denosumab,diflubenzuron,dimethyl fumarate,Dutasteride,emtricitabine,Epoetin Alfa,Eszopiclone,Etanercept,everolimus,ezetimibe,febuxostat,Filgrastim,fluticasone,Gliotoxin,infliximab,insulin detemir,Insulin Glargine,Interferon beta-1a,interferon beta-1b,Linagliptin,linezolid,liraglutide,loteprednol etabonate,lubiprostone,Malondialdehyde,natalizumab,nebivolol,olive oil,omalizumab,pemetrexed,pregabalin,ranibizumab,ranolazine,Rilpivirine,rituximab,rivaroxaban,rivastigmine,sevelamer,simeprevir,sofosbuvir,tadalafil,Tenofovir,trastuzumab,travoprost,ustekinumab,valsartan,varenicline

Not as good of a pick up as I would like, but I don't think we'd get all descriptorname_uis, some of them are completely random. Here are a few of my favorites:
Acacia,Acids,X-Rays,Wine,Walkers,Video Games,Shoes,Riot Control Agents, Chemical,Pacifiers

For this version of LAERTES I see two solutions to this:

  1. Rerun everything on the new Vocabulary coming out
  2. Just creating a patch for this version of LAERTES

** I'm running this off my local copy of MEDLINE so the numbers might not be identical to another copy.

@ericaVoss
Copy link
Contributor Author

Additionally, of the 64 missing drugs, 46 were in our original DRUG_UNIVERSE, that is 5% of the original ingredients.

@ericaVoss
Copy link
Contributor Author

Reviewed and the 64 ingredients do give us a bump, but not a huge bump. Here are about the number of records we could pick up based on the old copy of LAERTES:

image

WITH CTE_MISSING_ROWS AS (
    SELECT e.*, 
        c.CONCEPT_ID AS INGREDIENT_CONCEPT_ID, c.CONCEPT_NAME AS INGREDIENT_CONCEPT_NAME
    FROM DRUG_HOI_EVIDENCE e
        JOIN DRUG_HOI_RELATIONSHIP r
            ON r.ID = e.DRUG_HOI_RELATIONSHIP
            /*ROLL DOWN TO INGREDIENTS*/
        JOIN CONCEPT_ANCESTOR ca
            ON ca.DESCENDANT_CONCEPT_ID = r.DRUG
            AND ca.ANCESTOR_CONCEPT_ID IN (
                SELECT DISTINCT CONCEPT_ID
                FROM CONCEPT
                WHERE VOCABULARY_ID = 'RxNorm'
                AND CONCEPT_CLASS_ID = 'Ingredient'
                AND INVALID_REASON IS NULL
            )
        JOIN CONCEPT c
            ON c.CONCEPT_ID = ca.ANCESTOR_CONCEPT_ID
    WHERE c.CONCEPT_ID IN (
    /*Potential Missing Ingredients*/
    1186087,1119119,981774,757688,1397141,904525,1336825,43526465,1337620,1118084,912263,1315411,42874220,45775372,1304643,1756831,1358436,40222444,46287397,43526424,989482,1703069,1301125,757352,1151789,19011440,1526475,19017742,1304850,1149380,46276250,937368,1516976,1502905,722424,713196,40239216,1736887,40170911,967562,987366,46275968,735843,1314577,42900448,1110942,1304919,734354,19080982,1337107,40238930,1314273,40241331,733523,952004,44785086,44785094,1336926,19011093,1387104,904501,40161532,1308842,780442
    )
)
SELECT '01-ROW_COUNTS' AS STAT,
        EVIDENCE_TYPE AS STAT_TYPE,
        SUPPORTS, 
        COUNT(*) AS STAT_VALUE
FROM CTE_MISSING_ROWS
GROUP BY EVIDENCE_TYPE, SUPPORTS
UNION
SELECT '01-ROW_COUNTS' AS STAT,
        'Total' AS STAT_TYPE, 
        NULL AS SUPPORTS,
         COUNT(*) AS STAT_VALUE
FROM CTE_MISSING_ROWS
ORDER BY 1,2,3 DESC,4;

@cgreich
Copy link

cgreich commented May 17, 2016

So, wait. Where did you get these 64 from? Are they just random from what you were looking at?

@ericaVoss
Copy link
Contributor Author

@cgreich - I used my local copy of MedLine and basically compared if I used the Vocab vs. the lookup you provided me and I get these 64 ingredients.

Not that this would work anywhere other than my environment:

with drug_of_ade as (
    /*Look up drugs per our MEDLINE AVILLACH query*/
     select meshheading.pmid, meshheading.descriptorname, meshheading.descriptorname_ui 
     from medcit_meshheadinglist_meshheading meshheading inner join medcit_meshheadinglist_meshheading_qualifiername qualifier
     on meshheading.pmid = qualifier.pmid and meshheading.medcit_meshheadinglist_meshheading_order = qualifier.medcit_meshheadinglist_meshheading_order
     where qualifier.value = 'adverse effects'
), 
CTE_MESH_RXNORM AS (
    /*USING VOCAB*/
    SELECT A.CONCEPT_CODE source_concept_code, A.CONCEPT_NAME source_concept_name, B.CONCEPT_CODE concept_code, B.CONCEPT_NAME concept_name, B.CONCEPT_ID concept_id, B.CONCEPT_CLASS_ID concept_class_id
    FROM RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT A, RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT B, RNDUSRDHIT06.[VOCABULARY_V5.0_20160311].dbo.CONCEPT_RELATIONSHIP CR
    WHERE A.VOCABULARY_ID = 'MeSH' AND
          CR.CONCEPT_ID_2 = A.CONCEPT_ID AND
          B.CONCEPT_ID = CR.CONCEPT_ID_1 AND
          B.VOCABULARY_ID = 'RxNorm' AND
          CR.RELATIONSHIP_ID = 'Mapped from'
), 
CTE_NEW_MESH_RXNORM AS (
    /*USING VOCAB PATCH*/
    SELECT *
    FROM SCRATCH.dbo.EV_new_mesh
), 
CTE_COMPARE AS (
SELECT DISTINCT d.descriptorname, d.descriptorname_ui, 
        v.concept_code, v.concept_name, v.concept_id, v.concept_class_id,
        nv.concept_code NEW_CONCEPT_CODE, nv.concept_name NEW_CONCEPT_NAME, nv.concept_id AS NEW_CONCEPT_ID, nv.concept_class_id AS NEW_CONCEPT_CLASS_ID
    FROM drug_of_ade d
        LEFT OUTER JOIN CTE_MESH_RXNORM v
            ON v.source_concept_code = d.descriptorname_ui
        LEFT OUTER JOIN CTE_NEW_MESH_RXNORM nv
            ON nv.source_concept_code = d.descriptorname_ui
)
SELECT *
FROM CTE_COMPARE
WHERE CONCEPT_CODE IS NULL AND NEW_CONCEPT_CODE IS NOT NULL /*places where the VOCAB isn't making a map but the PATCH is*/

@cgreich
Copy link

cgreich commented May 23, 2016

Can you give me the list of all missing descriptorname_uis (including the pacifiers)? So I can figure out how we could catch them, maybe.

ericaVoss added a commit that referenced this issue Aug 10, 2016
Noticed that the Vocabulary has deprecated relationships now when using MeSH --> RxNorm.  Removed deprecated relationships.
ericaVoss added a commit that referenced this issue Aug 10, 2016
This should patch part of the issue where it couldn't map MeSH tags to RxNorms due to changes of tagging on the MEDLINE side (D codes to C codes).

The other 1/2 of the fix will come from Rich when rolls-back removal of PharmacoAction mapping piece implemented in the last release.
@ericaVoss
Copy link
Contributor Author

I need to follow the format of the previous file. Will need to update this at a later date.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants