Skip to content
Jonathan Payne edited this page Oct 13, 2015 · 3 revisions

Overview

The following is the still-developing process to import an updated OpenMRS v1.11 dictionary into OCL:

NOTE: Here are OpenMRS Concept Validation rules: https://wiki.openmrs.org/display/docs/Concept+Validation

  1. Create new CIEL source version in preparation for import using the OCL Web Interface
  • NOTE: This currently takes up to 3 days!!!
  1. Import latest release of CIEL OpenMRS v1.11 MySQL dump into my local MySQL. Andy Kanter provides this file via Dropbox as a zipped MySQL dump.
  2. Run OpenMRS Concept Dictionary Integrity Checks script -- this will verify that everything checks out with the MySQL database (e.g. no duplicate mappings) and that CIEL does not need to make any changes prior to importing into OCL
  • NOTE: This does not exist! I have several SQL scripts in the snippets though.
  1. Run the ocl_omrs verify_metadata management command to ensure that all required metadata in the OpenMRS instance exists in OCL. Use the output logs to add any missing metadata.
  • NOTE: This does not exist! Use check_sources() in ocl_omrs extract_db as starting point.
  1. Use ocl_omrs extract_db management command to generate JSON files for concepts and for mappings
  2. Use ocl_api import_concepts_to_source management command to import concepts into the new source version
  • NOTE: Estimate ~10-15 seconds per concept that is updated, which is 56000*12/60/60/24=7 days for CIEL. Time is much faster if there is no diff.
  1. Use ocl_api import_mappings_to_source management command to import mappings into the new source version.
  • NOTE: Estimate ~3 seconds per mapping that is updated, which is 160000*3/60/60/24=5 days for CIEL. There are generally very few updates, however, so in practice this takes <10 hours.
  • NOTE: It was possible to run this simultaneously with the concept import, while keeping track of mappings that were skipped because the concept does not exist yet (i.e. it is a new concept that has not yet been imported).
  1. Run the deactivation loop for concepts/mappings that were not included in the latest dictionary release.
  • NOTE: The deactivation loop was disabled during the last import, since the import was broken into batches of 5,000 concepts/mappings.
  • IDEA: Have import script optionally output file with IDs to deactivate and move deactivation into its own management script so that it can be verified before execution.
  1. Update the Solr index, e.g. /opt/deploy/ocl_api/ocl/manage.py update_index --verbosity=2 --workers=10 --remove.
  • NOTE: This takes ~24 hours using >=8 workers on a dedicated multi-core server.
  1. Generate the new export file using the API, e.g.: POST /orgs/CIEL/sources/CIEL/[:source-version-id]/export/
  • NOTE: Takes 3-4 hours, depending on server load
  1. Download the latest export file using the API: GET /orgs/CIEL/sources/CIEL/[:source-version-id]/export/
  2. Use the ocl_omrs validate_export management command to validate the export file against the original MySQL dump, e.g.: ocl_omrs validate_export --export=[export_filename] -v2 > validation_output.log
  • NOTE: Be sure to update the project settings to the correct MySQL database!
  1. Use the export validation output logs to address any gaps -- gaps came in 9 types:
  • Duplicate mapping in MySQL -- this showed up as a mapping missing in OCL.
    • SOLUTION: Delete the duplicate mapping in MySQL and restart the validation.
    • NOTE: In the future, this should be caught during #3 above.
  • Missing concept or mapping (incl. reference maps, Q/A, and concept sets) in OCL Export, meaning that MySQL has a record but the OCL Export does not. This usually occurs when the resource exists in OCL, but it was not included in the OCL Export, because its ID was missing from the source version list.
    • SOLUTION: Add the missing ID to the source version list directly in Mongo.
    • NOTE: Requires index update.
    • NOTE: If it turns out that the resource does not exist in OCL at all, that means there is a problem with the import script!
  • Missing concept or mapping in MySQL, meaning that a concept or mapping exists in OCL, but not in the latest CIEL MySQL release. This usually occurs when the resource was part of a previous CIEL MySQL release, but it was removed from the latest release. OpenMRS allows "retiring" of concepts, so they should always be included in the MySQL database even if retired. However, retired mappings are simply updated or removed entirely, meaning that we need to identify their absence and retire them in OCL.
    • SOLUTION: Retire the resources with the missing IDs using the API.
    • NOTE: Does not require index update, since it uses the API.
  1. If any changes were made directly to the source version concept/mapping arrays (13b), then return to step #9 to update the index, rebuild the export, and run the export validation. If changes were only made using the API (13c), then simply rebuild the export file and its ready to go. Otherwise, its good to go!

Example from CIEL v2015-08-24 Import

Overview

There were discrepancies in the number of concept and mappings and their states (e.g. active/retired) in OCL Export and in the MySQL Export. Upon further investigation, there were also discrepancies between which concepts/mappings existed in a source and which were members of the latest source version. Meaning, the import script was able to query and update the resource, but it did not appear in the OCL Export. Here is a summary of the statistics:

** CIEL OpenMRS v2015-08-24 Statistics **
Total Mappings: 160857 (all are active)
    Total QandA: 4742
    Total Concept Set: 1842
    Total RefMap: 154273 (excluding CIEL, which is not imported)
Total Concepts: 51082
    Active concepts: 49182
    Retired concepts: 1900

** OCL CIEL Mongo Source Version v2015-08-24 Statistics **
Total Mappings: 161553     # 696 more than CIEL OpenMRS
Total Concepts: 51083      # 1 more than CIEL OpenMRS -- already removed duplicate concept references

** OCL CIEL Solr Source Version v2015-08-24 Statistics **
Total Mappings: 161553      # This matches
    Q-AND-A: 4870
    CONCEPT-SET: 1842
Total Concepts: 51084       # This probably means that one concept ID is invalid
    Active: 49184           # Off by 2
    Retired: 1900           # Matches

** OCL CIEL v2015-08-24 Export Statistics **
Total Mappings: 161553
Total Concepts: 51084

Export Validation Log File Example

put an example in here

Solutions to the above export validation results

  • 2 sets of 2 concept versions for the same concept were members of the latest source version, meaning that 2 different versions of the same concepts were being exported.
    • SOLUTION: Remove the duplicate concept versions from the source version
# Remove the two concept versions that both point to the same concepts
db.sources_sourceversion.update(
    { "_id" : ObjectId("55fc750a525c311ef1a57484") },
    { $pull: { concepts: { $in: [ '5580b6f7525c312dca120f65', '5580b635525c312dca120bd3' ] } } }
)
  • 2 Q/A mapping(s) missing in OCL Export: These Q/A are not part of any source versions.
    • SOLUTION: Add to the v2015-08-24 mappings array.
# Add missing mappings to mappings array
db.sources_sourceversion.findOne( { mnemonic: 'v2015-08-24'}, { _id: 1, mnemonic: 1 } )
db.sources_sourceversion.findOne( { _id: ObjectId('55fc750a525c311ef1a57484')}, { _id: 1, mnemonic: 1 } )
db.sources_sourceversion.update( { _id: ObjectId('55fc750a525c311ef1a57484') }, { $push: { mappings: { $each: ['560b62d4525c3132942c57c8','560b62de525c3132942c57c9'] } } } )
  • 130 Q/A mapping(s) missing in MySQL: These Q/A existed in v2015-05-14 and were removed from v2015-08-24.

    • SOLUTION: Retire all 130 of these mappings. Solution was merged with #4 below.
  • 291 Reference Map(s) missing in OCL Export: These reference maps appear to have been created in OCL, but were not attached to any source versions.

    • SOLUTION: Find each of these mappings and add to v2015-08-24.
    • NOTE: These may need to be added to v2015-05-14 as well.
# Used snippets/missing_maps/export-missing-maps-json.py to create js file to run on the server
from_source = db.sources_source.findOne( { mnemonic:'CIEL'}, {_id:1,mnemonic:1} )
parent_source_version = db.sources_sourceversion.findOne( { mnemonic: 'v2015-08-24'}, { _id: 1, mnemonic: 1 } )
mapping_ids = []
# loop through these for each
    from_concept = db.concepts_concept.findOne( { mnemonic:'110105', parent_id:from_source._id.str } )
    to_source = db.sources_source.findOne( { mnemonic:'SNOMED-CT' }, {_id:1,mnemonic:1} )
    mapping = db.mappings_mapping.findOne( { from_concept_id:from_concept._id, to_concept_code:'21638000', map_type:'SAME-AS', to_source_id:to_source._id } )
# run this at the end
db.sources_sourceversion.update( { _id:parent_source_version._id }, { $push: { mappings: { $each: mapping_ids } } } )
  • 866 Reference Map(s) missing in MySQL: These reference maps existed in v2015-05-14 and were removed in v2015-08-24.
    • SOLUTION: Retire all 866 mappings:
      • Copied OCL mapping_id from #2 and #4 straight into python code found at snippets/retire_maps/
      • retire_maps.py is a small script that simply uses the API to retire a set of mappings based on the provided list of mapping_ids