-
Notifications
You must be signed in to change notification settings - Fork 8
Home
The following is the still-developing process to import an updated OpenMRS v1.11 dictionary into OCL:
NOTE: Here are OpenMRS Concept Validation rules: https://wiki.openmrs.org/display/docs/Concept+Validation
- Create new CIEL source version in preparation for import using the OCL Web Interface
- NOTE: This currently takes up to 3 days!!!
- Import latest release of CIEL OpenMRS v1.11 MySQL dump into my local MySQL. Andy Kanter provides this file via Dropbox as a zipped MySQL dump.
- Run OpenMRS Concept Dictionary Integrity Checks script -- this will verify that everything checks out with the MySQL database (e.g. no duplicate mappings) and that CIEL does not need to make any changes prior to importing into OCL
- NOTE: This does not exist! I have several SQL scripts in the snippets though.
- Run the
ocl_omrs verify_metadata
management command to ensure that all required metadata in the OpenMRS instance exists in OCL. Use the output logs to add any missing metadata.
- NOTE: This does not exist! Use
check_sources()
inocl_omrs extract_db
as starting point.
- Use
ocl_omrs extract_db
management command to generate JSON files for concepts and for mappings - Use
ocl_api import_concepts_to_source
management command to import concepts into the new source version
- NOTE: Estimate ~10-15 seconds per concept that is updated, which is 56000*12/60/60/24=7 days for CIEL. Time is much faster if there is no diff.
- Use
ocl_api import_mappings_to_source
management command to import mappings into the new source version.
- NOTE: Estimate ~3 seconds per mapping that is updated, which is 160000*3/60/60/24=5 days for CIEL. There are generally very few updates, however, so in practice this takes <10 hours.
- NOTE: It was possible to run this simultaneously with the concept import, while keeping track of mappings that were skipped because the concept does not exist yet (i.e. it is a new concept that has not yet been imported).
- Run the deactivation loop for concepts/mappings that were not included in the latest dictionary release.
- NOTE: The deactivation loop was disabled during the last import, since the import was broken into batches of 5,000 concepts/mappings.
- IDEA: Have import script optionally output file with IDs to deactivate and move deactivation into its own management script so that it can be verified before execution.
- Update the Solr index, e.g.
/opt/deploy/ocl_api/ocl/manage.py update_index --verbosity=2 --workers=10 --remove
.
- NOTE: This takes ~24 hours using >=8 workers on a dedicated multi-core server.
- Generate the new export file using the API, e.g.:
POST /orgs/CIEL/sources/CIEL/[:source-version-id]/export/
- NOTE: Takes 3-4 hours, depending on server load
- Download the latest export file using the API:
GET /orgs/CIEL/sources/CIEL/[:source-version-id]/export/
- Use the
ocl_omrs validate_export
management command to validate the export file against the original MySQL dump, e.g.:ocl_omrs validate_export --export=[export_filename] -v2 > validation_output.log
- NOTE: Be sure to update the project settings to the correct MySQL database!
- Use the export validation output logs to address any gaps -- gaps came in 9 types:
- Duplicate mapping in MySQL -- this showed up as a mapping missing in OCL.
- SOLUTION: Delete the duplicate mapping in MySQL and restart the validation.
- NOTE: In the future, this should be caught during #3 above.
- Missing concept or mapping (incl. reference maps, Q/A, and concept sets) in OCL Export, meaning that MySQL has a record but the OCL Export does not. This usually occurs when the resource exists in OCL, but it was not included in the OCL Export, because its ID was missing from the source version list.
- SOLUTION: Add the missing ID to the source version list directly in Mongo.
- NOTE: Requires index update.
- NOTE: If it turns out that the resource does not exist in OCL at all, that means there is a problem with the import script!
- Missing concept or mapping in MySQL, meaning that a concept or mapping exists in OCL, but not in the latest CIEL MySQL release. This usually occurs when the resource was part of a previous CIEL MySQL release, but it was removed from the latest release. OpenMRS allows "retiring" of concepts, so they should always be included in the MySQL database even if retired. However, retired mappings are simply updated or removed entirely, meaning that we need to identify their absence and retire them in OCL.
- SOLUTION: Retire the resources with the missing IDs using the API.
- NOTE: Does not require index update, since it uses the API.
- If any changes were made directly to the source version concept/mapping arrays (13b), then return to step #9 to update the index, rebuild the export, and run the export validation. If changes were only made using the API (13c), then simply rebuild the export file and its ready to go. Otherwise, its good to go!
There were discrepancies in the number of concept and mappings and their states (e.g. active/retired) in OCL Export and in the MySQL Export. Upon further investigation, there were also discrepancies between which concepts/mappings existed in a source and which were members of the latest source version. Meaning, the import script was able to query and update the resource, but it did not appear in the OCL Export. Here is a summary of the statistics:
** CIEL OpenMRS v2015-08-24 Statistics **
Total Mappings: 160857 (all are active)
Total QandA: 4742
Total Concept Set: 1842
Total RefMap: 154273 (excluding CIEL, which is not imported)
Total Concepts: 51082
Active concepts: 49182
Retired concepts: 1900
** OCL CIEL Mongo Source Version v2015-08-24 Statistics **
Total Mappings: 161553 # 696 more than CIEL OpenMRS
Total Concepts: 51083 # 1 more than CIEL OpenMRS -- already removed duplicate concept references
** OCL CIEL Solr Source Version v2015-08-24 Statistics **
Total Mappings: 161553 # This matches
Q-AND-A: 4870
CONCEPT-SET: 1842
Total Concepts: 51084 # This probably means that one concept ID is invalid
Active: 49184 # Off by 2
Retired: 1900 # Matches
** OCL CIEL v2015-08-24 Export Statistics **
Total Mappings: 161553
Total Concepts: 51084
put an example in here
- 2 sets of 2 concept versions for the same concept were members of the latest source version, meaning that 2 different versions of the same concepts were being exported.
- SOLUTION: Remove the duplicate concept versions from the source version
# Remove the two concept versions that both point to the same concepts
db.sources_sourceversion.update(
{ "_id" : ObjectId("55fc750a525c311ef1a57484") },
{ $pull: { concepts: { $in: [ '5580b6f7525c312dca120f65', '5580b635525c312dca120bd3' ] } } }
)
- 2 Q/A mapping(s) missing in OCL Export: These Q/A are not part of any source versions.
- SOLUTION: Add to the v2015-08-24 mappings array.
# Add missing mappings to mappings array
db.sources_sourceversion.findOne( { mnemonic: 'v2015-08-24'}, { _id: 1, mnemonic: 1 } )
db.sources_sourceversion.findOne( { _id: ObjectId('55fc750a525c311ef1a57484')}, { _id: 1, mnemonic: 1 } )
db.sources_sourceversion.update( { _id: ObjectId('55fc750a525c311ef1a57484') }, { $push: { mappings: { $each: ['560b62d4525c3132942c57c8','560b62de525c3132942c57c9'] } } } )
-
130 Q/A mapping(s) missing in MySQL: These Q/A existed in v2015-05-14 and were removed from v2015-08-24.
- SOLUTION: Retire all 130 of these mappings. Solution was merged with #4 below.
-
291 Reference Map(s) missing in OCL Export: These reference maps appear to have been created in OCL, but were not attached to any source versions.
- SOLUTION: Find each of these mappings and add to v2015-08-24.
- NOTE: These may need to be added to v2015-05-14 as well.
# Used snippets/missing_maps/export-missing-maps-json.py to create js file to run on the server
from_source = db.sources_source.findOne( { mnemonic:'CIEL'}, {_id:1,mnemonic:1} )
parent_source_version = db.sources_sourceversion.findOne( { mnemonic: 'v2015-08-24'}, { _id: 1, mnemonic: 1 } )
mapping_ids = []
# loop through these for each
from_concept = db.concepts_concept.findOne( { mnemonic:'110105', parent_id:from_source._id.str } )
to_source = db.sources_source.findOne( { mnemonic:'SNOMED-CT' }, {_id:1,mnemonic:1} )
mapping = db.mappings_mapping.findOne( { from_concept_id:from_concept._id, to_concept_code:'21638000', map_type:'SAME-AS', to_source_id:to_source._id } )
# run this at the end
db.sources_sourceversion.update( { _id:parent_source_version._id }, { $push: { mappings: { $each: mapping_ids } } } )
- 866 Reference Map(s) missing in MySQL: These reference maps existed in v2015-05-14 and were removed in v2015-08-24.
- SOLUTION: Retire all 866 mappings:
- Copied OCL mapping_id from #2 and #4 straight into python code found at snippets/retire_maps/
- retire_maps.py is a small script that simply uses the API to retire a set of mappings based on the provided list of mapping_ids
- SOLUTION: Retire all 866 mappings: