This documentation can be found here
Example: Sample CSV.
Non-required columns can can be included in the CSV, but will be ignored during processing.
CSV Column name | Data Type | Required | Description |
---|---|---|---|
multipassId | String | True | Represents the user ID of the Enclave user the concept set container/version is being made on on-behalf-of (i.e. who owns the uploaded concept set or version.) An Enclave user can find their ID by going to their account settings and copying the UserID on the top right. If you need someone else's User ID, use the Enclave Object Explorer to search for researchers and click on their name under "Results" on the right. Then, hover over the User ID field and click on the copy icon. |
parent_version_codeset_id | Integer | True (for update operations) | If you are creating a new version of a concept set, add the codeset ID of the parent concept set version. If creating a new concept set container, leave this empty. |
current_max_version | Double | True (for update operations) | If you are creating a new version of a concept set, set this to the maximum existing version number. So, if the maxiumum existing version of the concept set is v5, set this to 5.0 |
concept_set_name | String | True | The name of the concept set container. |
concept_id | Integer | True | This is the concept_id column in the OMOP concept table. In the condition_occurrence table, it appears as condition_concept_id . Likewise for other domain tables. |
includeDescendants | Boolean | True | If this is set to TRUE , then this expression item will match the selected OMOP Concept and all of its descendants. |
isExcluded | Boolean | True | This column is automatically generated by the Standard Operating Procedure. If this is set to TRUE , then the concepts matched by this expression will be removed from the final expansion of this concept set version after all other expressions have been processed. This is useful, for instance, if you want to include the descendants of some concept except for certain concepts or subtrees. |
includeMapped | Boolean | True | This column is automatically generated by the Standard Operating Procedure. Do not use this unless you know what you're doing. If you want to include mapped concepts, you can set this to TRUE , but we recommend you test the expression in ATLAS or the Enclave Concept Set Editor first. |
action | String | True | This column was previously intended to allow concepts/expressions to be added, changed, or removed from existing versions. For now, though, the new version will be created from scratch and include only the expressions listed in the file. Please set value to "add/replace" if you are creating or updating a concept set. |
vocabulary_concept_code | String | True (if concept_id is empty) |
Leave this column blank if concept_id column is not empty. vocabulary_id column should have a value if this column has a value. This is concept_code in OMOP concept table. In the condition_occurrence table, it appears as condition_source_value . Likewise for other domain tables. |
vocabulary_id | String | True (if concept_id is blank) |
Leave this column blank if concept_id column is not empty. vocabulary_concept_code column should have a value if this column has a value. This is vocabulary_id in OMOP concept table. |
annotation | String | False | This column is not required by the Standard Operating Procedure. This is for any comments about the inclusion of this expression. |
domain_team | String | False | This column is not used by the Standard Operating Procedure. |
provenance | String | False | This column is not used by the Standard Operating Procedure. |
limitations | String | False | This column is not used by the Standard Operating Procedure. |
intention | String | False | This column is not used by the Standard Operating Procedure. |
intended_research_project | String | False | This column is not used by the Standard Operating Procedure. |
authority | String | False | This column is not used by the Standard Operating Procedure. |
CSV Column name | Data Type | Required | Description |
---|---|---|---|
container_intention | String | True | The intention of the concept set. |
container_research_project | String | True | The name of the project, ideally the 'short name' (the part that appears in brackets before the longer name), e.g. the "RP-4A9E27" part in "[RP-4A9E27] DI&H - Data Quality". Can see a list of research projects here: https://unite.nih.gov/workspace/compass/projects |
container_assigned_sme | String | True | The concept set's assigned subject matter expert. |
container_assigned_informatician | String | True | The concept set's assigned informatician. |
CSV Column name | Data Type | Required | Description |
---|---|---|---|
domain | String | False | An ignored field. Feel free to include if it helps for readability / data management. |
class_id | String | False | An ignored field. Feel free to include if it helps for readability / data management. |
This can be done 1 of 2 ways: (a) by providing OMOP concept IDs in the omop_concept_id
field, or (b) adding concepts directy from a source vocabulary, using the vocabulary_id
and vocabulary_concept_code
fields.
Given a CSV like the following... TODO
...run: TODO
Given a CSV like the following... TODO
...run: TODO
Given a CSV like the following...
concept_set_name | parent_version_codeset_id | action | concept_id | includeDescendants | isExcluded | includeMapped | annotation | vocabulary_concept_code | vocabulary_id | FIELD11 | concept_name | domain | class_id |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
794639872 | add/replace | 4034962 | FALSE | FALSE | FALSE | 237613005 | Hyperproinsulinemia | ||||||
794639872 | add/replace | FALSE | TRUE | FALSE | 703136005 | SNOMED | Diabetes mellitus in remission | Condition | Clinical Finding |
...updates can be uploaded using the following Python code:
from enclave_wrangler.dataset_upload import upload_new_cset_version_with_concepts_from_csv
path = 'path/to/csv' # replace with path to your CSV
upload_new_cset_version_with_concepts_from_csv(path)
There is also a unit test that demonstrates this functionality in tests/test_enclave_wrangler.py
called TestEnclaveWrangler.test_upload()
.
- Add documentation here for how this can be run by users with Python skills
- Give upload_new_cset_version_with_concepts_from_csv features to allow:
- Specifying user auth token so it can be run by people on their own behalf (who don't have the bulkimport user auth token)
- Specify whether version(s) should be finalized or left in draft state.
Access your security authorization token for the Enclave API:
- Go to https://unite.nih.gov/workspace/slate/documents/dashboard
- Go to "account"
- Go to "settings"
- Go to "tokens"
- Foundry (the software that runs the Enclave) API documentation root: https://www.palantir.com/docs/foundry/api/
- Foundry backend diagram/explanation: https://unite.nih.gov/workspace/documentation/product/foundry-backend/
- Enclave API documentation root: https://unite.nih.gov/workspace/documentation/developer/api
- Data types: https://unite.nih.gov/workspace/documentation/product/api-gateway/types
- Action types (endpoints that allow creates/updates/deletes): https://unite.nih.gov/docs/foundry/action-types/overview/ / https://unite.nih.gov/workspace/ontology/home/action-type
- List of action types: https://unite.nih.gov/docs/foundry/api/ontology-resources/action-types/list-action-types/
- Object types: https://unite.nih.gov/docs/foundry/object-link-types/object-types-overview/
- List of object types: https://www.palantir.com/docs/foundry/api/ontology-resources/object-types/list-object-types/
- Object search: https://www.palantir.com/docs/foundry/api/ontology-resources/objects/search/
- List objects: https://www.palantir.com/docs/foundry/api/ontology-resources/objects/list-objects/
About the enclave: https://covid.cd2h.org/enclave Logging into the enclave: https://unite.nih.gov/workspace/slate/documents/dashboard Logic that runs to create dataset generation used by TermHub: https://unite.nih.gov/workspace/data-integration/code/repos/ri.stemma.main.repository.aea80f94-828b-4795-9603-c3228b153414/contents/refs%2Fheads%2Fmaster/
- Expands expression items to concepts: https://unite.nih.gov/workspace/data-integration/code/repos/ri.stemma.main.repository.aea80f94-828b-4795-9603-c3228b153414/contents/refs%2Fheads%2Fmaster/transforms-python/src/myproject/datasets/concept_set_items_to_all_concept_ids.py Concept set browser: https://unite.nih.gov/workspace/module/view/latest/ri.workshop.main.module.5a6c64c0-e82b-4cf8-ba5b-645cd77a1dbf Security tokens: https://unite.nih.gov/workspace/documentation/product/foundry-backend/security-api Security settings to allow users to access endpoints specifically used by TermHub:
- https://unite.nih.gov/workspace/ontology/action-type/create-new-draft-omop-concept-set-version/security
- https://unite.nih.gov/workspace/ontology/action-type/finalize-draft-omop-concept-set-version/security
- https://unite.nih.gov/workspace/ontology/action-type/add-selected-concepts-as-omop-version-expressions/security