This is a work in progress. We are using this wiki page to capture notes as we work on getting PIH concepts into OCL. Tracked in ocl_issues #45.
Note: This runs the script in a docker container. You will need docker installed, as the script starts up a docker container.
The JSON was bulk imported into the PIH source on OCL using the OCL app's bulk import feature.
We discovered some issues during the PIH import process, which are documented in OCL issues:
- Custom validation schema issues (#732)
- URL encoding issues for concept codes (#807)
- Unable to delete PIH dictionary in OCL because other sources/collections reference it (#661)
We also discovered that, since they manage dictionaries across multiple sites and many different servers, PIH does not expect concept IDs (
concept.concept_id in the OpenMRS database) to be predictable, so they rely on mappings and UUIDs instead of concept IDs. They sometimes create a mapping (e.g., PIH:3) to serve as a organizational concept code, but they don't have these for all concepts. Discussion on how to assign official PIH codes for concepts within OCL took place in this OpenMRS Talk thread.
Once we got a close proximation of the PIH dictionary into a temporary source in OCL, we created a matching collection and the added references to all concepts & mapping in the source with a REST API call like this:
One PIH began testing by downloading their dictionary back into OpenMRS, we quickly realized the sort weight for answers & set members (which allows answers & members of a set to be sorted) were not being imported into OCL.
- The ocl_omrs conversion script was adapted to include sort weight for answers & set members (here).
- OCLOMRS-1067 was created to make sure the sort weights are populated when concepts & mappings are imported from OCL.
Recipe for importing the PIH dictionary into OCL
- You have Docker and git installed
- You have httpie installed (
brew install httpieon mac) or can translate references to its
httpcommands below into curl commands or postman calls)
- You have two environment variables defined:
OCL_API_TOKENcontaining your API token from your OCL account on staging
- You have downloaded the attachments and made the scripts executable:
1. Clone the OCL import script:
2. Copy SQL dump file (can be .sql or zipped sql file as .zip) into
local/ subfolder (Docker containers can't see folders outside of the local copy of ocl-omrs, so you must copy your SQL under this local folder and reference it through a relative path)
3. Run the
For PIH, since the base OpenMRS instance used to create the dictionary is OpenMRS 2.3+, the Python script needs to be modified to refer to the concept_numeric.allow_decimal column rather than concept_numeric.precise.
Then execute the script as follows:
This will unzip the sql file if needed, import the sql into a temporary db, check the target OCL environment to make sure all referenced sources already exist in that environment, and then write an export file (same name with a .json extension) into the
local/ folder alongside the sql file.
USE_GOLD_MAPPINGS=1sets an environment variable that will make the import script require a gold mapping (same-as PIH with numeric code) for every concept. If any concept is missing or has more than exactly one gold mapping, the import will fail.
./sql-to-json.sh local/pih-concepts-db-20211213.zipruns the import script on the specified file in the
PIH PIHspecifies the organization (first "PIH") and the target source (second "PIH") to be used in the import file. If you wanted to create an import file for PIH's foobar source, then
PIH PIHwould be replaced with
stagingspecifies the target environment (as of Jan 2022, this can be "staging" or "production"). This is used when ensuring the sources referenced within the import exist on the target environment.
4. Delete the existing PIH content (while you can import without deleting first – i.e., just update existing content in OCL like we do with CIEL updates – while things are changing more between PIH imports, I've been deleting the PIH dictionary (source & collection) on each iteration to ensure we don't have extraneous resources in OCL (e.g., a concept or mapping that was removed between imports not getting deleted from OCL).
I use httpie, but you can use curl or postman to do the same thing:
Verify that the PIH source has been deleted before proceeding. I usually give this 15-30 minutes to be sure it's gone before recreating it just to be safe.
6. Bulk import the PIH source by logging into app.staging.openconceptlab.org, and selecting the Bulk Importer app.
- Check both "Update existing" (checked by default) and "Parallel" options
- Upload the import file that was generated in step #3 (e.g.,
8. Add references for all concepts & mappings of the PIH source into the PIH collection using the
pih-references.json files (attached to this page). This approach generates individual references for each concept & mapping. While it's possible to do this with two references (one to all concepts & one to all mappings), until the Dictionary Manager is refactored to use dynamic references or merged into OCL's term browser, you'll still want to work one reference per concept/mapping.