Migrating Data to OpenMRS

Summary

This is a page to collect notes about migrating data from existing EMRs to OpenMRS. Please add your own experiences and ideas. When we have enough material, we can turn this into a more formal guide.

The Bad News

It isn't easy to migrate an existing EMR into OpenMRS. If you're looking to migrate to OpenMRS, you most likely have a custom database or Excel spreadsheet that is tailored to your clinic's practices. It's very difficult to make an all-purpose migation tool that can handle any custom implementation.  

There are no modules for migrating at this time. There are some ways to import a lot of data, as this question in the Answer Base describes. 

The Better News

You can write software scripts that use the OpenMRS API or web services

Migration Steps

A migration includes these steps, in any order. Most likely you will iterate through the steps several times as your OpenMRS implementation takes shape.

  • Plan where the data are going 
  • Export the data from the existing EMR
  • Normalize the data so all the needed values are available
  • Import the data with a script 

Understanding Where the Data Will Go

Background

Read the /wiki/spaces/RES/pages/26262844 to learn about Encounters, Concepts, and Observations, especially the /wiki/spaces/RES/pages/26290270 on implementation (EH202).

Examples

Let's walk through some examples. 

Imagine a very simple database with four tables: Outpatient Clinic, Patient Info, TB Lab Tests, and Diagnosis Codes. 

Patient Info, Outpatient Clinic, and TB Labs are electronic copies of paper forms. Wherever there's a blank on the form, that data is transcribed into a column in its table. 

Outpatient Clinic table

Outpatient Clinic is the information a clinician took when they treated a patient. Each row is from a patient visit. In OpenMRS, each row will become an Encounter. An Encounter stores all of the information about one clinical visit. 

Each Encounter will be linked to a Patient, a treatment provider (Person), and a Location. More about patients and providers below. A Location is a place where treatment was provided.  They're entered in Administration > Locations > Manage Locations. 

Each column in the table will become a Concept. A Concept is an idea that you're collecting data about. You'll need to make one Concept in the concept dictionary for each column. Concepts can refer to numbers (height, weight), text (complaint, remarks), yes/no (return visit recommended, is patient a smoker), dates (last vaccination date). To begin, make one Concept for each column.  

If you look at the create concept form in the concept dictionary, you'll notice the datatype dropdown. The ConceptDataType's purpose is to tell the database and API what kind of information this is so it can store it properly. There's no room for choice - you'll pick the type that matches what you're collecting. ConceptDataTypes are simpler than database datatypes. All numeric types are Numeric. All chars and varchars are Text. 

A Concept Class is a category. It's meant to help humans organize the concepts into sets. It can also be used in modules and scripts when you want to say "fetch all the concepts that are about ___". For example, in the HTML Form Entry module, you can show the user a dropdown that shows all the concepts of class Diagnosis. It's best to start with the predefined concept classes, but know that you can make your own as needed.

Each cell in the table corresponds to a blank on the paper form. It's the information the clinician gathered about an idea. These will become Obs, short for observation. An Obs is a bucket that holds several things: information (the value) about an idea (the Concept) that the Provider noted about the Patient. You'll make one Obs per cell. The Encounter holds all the Obs from a row, all the observations made during that treatment session. 

Sometimes we need to show that some Obs belong together. This is especially common when there are repeated blanks for the same information. A bacteriology lab form may state the kind of bacteria, an antibiotic, and the degree of resistance to that antibiotic. On paper, we show these belong together by putting them on the same row or drawing a box around them. In a database we might name them "bacteria 1, antibiotic 1, resistance 1, bacteria 2, antibiotic 2, resistance 2." We understand that they're related pieces of information. You'll have to choose how to implement this. The simple choice is to keep your old column names of "bacteria 1," "bacteria 2," etc. This is clunky but easy. The second choice is to make an Obs grouping, which is an Obs that holds other Obs. You'll make a new Concept for the grouping, like "Bacteria Profile." When you migrate, you'll create an Obs for Bacteria Profile to hold the group members, and assign the "bacteria," "antibiotic," and "resistance" Obs with the addGroupMember() method. That's more detail than needed at this stage. During planning, look for columns that belong together, and especially columns that could be confused if moved out of context. 

There's some additional information to preserve as Obs. It's good to label each encounter with the table it came from and the ID or row number so you can refer back to your old EMR.  Create a Concept for "Legacy Table" and "Legacy ID." The Encounter class makes it own internal ID numbers. Don't try to override them. You may wish to make a Concept and Obs for migration date. If you're importing from several EMRs ( e.g. you use the exact same EMR at 3 sites, and you're uniting them into one OpenMRS implementation), add a Concept and Obs for the site the record came from.

Outline of next sections:

Providers 

Bare minimum to create a Person is a gender 

If Outpatient table lists the provider, make one provider for each name

If not, create a "Staff" person - maybe one per ward (Outpatient Staff, ICU Staff)

Patient Info

Bare minimum to create a Patient is an identifier and a gender

If you're lucky, patients are listed in a separate table and have an ID assigned

Identifiers:  keep same scheme, or issue a new type?  If new type,  preserve the old identifier as a second patient identifier

If not lucky ...  ?   What is the plan for making sure you don't confuse two people with the same name? What are the tools for deduplicating? 

If gender unknown, use "U"

TB Lab Test

A lab is simply a  encounter. The provider is the lab technician.

Diagnosis Codes 

In this example, the diagnosis table was made to create a dropdown list for providers to choose from. It's a list of constants, not a list of observations. Make one Concept per row.  

Dropdown lists are a tipoff to a Question and Answer. Make a Concept for the question, and add these concepts as Answers. 

Misc

Real-world databases have more than 4 tables. Look for similarities across forms -  e.g. all Lab tables have the same four columns. Make this a header on the form

Best Practices

Preserve legacy info about what table and row/ID an encounter came from, what clinic if more than one EMR being migrated in

Don't use migration as the opportunity to make lots of additional changes - iterate and improve. Careful of scope creep. 

Back up often. 

Challenges and Difficult Choices

Missing or bogus data - fix or ignore? 

The data changes types - e.g. legacy EMR uses text fields for blood types. Blood types are a list of constants. On a new implementation we'd make Concepts and a coded Question/Answer. Do you change them now? Stay with text fields and make a dropdown in the form? Import existing as text, use Concepts going forward? What does that do to reporting, and does it matter?

Lots of columns in legacy table - many not used now, but there's data in some rows. Import all as-is, even empty columns. Make one form for displaying legacy data. Make new, simpler form reflecting what's currently collected.  

If your database is very normalized, you have to put the pieces together again before import. Ex:  the BloodTypeLab table uses a foreign key on the BloodTypeConstants table (A neg, B pos, etc) . When you run the import script, the obs needs to say "A neg", and not the ID number of the row in BloodTypeConstants. 


Resources

Data Migration

Spreadsheet Import Module

Spreadsheet Import Module Version 2