Spreadsheet Import Module

This module has been rewritten and the new documentation can be found here: Spreadsheet Import Module Version 2

The module, however, supports only the xls format which has a limit of 65k rows of data.

Overview

The Spreadsheet Import Module allows users allows users to import data via XLS files. It is related to a request submitted by Evan Waters (PIH Implementer in Malawi).

A second version has been released for OpenMRS 1.9, see the child page below for it.

Download

Release Notes

  • Version 0.2
    • Fixed not-null property references a null or transient value: org.openmrs.module.spreadsheetimport.SpreadsheetImportTemplateColumnColumn.columnImportFirst error after trying to create a template in some cases
  • Version 0.1
    • Initial release

Instructions

Overview

Spreadsheets are imported based on "Spreadsheet Import Templates." These templates specify the columns and pre-specified values that are present in the given spreadsheet.

Columns are the actual data elements from the spreadsheet (database columns) that are imported into the OpenMRS database.

Pre-specified values can be illustrated with a simple example. Let us assume that one of the columns in a spreadsheet is an Obs Value Numeric - an observed numeric value. According to the OpenMRS data model, in order to create a new row in the obs table, two columns which are linked to other tables must be specified. Specifically, a person for whom this is an observation, and a concept describing the exact type of observation, must be specified.

Note that if another column, such as "Person: Birthdate," is present in the template, the Spreadsheet Import Module will not require the specification of a person to create an observation. Rather, it will link the observation to the person created with the corresponding date of birth. If you also have, for example, a "Person Name: Given Name" column, the given name created will also reference the correct person.

Important Caveat: Currently, a patient is only created if any field references the patient table. For example, your spreadsheet template would need to contain a column that corresponds to the "Patient Identifier: Identifier" data in order to properly create a patient. Note that a person will be created if created tables reference the person table, such as a column that corresponds to "Person Name: Given Name." However, a patient will not be created. This is completely valid according to the OpenMRS data model. However, it is somewhat difficult to find this person using the OpenMRS user interface.

Step by Step Instructions

Before following any of the below steps, please:

  • Log in as the admin user
  • Click on the Administration link in the navigation bar at the top of the home page
  • Under the Spreadsheet Import Module heading, click on Spreadsheet Import Template List

Create a new spreadsheet import template:

  • Click on New under Spreadsheet Import Template List
  • Follow the steps under "Edit a spreadsheet import template" below

Select a spreadsheet import template to edit:

  • Click on the name of the template you would like to edit inside the Spreadsheet Import Template List
  • Follow the steps under "Edit a spreadsheet import template" below

Edit a spreadsheet import template:

  • Fill in the name and description of the template
  • For each column you would like to add:
    • Click "Add Column"
    • Fill in the column "Name." This should match the name in the header (first) row of your Excel spreadsheet.
    • Select the location in the OpenMRS database where the data from this column in the spreadsheet should be placed. The available options are in the "Data" dropdown, listed as "Table Name: Column Name" according to the OpenMRS data model.
    • If you would like the spreadsheetimport module to check for exact duplicates of this column and refuse to add the specified column if the value already exists, select "Disallow Duplicate Value."
    • "Dataset Index." This field can be clarified by a simple example. Let us assume that you would like to import a list of patients with several observed values. Each observed value corresponds to a different concept. In order to specify to the spreadsheetimport module that each observed value should correspond to a separate row in the observed value table, one must specify separate dataset indices. In other words, specify the dataset index "?1" for the first observed value, "2" for the second observed value, etc.
    • "Delete." If you would like to delete this column from the template, check this box. Note: in order for the column to actually be deleted, you must click the "Delete Selected Columns" button.
  • Once you are satisfied with the spreadsheet import template columns, click the "Next Step" button.
  • Here, you will have the possibility of entering "Pre-specified values." For a description of these, please see the Overview subsection of the Instructions above. If you do not see this section, then no pre-specified values need to be entered. For each pre-specified value:
    • "Table" specifies the table to which this pre-specified value corresponds (e.g., Concept).
    • "Value" allows you to select all possible values from a drop-down list. Please select one.
    • "Linked Columns" specifies the columns which require this pre-specified value. For example, if you specified an "Obs: Obs Value Numeric" column, the pre-specified value for the "Concept" table will display this "Obs: Obs Value Numeric" column as a "Linked Column." Specifically, you will see both the column name (as specified in the first row of the Excel spreadsheet being imported) and the "Data" in the OpenMRS data model to which this value will be written.
  • Once you are finished selecting all values for the pre-specified values, click "Save."

Import an Excel spreadsheet based on a spreadsheet template:

  • In the Spreadsheet Import Template List, select the Import link next to the name of the Spreadsheet Import Template you would like to use.
  • Click the Browse... button next to Spreadsheet to to upload: to select the Excel file you would like to upload. This file should be formatted in such a way that the first row consists of column names that exactly match those you entered in the Name field of the corresponding spreadsheet template.
  • Next to Sheet:, enter the name of the sheet that contains the data within the spreadsheet, or leave at the default "Sheet1."
  • "Rollback transaction." Only check this field if you would like to test reading the spreadsheet, and do not want the data actually imported.
  • Click Upload to begin the import process.

Delete a spreadsheet import template:

  • In the Spreadsheet Import Template List, check the checkbox next to the corresponding spreadsheet import template.
  • Click on "Delete."

Technical Notes

Implementation is at: 
http://tickets.openmrs.org/browse/TRUNK-1650

You will find the omod: 
http://tickets.openmrs.org/secure/attachment/34646/spreadsheetimport-0.1.omod

as well as a testdata.zip
http://tickets.openmrs.org/secure/attachment/34647/testdata.zip

To use the test data, go into the directory where you downloaded and type: 

unzip testdata.zip 

You will see four files created. 

testdata/testdata.sql - SQL file containing test spreadsheetimport templates 
testdata/testdata.xls - XLS file to be used with test spreadsheetimport templates 
testdata/import - shell script to import data from testdata.sql (assumes ~/.OpenMRS/openmrs-runtime.properties configured correctly) 
testdata/dump - shell script to export data into testdata.sql (assumes ~/.OpenMRS/openmrs-runtime.properties configured correctly) 

To quick start, load the pre-formulated spreadsheetimport template data: 

cd testdata 
./import 

(this has been tested on Ubuntu 10.04 amd64) 

Under Administration, you will see a Spreadsheet Import Module, with a Spreadsheet Import Tempate List link underneath. 

You can use this link to edit the module as well as to import data. Feel free to try both. 

Some notes: 

* The module has been designed to allow user to import into _any_ table.value inside the openmrs database 
* Currently duplicate value checking is handled in a simplistic fashion (each column can be marked as "Disallow Duplicate Values"; there could be potential problems for this, e.g., if multiple sites allow patients with the same Patient Identifier) 
* I have made the test data per Darius' ideas here: 
http://openmrs-mailing-list-archives.1560443.n2.nabble.com/Re-OpenMRS-Create-module-that-allows-users-to-import-data-via-XLS-files-td5434367.html#a5454907
Specifically, we import patient data along with two observations for the patient (both weights in this case). 

Small caveat: under Administration -> Manage Observations, I get a Javascript error when trying to view the Observations (on trunk). However, I tried adding new observations through the GUI and get the same exact values in the obs table, so I do not believe the problem lies in my code. Please excuse if this is indeed something I am missing. 

Potential changes/fixes: 
* messages.properties currently not used. I got sick of this after all my rewrites of the code. Once there is a good version that works well and everyone is happy with, it will prob be fairly easy to move messages here and/or translate. 
* current design is based on columns and "prespecified values" (e.g., concept). The current model uses dependencies on columns with other columns. In reality, this dependency really is on _tables_. I thought about this long and hard, but it seems to me, for an end user, the GUI is more straightforward if one thinks in terms of _columns_ within the spreadsheet, and not _tables_ in the OpenMRS database. Look forward to comments. 
* the current model is quite simple in terms of privileges. I _do not_ check privileges for individual tables. Rather, I use a Import Spreadsheet Import Template privilege. My take on this was that, probably, at each site, only a few users would be allowed to import spreadsheets. In any case, if I am mistaken, it is pretty easy to change to a table by table version. 

Other comments/questions: 
* I have looked for a good Hibernate tool for Eclipse. Hibernate synchronizer http://sourceforge.net/projects/hibernatesynch/ seems quite nice but is quite old and, although has the capability to produce hibernate XML files automatically! from a database, in practice this did not seem to get great results for me. I spent about 30-45 minutes trying to get Hibernate Tools http://www.hibernate.org/subprojects/tools.html to work, but: a) the 3.6.0 version on Eclipse fails to install for me (http://download.jboss.org/jbosstools/updates/development update site) gets stuck at Calculating Requirements and dependencies...; I even tried installing Eclipse 3.3 but Hibernate Tools did not seem to install cleanly on that). 
* All code is found at: 
http://svn.openmrs.org/openmrs-modules/spreadsheetimport/trunk/
along with test data. 

References

References