Converging on Custom Datatypes (Design)

Background

OpenMRS now has a few different mechanism for adding custom datatypes to the system.  This wasn't planned; rather, it happened organically.  Our hope is to bring these mechanisms closer together in design with the hope to eventually combine them into a single approach that meets all needs.  First, we'll provide a brief history of how we've gotten to the point we are now.  Custom datatypes have been introduced in several parts of OpenMRS, including person attributes, complex observations, global properties, and visit attributes.

Person Attributes

When the person table was initially introduced into OpenMRS, all implementations were constrained to use the same attributes.  Then, a bright young Ugandan (Daniel Kayiwa) suggested that we introduce the same flexibility of an entity-attribute-value (EAV) approach into the person table as we had done with our observations in the obs table.  This approach allows implementations to virtually extend the person table to meet local needs.  For example, in Tanzania, they wanted to track "ten cells", which are a type of addressing used in Tanzania.  To accomplish this flexibility, we added a person_attribute table to contain implementation-specific data about persons and a person_attribute_type table to define the new "attributes" (virtual columns) for the person table.  Using this mechanism, an implementation in Tanzania could define a new person attribute for "ten cell" and use it to store data for each person in their system without having to alter the core data model.  Person attributes have been widely used & appreciated, since they allow OpenMRS to meet local needs through local extensions to the data model while allowing many implementations to continue sharing the core platform.

Complex Observations

In an unrelated effort, we had designed a method for handling complex observations.  One of the primary features of OpenMRS is to gather clinical observations about patients (pulse, weight, test results, answers to questions, etc.).  Sometimes these data can be more complex than a simple number or string.  For example, storing an chest x-ray image, an electrocardiogram of the patient's heart rhythm, or a lab results for a blood culture that is delivered as a rich text document.  Complex observations were designed to accommodate these observational data – i.e., data that aren't primitive datatypes.

One of the benefits of migrating complex observations to a more generic "custom datatype" approach is that it allows OpenMRS to accommodate a larger set of datatypes for clinical data, which opens the door for OpenMRS to find common ground with OpenEHR archetypes.

Settings (formerly Global Properties from 1.8 downwards)

Internal settings and module configuration data are persisted within the database as "settings (formerly Global Properties from 1.8 downwards)."  To date, these properties have been stored as strings, leaving it up to the modules and services that store & retrieve the values to convert non-string values into string representations.  We have wanted a way to define various datatypes for settings (formerly Global Properties from 1.8 downwards)– e.g., dates, numeric values, lists, etc. – and standardize the approach of defining the property's datatype, the UI widget used to manipulate the datatype, validation, etc.  Eventually, we'd like not only to have standard datatypes for settings (formerly Global Properties from 1.8 downwards), but also allow modules to add new custom datatypes.

Visit Attributes, Location Attributes, Provider Attributes

For OpenMRS 1.9, we decided to add the same extensibility for the visit table as we had for the person table, since the metadata needed about patient visits can vary greatly from one implementation to another.  Given our positive experience with person attributes, we planned to introduce visit attributes from the very start along with the visit table.  However, having learned a bit from the use of person attributes, Darius designed visit attributes to be a bit more flexible, introducing an abstracted "handler" interface instead of relying solely on regular expression formatting. Due to this generalized design, we were able to add attributes to the location and provider objects with little extra effort.

Requirements for Custom Datatypes

There are several shared requirements for these various custom datatypes:

  • An abstracted handler interface
    • Validation
    • Choices
      • When applicable, the handler must provide a list of possible values.
      • When applicable, the handler should be able to return a list of possible values, given a search (query) string.
    • Return value with basic views (e.g., raw, uri)
  • A UI extension of handler interface
    • Rending of data
    • User interface widget
  • The API should provide a String reference and optional String display property for handler to persist the value
    • For datatypes with large payloads (e.g., an image, video, document, etc.), the handler may provide a "key" to the value within an external data store.
    • For simple datatypes, the handler may embed the value within the reference.
  • The API should provide a String configuration property that the handler can use to manage configurations
  • The handler should be able to expose configuration settings 
    • Initially this can be presented as a textarea to the user when setting the handler and the value will be passed to the handler when setting it up.
    • Eventually, this can evolve into a map of properties or even into handlers being able to provide custom configuration pages.

Design Considerations

  • For complex observations, we adopted a convention of displayable string + delimiter + reference key to ensure that a displayable version would be available even when values are retrieved in bulk and it's not feasible to call the handler to render every value.  If we treat absence of the delimiter as displayable == reference key and include a mechanism to escape the delimiter, then this approach could be used across all custom datatypes.
  • We'd like to maintain our separation of data & web layers, meaning that the handler interface may need to be separated into Handler and WebHandler interfaces (as Darius has done with visit attributes).  In the future, the majority of OpenMRS instances may not be using the web application, so we should strive to maintain as much non-web-dependent functionality within the API layer as possible.
  • To what extent can we use or learn from the earlier fieldGen work.

Rendering Values

Part of the job of the handler is to handle rendering. The intent was that an application (OpenMRS, module, etc.) would call the API asking for the value rendered for a specific "view" (where the view could things like raw, thumbnail, link, html, json, etc.). View names would be somewhat standardized (e.g., some provided by core like raw, thumbnail, link, html but others invented by modules like flash, foobar, ...) and handlers would be able to report which views they supported. So, the API sends the view and value reference to the handler and the handler takes care of rendering the result and returning it. This model allows the API to support views without having to bind itself to web-based views and leaves the rendering to the handler.

For web support of rendering, it would be nice to have a web-aware extension of handlers – e.g., a method for providing supporting JavaScript (i.e., if a handler is being used it could inject some javascript on the page once even if there are 30 of its thumbnails being rendered on the page).

Notes

http://notes.openmrs.org/2011-06-01-Custom-Datatypes

http://notes.openmrs.org/Design-Forum-2011-09-14