Merge Concepts Module


Duplicate concepts may be introduced into a concept dictionary through the introduction of new ways of modeling data, maintaining multiple concept dictionaries, or multiple users adding concepts. The goal of this project is to make it possible to update all transactional and master data referencing one concept that has been identified as a duplicate to reference the preferred concept.

Important Questions

Comments welcome!

How does the module handle observations and other references in a way that preserves data integrity?

When a concept is retired in favor of a new concept, its obs must be updated so that data is not invalidated. Data is preserved by retiring and recreating the obs. The now retired obs contains a message explaining the merge. The module also checks concepts' compatibility (see Concept Datatype Checking below) in order to prevent data loss through incompatible concepts being merged.

Can a merge be undone?

No. The user wanting to merge concepts should be confident the concepts are duplicates and familiar with the concept dictionary in general. Before the merge is executed, the user is provided with a preview of the data that will be affected by the merge and the options to return to the choose concepts page or continue.

What tables are impacted by the merge concepts module?

See: Tables Impacted by Merge Concept Module

Can a retired concept be merged?

Not currently. As long as the retired concept is not the "concept to keep," this could be considered for future iterations of the module.

How is the merge concepts module going to find other module tables referencing a concept?

The first version of MCM will not find module tables. For now, it will most likely publish an event to let other modules know about the update and it would be up to other modules to handle this. In a future iteration, it could be a viable option to create an interface for other modules to implement or to find such references via Hibernate or a text matching search and notify the user of those findings, but the module itself would not update the references.

What is special about concept answers?

If two concepts are used as answers to a question, and later determined to be duplicates, when the merge happens, that question will have two of the same answer. For example, "What is patient's favorite color?" has answers Red (concept id = 10), Blue (concept id=11), and Navy (concept id=12). Before the merge, there is a formfield with a unique formfield id for each of the three answers and there are three entries in the conceptAnswer table in the database. When concepts Blue and Navy are merged, the references to concept id 12 will be updated to 11, but there are still three formfields for the question "What is patient's favorite color?" and there are still three entries in the conceptAnswer table. This redundancy could also be an occur with drugs, programs, concept sets, person attribute types, and maybe others. The module will automatically delete duplicate concept answers because it is easy to determine if two entries have matching question and matching answer concept ids. All other possible situations like this will be highlighted in the log so the user can handle any other instances of this kind of duplication.

Google Summer of Code 2012 presentation materials


Here is the video that was made at the end of GSoC 2012 to demonstrate the module: 


Road Map

  1. Update API accessible concept references**
  2. Create a log of what has been updated and situations that may require further action by the user*

**Concept references:

Italics indicate some or all of the work has been done and now needs to be tested.

*Situations requiring further action by the user:

Requested features:

Technical Design

Concept Datatype Checking

The following error messages may be given for incompatible concepts to merge:

Release Notes

New Features

  • coming soon

Known Issues

Resources (CLOSED)