## Why bother with check digits?

The purpose of check digits is simple. Any time identifiers (typically number +/- letters) are being manually entered via keyboard, there will be errors. Inadvertent keystrokes or fatigue can cause digits to be rearranged, dropped, or inserted. Have you ever mis-dialed a phone number? It happens.

Check digits help to reduce the likelihood of errors by introducing a final digit that is calculated from the prior digits. Using the proper algorithm, the final digit can always be calculated. Therefore, when a number is entered into the system (manually or otherwise), the computer can instantly verify that the final digit matches the digit predicted by the check digit algorithm. If the two do not match, the number is refused. The end result is fewer data entry errors.

**Calculate a check digit:**

## What is the Luhn algorithm?

We use a variation of the Luhn algorithm. This algorithm, also known as the "modulus 10" or "mod 10" algorithm, is very common. For example, it's the algorithm used by credit card companies to generate the final digit of a credit card.

Given an identifier, let's say "139", you travel right to left. Every other digit is doubled and the other digits are taken unchanged. All resulting digits are summed and the check digit is the amount necessary to take this sum up to a number divisible by ten.

Got it? All right, lets try the example.

Work right-to-left, using "139" and doubling every other digit.

9 x 2 = 18

3 = 3

1 x 2 = 2

Now sum all of the **digits** (note '18' is two digits, '1' and '8'). So the answer is '1 + 8 + 3 + 2 = 14' and the check digit is the amount needed to reach a number divisible by ten. For a sum of '14', the check digit is '6' since '20' is the next number divisible by ten.

## Our variation on the Luhn algorithm

### Allowing for Letters

We have borrowed the variation on the Luhn algorithm used by Regenstrief Institute, Inc. In this variation, we allow for letters as well as numbers in the identifier (i.e., alphanumeric identifiers). This allows for an identifier like "139MT" that the original Luhn algorithm cannot handle (it's limited to numeric digits only).

Allowing letters-even limited to capital letters-does not increase the accuracy of data entry. In fact, the potential for mistaking numbers and letters likely increases the chance for errors. In our case (Regenstrief with the AMPATH Medical Record System), we were forced to come up with a simple method for generating identifiers in disparate, disconnected location without collision (giving out the same number twice). Adding a 2-3 letter suffix to the identifer was our solution.

To handle alphanumeric digits (numbers **and** letters), we actually use the ASCII value (the computer's internal code) for each character and subtract 48 to derive the "digit" used in the Luhn algorithm. We subtract 48 because the characters "0" through "9" are assigned values 48 to 57 in the ASCII table. Subtracting 48 lets the characters "0" to "9" assume the values 0 to 9 we'd expect. The letters "A" through "Z" are values 65 to 90 in the ASCII table (and become values 17 to 42 in our algorithm after subtracting 48). To keep life simple, we convert identifiers to uppercase and remove any spaces before applying the algorithm.

### Mod 25 and Mod 30

The idgen module supports additional algorithms, including Mod25 and Mod30 algorithms. These algorithms not only allow letters and numbers to be used throughout the identifier, but also allow the check "digit" to be a letter. Typically, letters than can easily be confused with numbers (B, I, O, Q, S, and Z) are omitted. In fact, the Mod25 algorithm omits both numbers and letters that look similar and can be confused with each other (0, 1, 2, 5, 8, B, I, O, Q, S, and Z); the Mod25 algorithm omits only the potentially confusing letters. The LuhnModNIdentifierValidator.java class contains the code that computes a check digit using "baseCharacters" as the set of possible characters for the identifier or check digit.

## Here's how we handle non-numeric characters

For the second-to-last (2nd from the right) character and every other (even-positioned) character moving to the left, we just add 'ASCII value - 48' to the running total. Non-numeric characters will contribute values >10, but these digits are **not** added together; rather, the value 'ASCII value - 48' (even if over 10) is added to the running total. For example, '"M"' is ASCII 77. Since '77 - 48 = 29', we add 29 to the running total, **not** '2 + 9 = 11'.

For the rightmost character and every other (odd-positioned) character moving to the left, we use the formula '2 * n - 9 x INT(n/5)' (where INT() rounds off to the next lowest whole number) to calculate the contribution of every other character. If you use this formula on the numbers 0 to 9, you will see that it's the same as doubling the value and then adding the resulting digits together (e.g., using 8: '2 x 8 = 16' and '1 + 6 = 7'. Using the formula: '2 x 8 - 9 x INT(8/5) = 16 - 9 x 1 = 16 - 9 = 7') – identical to the Luhn algorithm. But using this formula allows us to handle non-numeric characters as well by simply plugging 'ASCII value - 48' into the formula. For example, '"Z"' is ASCII 90. '90 - 48 = 42' and '2 x 42 - 9 x INT(42/5) = 84 - 9 x 8 = 84 - 72 = 12'. So we add 12 (**not** '1 + 2 = 3') to the running total.

So, here's how we would use the Luhn algorithm for the identifier "139MT":

T (ASCII 84) -> 84 - 48 = 36 -> 2 x 36 - 9 x INT(36/5) = 72 - 9 x 7 = 72 - 63 = 9

M (ASCII 77) -> 77 - 48 = 29

9 x 2 = 18 -> 1 + 8 = 9 or 9 => 2 x 9 - 9 x INT(9/5) = 18 - 9 x 1 = 18 - 9 = 9

3 = 3

1 x 2 = 2 or 1 => 2 x 1 - 9 x INT(1/5) = 2 - 9 x 0 = 2

Summing the results we get '9 + 29 + 9 + 3 + 2 = 52'. The next number divisible by ten is 60. So, our check digit (the difference) is 8.

## Java

**The modified mod10 algorithm implemented in Java**

## VBA

**The modified mod10 algorithm implemented in VBA**

Note:

This VBA algorithm should probably check each character and return an error if any invalid characters are found (as the Java example above does by throwing an exception).

## Groovy

**The modified mod10 algorithm implemented in Groovy**

## Python

**Implemented in Python, by Daniel Watson**

## Perl

**Implemented in Perl, by Steve Cayford**

## C#

**C# direct translation, by Yves Rochon**

## JavaScript

**Implemented in JavaScript, by Owais Hussain**

## Excel Formula

Input the number in cell "A1" and assign the formula below to cell "A2", which will give you the check digit.

**Implemented in MS Excel, by Owais Hussain**

## 11 Comments

## Steve Cayford

I wrote up a Perl version if you're interested...

## Michael Downey

Thank you, Steve! We'll add this to the page.

## Yves Rochon

C# direct translation## Michael Downey

Thank you, Yves! We'll add this to the page.

## Ashvin N.

Hi Michael

we are trying to implement the Modulo 10 logic suggested by you in sql, however it is a bit different by the way in which the final digit is validated.

please find the logic below

I found the following piece of code written in Delphi (http://delphi.cjcsoft.net/viewthread.php?tid=48934), could you let help me with the logic used as i have to implement it in sql.. your help will be much appreciated

Question: Recently while completing my payments I made a typo. Actually I left out a digit (a zero in a sequence of 5 ) in the reference number entered. But instead of being warned the number was accepted. The payment went his wrong way, making me enough trouble to decide to verify whether the software I'm using did a mistake or the wrong number was just valid by chance, I wrote my own checking procedure presented here.

PS for the curious:

What did I have drawn from this case? I now use a digital reading pen to prevent this kind of (input) error!

Answer:

Thanks and Regards

Ashvin

## Burke Mamlin

This looks like it might be a useful algorithm Ashvin, especially because it can be implemented in SQL; however, it appears to be a completely different algorithm that comes up with different (incompatible) check digits compared to the Luhn Algorithm. Using the Luhn Algorithm, the check digit for 313947143000901 is 0, not 9 (from the example in your logic diagram).

In short, it appears that you have found another (not Luhn) modulo 10 algorithm for calculating a check digit. It would not be compatible with the check digits used by much of the OpenMRS community.

Cheers,

-Burke

## Ashvin N.

Thanks Burke , this is very much appreciated... I will check the feasibility of implementing this or else will stick to the Luhn Algorithm (i will post my updates once finalized)

Thanks again for the quick response to my query

Ashvin

## Andrew Allen

T-SQL (Microsoft SQL Server) version:

## Roopam V.

Hi Michael

we are trying to implement Modulo 10, recursive logic suggested by Ashvin N, in above comments.could you please help me with the logic used as i have to implement it in Java.

your help will be much appreciated

Thanks,

./Roopam

## Justmade Yau

Here is the Pascal version :

Pascal Version## Jonathan Cummins

I had to write a pl/sql version for something. I wrote it as a function but you could use it as a plain old procedure too if you wanted.