NOTE: Use any OpenMRS version 1.6 and beyond. These version have also been fixed: 1.5.1, 1.4 Beta 2 (aka 184.108.40.206), or 1.3.4 (See ticket:965, ticket:365, ticket:1539, ticket:1943 for updates)
Modify the database URL connection so it would know that we're passing UTF-8 char.
Add this string into your connection.url in your runtime properties file.
The connection will become something like this:
From Tomcat UTF-8
Modify Tomcat's server.xml file and add the following attribute to the connector tag:
The end result would be something like this:
<Connector port="8080" maxHttpHeaderSize="8192" ... URIEncoding="UTF-8" />
From MySQL: Configuring the Character Set
Make sure that your mysql installation is setup to use UTF-8, which is not necessarily the default character set in a mysql installation. To create the openmrs database such that its tables will use a given international default character set and collation for data storage, use a CREATE DATABASE statement like this:
CREATE DATABASE openmrs DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;
Add the following lines to the "[mysqld]" and "[mysql]" sections of the MySQL configuration file, adding the sections if necessary. For Ubuntu, the file is located in /etc/mysql/my.cnf.
[mysqld] character-set-server=utf8 collation-server=utf8_general_ci [mysql] default-character-set=utf8
These are the changes we had to make to our code:
Added the following line to OpenMRS web.xml:
<filter> <filter-name>charsetFilter</filter-name> <filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class> <init-param> <param-name>encoding</param-name> <param-value>UTF-8</param-value> </init-param> <init-param> <param-name>forceEncoding</param-name> <param-value>true</param-value> </init-param> </filter> <filter-mapping> <filter-name>charsetFilter</filter-name> <url-pattern>/*</url-pattern> </filter-mapping>
Added the following as the first line in headerFull.jsp and headerMinimal.jsp:<%@ page pageEncoding="UTF-8" contentType="text/html; charset=UTF-8" %>
To confirm the current character set, modify maintenance/systemInfo.jsp by adding these lines:
<tr> <td>Default Locale</td> <td><%= java.nio.charset.Charset.defaultCharset() %></td> </tr>
Modify the AddPersonController.java: 23
Fromreturn "?addName=" + name + ...
Toreturn "?addName=" + URLEncoder.encode(name, "UTF-8") + ...
Notice the "name" needs to be explicitly encoded because the function will return the redirection URL to the client. Other similar pattern will also need to be encoded as necessary.
You can now directly put unicode characters into messages_XX.properties. You no longer have to type unicode codes into strings (remember how these files used to be full of \u00E9, for example?). The key, however, is to make sure that your file editor encodes the messages_XX.properties file using UTF-8. If you open messages_fr.properties in eclipse, and you see garbage, change the default character encoding of your editor, or try a different application like Notepad++, if necessary.
The java Properties class has two methods (load() and store()) that don't support UTF-8, that were used in a couple of places, most notably when loading message string files. There are now alternate methods that you can use in OpenmrsUtil that preform the same function, but read and write using UTF-8.
There is an interested Java issue involving reading in a UTF-8 file. Some applications (including Microsoft Notepad) add a special character called a byte order mark to the beginning of a UTF-8 file, but Java's standard encoding doesn't support this:
A workaround that *appears* to fix the issue (I haven't done serious testing) is to wrap a file input stream in an org.apache.velocity.io.UnicodeInputStream instance, which is specifically designed to be BOM-aware. (There is also a BOMInputStream in apache.commons.io, but in a later version of the jar than is included in Openmrs 1.6).