How do I write Greek ( or other non-ASCII/8859-1 ) characters to a database?

Joe Sam Shirah

From the standard JDBC perspective, there is no difference between ASCII/8859-1 characters and those above 255 ( hex FF ). The reason for that is that all Java characters are in Unicode ( unless you perform/request special encoding ). Implicit in that statement is the presumption that the datastore can handle characters outside the hex FF range or interprets different character sets appropriately. That means either:

  • The OS, application and database use the same code page and character set. For example, a Greek version of NT with the DBMS set to the default OS encoding.
  • The DBMS has I18N support for Greek ( or other language ), regardless of OS encoding. This has been the most common for production quality databases, although support varies. Particular DBMSes may allow setting the encoding/code page/CCSID at the database, table or even column level. There is no particular standard for provided support or methods of setting the encoding. You have to check the DBMS documentation and set up the table properly.
  • The DBMS has I18N support in the form of Unicode capability. This would handle any Unicode characters and therefore any language defined in the Unicode standard. Again, set up is proprietary.
Note that a specific DBMS may provide some combination of all of these. Outside of those scenarios, you would, in general, have to handle encoding yourself. For all intents and purposes, you would have binary data in a character column. Aside from being error prone, you would lose virtually all benefits of a DBMS other than identity, structure, security and, presumably, reliability.