OmegaT+ Locales

Overview

OmegaT+ uses a locale standard in its operation. A basic use of a locale in OmegaT+ is in determining how to localize the user interface and select user documentation. This use of a locale is general to many applications that are enabled for multilingual operation. The differences between applications that use a locale will most likely be in the particular operating system or programming language implementation used by the application. Specific usage of a locale in OmegaT+ for translation purposes involves the set up of certain aspects of translation projects and file handling.

Locale

A locale in OmegaT+ encompasses the system locale of the computing environment, but also involves internal usage within the program specifically used in translation projects and the processing of documents. As such, it is not exclusively concerned with the text contents of documents; text contents are treated as a separate and integral part related to the locale. The use of a locale in OmegaT+ is based on the Java Internationalization framework and is specified in terms of a locale code.

Locale Codes

Locale codes (a.k.a. locale names or locale IDs) are based on locales as implemented in the underlying Java Internationalization framework. This is basically the combination of a language code and a country code, which many people refer to inclusively as a just the language code, with the requirement that the language code must be lowercase, the country code must be uppercase, and both are connected by an underscore. There is also an optional variant code, the use of which is not defined by any standard and is left up to the application vendor to implement. At present, OmegaT+ does not use a variant code in conjunction with the locale code.

It should be noted that a locale code precedes what many refer to as a language code (a form of locale code). The two are similar, but present differences in letter case, use of connective; e.g. hypen or underscore, and the type and number of possible extra tags (subtags). In OmegaT+, a language code can be used without concern. The reason for using a locale code, instead of a language code, to refer to these similar things is because that is what OmegaT+ is presenting to the user, as well as using internally.

In practice, there is no real confusion in this matter and it is not necessary to be too concerned about locale code versus language code or any internal details related to locale code usage. Just keep in mind that there are some differences and confusion may arise when terminology is mixed.

Compatibility

There is no compatability issue for end users to be concerned with since it is possible to use the same locale codes and language codes in OmegaT+ as are used in similar applications.

Related Standards

Language

Defined by the World Wide Web Consortium in RFC 4646, combined with RFC 4647, (obsoletes RFC 3066) for use in HTML and XML documents. This standard goes way beyond what OmegaT+ can provide. There are a number of subtags that can be used, but in many instances the language code and region (country) code should be sufficient.

OmegaT+ is oriented more towards RFC 3066 (replaces RFC 1766) at this time. In general, this uses a two character language and two character country code connected by a hyphen. The type case is not important in general, but certain implementations specify a particular case for the combination of language code and country code.

Locale

Defined in the ANSI C Standard (ISO 9899) and Java Internationalization framework. Both use a two character language and two character country code connected by an underscore. Language code is lowercase. Country code is uppercase. A variant code is also specified, in different ways in each, but not defined. All codes are optional.

Language Codes

Defined by ISO 639 (obsolete). Part 1 of the standard (639-1) defines two character country codes and part 2 (639-2) defines three character codes.

Country Codes

Defined by ISO 3166. Part 1 of the standard (3166-1) defines two character country codes and part 2 (3166-2) defines three character codes.

Variant Codes

Not defined by particular standards or codes. Implementation and application dependent, where used. Not presently used in OmegaT+. Possible use in future.

Other Codes

Certain implementations make use of extra codes in addition to the usual language and country codes, not to mention the variant. These other codes are not used in OmegaT+ and are not supported by Java. No plans at present to implement use of these.