UNICODE or ISO 10646

Thus UNICODE was invented. UNICODE is a construct of a number of computer companies and standards organisations and bibliographic interests. It is currently maintained and developed by the UNICODE Consortium which is an organisation in California.

At much the same time as UNICODE was being developed an ISO technical committee (ISO/JTC1/SC2/WG2) was looking at the same problem and coming to much the same conclusions. In fact the solutions are now officially identical as UNICODE is now (and will remain) identical to the BMP (Basic Multilingual Plane) of ISO 10646. The BMP is in fact the 16-bit element of the code space defined by ISO 10646 (also known as ISO 10646-2) which actually allows a 32-bit encoding (ISO 10646-4) for a code space well in excess of what is likely to be required.

So far all practical purposes UNICODE and ISO 10646 are identical. For this paper I will continue to call them UNICODE for convenience.

UNICODE as a character encoding system for all the world's characters. It is designed to allow computer systems to exchange text information unambiguously because each character is encoded as a single code point.

That is all it is.

It is not any of the following:

a sorting sequence
a glyph definition
a string comparison mechanism
a text formatting control mechanism
a language definition
a character conversion mechanism.

Each of these functions is necessary for the complete handling of text documents and as such there exist standards for each of these. In many cases (as with character encoding) there exist multiple standards and these must be reconciled.

There are no other universal 16-bit character encoding schemes as far as I know. So UNICODE is the world's only choice.

Back to table of contents
To next section: Use and future
To previous section: 16-bits to the rescue