The place of the character set in a document


To clarify what character sets are for (and what they do not do) it is illustrative to consider the layers of specification of a document stored within a computer.

Assume that the document is something such as the word processor file for this paper:

Thus the character set comes right at the bottom of the tree. It is concerned with capturing only the content of the text of the document. It is not concerned with the physical layout or typography of the document. It is not concerned with the actual semantics of the document.

For completeness it is also a good idea to define the following:

character
a single text element conveying either sound and/or meaning
character set
a fixed number of code points representing different characters
alphabet
a fixed number of character necessary to write a language
a rendering of the shapes (or glyphs) for the characters of a character set
typeface
a minor visual variation of a typeface (such as size or bold)
Having considered a little history and some definitions we are now in a position to worry about the current state of affairs.
Back to table of contents
To next section: Characters of the world
To previous section: History of character encoding