9
Chapter 3
Important Concepts for Character
Coding Systems
Character coding system is one of the fundamental elements of the software and information
processing. Without proper handling of character codes, your software is far from realization of
internationalization. Thus the author begins this document with the story on character codes.
In this chapter, basic concepts such as coded character set and encoding are introduced. These terms
will be needed to read this document and other documents on internationalization and character
codes including Unicode.
3.1 Basic Terminology
At first I begin this chapter by defining a few very important word.
As many people point out, there is a confusion on terminology, since words are used in various
different ways. The author does not want to add a new terminology to a confusing ocean of
various terminologies. Otherwise, terminology of RFC 2130 (
http://www.faqs.org/rfcs/
rfc2130.html
) will be adopted in this document, besides one exception of a word 'character
set'.
Character Character is an individual unit of which sentence and text consist. Character is an
abstract notion.
Glyph Glyph is a specific instance of character. Character and glyph is a pair of words. Some
times a character has multiple glyphs (for example, '$' may have one or two vertical bar.
Arabic characters have four glyphs for each character. Some of CJK ideograms have many
glyphs). Sometimes two or more characters construct one glyph (for example, ligature of
footer
Our partners:
PHP: Hypertext Preprocessor Best Web Hosting
Java Web Hosting
Inexpensive Web Hosting
Jsp Web Hosting
Cheapest Web Hosting
Jsp Hosting
Cheap Hosting
Visionwebhosting.net Business web hosting division of Web
Design Plus. All rights reserved