15
Chapter 4
Coded Character Sets And Encodings in
the World
Here major coded character sets and encodings are introduced. Note that you don't have to know
the detail of these character codes if you use LOCALE and
wchar_t
technology.
However, these knowledge will help you to understand why number of bytes, characters, and
columns should be counted separately, why
strchr()
and so on should not be used, why you
should use LOCALE and
wchar_t
technology instead of hard code processing of existing char
acter codes, and so on so on.
These varieties of character sets and encodings will tell you about struggles of people in the world
to handle their own languages by computers. Especially, CJK people could not help working out
various technologies to use plenty of characters within ASCII based computer systems.
If you are planning to develop a text processing software beyond the fields which the LOCALE
technology covers, you will have to understand the following descriptions very well. These fields
include automatic detection of encodings used for the input file (Most of Japanese capable text
viewers such as
jless
and
lv
have this mechanism) and so on.
4.1 ASCII and ISO 646
ASCII is a CCS and also an encoding at the same time. ASCII is 7bit and contains 94 printable
characters which are encoded in the region of
0x21
0x7e
.
ISO 646 is the international standard of ASCII. Following 12 characters of
0x23 (number),
footer
Our partners:
PHP: Hypertext Preprocessor Best Web Hosting
Java Web Hosting
Inexpensive Web Hosting
Jsp Web Hosting
Cheapest Web Hosting
Jsp Hosting
Cheap Hosting
Visionwebhosting.net Business web hosting division of Web
Design Plus. All rights reserved