Chapter 5. Characters in Each Country
38
stateful
subset of 7bit version of ISO 2022, where ASCII, JIS X 0201 1976 Roman, JIS X 0208
1978, and JIS X 0208 1983 are supported.
7bit, which means the most significant bit (MSB) of each byte is always zero.
used for e mail and net news and preferred for HTML.
Determined in RFC 1468.
EUC JP (Japanese version of Extended UNIX Code)
stateless
an implementation of EUC where G0, G1, G2, and G3 are ASCII, JIS X 0208, JIS X 0201
Kana, and JIS X 0212 respectively. There are many implementation which cannot use
JIS X 0201 Kana and JIS X 0212.
8bit
preferred encoding for UNIX. For example, almost all Japanese message catalogs for
gettext is written in EUC JP.
Japanese code is mapped in
0xa0
0xff
. This is important for programmer because
one doesn't need to care there are fake '\' or '/' (which can be treated in a special way
in various context) in the Japanese code.
SHIFT JIS (aka Microsoft Kanji Code)
stateless
NOT subset of ISO 2022
8bit
JIS X 0201 Roman, JIS X 0201 Kana, and JIS X 0208 can be expressed, but JIS X 0212
cannot.
The standard encoding for Windows/Macintosh. This makes SHIFT JIS the most pop
ular encoding in Japan. Though MS is thinking about transition to UNICODE, it is
suspicious that it can be done successfully.
ISO 2022 JP is a subset of 7bit version of ISO 2022, where only G0 is used and G0 is assumed to
be invoked into GL. Character sets included in ISO 2022 JP are:
ASCII (ESC 0x28 0x42),
JIS X 0201 1976 Roman (ESC 0x28 0x4a),
JIS X 0208 1978 (old JIS) (ESC 0x24 0x40), and
footer
Our partners:
PHP: Hypertext Preprocessor Best Web Hosting
Java Web Hosting
Inexpensive Web Hosting
Jsp Web Hosting
Cheapest Web Hosting
Jsp Hosting
Cheap Hosting
Visionwebhosting.net Business web hosting division of Web
Design Plus. All rights reserved