|
ISO 10646 Questions and Answers
HKSCS Questions and Answers
ISO
10646 Questions and Answers
- What is coding standard?
A coding standard is a mechanism to assign computer internal codes to
different characters so that computers can process and display these
characters.
|
- What are the coding standards for
Chinese characters being used in different regions?
The ISO 10646 and Big-5 coding standard are both used in Hong Kong and
the ISO 10646 standard is being actively promoted. The GB (Guo Biao)
coding standard is used in Mainland China. The CNS Code is Taiwan's
coding standard while Big-5 coding is also commonly used.
|
- What are the benefits of unifying
coding standards?
With a unified coding standard, computers are capable of accurately
processing and displaying electronic information in different
languages. Users no longer need conversion tools to handle electronic
information encoded in different coding standards. Distortion of
information can be reduced during electronic communication, thus
facilitating the exchange of electronic information across geographical
areas.
|
- How does a unified coding standard
benefit the development of a common Chinese language interface?
With a unified coding standard, computers in different parts of the
world can display electronic information encoded in the same coding
standard. Computers in Mainland China, Hong Kong and Taiwan can become
capable of accurately displaying electronic information in traditional
Chinese, simplified Chinese and Chinese characters specific to Hong
Kong. Users no longer need to use different coding standards for the
different sets of Chinese characters, thus avoiding the problems in
electronic communication conducted in Chinese.
|
- What is the International
Organization for Standardization (ISO)?
The ISO is a non-governmental organization established in 1947 (http://www.iso.ch/).
It comprises members from more than 140 countries. Its mission is to
develop different international standards for facilitating the exchange
in various areas (e.g. trade, information and technologies) among
different parts of the world.
|
- What is the ISO 10646 standard?
ISO 10646 is an international coding standard developed under the aegis
of the International Organization for Standardization (ISO). It encodes
the characters of the major languages of the world into a common
character set.
|
- When was the ISO 10646 standard
released?
The ISO released the first version of the ISO 10646 standard in 1993.
It was called ISO/IEC 10646-1:1993. In 2000, the ISO released ISO/IEC
10646-1:2000, which is an updated version of ISO/IEC 10646-1:1993.
ISO/IEC 10646-1:2000 contains 27,484 ideographic characters consisting
of the 20,902 ideographic characters of ISO/IEC 10646-1:1993 plus 6,582
newly defined ideographic characters in the Extension A.
In November 2001, the ISO released ISO/IEC 10646-2:2001 as a supplement
to ISO/IEC 10646-1:2000. ISO/IEC 10646-2:2001 contains 42,711 newly
defined ideographic characters in the Extension B, bringing the total
number of ideographic characters contained in the ISO 10646 standard to
exceed 70,000. All the characters in the Kangxi Dictionary, Hanyu
Dazidian and Hanyu Dacidian are now included in the ISO 10646 standard.
In April 2004, ISO published the ISO/IEC 10646:2003. It is a single
publication as the result of the merger of the ISO/IEC 10646-1:2000 and
ISO/IEC 10646-2:2001. Therefore, the ideographic characters in the
ISO/IEC 10646:2003 standard are the same as those in ISO/IEC
10646-1:2000 cum ISO/IEC 10646-2:2001. In December 2008, ISO published
the Extension C in ISO/IEC 10646:2003/Amd 5:2008. The Extension C
contains 4,149 additional ideographic characters. In October 2009, ISO
published the ISO/IEC 10646:2003/Amd 6:2009.
|
- What is the current development
status of the ISO 10646 standard?
Ideographic characters refer to those characters with appearance
related to the meaning of the characters, such as the Han characters.
Inclusion of ideographic characters into the ISO 10646 standard is
carried out in three phases: i.e. Extension A, Extension B and
Extension C. The Extension A, Extension B and Extension C were released
as part of ISO/IEC 10646-1:2000, ISO/IEC 10646-2:2001 and ISO/IEC
10646:2003/Amd 5:2008 respectively.
|
- What is ideographic character?
The International Organization for Standardization categorizes
characters from different regions of the world by their
characteristics. Ideographic characters refer to those characters with
appearance related to the meaning of the characters. An example of
ideographic character is Han characters mainly used in South East Asia
countries or territories such as Mainland China, Hong Kong, Macao,
Taiwan, Japan, South Korea, North Korea, Vietnam and Singapore.
|
- What is the Ideographic Rapporteur
Group (IRG)?
The IRG is a working group under the International Organization for
Standardization. Its mission is to develop ideographic characters in
the ISO 10646 standard. The IRG has developed CJK Unified Ideographs
Block, the Extension A Block, the Extension B Block and the Extension C
Block.
|
- Which countries are members of the
Ideographic Rapporteur Group?
IRG members include Mainland China, Hong Kong, Macao, Taipei Computer
Association, Singapore, Japan, South Korea, North Korea, Vietnam and
USA. Representatives from the Unicode Consortium also attend IRG
meetings for coordinating the synchronization between the ISO 10646
standard and Unicode.
|
- What is Unicode?
Unicode is a character coding system designed by the Unicode Consortium
to support the interchange, processing and display of the written texts
of many languages in the world. The Unicode Consortium comprises mainly
hardware and software vendors.
|
- What is the relationship between
Unicode and the ISO 10646 standard?
In 1991, the ISO and the Unicode Consortium decided to cooperate in
defining a universal coding standard for multilingual texts. Since
then, the two organizations have been working very closely to extend
the ISO 10646 standard and Unicode, and to keep them synchronized. The
ISO releases information of characters and code points in the ISO 10646
standard, while the Unicode Consortium supplements the characters and
code points with implementation algorithms and semantics information.
The ISO 10646 standard and the corresponding version of Unicode are
code-to-code identical. Unicode can be regarded as the implementation
version of the ISO 10646 standard. Therefore, products supporting
Unicode also support the ISO 10646 standard.
|
- What is ISO 10646 Extension B and
what benefit does it bring?
Similar to the CJK and the Extension A, the Extension B contains
commonly used Chinese characters with the appearance related to the
meaning of the characters. The ISO 10646 Extension B was released as
part of the ISO/IEC 10646-2:2001 in November 2001.
The inclusion of Extension B brings the total number of ideographic
characters contained in the ISO 10646 standard to exceed 70,000, in
which all the characters of the Kangxi Dictionary, Hanyu Dazidian and
Hanyu Dacidian have been included. As a result, more commonly used
Chinese characters are available for use electronically.
|
- Is the new version of ISO 10646 and
HKSCS (collectively referred to as the "Standard") backward compatible
with its corresponding old version?
The new version of the "Standard" is backward compatible with its
corresponding old version. However, in respect of software
implementation, newly included characters in the new version of the
"Standard" may not be properly viewed or displayed on software
platforms that support previous version of the "Standard". In addition,
existing software applications that support previous version of the
"Standard" may not be able to handle properly newly included
characters, including those HKSCS characters with code points assigned
by the ISO in the new version of the "Standard".
When users encounter problems in handling Chinese characters in the
course of using GovHK Online Services, they may make reference to the FAQ section.
|

HKSCS
Questions and Answers
- Why do we need to develop the Hong
Kong Supplementary Character Set (HKSCS)?
The aim of developing the HKSCS is to define a character set which
contains special Chinese characters that are commonly used in Hong Kong
and are required by the Government and the public in electronic
communication. The HKSCS is for supplementing the character set of the
ISO/IEC 10646 standard.
One of the initiatives under
Government's "Digital 21" Strategy for IT development is to develop an
open and common Chinese language interface in the HKSAR for users who
prefer to communicate electronically in Chinese. A pivotal element of
this initiative is the adoption of the ISO 10646 standard and the HKSCS
to address the problems arising from the existence of different coding
standards and insufficient characters in some Chinese character sets
used on computers.
Many popular operating systems,
database software, office automation (OA) suites, web browsers, e-mail
clients and input devices already support the ISO 10646 standard.
Hence, computer users could make use of HKSCS characters when they use
products that are compliant with the ISO 10646 standard.
|
- When was the Hong Kong
Supplementary Character Set (HKSCS) developed?
The Government of the Hong Kong Special Administrative Region developed
the HKSCS in collaboration with the Chinese Language Interface Advisory
Committee (CLIAC) in September 1999. In December 2001, the Government
released Hong Kong Supplementary Character Set - 2001
(HKSCS-2001). In May 2005, the Government
released Hong Kong Supplementary Character Set - 2004
(HKSCS-2004). The latest version of the HKSCS is the
HKSCS-2008 that was published in December 2009.
|
- What
contents are updated in the HKSCS?
The
HKSCS-2008 has included 68 new characters. It expands and
replaces the HKSCS-2004. HKSCS-2008 also technically aligns with the ISO/IEC 10646:2003 and its Amendment 1 to Amendment 6 which include all characters of the HKSCS-2008. The code allocations of such characters in the HKSCS-2008 have been adjusted accordingly.
The
HKSCS-2004 has included 123 new characters. It expands and
replaces the HKSCS-2001. HKSCS-2004 also technically aligns with the
ISO/IEC 10646:2003, and its Amendment 1, released by the International
Organization for Standardization in April 2004, and in November 2005
respectively. The ISO/IEC 10646:2003 and its Amendment 1 include all
characters of the HKSCS-2004. The code allocations of such characters
in the HKSCS-2004 have been adjusted accordingly.
The
HKSCS-2001 has included 116 new characters. It expands and
replaces the HKSCS released in 1999. HKSCS-2001 also technically aligns
with the ISO/IEC 10646-2:2001, released by the International
Organization for Standardization in November 2001. The ISO/IEC
10646-2:2001 includes 1,622 additional characters from the HKSCS. The
code allocations of such characters in the HKSCS-2001 have been
adjusted accordingly.
|
- What characters are included in the
HKSCS?
Breakdown of HKSCS-2008 characters
The HKSCS-2008 contains a total of 5,009 characters, most of which are
Chinese characters ( 4,568 characters or 91%). Many of these characters can
be found in major dictionaries (including Kangxi Dictionary, Hanyu
Dazidian, Hanyu Dacidian and Zhonghua Zihai). Since an individual Chinese
character can be used for different purposes, there is no detailed
classification of the characters in the HKSCS-2008. Nevertheless, these
characters can be broadly classified into the following categories:
Proper names, such as names of persons, places or companies
Characters used in the Cantonese dialect
Scientific terms
Radicals and shapes
These characters are mainly proposed by government departments (such as the
Companies Registry, the Department of Justice, the Hong Kong Police Force,
the Immigration Department, the Inland Revenue Department, the Judiciary,
and the Lands Department), academic bodies, educational institutions and
members of the public.
The remaining 441 characters are various symbols, such as components of
Chinese characters, Hanyu Pinyin symbols, international phonetic alphabets,
Japanese Katakana and Hiragana, etc.
|
- Can the general public and private
organizations apply for inclusion of characters in the Hong Kong
Supplementary Character Set (HKSCS)?
Yes. The general public, government departments and other private
organizations can submit applications for inclusion of characters in
the HKSCS to the Secretariat of the Chinese Language Interface Advisory
Committee. Information like glyph, pronunciation and meaning of the
characters should also be provided. Submission of other information
like the origin and usage with examples of the characters are also
highly recommended.
|
- What are the principle of inclusion
of characters in the Hong Kong Supplementary Character Set (HKSCS)?
The basic principles of inclusion of characters in the HKSCS are that
the origins of the characters should be known and the characters should
be required for use by government departments or the public. Since the
HKSCS is gradually becoming part of the ISO 10646 standard, the
relevant rules of the ISO 10646 standard are taken into account when
considering the inclusion of characters in the HKSCS.
|
- What are the
procedures for inclusion of characters in the Hong Kong
Supplementary Character Set (HKSCS)?
The procedures and principles for inclusion of characters in the HKSCS
can be found at the following web page:
http://www.ogcio.gov.hk/ccli/eng/hkscs/applicn.html.
The Chinese Language Interface
Advisory Committee will consider
applications for inclusion of characters in the HKSCS.
|
- Have all characters in HKSCS been
included in the ISO 10646 coding standard?
The ISO/IEC 10646:2003 and its Amendments 1 to 6 include all characters
of
the HKSCS.
|
- What is the relationship between
the Government Common Character Set and the Hong Kong Supplementary
Character Set?
To facilitate electronic communication between government departments,
the Hong Kong Government developed in 1995 the Government Common
Character Set (GCCS) that contained characters specific to Hong Kong
for supplementing the Big-5 character set. In 1999, the Government of
the Hong Kong Special Administrative Region developed and published in
collaboration with the Chinese Language Interface Advisory Committee
(CLIAC) the Hong Kong Supplementary Character Set (HKSCS) which
replaced the GCCS. In December 2001, the Government released Hong Kong
Supplementary Character Set - 2001 (HKSCS-2001). In May 2005, the
Government released Hong Kong Supplementary Character Set - 2004
(HKSCS-2004). The latest version of the HKSCS is the HKSCS-2008 that
was published in December 2009.
|
- Why does the Hong Kong
Supplementary Character Set (HKSCS) include characters of the Cantonese
dialect or even foul languages?
Characters of the Cantonese dialect included in the HKSCS came from the
Judiciary, the Hong Kong Police, the Department of Justice, the
Linguistic Society of Hong Kong and the Hong Kong Polytechnic
University. Some of these characters can be found in the dictionaries
of Cantonese dialect or academic articles. One of the reasons of
including these characters in the HKSCS is to facilitate the Judiciary
in recording legal proceedings and the Hong Kong Police and other law
enforcement agencies in taking statements. Besides, the study of
Cantonese dialect is an academic subject. Many members of the
linguistic societies, teaching staff and students of tertiary
institutions are conducting research in this area. If characters of the
Cantonese dialect are not included in the HKSCS, it will be difficult
to a certain extent to publish academic journals or articles on
Cantonese dialect.
|
- Why does the Hong Kong
Supplementary Character Set (HKSCS) contain characters not found in
dictionaries?
Some characters in the HKSCS cannot be found in dictionaries. Except a
few characters of Cantonese dialect, most of them come from databases
of the Immigration Department, the Companies Registry, the Inland
Revenue Department and the Lands Department. It has been confirmed that
these characters are still being used in names of persons, companies
and locations. These characters appear in various kinds of
certificates, contracts and legal documents. In view of operational
needs and legal requirements, the HKSCS includes these characters.
|
- Where can I obtain the information
of the Hong Kong Supplementary Character Set (HKSCS)?
The latest information of HKSCS can be found in the "Digital
21" website of the Government (http://www.ogcio.gov.hk/ccli/eng/structure/cli_main.html),
including the encoding schemes of the HKSCS for Big-5 and the ISO 10646
standard.
|
- Are the HKSCS documents released
previously still available?
The documents for the previous HKSCS releases are available on the
"Digital 21" website of the Government (http://www.ogcio.gov.hk/ccli/eng/hkscs/document.html)
for the reference of the public and IT suppliers.
|
- What will be the changes in
assigning code points to newly included HKSCS characters with effect
from 31 March 2008?
With effect from 31 March 2008, the Chinese Language Interface Advisory
Committee will only continue to assign ISO 10646 code points to newly
included HKSCS characters. All HKSCS characters already assigned with
code points before the effective date will not be affected although
progressive migration from Big-5 coding to ISO 10646 coding is
encouraged.
This is a further step to promote the wider adoption of the common
Chinese language interface in the community and to facilitate
electronic communication in Chinese. This is also in line with the
adoption of the ISO 10646 standard and the HKSCS as recommended by the
"Digital 21" Strategy for IT Development. Members of the public may
refer to the latest principles for the inclusion of
characters in the HKSCS (Chinese version only) at the
"Digital 21" Strategy website.
|
Download Adobe
Acrobat Reader for view above documents
|