Office of the Government Chief Information Officer - Common Chinese Language Interface
Site Map

 

Frequently Asked Questions


ISO 10646 Questions and Answers

HKSCS Questions and Answers



ISO 10646 Questions and Answers

  1. What is coding standard?

    A coding standard is a mechanism to assign computer internal codes to different characters so that computers can process and display these characters.
  1. What are the coding standards for Chinese characters being used in different regions?

    The ISO 10646 and Big-5 coding standard are both used in Hong Kong and the ISO 10646 standard is being actively promoted. The GB (Guo Biao) coding standard is used in Mainland China. The CNS Code is Taiwan's coding standard while Big-5 coding is also commonly used.
  1. What are the benefits of unifying coding standards?

    With a unified coding standard, computers are capable of accurately processing and displaying electronic information in different languages. Users no longer need conversion tools to handle electronic information encoded in different coding standards. Distortion of information can be reduced during electronic communication, thus facilitating the exchange of electronic information across geographical areas.
  1. How does a unified coding standard benefit the development of a common Chinese language interface?

    With a unified coding standard, computers in different parts of the world can display electronic information encoded in the same coding standard. Computers in Mainland China, Hong Kong and Taiwan can become capable of accurately displaying electronic information in traditional Chinese, simplified Chinese and Chinese characters specific to Hong Kong. Users no longer need to use different coding standards for the different sets of Chinese characters, thus avoiding the problems in electronic communication conducted in Chinese.
  1. What is the International Organization for Standardization (ISO)?

    The ISO is a non-governmental organization established in 1947 (http://www.iso.ch/). It comprises members from more than 140 countries. Its mission is to develop different international standards for facilitating the exchange in various areas (e.g. trade, information and technologies) among different parts of the world.
  1. What is the ISO 10646 standard?

    ISO 10646 is an international coding standard developed under the aegis of the International Organization for Standardization (ISO). It encodes the characters of the major languages of the world into a common character set.
  1. When was the ISO 10646 standard released?

    The ISO released the first version of the ISO 10646 standard in 1993. It was called ISO/IEC 10646-1:1993. In 2000, the ISO released ISO/IEC 10646-1:2000, which is an updated version of ISO/IEC 10646-1:1993. ISO/IEC 10646-1:2000 contains 27,484 ideographic characters consisting of the 20,902 ideographic characters of ISO/IEC 10646-1:1993 plus 6,582 newly defined ideographic characters in the Extension A.
    In November 2001, the ISO released ISO/IEC 10646-2:2001 as a supplement to ISO/IEC 10646-1:2000. ISO/IEC 10646-2:2001 contains 42,711 newly defined ideographic characters in the Extension B, bringing the total number of ideographic characters contained in the ISO 10646 standard to exceed 70,000. All the characters in the Kangxi Dictionary, Hanyu Dazidian and Hanyu Dacidian are now included in the ISO 10646 standard.
    In April 2004, ISO published the ISO/IEC 10646:2003. It is a single publication as the result of the merger of the ISO/IEC 10646-1:2000 and ISO/IEC 10646-2:2001. Therefore, the ideographic characters in the ISO/IEC 10646:2003 standard are the same as those in ISO/IEC 10646-1:2000 cum ISO/IEC 10646-2:2001. In December 2008, ISO published the Extension C in ISO/IEC 10646:2003/Amd 5:2008. The Extension C contains 4,149 additional ideographic characters. In October 2009, ISO published the ISO/IEC 10646:2003/Amd 6:2009.
  1. What is the current development status of the ISO 10646 standard?

    Ideographic characters refer to those characters with appearance related to the meaning of the characters, such as the Han characters. Inclusion of ideographic characters into the ISO 10646 standard is carried out in three phases: i.e. Extension A, Extension B and Extension C. The Extension A, Extension B and Extension C were released as part of ISO/IEC 10646-1:2000, ISO/IEC 10646-2:2001 and ISO/IEC 10646:2003/Amd 5:2008 respectively.
  1. What is ideographic character?

    The International Organization for Standardization categorizes characters from different regions of the world by their characteristics. Ideographic characters refer to those characters with appearance related to the meaning of the characters. An example of ideographic character is Han characters mainly used in South East Asia countries or territories such as Mainland China, Hong Kong, Macao, Taiwan, Japan, South Korea, North Korea, Vietnam and Singapore.
  1. What is the Ideographic Rapporteur Group (IRG)?

    The IRG is a working group under the International Organization for Standardization. Its mission is to develop ideographic characters in the ISO 10646 standard. The IRG has developed CJK Unified Ideographs Block, the Extension A Block, the Extension B Block and the Extension C Block.
  1. Which countries are members of the Ideographic Rapporteur Group?

    IRG members include Mainland China, Hong Kong, Macao, Taipei Computer Association, Singapore, Japan, South Korea, North Korea, Vietnam and USA. Representatives from the Unicode Consortium also attend IRG meetings for coordinating the synchronization between the ISO 10646 standard and Unicode.
  1. What is Unicode?

    Unicode is a character coding system designed by the Unicode Consortium to support the interchange, processing and display of the written texts of many languages in the world. The Unicode Consortium comprises mainly hardware and software vendors.
  1. What is the relationship between Unicode and the ISO 10646 standard?

    In 1991, the ISO and the Unicode Consortium decided to cooperate in defining a universal coding standard for multilingual texts. Since then, the two organizations have been working very closely to extend the ISO 10646 standard and Unicode, and to keep them synchronized. The ISO releases information of characters and code points in the ISO 10646 standard, while the Unicode Consortium supplements the characters and code points with implementation algorithms and semantics information. The ISO 10646 standard and the corresponding version of Unicode are code-to-code identical. Unicode can be regarded as the implementation version of the ISO 10646 standard. Therefore, products supporting Unicode also support the ISO 10646 standard.
  1. What is ISO 10646 Extension B and what benefit does it bring?

    Similar to the CJK and the Extension A, the Extension B contains commonly used Chinese characters with the appearance related to the meaning of the characters. The ISO 10646 Extension B was released as part of the ISO/IEC 10646-2:2001 in November 2001.

    The inclusion of Extension B brings the total number of ideographic characters contained in the ISO 10646 standard to exceed 70,000, in which all the characters of the Kangxi Dictionary, Hanyu Dazidian and Hanyu Dacidian have been included. As a result, more commonly used Chinese characters are available for use electronically.
  1. Is the new version of ISO 10646 and HKSCS (collectively referred to as the "Standard") backward compatible with its corresponding old version?

    The new version of the "Standard" is backward compatible with its corresponding old version. However, in respect of software implementation, newly included characters in the new version of the "Standard" may not be properly viewed or displayed on software platforms that support previous version of the "Standard". In addition, existing software applications that support previous version of the "Standard" may not be able to handle properly newly included characters, including those HKSCS characters with code points assigned by the ISO in the new version of the "Standard".

    When users encounter problems in handling Chinese characters in the course of using GovHK Online Services, they may make reference to the FAQ section.

to top


HKSCS Questions and Answers

  1. Why do we need to develop the Hong Kong Supplementary Character Set (HKSCS)?

    The aim of developing the HKSCS is to define a character set which contains special Chinese characters that are commonly used in Hong Kong and are required by the Government and the public in electronic communication. The HKSCS is for supplementing the character set of the ISO/IEC 10646 standard.

    One of the initiatives under Government's "Digital 21" Strategy for IT development is to develop an open and common Chinese language interface in the HKSAR for users who prefer to communicate electronically in Chinese. A pivotal element of this initiative is the adoption of the ISO 10646 standard and the HKSCS to address the problems arising from the existence of different coding standards and insufficient characters in some Chinese character sets used on computers.

    Many popular operating systems, database software, office automation (OA) suites, web browsers, e-mail clients and input devices already support the ISO 10646 standard. Hence, computer users could make use of HKSCS characters when they use products that are compliant with the ISO 10646 standard.

  1. When was the Hong Kong Supplementary Character Set (HKSCS) developed?

    The Government of the Hong Kong Special Administrative Region developed the HKSCS in collaboration with the Chinese Language Interface Advisory Committee (CLIAC) in September 1999. In December 2001, the Government released Hong Kong Supplementary Character Set - 2001 (HKSCS-2001).  In May 2005, the Government released Hong Kong Supplementary Character Set - 2004 (HKSCS-2004).  The latest version of the HKSCS is the HKSCS-2008 that was published in December 2009.
  1. What contents are updated in the HKSCS?

    The HKSCS-2008 has included 68 new characters. It expands and replaces the HKSCS-2004. HKSCS-2008 also technically aligns with the ISO/IEC 10646:2003 and its Amendment 1 to Amendment 6 which include all characters of the HKSCS-2008. The code allocations of such characters in the HKSCS-2008 have been adjusted accordingly.

    The HKSCS-2004 has included 123 new characters. It expands and replaces the HKSCS-2001. HKSCS-2004 also technically aligns with the ISO/IEC 10646:2003, and its Amendment 1, released by the International Organization for Standardization in April 2004, and in November 2005 respectively. The ISO/IEC 10646:2003 and its Amendment 1 include all characters of the HKSCS-2004. The code allocations of such characters in the HKSCS-2004 have been adjusted accordingly.

    The HKSCS-2001 has included 116 new characters. It expands and replaces the HKSCS released in 1999. HKSCS-2001 also technically aligns with the ISO/IEC 10646-2:2001, released by the International Organization for Standardization in November 2001. The ISO/IEC 10646-2:2001 includes 1,622 additional characters from the HKSCS. The code allocations of such characters in the HKSCS-2001 have been adjusted accordingly.
  1. What characters are included in the HKSCS?

    Breakdown of HKSCS-2008 characters

    The HKSCS-2008 contains a total of 5,009 characters, most of which are Chinese characters ( 4,568 characters or 91%). Many of these characters can be found in major dictionaries (including Kangxi Dictionary, Hanyu Dazidian, Hanyu Dacidian and Zhonghua Zihai). Since an individual Chinese character can be used for different purposes, there is no detailed classification of the characters in the HKSCS-2008. Nevertheless, these characters can be broadly classified into the following categories:

        Proper names, such as names of persons, places or companies

        Characters used in the Cantonese dialect

        Scientific terms

        Radicals and shapes

    These characters are mainly proposed by government departments (such as the Companies Registry, the Department of Justice, the Hong Kong Police Force, the Immigration Department, the Inland Revenue Department, the Judiciary, and the Lands Department), academic bodies, educational institutions and members of the public.

    The remaining 441 characters are various symbols, such as components of Chinese characters, Hanyu Pinyin symbols, international phonetic alphabets, Japanese Katakana and Hiragana, etc.

  1. Can the general public and private organizations apply for inclusion of characters in the Hong Kong Supplementary Character Set (HKSCS)?

    Yes. The general public, government departments and other private organizations can submit applications for inclusion of characters in the HKSCS to the Secretariat of the Chinese Language Interface Advisory Committee. Information like glyph, pronunciation and meaning of the characters should also be provided. Submission of other information like the origin and usage with examples of the characters are also highly recommended.
  1. What are the principle of inclusion of characters in the Hong Kong Supplementary Character Set (HKSCS)?

    The basic principles of inclusion of characters in the HKSCS are that the origins of the characters should be known and the characters should be required for use by government departments or the public. Since the HKSCS is gradually becoming part of the ISO 10646 standard, the relevant rules of the ISO 10646 standard are taken into account when considering the inclusion of characters in the HKSCS.
  1. What are the procedures for inclusion of characters in the Hong Kong Supplementary Character Set (HKSCS)?

    The procedures and principles for inclusion of characters in the HKSCS can be found at the following web page:
    http://www.ogcio.gov.hk/ccli/eng/hkscs/applicn.html.

    The Chinese Language Interface Advisory Committee will consider applications for inclusion of characters in the HKSCS.

  1. Have all characters in HKSCS been included in the ISO 10646 coding standard?

    The ISO/IEC 10646:2003 and its Amendments 1 to 6 include all characters of the HKSCS.
  1. What is the relationship between the Government Common Character Set and the Hong Kong Supplementary Character Set?

    To facilitate electronic communication between government departments, the Hong Kong Government developed in 1995 the Government Common Character Set (GCCS) that contained characters specific to Hong Kong for supplementing the Big-5 character set. In 1999, the Government of the Hong Kong Special Administrative Region developed and published in collaboration with the Chinese Language Interface Advisory Committee (CLIAC) the Hong Kong Supplementary Character Set (HKSCS) which replaced the GCCS. In December 2001, the Government released Hong Kong Supplementary Character Set - 2001 (HKSCS-2001). In May 2005, the Government released Hong Kong Supplementary Character Set - 2004 (HKSCS-2004). The latest version of the HKSCS is the HKSCS-2008 that was published in December 2009.
  1. Why does the Hong Kong Supplementary Character Set (HKSCS) include characters of the Cantonese dialect or even foul languages?

    Characters of the Cantonese dialect included in the HKSCS came from the Judiciary, the Hong Kong Police, the Department of Justice, the Linguistic Society of Hong Kong and the Hong Kong Polytechnic University. Some of these characters can be found in the dictionaries of Cantonese dialect or academic articles. One of the reasons of including these characters in the HKSCS is to facilitate the Judiciary in recording legal proceedings and the Hong Kong Police and other law enforcement agencies in taking statements. Besides, the study of Cantonese dialect is an academic subject. Many members of the linguistic societies, teaching staff and students of tertiary institutions are conducting research in this area. If characters of the Cantonese dialect are not included in the HKSCS, it will be difficult to a certain extent to publish academic journals or articles on Cantonese dialect.
  1. Why does the Hong Kong Supplementary Character Set (HKSCS) contain characters not found in dictionaries?

    Some characters in the HKSCS cannot be found in dictionaries. Except a few characters of Cantonese dialect, most of them come from databases of the Immigration Department, the Companies Registry, the Inland Revenue Department and the Lands Department. It has been confirmed that these characters are still being used in names of persons, companies and locations. These characters appear in various kinds of certificates, contracts and legal documents. In view of operational needs and legal requirements, the HKSCS includes these characters.
  1. Where can I obtain the information of the Hong Kong Supplementary Character Set (HKSCS)?

    The latest information of HKSCS can be found in the "Digital 21" website of the Government (http://www.ogcio.gov.hk/ccli/eng/structure/cli_main.html), including the encoding schemes of the HKSCS for Big-5 and the ISO 10646 standard.
  1. Are the HKSCS documents released previously still available?

    The documents for the previous HKSCS releases are available on the "Digital 21" website of the Government (http://www.ogcio.gov.hk/ccli/eng/hkscs/document.html) for the reference of the public and IT suppliers.
  1. What will be the changes in assigning code points to newly included HKSCS characters with effect from 31 March 2008? 

    With effect from 31 March 2008, the Chinese Language Interface Advisory Committee will only continue to assign ISO 10646 code points to newly included HKSCS characters. All HKSCS characters already assigned with code points before the effective date will not be affected although progressive migration from Big-5 coding to ISO 10646 coding is encouraged.

    This is a further step to promote the wider adoption of the common Chinese language interface in the community and to facilitate electronic communication in Chinese. This is also in line with the adoption of the ISO 10646 standard and the HKSCS as recommended by the "Digital 21" Strategy for IT Development. Members of the public may refer to the latest principles for the inclusion of characters in the HKSCS (Chinese version only) PDF at the "Digital 21" Strategy website.

Download Acrobat Reader Download Adobe Acrobat Reader for view above pdfdocuments

Back Top
Last revision date: 04/01/2010