z-logo
Premium
Transition from national standards to Unicode: multilingual support in operating systems and programming languages
Author(s) -
Wu PeiChi
Publication year - 2000
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/(sici)1097-024x(200006)30:7<765::aid-spe317>3.0.co;2-n
Subject(s) - unicode , character encoding , computer science , ascii , character (mathematics) , string (physics) , byte , programming language , syntax , set (abstract data type) , natural language processing , mathematics , geometry , mathematical physics
Character sets are one of the basic issues for information interchange. Most current national standard character sets extend 7‐bit ASCII. These extensions conflict with each other and make the design of multilingual information systems complicated. Unicode or the Universal Character Set (UCS) is a character set that covers symbols in the major written languages. Text files and strings usually have no header to indicate which character set is in use, and they currently use one of the national standards by default. The transition from national standards to Unicode may take a longer time than expected. This paper presents the following methods to help the transition. (1) A text file format of fixed‐width characters: if the first character in a text file is a nonzero control code, the file is in UCS; otherwise, it is in the default national standard. The control code indicates which UCS subset or byte order is in use. (2) A tagged string storage: each string has a tag representing which character set or coding format is in use, e.g., the default national standard, 8‐bit subset of UCS‐2, UCS‐2, or UCS‐4. (3) A method for assigning the format of string literals: all string literals use the same syntax notation, and their storage format is the same as that of their source files. These methods can improve multilingual support without introducing much complexity. Copyright © 2000 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here