ࡱ> _a`M #bjbj== oGWWlfffz8,zjE44JJJg!g!g!DDDDDDD$dG IE!fg!E!"g!g!g!E#JJ$E###g!8JfJD#g!D#"#*ANfDJ( Qz6 "BWC,D:E0jECL8Ja#X8JD#zzISO/IEC JTC1/SC2/WG2 N2370 2001-10-10 Universal Multiple Octet Coded Character Set International Organization for Standardization Organisation internationale de normalisation 564C=0@>4=0O >@30=870F8O ?> AB0=40@B870F88 Doc Type: Working Group Document Title: Unicode Consortium Liaison Report Source: The Unicode Consortium Status: Liaison report Action: For consideration by JTC1/SC2/WG2 Related: N2339, N2361, N2362, N2366, N2369, N2383 References In the interest of continuing synchronization between the Unicode Standard and ISO/IEC 10646 there are a few areas where it would be useful to give implementers wishing to create interoperable implementations access to additional information. Therefore it would be helpful if references were available in ISO/IEC10646 that point to the Unicode Standard and Unicode Technical Reports as a source of a more detailed description of, and more information on how to render and process the scripts in 10646. Given the emerging prominence of normalization in the context of W3C protocols and related specifications, the Unicode Consortium suggests that WG2 add a reference to Unicode Standard Annex 15 Unicode Normalization Forms as a source for information for implementers that wish to created normalized data streams for implementation levels 2 and 3. Policies The impact of normalization form C is that sequences of characters that are considered canonically equivalent in the Unicode Standard are normalized to the same sequence. There are some practical effects of this for coding new characters and in the interest of synchronization WG2 should make sure these are reflected in the principle and procedures. In addition, singleton canonical equivalences have the practical effect of removing the distinction between pairs of characters and such pairs of characters should therefore be imaged with identical glyphs. The Unicode Consortium would like to be abler to advise implementers of the following ranges for characters that can be ignored in most processing, mostly because they are formatting characters. 2060..2069 FFF0..FFF8 Plane E Since formatting characters are very different from other graphic characters, implementations that intend to be robust in terms of additions to the Unicode Standard and 10646 need to be able to anticipate the ranges where such characters are to be added in the future. Therefore the Consortium asks WG2 to adopt a matching policy and reflect it in the roadmap. For reasons given in a separate technical paper, variation selectors need to be combining characters. It has turned out to be an intractable problem to allow variation selectors to act on combining characters. Therefore Unicode has established a policy that variation selectors cannot be used with combining marks and ask WG2 to adopt the same policy. Other information There have been some questions regarding the proposed grapheme joiner. This character is not intended to affect ligation. The existing document provides enough information on everything else. Roadmaps The Unicode Consortium is proud to announce that the WG2 roadmaps are now hosted on its web site and will be maintained there by the roadmap ad-hoc committee. A separate document has been submitted with further details. Publications Unicode 3.1.1 has been released. Unicode experienced great difficulties in publishing code charts for the Han Extensions for 3.1. Such problems can be avoided in the future by requiring that all scripts are processed with the same production process. The Unicode Standard Versions 3.2 is tentatively scheduled for Spring 2002 and will incorporate the new characters for amendment 1. Version 4.0 is anticipated for Spring 2003. It will be a complete reissue of the Unicode Standard in book form. Summary of UTC actions The following are brief summaries of recent Unicode Technical Committee (UTC) actions that are deemed of interest to WG2. In almost all instances these actions are reflected in separate, specific documents submitted to WG2, in those cases such documents override the summary information given below. The UTC reviewed and accepted the changes to the repertoire for FPDAM1 with minor changes suggested for several character names. The UTC supports one new script, Limbu as candidate for amendment 2 of Part 1. The UTC supports new scripts (Aegean scripts, Ugaritic 103C00..103DF. Osmanya 10480..104A9, Shavian 10450..1047F) in Plane 1 as candidates for amendment 1 of Part 2. The UTC would like to request a minor revision in a note to the text describing UTF-8 in 10646-1 to allow FFFE and FFFF. This would provide synchronization with the formal definition of UTF-8 as used in the Unicode Standard. This is part of an effort to align the definition of UTF-8 as used by Unicode with the definition of UTF-8 as used by IETF and others. The UTC approved several additional characters, which will be put forth in specific proposal documents or ballot comments. They include two monogram four digram characters at 2672..2677, and sixty four hexagram characters at 4DC0 to 4DFF fifteen variation selector characters to be encoded at FE01..FE0F with the names VARIATION SELECTOR 2..VARIATION SELECTOR 16 240 variation selector characters to be encoded at E0110..E01FF with the names VARIATION SELECTOR 17..VARIATION SELECTOR 256. five phonetic characters: U+0221 LATIN SMALL LETTER T WITH CURL U+0234 LATIN SMALL LETTER D WITH CURL U+0235 LATIN SMALL LETTER N WITH CURL U+02AE LATIN SMALL LETTER TURNED H WITH FISHHOOK U+02AF LATIN SMALL LETTER TURNED H WITH FISHHOOK AND TAIL eight bracket characters: U+27E6 MATHEMATICAL LEFT WHITE SQUARE BRACKET U+27E7 MATHEMATICAL RIGHT WHITE SQUARE BRACKET U+27E8 MATHEMATICAL LEFT ANGLE BRACKET U+27E9 MATHEMATICAL RIGHT ANGLE BRACKET U+27EA MATHEMATICAL LEFT DOUBLE ANGLE BRACKET U+27EB MATHEMATICAL RIGHT DOUBLE ANGLE BRACKET U+FF5F FULLWIDTH LEFT WHITE PARENTHESIS U+FF60 FULLWIDTH RIGHT WHITE PARENTHESIS ARABIC CURRENCY SIGN RIAL at U+FDFC Unicode has considered a number of characters to be deprecated, with the consequence that their use is strongly discouraged, even though they formally remain in the standard. The list of these characters (0340, 0341, 206A..206F) will now be made available in machine readable form. Other, less strongly discouraged characters are often annotated in the Unicode names list. Certain characters need to be ignored by almost all general text processes, except for certain specific processes for which they were designed (example: JOINER and NON-JOINER). Many display engines will force a zero-width glyph for these characters, as too many fonts simply display the missing glyph symbol. To ensure that software that is created today can handle future addition of such characters, in the Unicode Character Database, the UTC designated the following ranges with appropriate code point properties: 2060..206F FE00..FE0F FFF0..FFFC E0000..E0FFF The UTC accepted a Proposed Draft Unicode Technical Report#26 Compatibility Encoding Scheme for UTF-16 - 8-bit. UTC has formally rejected the proposal to encode Klingon. The UTC supports the development of a proposal for encoding Egyptian hieroglyphs consisting of the Gardiner subset while leaving hieroglyphic extensions and markup issues for later study. The UTC has become aware that there is no direct support for UTF-16 in the C and C++ programming languages and forwarded to the C and C++ committees a request for an unambiguous UTF-16 datatype and string literal support. The UTC recommends adding a separate Help document to the Proposal Submission Form to assist the submitter in filling out the form. The UTC welcomes dialogue with representatives of Cambodia to ensure that the Unicode Standard and 10646 meet the needs for text representation of Cambodian.  PAGE 1 *8h1{|   ) 3 : ;   % B C   )*12mHsHOJQJ^JmHsH 56\] 6NH]NH6]5\CJOJQJaJmHsHmHsHmH sH CJ0CJ$jCJ$UmHnHuD8PRTVh1Sj ^` $7$8$H$a$$a$$a$$ !L^L`a$###( ) 2 3 a b % 0 ; C   !^ ^`|}a8 2^` 2h^h`2 & F]\]+,uvABJK@ADE89{|lm  ) z { !!!k"l"""##########5OJQJ\^J!0J\5OJQJ\^JmHnHu0J\5OJQJ\^Jj0J\5OJQJU\^JmHsH5\NH;W{|   ) * !!r"s" 2h^h`2 & F 7 88^8 2 & F ^ 2^`2 & F h^8 & F8s"""###########] 2^`2 & F +0&P1hP/ =!"#$%n:: a@tDXPNG  IHDRiٹgAMA|Q pHYs+IDATHMj0` A vY Zv1W@dcU1Ex{NPfQC&0jfX| 2 =8\I4S@,wS oXYO=N2 vk;Dƍ$+ XZ@&`_<,C$R/ҷR:tm{BdG/I@4C9@ex0 6uE-=ݍ+ V2,:Qt04\\Q}j# BM:ʢ@GmW6/{~<_""lV{[o X߼c~v`7ffӤ5nxs4]0:e*-L D:/+Rvp$u`CM1 [Ol9? AxB{=EeQ,+6>>PO/֣bC؄l#tdlNN%?YQ9?Ok~eXуIENDB`_ i>@> Normal$a$CJ_HaJmH sH tH @@@ Heading 1$ !@&`CJ4B@B Heading 2$@&^`5\N@N Heading 3$<@&5CJOJQJ\^JaJ<@< Heading 4$@&56CJ\]>@> Heading 5$@&5OJQJ\^J@@@ Heading 6 <@&5CJ\aJ2@2 Heading 7 <@&8@8 Heading 8 <@&6]F @F Heading 9 <@&CJOJQJ^JaJ<A@< Default Paragraph Font.U. Hyperlink >*B*ph,@, Header  !, @, Footer  !<T@"< Block Textx]^*B@2* Body Textx4P@B4 Body Text 2 dx6Q@R6 Body Text 3xCJaJHM@1bH Body Text First Indent `@C@r@ Body Text Indenthx^hLN@qL Body Text First Indent 2 `JR@J Body Text Indent 2hdx^hLS@L Body Text Indent 3hx^hCJaJ8"@8 Caption xx5CJ\aJ*?@* Closing ^4@4 Comment TextCJaJL@ DateJY@J Document Map-D M OJQJ^J4[@4 E-mail Signature 4+@4 Endnote Text!CJaJ`$@"` Envelope Address!"@ &+D/^@ OJQJ^JF%@2F Envelope Return#CJOJQJ^JaJ6@B6 Footnote Text$CJaJ2`@R2 HTML Address%6]Je@bJ HTML Preformatted&CJOJQJ^JaJ2 @2 Index 1'^`2 @2 Index 2(^`2 @2 Index 3)^`2 @2 Index 4*^`2@2 Index 5+^`2@2 Index 6,^`2@2 Index 7-^`2@2 Index 8.^`2@2 Index 9/p^p`@!@r@ Index Heading05OJQJ\^J,/@, List1h^h`02@"0 List 22^`03@20 List 338^8`04@B0 List 44^`05@R0 List 55^`20@b2 List Bullet 6 & F66@r6 List Bullet 2 7 & F<7@< List Bullet 38$ & Fa$68@6 List Bullet 4 9 & F69@6 List Bullet 5 : & F:D@: List Continue;hx^h>E@> List Continue 2<x^>F@> List Continue 3=8x^8>G@> List Continue 4>x^>H@> List Continue 5?x^21@2 List Number @ & F6:@6 List Number 2 A & F6;@"6 List Number 3 B & F 6<@26 List Number 4 C & F 6=@B6 List Number 5 D & F d-@Rd Macro Text"E  ` @ OJQJ^J_HmH sH tH I@b Message HeadergF8$d%d&d'd-DM NOPQ^8` OJQJ^J,^@r, Normal (Web)G6@6 Normal Indent H^,O@, Note HeadingI<Z@< Plain TextJCJOJQJ^JaJ(K@( SalutationK.@@. Signature L^>J@> SubtitleM$<@&a$ OJQJ^JL,@L Table of AuthoritiesN^`D#@D Table of FiguresO ^` N>@N TitleP$<@&a$5CJ KHOJQJ\^JaJ @.@@ TOA HeadingQx5OJQJ\^J@ TOC 1R&@& TOC 2 S^&@& TOC 3 T^&@& TOC 4 U^&@& TOC 5 V^&@& TOC 6 W^&@& TOC 7 X^&@& TOC 8 Y^&@& TOC 9 Z^>V> FollowedHyperlink >*B* ph&)@& Page NumberPOP HTML Body ]7$8$H$ CJOJQJ_HaJmH sH tH TOT Blockquote)^$hhdd7$8$H$]h^ha$CJ.D()*+X1Sj()23ab%0;C  ! |}aW{|)*rs000000000000000000000000000000000000000000 0 0 00 0 0 0 0 0 0 0 0 0 0 x0 00]0 2020 2020 200 200 2020 80 80 80 80 80 80 8080 2020 200 70 70 70 7080 2020 2020 2020 2020 2020 202020]00@0@0 000#s"# !# !tl,b$: a@tDXB-D@~(    c Ab.@D:\Asmusf533-C\present\Tutorial\tut11\isoiec.bmpS`TS`TB S  ?3%/0B(/'MU%/0:;B 57KMTV}IK[]cg  (3333333333333333+)1  Asmus FreytagmC:\WINNT\Profiles\Asmusf\Application Data\Microsoft\Word\AutoRecovery save of n2370-liaison-rpt-singapore.asd Asmus Freytag+D:\work\WG2\n2370-liaison-rpt-singapore.doc Asmus FreytagmC:\WINNT\Profiles\Asmusf\Application Data\Microsoft\Word\AutoRecovery save of n2370-liaison-rpt-singapore.asd Asmus FreytagmC:\WINNT\Profiles\Asmusf\Application Data\Microsoft\Word\AutoRecovery save of n2370-liaison-rpt-singapore.asd Asmus Freytag+D:\work\WG2\n2370-liaison-rpt-singapore.doc Asmus FreytagmC:\WINNT\Profiles\Asmusf\Application Data\Microsoft\Word\AutoRecovery save of n2370-liaison-rpt-singapore.asd Asmus FreytagmC:\WINNT\Profiles\Asmusf\Application Data\Microsoft\Word\AutoRecovery save of n2370-liaison-rpt-singapore.asd Asmus Freytag+D:\work\WG2\n2370-liaison-rpt-singapore.doc Asmus Freytag+D:\work\WG2\n2370-liaison-rpt-singapore.doc Asmus Freytag+D:\work\WG2\n2370-liaison-rpt-singapore.doc|,6D}x;C~$BNwAJT:dF09Sz|7vV @Y6töQ")a .xV2g`2d8YT.?@ABCDEFGHIKLMNOPQSTUVWXY\Root Entry F_̓Q^1Table$8JWordDocumentoGSummaryInformation(JDocumentSummaryInformation8RCompObjjObjectPool_̓Q_̓Q  FMicrosoft Word Document MSWordDocWord.Document.89qRoot Entry F@ Qd1Table$8JWordDocumentoGSummaryInformation(J  !"#%&'()*+,-./0123456789:;<=>?@ABCDEFGHIKLMNOPQcb  _PID_HLINKS_AdHocReviewCycleID_EmailSubject _AuthorEmail_AuthorEmailDisplayNameA'1D:\Asmusf533-C\present\Tutorial\tut11\isoiec.bmpSNew document N2370mikeksar@microsoft.com Mike KsarDocumentSummaryInformation8DCompObjjObjectPool_̓Q_̓Q  FMicrosoft Word Document MSWordDocWord.Document.89q՜.+,D՜.+,L px   ASMUS Inc.5   New Title Title@lt