|
|
| About site: Internet/RFCs/1801 - 1900 - RFC 1843 |
Return to Computers also Computers |
| About site: http://www.faqs.org/rfcs/rfc1843.html |
Title: Internet/RFCs/1801 - 1900 - RFC 1843 HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII characters. F. Lee. August 1995. |
|
|
|
|
Brahma Online book with information about history, general information, building and maintaining a Beowulf, and parallel computing models.
| SQLphone Software that answers phone calls automatically. It guides callers through a flowchart of your design, allowing callers to interact with your database using a touch-tone phone.
| Smartech_Interactive_Pvt__Ltd_ Specialises in multimedia presentations, corporate films, electronic corporate brochures, website designing, portals, e-commerce solutions and database applications.
| Software_Protection_Labs Offers a full range of software copy protection products for all PC platforms including Dos and Windows. No source code modification or hardware needed.
| Reactor-XG_FTN_IP-Mailer_for_Win32_platform_ FTN Mailer designed to work as a multi-line system using two widely used data exchange transports, Dial-up networking and TCP/IP simultaneously.
| Devanagari_keyboard On screen keyboard for typing devanagari unicode characters using the INSCRIPT keymap.
|
|
| Alexa statistic for http://www.faqs.org/rfcs/rfc1843.html |
Please visit: http://www.faqs.org/rfcs/rfc1843.html
|
| Related sites for http://www.faqs.org/rfcs/rfc1843.html |
| Mike\'s_Sketchpad Graphics tutorials for print media and the web. Undocumented commands and techniques for Adobe Photoshop, Illustrator, CorelDRAW, Freehand, Paint Shop Pro and QuarkXPress. | | GoAhead_Software Develops service availability software that makes the Internet Infrastructure more reliable. | | PG_Music Makers of Band-in-a-Box, PowerTracks, Modern Jazz Pianist, Latin Pianist, others for Mac and IBM. | | Unitek 14 day MCSE 2000 boot camp taught by two industry experts. All-inclusive boot camp includes hotel stay, meals, exam vouchers, and shuttle service. | | World_Knowledge_Database Newsfeed with a directory of links. | | RFC_2626 The Internet and the Millennium Problem (Year 2000). P. Nesser II. June 1999. | | RFC_1639 FTP Operation over Big Address Records (FOOBAR). D. Piscitello. June 1994. | | Drive_Image Software that creates exact image of a hard drive, partition, or logical disk. Drive image files can be created on the fly without stopping Windows. | | The_Regex_Coach Graphical application for Linux and Windows which can be used to experiment with (Perl-compatible) regular expressions interactively. | | Training_Innovations_Inc_ A developer of custom technology-based training for complex systems. We specialize in developing multimedia training for software, aeropace and defence industries. Both CBT and WBT. | | HTML_Goodies Features HTML and graphics tutorials with obline samples. Forums, and newsletter. | | British_Computer_Society_(BCS)_Fortran_Specialist_Group Site describes goals and activities of group and has presentations from meetings. | | Multilingual_Domain_Name_Registrations Multilingual domain name registrations. | | OpenLDAP Committee seeks to develop a robust suite of applications based on the Lightweight Directory Access Protocol for hierarchical data access. | | RFC_1807 A Format for Bibliographic Records. R. Lasher, D. Cohen. June 1995. | | Novell_Challenges_SCO\'s_Unix_Claims As pressure mounts against SCO and its crusade to protect what it sees as its intellectual property, Novell, which once owned Unix rights, publicly challenged SCO assertions that it owns Unix System V | | Pasdoc A documentation generator for Pascal code, with support for Object Pascal. Creates HTML, LaTeX, and PDF docs from comments in the source code [freeware, GPL]. | | Luxusbuerg The most popular chat platform in Luxembourg. Enjoy chatting with more than 5000 people everyday. | | Keltner,_Grant Offers web and graphics design. Located in Oregon, United States. | | Win-Media_Software WorldCom Internet Dialer is a powerful internet dialer with many features and friendly user interface. [Win 95/98/NT/2000] |
|
This is websites2007.org cache of m/ as retrieved on 2008.09.08 websites2007.org's cache is the snapshot that we took of the page as we crawled the web. The page may have changed since that time.
|
RFC 1843 (rfc1843) - HZ - A Data Format for Exchanging Files of Arbitraril@import 'http://faqs.org/abstracts/css/default.css';@import 'http://faqs.org/search.css';function erfc(s){document.write("[ RFC Index | RFC Search | Usenet FAQs | Web FAQs | Documents | Cities ]Alternate Formats: rfc1843.txt | rfc1843.txt.pdfRFC 1843 - HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII charactersSearch the Archives Display RFC by number RFC1843 - HZ - A Data Format for Exchanging Files of ArbitrarilNetwork Working Group F. LeeRequest for Comments: 1843 Stanford UniversityCategory: Informational August 1995 HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII charactersStatus of this Memo This memo provides information for the Internet community. This memo does not specify an Internet standard of any kind. Distribution of this memo is unlimited.Abstract The content of this memo is identical to an article of the same title written by the author on September 4, 1989. In this memo, GB stands for GB2312-80. Note that the title is kept only for historical reasons. HZ has been widely used for purposes other than "file exchange".1. Introduction Most existing computer systems which can handle a text file of arbitrarily mixed Chinese and ASCII characters use 8-bit codes. To exchange such text files through electronic mail on ASCII computer systems, it is necessary to encode them in a 7-bit format. A generic binary to ASCII encoder is not sufficient, because there is currently no universal standard for such 8-bit codes. For example, CCDOS and Macintosh's Chinese OS use different internal codes. Fortunately, there is a PRC national standard, GuoBiao (GB), for the encoding of Chinese characters, and Chinese characters encoded in the above systems can be easily converted to GB by a simple formula. (* The ROC standard BIG-5 is outside the scope of this article.) HZ is a 7-bit data format proposed for arbitrarily mixed GB and ASCII text file exchange. HZ is also intended for the design of terminal emulators that display and edit mixed Chinese and ASCII text files in real time.2. Specification The format of HZ is described in the following. Without loss of generality, we assume that all Chinese characters (HanZi) have already been encoded in GB. A GB (GB1 and GB2) code is a two byte code, where the first byte is in the range $21-$77 (hexadecimal), and the second byte is in the range $21-$7E. A graphical ASCII character is a byte in the range $21-$7E. A non- graphical ASCII character is a byte in the range $0-$20 or of the value $7F. Since the range of a graphical ASCII character overlaps that of a GB byte, a byte in the range $21-$7E is interpreted according to the mode it is in. There are two modes, namely ASCII mode and GB mode. By convention, a non-graphical ASCII character should only appear in ASCII mode. The default mode is ASCII mode. In ASCII mode, a byte is interpreted as an ASCII character, unless a '~' is encountered. The character '~' is an escape character. By convention, it must be immediately followed ONLY by '~', '{' or '\n' (<LF>), with the following special meaning. o The escape sequence '~~' is interpreted as a '~'. o The escape-to-GB sequence '~{' switches the mode from ASCII to GB. o The escape sequence '~\n' is a line-continuation marker to be consumed with no output produced. In GB mode, characters are interpreted two bytes at a time as (pure) GB codes until the escape-from-GB code '~}' is read. This code switches the mode from GB back to ASCII. (Note that the escape- from-GB code '~}' ($7E7D) is outside the defined GB range.) The decoding process is clear from the above description. The encoding process is straightforward. Note that an (ASCII) '~' is always encoded as '~~'. A sequence of GB codes is enclosed in '~{' and '~}'.3. Remarks & Recommendations We choose to encode any ASCII character except '~' as it is, rather than as a two byte code, and we choose ASCII as the default mode for the following reasons. The computer systems we use is ASCII based. A HZ file containing pure ASCII characters (i.e. no Chinese characters) except '~' is precisely a pure ASCII file. In general, the English (ASCII) portion of a HZ file is directly readable. The escape character '~' is chosen not only because it is commonly used in the ASCII world, but also because '~' ($7E) is outside the defined range ($21-$77) of the first byte of a GB code. In ASCII mode, other potential escape sequences, i.e., two byte sequences beginning with '~' (other than '~~', '~{', '~\n') are currently invalid HZ sequences. Hence, they can be used for future extension of HZ with total upward compatibility. The line-continuation marker '~\n' is useful if one wants to encode long lines in the original text into short lines in this data format without introducing extra newline characters in the decoding process. There is no limit on the length of a line. In fact, the whole file could be one long line or even contain no newline characters. Any DECODER of this HZ data format should not and has no need to operate on the concept of a line. It is easy to write encoders and decoders for HZ. An encoder or decoder needs to lookahead at most one character in the input data stream. Given the current mode, it is also possible and easy to decode a HZ data stream by scanning backward. One of the implication is that "backspaces" can be handled correctly by a terminal emulator. To facilitate the effective use of programs supporting line/page skips such as "more" on UNIX with a terminal emulator understanding the HZ format, it is RECOMMENDED that the ENCODER (which outputs in HZ) sets a maximum line size of less than 80 characters. Since '\n' is an ASCII character, the syntax of HZ then automatically implies that GB codes appearing at the end of a line must be terminated with the escape-from-GB code '~}', and the line-continuation marker '~\n' should be inserted appropriately. The price to paid is that the encoded file size is slightly larger. It is important to understand the following distinction. Note that the above recommendation does NOT change the HZ format. It is simply an encoding "style" which follows the syntax of HZ. Note that this "style" is not built into HZ. It is an additional convention built "on top of" HZ. Other applications may require different "styles", but the same basic HZ DECODER will always work. The essence of HZ is to provide such a flexible basic data format for files of arbitrarily mixed Chinese and ASCII characters.4. Examples To illustrate the "stylistic" issue of HZ encoding, we give the following four examples of encoded text, which should produce the same decoded output. (The recommendation in the last section refers to Example 2.) Example 1: (Suppose there is no line size limit.) This sentence is in ASCII. The next sentence is in GB.~{<:Ky2;S{#,NpJ)l6HK!#~}Bye. Example 2: (Suppose the maximum line size is 42.) This sentence is in ASCII. The next sentence is in GB.~{<:Ky2;S{#,~}~ ~{NpJ)l6HK!#~}Bye. Example 3: (Suppose a new line is started whenever there is a mode switch.) This sentence is in ASCII. The next sentence is in GB.~ ~{<:Ky2;S{#,NpJ)l6HK!#~}~ Bye.Acknowledgement Edmund Lai was the first one who brought my attention to this topic. Discussions with Ed, Tin-Fook Ngai, Yagui Wei and Ricky Yeung were very helpful in shaping the ideas in this article. Thanks to Tin-Fook for his careful review of the draft and numerous interesting suggestions.References [1] Fung Fung Lee, "HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII Characters," September 4, 1989. As part of //ftp.ifcss.org/software/unix/convert/HZ-2.0.tar.gzSecurity Considerations Security issues are not addressed in this memo.Author's Address Fung Fung Lee Computer Systems Laboratory Stanford University Stanford, CA 94309 Phone: +1 415 723 1450 EMail: lee@csl.stanford.edu Previous: RFC 1842 - ASCII Printable Characters-Based Chinese Character Encoding for Internet Messages Next: RFC 1844 - Multimedia E-mail (MIME) User Agent Checklist [ RFC Index | RFC Search | Usenet FAQs | Web FAQs | Documents | Cities ] © 2008 FAQS.ORG. All rights reserved. |
|
| |
HZ | - | A | Data | Format | for | Exchanging | Files | of | Arbitrarily | Mixed | Chinese | and | ASCII | characters. | F. | Lee. | August | 1995. |
|
http://www.faqs.org/rfcs/rfc1843.html
RFC 1843 2008 September
dvd rental
dvd
HZ - A Data Format for Exchanging Files of Arbitrarily Mixed Chinese and ASCII characters. F. Lee. August 1995.
Rules
|
© 2008 Internet Explorer 5+ or Netscape 6+
|
|
Recommended Sites: 1.
Arts -
Business -
Computers -
Games -
Health -
Home -
Kids and Teens -
News -
Recreation -
Reference -
Regional -
Science -
Shopping -
Society -
Sports -
World
Miss Gallery
- Top Anime Hentai
- DVD rental by mail
- Debt Help - Mortgage Calculator - Compare - MPAA - Mortgages
|