| Dictionary, Encyclopedia and Thesaurus - The Free Dictionary 3,890,641,577 visitors served. |
Dictionary/ thesaurus | Medical dictionary | Legal dictionary | Financial dictionary | Acronyms | Idioms | Encyclopedia | Wikipedia encyclopedia | ? |
Unicode |
Also found in: Dictionary/thesaurus, Wikipedia | 0.01 sec. |
|
|
Unicode (y
`nĭkōd'), set of codes used to represent letters, numbers, control characters, and the like, designed for use internationally in computers computer, device capable of performing a series of arithmetic or logical operations. A computer is distinguished from a calculating machine, such as an electronic calculator, by being able to store a computer program (so that it can repeat its operations and make..... Click the link for more information. . It has been expanded to include such items as scientific, mathematical, and technical symbols, and even musical notation. The Unicode standard defines codes for linguistic symbols used in every major language written today. It includes the Latin alphabet used for English, the Cyrillic alphabet used for Russian, the Greek, Hebrew, and Arabic alphabets, and other alphabets and alphabetlike writing systems used in countries across Europe, Africa, the Indian subcontinent, and Asia, such as Japanese kana, Korean hangeul, and Chinese bopomofo. A large part of the Unicode standard is devoted to thousands of unified character codes for Chinese, Japanese, and Korean ideographs. Adopted as an international standard in 1992, Unicode was originally a "double-byte," or 16-digit, binary number (see numeration numeration, in mathematics, process of designating Numbers according to any particular system; the number designations are in turn called numerals. In any place value system of numeration, a base number must be specified, and groupings are then made by powers of the ..... Click the link for more information. ) code that could represent up to 65,536 items. No longer limited to 16 bits, it can now represent about one million code positions using three encoding forms called Unicode Transformation Formats (UTF). UTF-8, which consists of one-, two-, three-, and four-byte codes, is used extensively in World Wide Web World Wide Web (WWW or W3), collection of globally distributed text and multimedia documents and files and other network services linked in such a way as to create an immense electronic library from which information can be retrieved quickly by intuitive searches. ..... Click the link for more information. applications; UTF-16, which consists of two- and four-byte codes, is used primarily for data storage and text processing; and UTF-32, which consists of four-byte codes, is used where character handling must be as efficient as possible. See also ASCII ASCII or American Standard Code for Information Interchange, a set of codes used to represent letters, numbers, a few symbols, and control characters. Originally designed for teletype operations, it has found wide application in computers. ..... Click the link for more information. . UnicodeInternational character-encoding system designed to support the electronic interchange, processing, and display of the written texts of the diverse languages of the modern and classical world. The Unicode Worldwide Character Standard includes letters, digits, diacritics, punctuation marks, and technical symbols for all the world's principal written languages, using a uniform encoding scheme. The first version of Unicode was introduced in 1991; the most recent version contains almost 50,000 characters. Numerous encoding systems (including ASCII) predate Unicode. With Unicode (unlike earlier systems), the unique number provided for each character remains the same on any system that supports Unicode. Unicode A character code that defines every character in most of the speaking languages in the world. Although commonly thought to be only a two-byte coding system, Unicode characters can use only one byte, or up to four bytes, to hold a Unicode "code point." The code point is a unique number for a character or some character aspect such as an accent mark or ligature. Unicode supports more than a million code points, which are written with a "U" followed by a plus sign and the number in hex; for example, the word "Hello" is written U+0048 U+0065 U+006C U+006C U+006F (see hex chart).Character Encoding Schemes There are several formats for storing Unicode code points. When combined with the byte order of the hardware (big endian or little endian), they are known officially as "character encoding schemes." They are also known by their UTF acronyms, which stand for "Unicode Transformation Format" or "Universal Character Set Transformation Format." See byte order. UTF-8 is widely used because the first 128 bits in the byte are ASCII, and although up to four bytes can be used, only one byte is required for use in the English speaking world. UTF-16 and UTF-32 use a fixed number of bytes. See DBCS. Unicode ISO Number Coding 10646 of Byte Scheme Equivalent Bytes Order** UTF-8 1-4 BE or LE UTF-16 (UCS-2) 2 BE or LE UTF-16BE (UCS-2) 2 BE UTF-16LE (UCS-2) 2 LE UTF-32 (UCS-4) 4 BE or LE UTF-32BE (UCS-4) 4 BE UTF-32LE (UCS-4) 4 LE Pure ASCII (compatible with early 7-bit e-mail systems) UTF-7 1-4 BE or LE **Byte Order (see byte order) BE = big endian LE = little endian Unicode Computing a character set for all languages
Want to thank TFD for its existence? Tell a friend about us, add a link to this page, add the site to iGoogle, or visit the webmaster's page for free fun content. |
|
| Encyclopedia |
| Free Tools: |
For surfers:
Free toolbar & extensions |
Word of the Day |
Help
For webmasters: Free content | Linking | Lookup box | Double-click lookup |
|---|