About: Unicode

Facets (new session)
Description
Metadata
Settings
- Rule:
- Inverse Functional Properties:
- "Same As":

About: Unicode Goto Sponge NotDistinct Permalink

An Entity of Type : dbo:Company, within Data Space : el.dbpedia.org associated with source document(s)

Unicode, formally the Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines 144,697 characters covering 159 modern and historic scripts, as well as symbols, emoji, and non-visual control and formatting codes.

Attributes	Values
rdf:type	Thing company
rdfs:label	Unicode (en)
rdfs:comment	Unicode, formally the Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines 144,697 characters covering 159 modern and historic scripts, as well as symbols, emoji, and non-visual control and formatting codes. (en)
rdfs:seeAlso	Unicode equivalence Universal Character Set characters
sameAs	Unicode Unicode Unicode Unicode Unicode
dbp:wikiPageUsesTemplate	dbt:Anchor dbt:As_of dbt:Authority_control dbt:Better_source_needed dbt:Char dbt:Citation_needed dbt:Cite_book dbt:Clarify dbt:Cn dbt:Contains_special_characters dbt:Efn dbt:Em dbt:For dbt:IETF_RFC dbt:IPA-th dbt:ISBN dbt:Main dbt:Notelist dbt:Official_website dbt:Quote dbt:Refbegin dbt:Refend dbt:Reflist dbt:Sc2 dbt:See_also dbt:Short_description dbt:Sister_project_links dbt:Snd dbt:Tt dbt:Ubl dbt:Unichar dbt:Use_dmy_dates dbt:DMOZ dbt:Typo dbt:Character_encoding dbt:Infobox_character_encoding dbt:Unicode_navigation dbt:Middot dbt:Planes_(Unicode) dbt:Wiktth dbt:General_Category_(Unicode) dbt:Unicode_version
Subject	Character encoding Digital typography Unicode
gold:hypernym	List of Utah state symbols
prov:wasDerivedFrom	http://en.wikipedia.org/wiki/Unicode?oldid=1074338498&ns=0
Wikipage page ID	31742 (xsd:integer)
page length (characters) of wiki page	73115 (xsd:nonNegativeInteger)
Wikipage revision ID	1074338498 (xsd:integer)
has abstract	Unicode, formally the Unicode Standard, is an information technology standard for the consistent encoding, representation, and handling of text expressed in most of the world's writing systems. The standard, which is maintained by the Unicode Consortium, defines 144,697 characters covering 159 modern and historic scripts, as well as symbols, emoji, and non-visual control and formatting codes. The Unicode character repertoire is synchronized with ISO/IEC 10646, each being code-for-code identical with the other. The Unicode Standard, however, includes more than just the base code. Alongside the character encodings, the Consortium's official publication includes a wide variety of details about the scripts and how to display them: normalization rules, decomposition, collation, rendering, and bidirectional text display order for multilingual texts, and so on. The Standard also includes reference data files and visual charts to help developers and designers correctly implement the repertoire. Unicode's success at unifying character sets has led to its widespread and predominant use in the internationalization and localization of computer software. The standard has been implemented in many recent technologies, including modern operating systems, XML, and most modern programming languages. Unicode can be implemented by different character encodings. The Unicode standard defines Unicode Transformation Formats (UTF): UTF-8, UTF-16, and UTF-32, and several other encodings. The most commonly used encodings are UTF-8, UTF-16, and the obsolete UCS-2 (a precursor of UTF-16 without full support for Unicode); GB18030, while not an official Unicode standard, is standardized in China and implements Unicode fully. UTF-8, the dominant encoding on the World Wide Web (used in over 95% of websites as of 2020, and up to 100% for some languages) and on most Unix-like operating systems, uses one byte (8 bits) for the first 128 code points, and up to 4 bytes for other characters. The first 128 Unicode code points represent the ASCII characters, which means that any ASCII text is also a UTF-8 text. UCS-2 uses two bytes (16 bits) for each character but can only encode the first 65,536 code points, the so-called Basic Multilingual Plane (BMP). With 1,112,064 possible Unicode code points corresponding to characters (see ) on 17 planes, and with over 144,000 code points defined as of version 14.0, UCS-2 is only able to represent less than half of all encoded Unicode characters. Therefore, UCS-2 is obsolete, though still used in software. UTF-16 extends UCS-2, by using the same 16-bit encoding as UCS-2 for the Basic Multilingual Plane, and a 4-byte encoding for the other planes. As long as it contains no code points in the reserved range U+D800–U+DFFF, a UCS-2 text is valid UTF-16 text. UTF-32 (also referred to as UCS-4) uses four bytes to encode any given code point, but not necessarily any given user-perceived character (loosely speaking, a grapheme), since a user-perceived character may be represented by a grapheme cluster (a sequence of multiple code points). Like UCS-2, the number of bytes per code point is fixed, facilitating code point indexing; but unlike UCS-2, UTF-32 is able to encode all Unicode code points. However, because each code point uses four bytes, UTF-32 takes significantly more space than other encodings, and is not widely used. Although UTF-32 has a fixed size for each code point, it is also variable-length with respect to user-perceived characters. Examples include: the Devanagari kshi, which is encoded by 4 code points, and national flag emojis, which are composed of two code points. All combining character sequences are graphemes, but there are other sequences of code points that are as well, for example . (en)
foaf:isPrimaryTopicOf	http://en.wikipedia.org/wiki/Unicode
is rdfs:seeAlso of	Radical (Chinese characters) Islamic honorifics Pe (letter)
is gold:hypernym of	Specials (Unicode block) Latin-1 Supplement (Unicode block) Tamil All Character Encoding
is Wikipage redirect of	Uni-code UniCode Unicode.org Unicode 1 Unicode 1.0 Unicode 1.0.0 Unicode 1.0.1 Unicode 1.1 Unicode 1.1.0 Unicode 1.1.5 Unicode 10 Unicode 10.0 Unicode 10.0.0 Unicode 11 Unicode 11.0 Unicode 11.0.0 Unicode 12 Unicode 12.0 Unicode 12.0.0 Unicode 12.1 Unicode 12.1.0 Unicode 13 Unicode 13.0 Unicode 13.0.0 Unicode 14 Unicode 14.0 Unicode 14.0.0 Unicode 15 Unicode 15.0 Unicode 15.0.0 Unicode 2

Faceted Search & Find service v1.17_git151 as of Feb 20 2025

Alternative Linked Data Documents: ODE Content Formats:

RDF

ODATA

Microdata

About

OpenLink Virtuoso version 07.20.3240 as of Nov 11 2024, on Linux (x86_64-ubuntu_focal-linux-gnu), Single-Server Edition (72 GB total memory, 1 GB memory in use)
Data on this page belongs to its respective rights holders.
Virtuoso Faceted Browser Copyright © 2009-2025 OpenLink Software