Revision as of 15:44, 8 September 2010 editDePiep (talk | contribs)Extended confirmed users294,285 editsmNo edit summary← Previous edit | Revision as of 20:36, 30 September 2010 edit undoDePiep (talk | contribs)Extended confirmed users294,285 edits integrating notes into the tableNext edit → | ||
Line 310: | Line 310: | ||
|- | |- | ||
| Zzzz || 999 || Code for ] || Unknown || || || all other code points | | Zzzz || 999 || Code for ] || Unknown || || || all other code points | ||
|- class="sortbottom" | |||
|}<div style="border: 1px solid grey;"> | |||
| colspan="7" | '''Notes''' | |||
;Notes | |||
:1.{{note|ISO_Unicode}} | :1.{{note|ISO_Unicode}}</br> | ||
:2.{{note|ISO_list}} (Alias names are informal) | :2.{{note|ISO_list}} (Alias names are informal)</br> | ||
:3.{{note|ISO_changes}} (including Aliases for Unicode)] {{As of|2010|07|23}} | :3.{{note|ISO_changes}} (including Aliases for Unicode)] {{As of|2010|07|23}}</br> | ||
:4.{{note|Asof_Unicode_version}}As of Unicode version 5.2 | :4.{{note|Asof_Unicode_version}}As of Unicode version 5.2</br> | ||
:5.{{note|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924 | :5.{{note|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924</br> | ||
|}<noinclude> | |||
{{Documentation}} | {{Documentation}} | ||
</noinclude> | </noinclude> |
Revision as of 20:36, 30 September 2010
Code | Nr | Name | Alias | Version | Characters | Remark |
---|---|---|---|---|---|---|
Arab | 160 | Arabic | Arabic | 1.0 | 1030 | |
Armi | 124 | Imperial Aramaic | Imperial_Aramaic | 5.2 | 31 | Ancient/historic |
Armn | 230 | Armenian | Armenian | 1.0 | 90 | |
Avst | 134 | Avestan | Avestan | 5.2 | 61 | Ancient/historic |
Bali | 360 | Balinese | Balinese | 5.0 | 121 | |
Bamu | 435 | Bamum | Bamum | 5.2 | 88 | |
Bass | 259 | Bassa Vah | not in Unicode | |||
Batk | 365 | Batak | Batak | not in Unicode | ||
Beng | 325 | Bengali | Bengali | 1.0 | 92 | |
Blis | 550 | Blissymbols | not in Unicode | |||
Bopo | 285 | Bopomofo | Bopomofo | 1.0 | 65 | |
Brah | 300 | Brahmi | Brahmi | not in Unicode | ||
Brai | 570 | Braille | Braille | 3.0 | 256 | |
Bugi | 367 | Buginese | Buginese | 4.1 | 30 | |
Buhd | 372 | Buhid | Buhid | 3.2 | 20 | |
Cakm | 349 | Chakma | not in Unicode | |||
Cans | 440 | Unified Canadian Aboriginal Syllabics | Canadian_Aboriginal | 3.0 | 710 | |
Cari | 201 | Carian | Carian | 5.1 | 49 | Ancient/historic |
Cham | 358 | Cham | Cham | 5.1 | 83 | |
Cher | 445 | Cherokee | Cherokee | 3.0 | 85 | |
Cirt | 291 | Cirth | not in Unicode | |||
Copt | 204 | Coptic | Coptic | 1.0 | 135 | (disunified from Greek in 4.1) Ancient/historic |
Cprt | 403 | Cypriot | Cypriot | 4.0 | 55 | Ancient/historic |
Cyrl | 220 | Cyrillic | Cyrillic | 1.0 | 404 | |
Cyrs | 221 | Cyrillic (Old Church Slavonic variant) | not in Unicode | |||
Deva | 315 | Devanagari (Nagari) | Devanagari | 1.0 | 140 | |
Dsrt | 250 | Deseret (Mormon) | Deseret | 3.1 | 80 | |
Dupl | 755 | Duployan shorthand | not in Unicode | |||
Egyd | 70 | Egyptian demotic | not in Unicode | |||
Egyh | 60 | Egyptian hieratic | not in Unicode | |||
Egyp | 50 | Egyptian hieroglyphs | Egyptian_Hierogyphs | 5.2 | 1071 | Ancient/historic |
Elba | 226 | Elbasan | not in Unicode | |||
Ethi | 430 | Ethiopic (Ge'ez) | Ethiopic | 3.0 | 461 | |
Geok | 241 | Khutsuri (Asomtavruli and Nuskhuri) | not in Unicode | |||
Geor | 240 | Georgian (Mkhedruli) | Georgian | 1.0 | 120 | |
Glag | 225 | Glagolitic | Glagolitic | 4.1 | 94 | Ancient/historic |
Goth | 206 | Gothic | Gothic | 3.1 | 27 | Ancient/historic |
Gran | 343 | Grantha | not in Unicode | |||
Grek | 200 | Greek | Greek | 1.0 | 511 | |
Gujr | 320 | Gujarati | Gujarati | 1.0 | 83 | |
Guru | 310 | Gurmukhi | Gurmukhi | 1.0 | 79 | |
Hang | 286 | Hangul (Hangul, Hangeul) | Hangul | 1.0 | 11,737 | (Hangul syllables relocated in 2.0) |
Hani | 500 | Han (Hanzi, Kanji, Hanja) | Han | 1.0 | 75,738 | |
Hano | 371 | Hanunoo (Hanunóo) | Hanunoo | 3.2 | 21 | |
Hans | 501 | Han (Simplified variant) | not in Unicode | |||
Hant | 502 | Han (Traditional variant) | not in Unicode | |||
Hebr | 125 | Hebrew | Hebrew | 1.0 | 133 | |
Hira | 410 | Hiragana | Hiragana | 1.0 | 90 | |
Hmng | 450 | Pahawh Hmong | not in Unicode | |||
Hrkt | 412 | (alias for Hiragana + Katakana) | Katakana_Or_Hiragana | not in Unicode | ||
Hung | 176 | Old Hungarian | not in Unicode | |||
Inds | 610 | Indus (Harappan) | not in Unicode | |||
Ital | 210 | Old Italic (Etruscan, Oscan, etc.) | Old_Italic | 3.1 | 35 | Ancient/historic |
Java | 361 | Javanese | Javanese | 5.2 | 91 | |
Jpan | 413 | Japanese (alias for Han + Hiragana + Katakana) | not in Unicode | |||
Kali | 357 | Kayah Li | Kayah_Li | 5.1 | 48 | |
Kana | 411 | Katakana | Katakana | 1.0 | 299 | |
Khar | 305 | Kharoshthi | Kharoshthi | 4.1 | 65 | Ancient/historic |
Khmr | 355 | Khmer | Khmer | 3.0 | 146 | |
Knda | 345 | Kannada | Kannada | 1.0 | 84 | |
Kore | 287 | Korean (alias for Hangul + Han) | not in Unicode | |||
Kpel | 436 | Kpelle | not in Unicode | |||
Kthi | 317 | Kaithi | Kaithi | 5.2 | 66 | Ancient/historic |
Lana | 351 | Tai Tham (Lanna) | Tai_Tham | 5.2 | 127 | |
Laoo | 356 | Lao | Lao | 1.0 | 65 | |
Latf | 217 | Latin (Fraktur variant) | not in Unicode | |||
Latg | 216 | Latin (Gaelic variant) | not in Unicode | |||
Latn | 215 | Latin | Latin | 1.0 | 1244 | |
Lepc | 335 | Lepcha (Róng) | Lepcha | 5.1 | 74 | |
Limb | 336 | Limbu | Limbu | 4.0 | 66 | |
Lina | 400 | Linear A | not in Unicode | |||
Linb | 401 | Linear B | Linear_B | 4.0 | 211 | Ancient/historic |
Lisu | 399 | Lisu (Fraser) | Lisu | 5.2 | 48 | |
Loma | 437 | Loma | not in Unicode | |||
Lyci | 202 | Lycian | Lycian | 5.1 | 29 | Ancient/historic |
Lydi | 116 | Lydian | Lydian | 5.1 | 27 | Ancient/historic |
Mand | 140 | Mandaic, Mandaean | Mandaic | not in Unicode | ||
Mani | 139 | Manichaean | not in Unicode | |||
Maya | 90 | Mayan hieroglyphs | not in Unicode | |||
Mend | 438 | Mende | not in Unicode | |||
Merc | 101 | Meroitic Cursive | not in Unicode | |||
Mero | 100 | Meroitic Hieroglyphs | not in Unicode | |||
Mlym | 347 | Malayalam | Malayalam | 1.0 | 95 | |
Mong | 145 | Mongolian | Mongolian | 3.0 | 153 | Includes Clear, Manchu scripts |
Moon | 218 | Moon (Moon code, Moon script, Moon type) | not in Unicode | |||
Mtei | 337 | Meitei Mayek (Meithei, Meetei) | Meetei_Mayek | 5.2 | 56 | |
Mymr | 350 | Myanmar (Burmese) | Myanmar | 3.0 | 188 | |
Narb | 106 | Old North Arabian (Ancient North Arabian) | not in Unicode | |||
Nbat | 159 | Nabataean | not in Unicode | |||
Nkgb | 420 | Nakhi Geba ('Na-'Khi ²Ggo-¹baw, Naxi Geba) | not in Unicode | |||
Nkoo | 165 | N’Ko | Nko | 5.0 | 59 | |
Ogam | 212 | Ogham | Ogham | 3.0 | 29 | Ancient/historic |
Olck | 261 | Ol Chiki (Ol Cemet’, Ol, Santali) | Ol_Chiki | 5.1 | 48 | |
Orkh | 175 | Old Turkic, Orkhon Runic | Old_Turkic | 5.2 | 73 | Ancient/historic |
Orya | 327 | Oriya | Oriya | 1.0 | 84 | |
Osma | 260 | Osmanya | Osmanya | 4.0 | 40 | |
Palm | 126 | Palmyrene | not in Unicode | |||
Perm | 227 | Old Permic | not in Unicode | |||
Phag | 331 | Phags-pa | Phags_Pa | 5.0 | 56 | Ancient/historic |
Phli | 131 | Inscriptional Pahlavi | Inscriptional_Pahlavi | 5.2 | 27 | Ancient/historic |
Phlp | 132 | Psalter Pahlavi | not in Unicode | |||
Phlv | 133 | Book Pahlavi | not in Unicode | |||
Phnx | 115 | Phoenician | Phoenician | 5.0 | 29 | Ancient/historic |
Plrd | 282 | Miao (Pollard) | not in Unicode | |||
Prti | 130 | Inscriptional Parthian | Inscriptional_Parthian | 5.2 | 30 | Ancient/historic |
Qaaa | 900 | Reserved for private use (start) | not in Unicode | |||
Qaai | 908 | (Private use) | Inherited | 523 | In versions prior to 5.2 (from 5.2: 'Zinh') | |
Qabx | 949 | Reserved for private use (end) | not in Unicode | |||
Rjng | 363 | Rejang (Redjang, Kaganga) | Rejang | 5.1 | 37 | |
Roro | 620 | Rongorongo | not in Unicode | |||
Runr | 211 | Runic | Runic | 3.0 | 78 | Ancient/historic |
Samr | 123 | Samaritan | Samaritan | 5.2 | 61 | |
Sara | 292 | Sarati | not in Unicode | |||
Sarb | 105 | Old South Arabian | Old_South_Arabian | 5.2 | 32 | Ancient/historic |
Saur | 344 | Saurashtra | Saurashtra | 5.1 | 81 | |
Sgnw | 95 | SignWriting | not in Unicode | |||
Shaw | 281 | Shavian (Shaw) | Shavian | 4.0 | 48 | |
Sind | 318 | Sindhi | not in Unicode | |||
Sinh | 348 | Sinhala | Sinhala | 3.0 | 80 | |
Sund | 362 | Sundanese | Sundanese | 5.1 | 55 | |
Sylo | 316 | Syloti Nagri | Syloti_Nagri | 4.1 | 44 | |
Syrc | 135 | Syriac | Syriac | 3.0 | 77 | |
Syre | 138 | Syriac (Estrangelo variant) | not in Unicode | |||
Syrj | 137 | Syriac (Western variant) | not in Unicode | |||
Syrn | 136 | Syriac (Eastern variant) | not in Unicode | |||
Tagb | 373 | Tagbanwa | Tagbanwa | 3.2 | 18 | |
Tale | 353 | Tai Le | Tai_Le | 4.0 | 35 | |
Talu | 354 | New Tai Lue | New_Tai_Lue | 4.1 | 83 | |
Taml | 346 | Tamil | Tamil | 1.0 | 72 | |
Tavt | 359 | Tai Viet | Tai_Viet | 5.2 | 72 | |
Telu | 340 | Telugu | Telugu | 1.0 | 93 | |
Teng | 290 | Tengwar | not in Unicode | |||
Tfng | 120 | Tifinagh (Berber) | Tifinagh | 4.1 | 55 | |
Tglg | 370 | Tagalog (Baybayin, Alibata) | Tagalog | 3.2 | 20 | |
Thaa | 170 | Thaana | Thaana | 3.0 | 50 | |
Thai | 352 | Thai | Thai | 1.0 | 86 | |
Tibt | 330 | Tibetan | Tibetan | 1.0 | 201 | (removed in 1.1 and reintroduced in 2.0) |
Ugar | 40 | Ugaritic | Ugaritic | 4.0 | 31 | Ancient/historic |
Vaii | 470 | Vai | Vai | 5.1 | 300 | |
Visp | 280 | Visible Speech | not in Unicode | |||
Wara | 262 | Warang Citi (Varang Kshiti) | not in Unicode | |||
Xpeo | 30 | Old Persian | Old_Persian | 4.1 | 50 | Ancient/historic |
Xsux | 20 | Cuneiform, Sumero-Akkadian | Cuneiform | 5.0 | 982 | Ancient/historic |
Yiii | 460 | Yi | Yi | 3.0 | 1220 | |
Zinh | 994 | Code for inherited script | Inherited | In version 5.2 (prior versions: 'Qaai') | ||
Zmth | 995 | Mathematical notation | not a 'script' in Unicode | |||
Zsym | 996 | Symbols | not a 'script' in Unicode | |||
Zxxx | 997 | Code for unwritten documents | not in Unicode | |||
Zyyy | 998 | Code for undetermined script | Common | 5395 | ||
Zzzz | 999 | Code for uncoded script | Unknown | all other code points | ||
Notes
|
This documentation is shared between templates {{Unicode blocks}} and {{ISO 15924 script codes and related Unicode data}}.
Usage
The template can be used as usual. It is not a navigation box, so it can be everywhere in an article. The notes are contained within the template, and will not appear in the main References part.
- Note: when resolving red links or wrong links, edit
{{ISO 15924/wp-article}}
. That is where the connection between ISO code and a Misplaced Pages article is made.
ISO 15924 templates
ISO 15924 – Unicode – Wikidata – enwiki: Overview templates & properties | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Item | In template | /subs | Content | Example | Publisher | Usage | TPU | Note | |||
Code (ISO) | {{ISO 15924 code}} | /subp | ID | Arab | ISO 15924 | Everywhere | Alpha-4, enwiki central ISO script id list | ||||
Alias (Unicode) | {{ISO 15924 alias}} | /subp | ID | Arabic | Unicode | ||||||
Article (enwiki) | {{ISO 15924/wp-article}} | /subp | ID | ] | enwiki | ||||||
QID (wikidata) | {{ISO 15924/qid}} | /subp | ID | Q790681 | Wikidata | ||||||
Number; range 000–999 | {{ISO 15924 number}} | /subp | ID | 234 | ISO 15924 | rarely | ISO number not used as ID in enwiki; see Code | ||||
Scripts (sub)merged into main scripts | {{ISO 15924 alias/unicode-merged-into-script}} | /subp | Merged scripts | Latf → Latn | Unicode | Script descriptions, re U+ | In mainspace: 10× hardcoded (e.g.); 2× Qxxx depr | ||||
Name | {{ISO 15924 name}} | /subp | data | Deseret (Mormon) | ISO 15924 | ||||||
Unicode chapter | {{ISO 15924/unicode-chapter}} | /subp | data | Ch 18.1 | Unicode | pdf does not open at .n subchapter | |||||
Script example character |
{{ISO 15924/script-example-character}} | /subp | data | ع | enwiki | User boxes | |||||
In Mainspace | |||||||||||
Overview
|
{{ISO 15924 script codes and related Unicode data}} | /subp | list | enwiki | ISO 15924 | Mainspace: ISO 15924, Script (Unicode), Unicode character property | |||||
Blocks ⇄ Scripts | {{Unicode blocks}} | /subp | list | enwiki | some script articles | Mainspace; related | |||||
graphs | {{ISO 15924/unicode-script-illustration}} | /subp | fonts&files | Mainspace, Scripts in Unicode | |||||||
Overviews | |||||||||||
Overview: templates | {{ISO 15924/overview-templates}} | /subp | list | Misplaced Pages | |||||||
WP-category | {{ISO 15924/wp-category}} | /subp | data | Category:Arabic script | enwiki | Not checked for mainspace | watered down concept for minor scripts | ||||
Also (doc, userbox, technical, ...) | |||||||||||
Documentation | {{ISO 15924 templates/doc}} | /subp | prime documentation | Latin script in Unicode (~) | Reused in multiple templates | ||||||
Redirect | {{R from ISO 15924 code}} | /subp | template | enwiki | Redirects | ||||||
userbox | {{User iso15924}} | /subp | Userboxes | ||||||||
Related Changes | {{Recent changes in Unicode}} | /subp | pages Unicode, ISO 15924 WP:RELC | Related Changes | enwiki | WikiProject | 900+700 P x T | ||||
Unicode versions | {{Unicode version}} | /subp | Version number | as of Unicode version 13.0 | enwiki | (new Sep2022) | |||||
Wikidata properties | |||||||||||
Directionality | script directionality (P1406) | P1406 | {{Infobox}}, ... | ||||||||
Unicode ranges | Unicode range (P5949) | P5949 | {{Infobox}}, ... | ||||||||
ISO English name | name (P2561) | P2561 | Crosscheck only | ||||||||
Modules | |||||||||||
Data module | module:Unicode data | /subp | § Functions overview | ||||||||
HTML named entities | module:Numcr2namecr | /subp | |||||||||
|
Template data
This is the TemplateData for this template used by TemplateWizard, VisualEditor and other tools. See a monthly parameter usage report for Template:ISO 15924 script codes and related Unicode data in articles based on its TemplateData.TemplateData for ISO 15924 script codes and related Unicode data
No description.
Parameter | Description | Type | Status | |
---|---|---|---|---|
1 | 1 | no description | Unknown | optional |
Background: How is this table composed
Note that a script is not a language. A single script, like the Latin alphabet, is used in many languages. Unicode is only about scripts, not about languages that use that script. Still there may be nuances, like the English versus Polish language in using accents on letters.
Step 1: ISO defines a script
ISO defines and publishes a script in the ISO 15924 list. It defines the Alpha-4 code (Aaaa-Zzzz), the Numeric code (000-999), and the formal Name for each accepted script. Currently there are some 160 scripts defined in this list. Included are scripts like "Mathematical notation (Zmth)" and "Code for undetermined script (a.k.a. Common, Zyyy)". The list is formally maintained and published by ISO, and practically by the Unicode Consortium office. It is published on the Unicode website. Technically, the list is file iso15924.txt
.
Step 2: Unicode attaches an Alias name
Then, Unicode (not ISO) maintains a list of Alias script names right next to the ISO-defined scripts, for each script Unicode has encoded. The Alias name is an English name for that script.
So the ISO alpha-4 code gets a unique Alias name by Unicode:
Mymr:ISO Name=Myanmar (Burmese), Alias=Myanmar
.
These Alias names are also present in the definition file iso15924.txt
.
Step 3: Usage by Unicode
From that list, Unicode can translate any alpha4-code into the Alias name of the script, and reverse. Unicode does not use the formal ISO name.
A script name is used in the Unicode Name of a character: "U+05BF ֿ HEBREW POINT RAFE".
Per character
In the Unicode database, Unicode adds one single appropriate alpha-4 code to every individual script character. So every letter, punctuation, number and so of a script get that code. Characters used by multiple scripts, such as the period (.), have script code "Zyyy" (Common). The "script" codes for Mathematical and Symbol are not used by Unicode; symbols and mathematical characters have the property script="Unknown".
Then, in the file Scripts.txt
, Unicode publishes the Alias script name per character (possibly by a range of characters). A part of that file looks like:
... 0591..05BD ; Hebrew # Mn HEBREW ACCENT ETNAHTA..HEBREW POINT METEG 05BE ; Hebrew # Pd HEBREW PUNCTUATION MAQAF 05BF ; Hebrew # Mn HEBREW POINT RAFE 05C0 ; Hebrew # Po HEBREW PUNCTUATION PASEQ 05C1..05C2 ; Hebrew # Mn HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT 05C3 ; Hebrew # Po HEBREW PUNCTUATION SOF PASUQ ...
This datafile defines which scripts are present in Unicode, and what script is at a certain code point.
In a block
Given a block range of code points, then which scripts are present in that block? See {{Unicode blocks}}: this table is constructed by signaling every script that is present as a block (once).
There is no secure relation between a script name and a block name. Some scripts are in a single block, but other scripts are spread amongst several blocks.
See also
- {{ISO 15924 script codes and related Unicode data/header}} (technical subtemplate)
- {{ISO 15924 script codes and related Unicode data/row}} (technical subtemplate)
ISO 15924 templates | |
---|---|
General | |
ISO-defined |
|
Unicode | |
Misplaced Pages (enwiki) | |
Wikidata |
|
Userboxes | |
Technical | |
Unicode templates | |
---|---|
General | |
Inline |
|
Character properties | |
Code points | |
Scripts | |
CJK-specific | |
Misplaced Pages related |
|
Editors can experiment in this template's sandbox (edit | diff) and testcases (create) pages.
Add categories to the /doc subpage. Subpages of this template.
- "UTR #24: Unicode Script Property". Unicode Consortium. 2023-08-14.