Misplaced Pages

Template:ISO 15924 script codes and related Unicode data

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by OwenBlacker (talk | contribs) at 09:07, 3 May 2011 (Made collapsing only take affect when transcluded into a page; formatting: 45× whitespace (using Advisor.js)). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 09:07, 3 May 2011 by OwenBlacker (talk | contribs) (Made collapsing only take affect when transcluded into a page; formatting: 45× whitespace (using Advisor.js))(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)
ISO 15924 Scripts and Unicode 

ISO 15924 Scripts in Unicode

Code Nr Name Alias Version Characters Remark
Afak 439 Afaka not in Unicode
Arab 160 Arabic Arabic 1.0 1,051
Armi 124 Imperial Aramaic Imperial_Aramaic 5.2 31 Ancient/historic
Armn 230 Armenian Armenian 1.0 90
Avst 134 Avestan Avestan 5.2 61 Ancient/historic
Bali 360 Balinese Balinese 5.0 121
Bamu 435 Bamum Bamum 5.2 657
Bass 259 Bassa Vah not in Unicode
Batk 365 Batak Batak 6.0 56
Beng 325 Bengali Bengali 1.0 92
Blis 550 Blissymbols not in Unicode
Bopo 285 Bopomofo Bopomofo 1.0 70
Brah 300 Brahmi Brahmi 6.0 108
Brai 570 Braille Braille 3.0 256
Bugi 367 Buginese Buginese 4.1 30
Buhd 372 Buhid Buhid 3.2 20
Cakm 349 Chakma not in Unicode
Cans 440 Unified Canadian Aboriginal Syllabics Canadian_Aboriginal 3.0 710
Cari 201 Carian Carian 5.1 49 Ancient/historic
Cham 358 Cham Cham 5.1 83
Cher 445 Cherokee Cherokee 3.0 85
Cirt 291 Cirth not in Unicode
Copt 204 Coptic Coptic 1.0 135 (disunified from Greek in 4.1) Ancient/historic
Cprt 403 Cypriot Cypriot 4.0 55 Ancient/historic
Cyrl 220 Cyrillic Cyrillic 1.0 408
Cyrs 221 Cyrillic (Old Church Slavonic variant) not in Unicode
Deva 315 Devanagari (Nagari) Devanagari 1.0 150
Dsrt 250 Deseret (Mormon) Deseret 3.1 80
Dupl 755 Duployan shorthand not in Unicode
Egyd 70 Egyptian demotic not in Unicode
Egyh 60 Egyptian hieratic not in Unicode
Egyp 50 Egyptian hieroglyphs Egyptian_Hierogyphs 5.2 1,071 Ancient/historic
Elba 226 Elbasan not in Unicode
Ethi 430 Ethiopic (Ge'ez) Ethiopic 3.0 495
Geok 241 Khutsuri (Asomtavruli and Nuskhuri) not in Unicode
Geor 240 Georgian (Mkhedruli) Georgian 1.0 120
Glag 225 Glagolitic Glagolitic 4.1 94 Ancient/historic
Goth 206 Gothic Gothic 3.1 27 Ancient/historic
Gran 343 Grantha not in Unicode
Grek 200 Greek Greek 1.0 511
Gujr 320 Gujarati Gujarati 1.0 83
Guru 310 Gurmukhi Gurmukhi 1.0 79
Hang 286 Hangul (Hangul, Hangeul) Hangul 1.0 11,739 (Hangul syllables relocated in 2.0)
Hani 500 Han (Hanzi, Kanji, Hanja) Han 1.0 75,960
Hano 371 Hanunoo (Hanunóo) Hanunoo 3.2 21
Hans 501 Han (Simplified variant) not in Unicode
Hant 502 Han (Traditional variant) not in Unicode
Hebr 125 Hebrew Hebrew 1.0 133
Hira 410 Hiragana Hiragana 1.0 91
Hmng 450 Pahawh Hmong not in Unicode
Hrkt 412 (alias for Hiragana + Katakana) Katakana_Or_Hiragana Not in Unicode ​(see individual scripts)
Hung 176 Old Hungarian not in Unicode
Inds 610 Indus (Harappan) not in Unicode
Ital 210 Old Italic (Etruscan, Oscan, etc.) Old_Italic 3.1 35 Ancient/historic
Java 361 Javanese Javanese 5.2 91
Jpan 413 Japanese (alias for Han + Hiragana + Katakana) not in Unicode
Jurc 510 Jurchen not in Unicode
Kali 357 Kayah Li Kayah_Li 5.1 48
Kana 411 Katakana Katakana 1.0 300
Khar 305 Kharoshthi Kharoshthi 4.1 65 Ancient/historic
Khmr 355 Khmer Khmer 3.0 146
Knda 345 Kannada Kannada 1.0 86
Kore 287 Korean (alias for Hangul + Han) not in Unicode
Kpel 436 Kpelle not in Unicode
Kthi 317 Kaithi Kaithi 5.2 66 Ancient/historic
Lana 351 Tai Tham (Lanna) Tai_Tham 5.2 127
Laoo 356 Lao Lao 1.0 65
Latf 217 Latin (Fraktur variant) not in Unicode
Latg 216 Latin (Gaelic variant) not in Unicode
Latn 215 Latin Latin 1.0 1,267
Lepc 335 Lepcha (Róng) Lepcha 5.1 74
Limb 336 Limbu Limbu 4.0 66
Lina 400 Linear A not in Unicode
Linb 401 Linear B Linear_B 4.0 211 Ancient/historic
Lisu 399 Lisu (Fraser) Lisu 5.2 48
Loma 437 Loma not in Unicode
Lyci 202 Lycian Lycian 5.1 29 Ancient/historic
Lydi 116 Lydian Lydian 5.1 27 Ancient/historic
Mand 140 Mandaic, Mandaean Mandaic 6.0 29
Mani 139 Manichaean not in Unicode
Maya 90 Mayan hieroglyphs not in Unicode
Mend 438 Mende script not in Unicode
Merc 101 Meroitic Cursive not in Unicode
Mero 100 Meroitic Hieroglyphs not in Unicode
Mlym 347 Malayalam Malayalam 1.0 98
Mong 145 Mongolian Mongolian 3.0 153 Includes Clear, Manchu scripts
Moon 218 Moon (Moon code, Moon script, Moon type) not in Unicode
Mroo 199 Mro not in Unicode
Mtei 337 Meitei Mayek (Meithei, Meetei) Meetei_Mayek 5.2 56
Mymr 350 Myanmar (Burmese) Myanmar 3.0 188
Narb 106 Old North Arabian (Ancient North Arabian) not in Unicode
Nbat 159 Nabataean not in Unicode
Nkgb 420 Nakhi Geba ('Na-'Khi ²Ggo-¹baw, Naxi Geba) not in Unicode
Nkoo 165 N’Ko Nko 5.0 59
Nshu 499 Nüshu not in Unicode
Ogam 212 Ogham Ogham 3.0 29 Ancient/historic
Olck 261 Ol Chiki (Ol Cemet’, Ol, Santali) Ol_Chiki 5.1 48
Orkh 175 Old Turkic, Orkhon Runic Old_Turkic 5.2 73 Ancient/historic
Orya 327 Oriya Oriya 1.0 90
Osma 260 Osmanya Osmanya 4.0 40
Palm 126 Palmyrene not in Unicode
Perm 227 Old Permic not in Unicode
Phag 331 Phags-pa Phags_Pa 5.0 56 Ancient/historic
Phli 131 Inscriptional Pahlavi Inscriptional_Pahlavi 5.2 27 Ancient/historic
Phlp 132 Psalter Pahlavi not in Unicode
Phlv 133 Book Pahlavi not in Unicode
Phnx 115 Phoenician Phoenician 5.0 29 Ancient/historic
Plrd 282 Miao (Pollard) not in Unicode
Prti 130 Inscriptional Parthian Inscriptional_Parthian 5.2 30 Ancient/historic
Qaaa 900 Reserved for private use (start) not in Unicode
Qaai 908 (Private use) Inherited 523 In versions prior to 5.2 (from 5.2: 'Zinh')
Qabx 949 Reserved for private use (end) not in Unicode
Rjng 363 Rejang (Redjang, Kaganga) Rejang 5.1 37
Roro 620 Rongorongo not in Unicode
Runr 211 Runic Runic 3.0 78 Ancient/historic
Samr 123 Samaritan Samaritan 5.2 61
Sara 292 Sarati not in Unicode
Sarb 105 Old South Arabian Old_South_Arabian 5.2 32 Ancient/historic
Saur 344 Saurashtra Saurashtra 5.1 81
Sgnw 95 SignWriting not in Unicode
Shaw 281 Shavian (Shaw) Shavian 4.0 48
Shrd 319 Sharada not in Unicode
Sind 318 Sindhi, Khudawadi not in Unicode
Sinh 348 Sinhala Sinhala 3.0 80
Sora 398 Sora Sompeng not in Unicode
Sund 362 Sundanese Sundanese 5.1 55
Sylo 316 Syloti Nagri Syloti_Nagri 4.1 44
Syrc 135 Syriac Syriac 3.0 77
Syre 138 Syriac (Estrangelo variant) not in Unicode
Syrj 137 Syriac (Western variant) not in Unicode
Syrn 136 Syriac (Eastern variant) not in Unicode
Tagb 373 Tagbanwa Tagbanwa 3.2 18
Takr 321 Takri not in Unicode
Tale 353 Tai Le Tai_Le 4.0 35
Talu 354 New Tai Lue New_Tai_Lue 4.1 83
Taml 346 Tamil Tamil 1.0 72
Tang 520 Tangut not in Unicode
Tavt 359 Tai Viet Tai_Viet 5.2 72
Telu 340 Telugu Telugu 1.0 93
Teng 290 Tengwar not in Unicode
Tfng 120 Tifinagh (Berber) Tifinagh 4.1 57
Tglg 370 Tagalog (Baybayin, Alibata) Tagalog 3.2 20
Thaa 170 Thaana Thaana 3.0 50
Thai 352 Thai Thai 1.0 86
Tibt 330 Tibetan Tibetan 1.0 207 (removed in 1.1 and reintroduced in 2.0)
Ugar 40 Ugaritic Ugaritic 4.0 31 Ancient/historic
Vaii 470 Vai Vai 5.1 300
Visp 280 Visible Speech not in Unicode
Wara 262 Warang Citi (Varang Kshiti) not in Unicode
Wole 480 Woleai not in Unicode
Xpeo 30 Old Persian Old_Persian 4.1 50 Ancient/historic
Xsux 20 Cuneiform, Sumero-Akkadian Cuneiform 5.0 982 Ancient/historic
Yiii 460 Yi Yi 3.0 1,220
Zinh 994 Code for inherited script Inherited In version 5.2 (prior versions: 'Qaai')
Zmth 995 Mathematical notation not a 'script' in Unicode
Zsym 996 Symbols not a 'script' in Unicode
Zxxx 997 Code for unwritten documents not in Unicode
Zyyy 998 Code for undetermined script Common 6,379
Zzzz 999 Code for uncoded script Unknown all other code points
Notes
1. ISO 15924 publications (at Unicode.org site) As of 21 December 2010
2. ISO 15924 Normative text file (Alias names are informal)
3. ISO 15924 Changes (including Aliases for Unicode)
4. As of Unicode version 6.0
5. Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924
Template documentation[view] [edit] [history] [purge]

This documentation is shared between templates {{Unicode blocks}} and {{ISO 15924 script codes and related Unicode data}}.

Usage

The template can be used as usual. It is not a navigation box, so it can be everywhere in an article. The notes are contained within the template, and will not appear in the main References part.

Note: when resolving red links or wrong links, edit {{ISO 15924/wp-article}}. That is where the connection between ISO code and a Misplaced Pages article is made.

ISO 15924 templates

ISO 15924Unicode – Wikidata – enwiki: Overview templates & properties
Item In template /subs Content Example Publisher Usage TPU Note
Code (ISO) {{ISO 15924 code}} /subp ID Arab ISO 15924 Everywhere Alpha-4, enwiki central ISO script id list
Alias (Unicode) {{ISO 15924 alias}} /subp ID Arabic Unicode
Article (enwiki) {{ISO 15924/wp-article}} /subp ID ] enwiki
QID (wikidata) {{ISO 15924/qid}} /subp ID Q790681 Wikidata
Number; range 000–999 {{ISO 15924 number}} /subp ID 234 ISO 15924 rarely ISO number not used as ID in enwiki; see Code
Scripts (sub)merged into main scripts {{ISO 15924 alias/unicode-merged-into-script}} /subp Merged scripts Latf → Latn Unicode Script descriptions, re U+ In mainspace: 10× hardcoded (e.g.); 2× Qxxx depr
Name {{ISO 15924 name}} /subp data Deseret (Mormon) ISO 15924
Unicode chapter {{ISO 15924/unicode-chapter}} /subp data Ch 18.1 Unicode pdf does not open at .n subchapter
Script example
character
{{ISO 15924/script-example-character}} /subp data ع‎ enwiki User boxes
In Mainspace
Overview
  • ISO
  • U
  • enwiki
{{ISO 15924 script codes and related Unicode data}} /subp list enwiki ISO 15924 Mainspace: ISO 15924, Script (Unicode), Unicode character property
Blocks ⇄ Scripts {{Unicode blocks}} /subp list enwiki some script articles Mainspace; related
graphs {{ISO 15924/unicode-script-illustration}} /subp fonts&files Mainspace, Scripts in Unicode
Overviews
Overview: templates {{ISO 15924/overview-templates}} /subp list Misplaced Pages
WP-category {{ISO 15924/wp-category}} /subp data Category:Arabic script enwiki Not checked for mainspace watered down concept for minor scripts
Also (doc, userbox, technical, ...)
Documentation {{ISO 15924 templates/doc}} /subp prime documentation Latin script in Unicode (~) Reused in multiple templates
Redirect {{R from ISO 15924 code}} /subp template enwiki Redirects
userbox {{User iso15924}} /subp Userboxes
Related Changes {{Recent changes in Unicode}} /subp pages Unicode, ISO 15924 WP:RELC Related Changes enwiki WikiProject 900+700 P x T
Unicode versions {{Unicode version}} /subp Version number as of Unicode version 13.0 enwiki (new Sep2022)
Wikidata properties
Directionality script directionality (P1406) P1406 {{Infobox}}, ...
Unicode ranges Unicode range (P5949) P5949 {{Infobox}}, ...
ISO English name name (P2561) P2561 Crosscheck only
Modules
Data module module:Unicode data /subp § Functions overview
HTML named entities module:Numcr2namecr /subp
More templates


Template data

This is the TemplateData for this template used by TemplateWizard, VisualEditor and other tools. See a monthly parameter usage report for Template:ISO 15924 script codes and related Unicode data in articles based on its TemplateData.

TemplateData for ISO 15924 script codes and related Unicode data

No description.

Template parameters

ParameterDescriptionTypeStatus
11

no description

Unknownoptional

Background: How is this table composed

Note that a script is not a language. A single script, like the Latin alphabet, is used in many languages. Unicode is only about scripts, not about languages that use that script. Still there may be nuances, like the English versus Polish language in using accents on letters.

Step 1: ISO defines a script

ISO defines and publishes a script in the ISO 15924 list. It defines the Alpha-4 code (Aaaa-Zzzz), the Numeric code (000-999), and the formal Name for each accepted script. Currently there are some 160 scripts defined in this list. Included are scripts like "Mathematical notation (Zmth)" and "Code for undetermined script (a.k.a. Common, Zyyy)". The list is formally maintained and published by ISO, and practically by the Unicode Consortium office. It is published on the Unicode website. Technically, the list is file iso15924.txt.

Step 2: Unicode attaches an Alias name

Then, Unicode (not ISO) maintains a list of Alias script names right next to the ISO-defined scripts, for each script Unicode has encoded. The Alias name is an English name for that script.

So the ISO alpha-4 code gets a unique Alias name by Unicode: Mymr:ISO Name=Myanmar (Burmese), Alias=Myanmar. These Alias names are also present in the definition file iso15924.txt.

Step 3: Usage by Unicode

From that list, Unicode can translate any alpha4-code into the Alias name of the script, and reverse. Unicode does not use the formal ISO name.

A script name is used in the Unicode Name of a character: "U+05BF ֿ HEBREW POINT RAFE".

Per character

In the Unicode database, Unicode adds one single appropriate alpha-4 code to every individual script character. So every letter, punctuation, number and so of a script get that code. Characters used by multiple scripts, such as the period (.), have script code "Zyyy" (Common). The "script" codes for Mathematical and Symbol are not used by Unicode; symbols and mathematical characters have the property script="Unknown".

Then, in the file Scripts.txt, Unicode publishes the Alias script name per character (possibly by a range of characters). A part of that file looks like:

...
0591..05BD    ; Hebrew # Mn   HEBREW ACCENT ETNAHTA..HEBREW POINT METEG
05BE          ; Hebrew # Pd       HEBREW PUNCTUATION MAQAF
05BF          ; Hebrew # Mn       HEBREW POINT RAFE
05C0          ; Hebrew # Po       HEBREW PUNCTUATION PASEQ
05C1..05C2    ; Hebrew # Mn    HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT
05C3          ; Hebrew # Po       HEBREW PUNCTUATION SOF PASUQ
...

This datafile defines which scripts are present in Unicode, and what script is at a certain code point.

In a block

Given a block range of code points, then which scripts are present in that block? See {{Unicode blocks}}: this table is constructed by signaling every script that is present as a block (once).

There is no secure relation between a script name and a block name. Some scripts are in a single block, but other scripts are spread amongst several blocks.

See also

ISO 15924 templates
General
ISO-defined
Unicode
Misplaced Pages (enwiki)
Wikidata
Userboxes
Technical
Unicode templates
General
Inline
Character
properties
Code points
Scripts
CJK-specific
Misplaced Pages related
The above documentation is transcluded from Template:ISO 15924 script codes and related Unicode data/doc. (edit | history)
Editors can experiment in this template's sandbox (edit | diff) and testcases (create) pages.
Add categories to the /doc subpage. Subpages of this template.
  1. "UTR #24: Unicode Script Property". Unicode Consortium. 2023-08-14.
Categories: