Template:ISO 15924 script codes and related Unicode data: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 15:44, 8 September 2010 editDePiep (talk \| contribs)Extended confirmed users294,285 editsmNo edit summary← Previous edit		Revision as of 20:36, 30 September 2010 edit undoDePiep (talk \| contribs)Extended confirmed users294,285 edits integrating notes into the tableNext edit →
Line 310:		Line 310:
	\|-		\|-
	\| Zzzz \|\| 999 \|\| Code for ] \|\| Unknown \|\| \|\| \|\| all other code points		\| Zzzz \|\| 999 \|\| Code for ] \|\| Unknown \|\| \|\| \|\| all other code points
			\|- class="sortbottom"
	\|}<div style="border: 1px solid grey;">
			\| colspan="7" \| '''Notes'''
	;Notes
	:1.{{note\|ISO_Unicode}}		:1.{{note\|ISO_Unicode}}</br>
	:2.{{note\|ISO_list}} (Alias names are informal)		:2.{{note\|ISO_list}} (Alias names are informal)</br>
	:3.{{note\|ISO_changes}} (including Aliases for Unicode)] {{As of\|2010\|07\|23}}		:3.{{note\|ISO_changes}} (including Aliases for Unicode)] {{As of\|2010\|07\|23}}</br>
	:4.{{note\|Asof_Unicode_version}}As of Unicode version 5.2		:4.{{note\|Asof_Unicode_version}}As of Unicode version 5.2</br>
	:5.{{note\|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924		:5.{{note\|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924</br>
	~~</div>~~<noinclude>		\|}<noinclude>
	{{Documentation}}		{{Documentation}}
	</noinclude>		</noinclude>

Revision as of 20:36, 30 September 2010

**ISO 15924 Scripts** **and Unicode**
v
t
e

**ISO 15924** **Scripts in Unicode**
Code	Nr	Name	Alias	Version	Characters	Remark
Arab	160	Arabic	Arabic	1.0	1030
Armi	124	Imperial Aramaic	Imperial_Aramaic	5.2	31	Ancient/historic
Armn	230	Armenian	Armenian	1.0	90
Avst	134	Avestan	Avestan	5.2	61	Ancient/historic
Bali	360	Balinese	Balinese	5.0	121
Bamu	435	Bamum	Bamum	5.2	88
Bass	259	Bassa Vah				not in Unicode
Batk	365	Batak	Batak			not in Unicode
Beng	325	Bengali	Bengali	1.0	92
Blis	550	Blissymbols				not in Unicode
Bopo	285	Bopomofo	Bopomofo	1.0	65
Brah	300	Brahmi	Brahmi			not in Unicode
Brai	570	Braille	Braille	3.0	256
Bugi	367	Buginese	Buginese	4.1	30
Buhd	372	Buhid	Buhid	3.2	20
Cakm	349	Chakma				not in Unicode
Cans	440	Unified Canadian Aboriginal Syllabics	Canadian_Aboriginal	3.0	710
Cari	201	Carian	Carian	5.1	49	Ancient/historic
Cham	358	Cham	Cham	5.1	83
Cher	445	Cherokee	Cherokee	3.0	85
Cirt	291	Cirth				not in Unicode
Copt	204	Coptic	Coptic	1.0	135	(disunified from Greek in 4.1) Ancient/historic
Cprt	403	Cypriot	Cypriot	4.0	55	Ancient/historic
Cyrl	220	Cyrillic	Cyrillic	1.0	404
Cyrs	221	Cyrillic (Old Church Slavonic variant)				not in Unicode
Deva	315	Devanagari (Nagari)	Devanagari	1.0	140
Dsrt	250	Deseret (Mormon)	Deseret	3.1	80
Dupl	755	Duployan shorthand				not in Unicode
Egyd	70	Egyptian demotic				not in Unicode
Egyh	60	Egyptian hieratic				not in Unicode
Egyp	50	Egyptian hieroglyphs	Egyptian_Hierogyphs	5.2	1071	Ancient/historic
Elba	226	Elbasan				not in Unicode
Ethi	430	Ethiopic (Ge'ez)	Ethiopic	3.0	461
Geok	241	Khutsuri (Asomtavruli and Nuskhuri)				not in Unicode
Geor	240	Georgian (Mkhedruli)	Georgian	1.0	120
Glag	225	Glagolitic	Glagolitic	4.1	94	Ancient/historic
Goth	206	Gothic	Gothic	3.1	27	Ancient/historic
Gran	343	Grantha				not in Unicode
Grek	200	Greek	Greek	1.0	511
Gujr	320	Gujarati	Gujarati	1.0	83
Guru	310	Gurmukhi	Gurmukhi	1.0	79
Hang	286	Hangul (Hangul, Hangeul)	Hangul	1.0	11,737	(Hangul syllables relocated in 2.0)
Hani	500	Han (Hanzi, Kanji, Hanja)	Han	1.0	75,738
Hano	371	Hanunoo (Hanunóo)	Hanunoo	3.2	21
Hans	501	Han (Simplified variant)				not in Unicode
Hant	502	Han (Traditional variant)				not in Unicode
Hebr	125	Hebrew	Hebrew	1.0	133
Hira	410	Hiragana	Hiragana	1.0	90
Hmng	450	Pahawh Hmong				not in Unicode
Hrkt	412	(alias for Hiragana + Katakana)	Katakana_Or_Hiragana			not in Unicode
Hung	176	Old Hungarian				not in Unicode
Inds	610	Indus (Harappan)				not in Unicode
Ital	210	Old Italic (Etruscan, Oscan, etc.)	Old_Italic	3.1	35	Ancient/historic
Java	361	Javanese	Javanese	5.2	91
Jpan	413	Japanese (alias for Han + Hiragana + Katakana)				not in Unicode
Kali	357	Kayah Li	Kayah_Li	5.1	48
Kana	411	Katakana	Katakana	1.0	299
Khar	305	Kharoshthi	Kharoshthi	4.1	65	Ancient/historic
Khmr	355	Khmer	Khmer	3.0	146
Knda	345	Kannada	Kannada	1.0	84
Kore	287	Korean (alias for Hangul + Han)				not in Unicode
Kpel	436	Kpelle				not in Unicode
Kthi	317	Kaithi	Kaithi	5.2	66	Ancient/historic
Lana	351	Tai Tham (Lanna)	Tai_Tham	5.2	127
Laoo	356	Lao	Lao	1.0	65
Latf	217	Latin (Fraktur variant)				not in Unicode
Latg	216	Latin (Gaelic variant)				not in Unicode
Latn	215	Latin	Latin	1.0	1244
Lepc	335	Lepcha (Róng)	Lepcha	5.1	74
Limb	336	Limbu	Limbu	4.0	66
Lina	400	Linear A				not in Unicode
Linb	401	Linear B	Linear_B	4.0	211	Ancient/historic
Lisu	399	Lisu (Fraser)	Lisu	5.2	48
Loma	437	Loma				not in Unicode
Lyci	202	Lycian	Lycian	5.1	29	Ancient/historic
Lydi	116	Lydian	Lydian	5.1	27	Ancient/historic
Mand	140	Mandaic, Mandaean	Mandaic			not in Unicode
Mani	139	Manichaean				not in Unicode
Maya	90	Mayan hieroglyphs				not in Unicode
Mend	438	Mende				not in Unicode
Merc	101	Meroitic Cursive				not in Unicode
Mero	100	Meroitic Hieroglyphs				not in Unicode
Mlym	347	Malayalam	Malayalam	1.0	95
Mong	145	Mongolian	Mongolian	3.0	153	Includes Clear, Manchu scripts
Moon	218	Moon (Moon code, Moon script, Moon type)				not in Unicode
Mtei	337	Meitei Mayek (Meithei, Meetei)	Meetei_Mayek	5.2	56
Mymr	350	Myanmar (Burmese)	Myanmar	3.0	188
Narb	106	Old North Arabian (Ancient North Arabian)				not in Unicode
Nbat	159	Nabataean				not in Unicode
Nkgb	420	Nakhi Geba ('Na-'Khi ²Ggo-¹baw, Naxi Geba)				not in Unicode
Nkoo	165	N’Ko	Nko	5.0	59
Ogam	212	Ogham	Ogham	3.0	29	Ancient/historic
Olck	261	Ol Chiki (Ol Cemet’, Ol, Santali)	Ol_Chiki	5.1	48
Orkh	175	Old Turkic, Orkhon Runic	Old_Turkic	5.2	73	Ancient/historic
Orya	327	Oriya	Oriya	1.0	84
Osma	260	Osmanya	Osmanya	4.0	40
Palm	126	Palmyrene				not in Unicode
Perm	227	Old Permic				not in Unicode
Phag	331	Phags-pa	Phags_Pa	5.0	56	Ancient/historic
Phli	131	Inscriptional Pahlavi	Inscriptional_Pahlavi	5.2	27	Ancient/historic
Phlp	132	Psalter Pahlavi				not in Unicode
Phlv	133	Book Pahlavi				not in Unicode
Phnx	115	Phoenician	Phoenician	5.0	29	Ancient/historic
Plrd	282	Miao (Pollard)				not in Unicode
Prti	130	Inscriptional Parthian	Inscriptional_Parthian	5.2	30	Ancient/historic
Qaaa	900	Reserved for private use (start)				not in Unicode
Qaai	908	(Private use)	Inherited		523	In versions prior to 5.2 (from 5.2: 'Zinh')
Qabx	949	Reserved for private use (end)				not in Unicode
Rjng	363	Rejang (Redjang, Kaganga)	Rejang	5.1	37
Roro	620	Rongorongo				not in Unicode
Runr	211	Runic	Runic	3.0	78	Ancient/historic
Samr	123	Samaritan	Samaritan	5.2	61
Sara	292	Sarati				not in Unicode
Sarb	105	Old South Arabian	Old_South_Arabian	5.2	32	Ancient/historic
Saur	344	Saurashtra	Saurashtra	5.1	81
Sgnw	95	SignWriting				not in Unicode
Shaw	281	Shavian (Shaw)	Shavian	4.0	48
Sind	318	Sindhi				not in Unicode
Sinh	348	Sinhala	Sinhala	3.0	80
Sund	362	Sundanese	Sundanese	5.1	55
Sylo	316	Syloti Nagri	Syloti_Nagri	4.1	44
Syrc	135	Syriac	Syriac	3.0	77
Syre	138	Syriac (Estrangelo variant)				not in Unicode
Syrj	137	Syriac (Western variant)				not in Unicode
Syrn	136	Syriac (Eastern variant)				not in Unicode
Tagb	373	Tagbanwa	Tagbanwa	3.2	18
Tale	353	Tai Le	Tai_Le	4.0	35
Talu	354	New Tai Lue	New_Tai_Lue	4.1	83
Taml	346	Tamil	Tamil	1.0	72
Tavt	359	Tai Viet	Tai_Viet	5.2	72
Telu	340	Telugu	Telugu	1.0	93
Teng	290	Tengwar				not in Unicode
Tfng	120	Tifinagh (Berber)	Tifinagh	4.1	55
Tglg	370	Tagalog (Baybayin, Alibata)	Tagalog	3.2	20
Thaa	170	Thaana	Thaana	3.0	50
Thai	352	Thai	Thai	1.0	86
Tibt	330	Tibetan	Tibetan	1.0	201	(removed in 1.1 and reintroduced in 2.0)
Ugar	40	Ugaritic	Ugaritic	4.0	31	Ancient/historic
Vaii	470	Vai	Vai	5.1	300
Visp	280	Visible Speech				not in Unicode
Wara	262	Warang Citi (Varang Kshiti)				not in Unicode
Xpeo	30	Old Persian	Old_Persian	4.1	50	Ancient/historic
Xsux	20	Cuneiform, Sumero-Akkadian	Cuneiform	5.0	982	Ancient/historic
Yiii	460	Yi	Yi	3.0	1220
Zinh	994	Code for inherited script	Inherited			In version 5.2 (prior versions: 'Qaai')
Zmth	995	Mathematical notation				not a 'script' in Unicode
Zsym	996	Symbols				not a 'script' in Unicode
Zxxx	997	Code for unwritten documents				not in Unicode
Zyyy	998	Code for undetermined script	Common		5395
Zzzz	999	Code for uncoded script	Unknown			all other code points
Notes 1. ISO 15924 publications (at Unicode.org site) 2. ISO 15924 Normative text file (Alias names are informal) 3. ISO 15924 Changes (including Aliases for Unicode)] As of 23 July 2010 4. As of Unicode version 5.2 5. Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924

Template documentation[view] [edit] [history] [purge]

This documentation is shared between templates {{Unicode blocks}} and {{ISO 15924 script codes and related Unicode data}}.

Usage

The template can be used as usual. It is not a navigation box, so it can be everywhere in an article. The notes are contained within the template, and will not appear in the main References part.

Note: when resolving red links or wrong links, edit {{ISO 15924/wp-article}}. That is where the connection between ISO code and a Misplaced Pages article is made.

ISO 15924 templates

ISO 15924 – Unicode – Wikidata – enwiki: Overview templates & properties

Item

In template

/subs

Content

Example

Publisher

Usage

TPU

Note

Code (ISO)

/subp

Arab

ISO 15924

Everywhere

Alpha-4, enwiki central ISO script id list

Alias (Unicode)

/subp

Arabic

Unicode

Article (enwiki)

/subp

]

enwiki

QID (wikidata)

/subp

Q790681

Wikidata

Number; range 000–999

/subp

234

ISO 15924

rarely

ISO number not used as ID in enwiki; see Code

Scripts (sub)merged into main scripts

/subp

Merged scripts

Latf → Latn

Unicode

Script descriptions, re U+

In mainspace: 10× hardcoded (e.g.); 2× Qxxx depr

Name

/subp

data

Deseret (Mormon)

ISO 15924

Unicode chapter

/subp

data

Ch 18.1

Unicode

pdf does not open at .n subchapter

Script example
character

/subp

data

ع‎

enwiki

User boxes

In Mainspace

Overview

ISO
U
enwiki

/subp

list

enwiki

ISO 15924

Mainspace: ISO 15924, Script (Unicode), Unicode character property

Blocks ⇄ Scripts

/subp

list

enwiki

some script articles

Mainspace; related

graphs

/subp

fonts&files

Mainspace, Scripts in Unicode

Overviews

Overview: templates

/subp

list

Misplaced Pages

WP-category

/subp

data

Category:Arabic script

enwiki

Not checked for mainspace

watered down concept for minor scripts

Also (doc, userbox, technical, ...)

Documentation

/subp

prime documentation

Latin script in Unicode (~)

Reused in multiple templates

Redirect

/subp

template

enwiki

Redirects

userbox

/subp

Userboxes

Related Changes

/subp

pages Unicode, ISO 15924 WP:RELC

Related Changes

enwiki

WikiProject

900+700 P x T

Unicode versions

/subp

Version number

as of Unicode version 13.0

enwiki

(new Sep2022)

Wikidata properties

Directionality

script directionality (P1406)

P1406

{{Infobox}}, ...

Unicode ranges

Unicode range (P5949)

P5949

{{Infobox}}, ...

ISO English name

name (P2561)

P2561

Crosscheck only

Modules

Data module

module:Unicode data

/subp

§ Functions overview

HTML named entities

module:Numcr2namecr

/subp

More templates

All subpages of {{ISO_15924}}

{{lang}} connection (§ Indicating writing script)

Template data

This is the TemplateData for this template used by TemplateWizard, VisualEditor and other tools. See a monthly parameter usage report for Template:ISO 15924 script codes and related Unicode data in articles based on its TemplateData.

TemplateData for ISO 15924 script codes and related Unicode data

No description.

Template parameters
Parameter		Description	Type	Status
1	`1`	no description	Unknown	optional

Background: How is this table composed

Note that a script is not a language. A single script, like the Latin alphabet, is used in many languages. Unicode is only about scripts, not about languages that use that script. Still there may be nuances, like the English versus Polish language in using accents on letters.

Step 1: ISO defines a script

ISO defines and publishes a script in the ISO 15924 list. It defines the Alpha-4 code (Aaaa-Zzzz), the Numeric code (000-999), and the formal Name for each accepted script. Currently there are some 160 scripts defined in this list. Included are scripts like "Mathematical notation (Zmth)" and "Code for undetermined script (a.k.a. Common, Zyyy)". The list is formally maintained and published by ISO, and practically by the Unicode Consortium office. It is published on the Unicode website. Technically, the list is file iso15924.txt.

Step 2: Unicode attaches an Alias name

Then, Unicode (not ISO) maintains a list of Alias script names right next to the ISO-defined scripts, for each script Unicode has encoded. The Alias name is an English name for that script.

So the ISO alpha-4 code gets a unique Alias name by Unicode: Mymr:ISO Name=Myanmar (Burmese), Alias=Myanmar. These Alias names are also present in the definition file iso15924.txt.

Step 3: Usage by Unicode

From that list, Unicode can translate any alpha4-code into the Alias name of the script, and reverse. Unicode does not use the formal ISO name.

A script name is used in the Unicode Name of a character: "U+05BF ֿ HEBREW POINT RAFE".

Per character

In the Unicode database, Unicode adds one single appropriate alpha-4 code to every individual script character. So every letter, punctuation, number and so of a script get that code. Characters used by multiple scripts, such as the period (.), have script code "Zyyy" (Common). The "script" codes for Mathematical and Symbol are not used by Unicode; symbols and mathematical characters have the property script="Unknown".

Then, in the file Scripts.txt, Unicode publishes the Alias script name per character (possibly by a range of characters). A part of that file looks like:

...
0591..05BD    ; Hebrew # Mn   HEBREW ACCENT ETNAHTA..HEBREW POINT METEG
05BE          ; Hebrew # Pd       HEBREW PUNCTUATION MAQAF
05BF          ; Hebrew # Mn       HEBREW POINT RAFE
05C0          ; Hebrew # Po       HEBREW PUNCTUATION PASEQ
05C1..05C2    ; Hebrew # Mn    HEBREW POINT SHIN DOT..HEBREW POINT SIN DOT
05C3          ; Hebrew # Po       HEBREW PUNCTUATION SOF PASUQ
...

This datafile defines which scripts are present in Unicode, and what script is at a certain code point.

In a block

Given a block range of code points, then which scripts are present in that block? See {{Unicode blocks}}: this table is constructed by signaling every script that is present as a block (once).

There is no secure relation between a script name and a block name. Some scripts are in a single block, but other scripts are spread amongst several blocks.

Revision as of 15:44, 8 September 2010 editDePiep (talk \| contribs)Extended confirmed users294,285 editsmNo edit summary← Previous edit		Revision as of 20:36, 30 September 2010 edit undoDePiep (talk \| contribs)Extended confirmed users294,285 edits integrating notes into the tableNext edit →
Line 310:		Line 310:
	\|-		\|-
	\| Zzzz \|\| 999 \|\| Code for ] \|\| Unknown \|\| \|\| \|\| all other code points		\| Zzzz \|\| 999 \|\| Code for ] \|\| Unknown \|\| \|\| \|\| all other code points
			\|- class="sortbottom"
	\|}<div style="border: 1px solid grey;">
			\| colspan="7" \| '''Notes'''
	;Notes
	:1.{{note\|ISO_Unicode}}		:1.{{note\|ISO_Unicode}}</br>
	:2.{{note\|ISO_list}} (Alias names are informal)		:2.{{note\|ISO_list}} (Alias names are informal)</br>
	:3.{{note\|ISO_changes}} (including Aliases for Unicode)] {{As of\|2010\|07\|23}}		:3.{{note\|ISO_changes}} (including Aliases for Unicode)] {{As of\|2010\|07\|23}}</br>
	:4.{{note\|Asof_Unicode_version}}As of Unicode version 5.2		:4.{{note\|Asof_Unicode_version}}As of Unicode version 5.2</br>
	:5.{{note\|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924		:5.{{note\|Aliases_for_Unicode}}Unicode uses the Alias (Property Value Alias) as the script-name. These Alias names are part of Unicode and are published informatively next to ISO 15924</br>
	~~</div>~~<noinclude>		\|}<noinclude>
	{{Documentation}}		{{Documentation}}
	</noinclude>		</noinclude>

v t e ISO 15924 templates
General	ISO 15924 Unicode `{{Infobox writing system}}`
ISO-defined	`{{ISO 15924 code}}` `{{ISO 15924 name}}` `{{ISO 15924 number}}`
Unicode	{{Unicode Alias script names}} ({{Unicode merged into-scripts}}) `{{ISO 15924 script codes and related Unicode data}}`
Misplaced Pages (enwiki)	`{{ISO 15924/wp-article}}` (label) `{{ISO 15924/script-example-character}}` `{{ISO 15924/wp-category}}`
Wikidata	`{{ISO 15924/qid}}`
Userboxes	Userboxes/Writing systems `{{User iso15924}}` `{{User iso15924/category-intro}}`
Technical	`{{R from ISO 15924 code}}`
Category:ISO 15924 templates All templates starting with ISO 15924 All templates starting with User iso15924 (Userbox)

v t e Unicode templates
General	{{Unicode navigation}}
Inline	{{Unichar}} {{U+}} {{GB18030}} {{#invoke:Unicode convert}}
Character properties	Bidi Class General Category Punctuation marks in Unicode Hexadecimal digit Numeric Type Whitespace Alias names and abbreviations
Code points	Planes Unicode blocks Private Use Area
Scripts	{{ISO 15924 script codes and related Unicode data}}
CJK-specific	{{CJK ideographs in Unicode}} {{CJKV}} {{Unihan}} {{Hani}} {{Lang-zh}} {{Nihongo}} {{Korean}} {{Vi-nom}}
Misplaced Pages related	{{Contains special characters}} (with 'Uncommon Unicode' or more specifically) {{PUA}} (MOS:PUA)
Unicode blocks Unicode charts Unicode templates

Revision as of 20:36, 30 September 2010

Usage

ISO 15924 templates

Template data

Background: How is this table composed

Step 1: ISO defines a script

Step 2: Unicode attaches an Alias name

Step 3: Usage by Unicode

Per character

In a block

See also