Misplaced Pages

Optical character recognition

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.

This is an old revision of this page, as edited by 199.72.115.66 (talk) at 18:03, 25 October 2007. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Revision as of 18:03, 25 October 2007 by 199.72.115.66 (talk)(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff) This article contains special characters. Without proper rendering support, you may see question marks, boxes, or other symbols.

Optical character recognition, usually abbreviated to OCR, is the mechanical or electronic translation of images of handwritten or typewritten text (usually captured by a scanner) into machine-editable text.

OCR is aRecognition]].

Recognition of cursive text is an active area of research, with recognition rates even lower than that of hand-printed text. Higher rates of recognition of general cursive script will likely not be possible without the use of contextual or grammatical information. For example, recognizing entire the mid 1970s at MIT and other institutions. Successive efforts were made to localize and remove musical staff lines leaving symbols to be recognized and parsed. The first proprietary music-scanning program, MIDISCAN, was released in 1991. Three proprietary products are currently available. At this time, OCR software does not recognize handwritten scores.

Magnetic ink character recognition

One area where accuracy and speed of computer input of character information exceeds that of humans is in the area of magnetic ink character recognition, where the error rates range around one read error for every 20,000 to 30,000 checks.

Optical Character Recognition in Unicode

In Unicode, Optical Character Recognition symbol characters are placed in the hexadecimal range 0x2440–0x245F, as shown below (see also Unicode Symbols):

colspan="4" rowspan="3" Template:CT-2|   Symbol rowspan="2" Template:CT-3| Name colspan="4" rowspan="3" Template:CT-4|  
Hex
colspan="2" Template:CT-2| Symbol's Picture
width="0*" Template:CT-7| ⑀ rowspan="2" Template:CT-3| OCR Hook width="0*" Template:CT-7| ⑁ rowspan="2" Template:CT-3| OCR Chair width="0*" Template:CT-7| ⑂ rowspan="2" Template:CT-3| OCR Fork width="0*" Template:CT-7| ⑃ rowspan="2" Template:CT-3| OCR Inverted Fork width="0*" Template:CT-7| ⑄ rowspan="2" Template:CT-3| OCR Belt Buckle
0x2440 0x2441 0x2442 0x2443 0x2444
colspan="2" width="20%" Template:CT-2| File:U+2440.gif colspan="2" width="20%" Template:CT-2| File:U+2441.gif colspan="2" width="20%" Template:CT-2| File:U+2442.gif colspan="2" width="20%" Template:CT-2| File:U+2443.gif colspan="2" width="20%" Template:CT-2| File:U+2444.gif
Template:CT-7| ⑅ rowspan="2" Template:CT-3| OCR Bow Tie Template:CT-7| ⑆ rowspan="2" Template:CT-3| OCR Branch Bank Identification Template:CT-7| ⑇ rowspan="2" Template:CT-3| OCR Amount Of Check Template:CT-7| ⑈ rowspan="2" Template:CT-3| OCR Customer Account Number Template:CT-7| ⑉ rowspan="2" Template:CT-3| OCR Dash
0x2445 0x2446 0x2447 0x2448 0x2449
colspan="2" Template:CT-2| File:U+2445.gif colspan="2" Template:CT-2| File:U+2446.gif colspan="2" Template:CT-2| File:U+2447.gif colspan="2" Template:CT-2| File:U+2448.gif colspan="2" Template:CT-2| File:U+2449.gif
Template:CT-7| ⑊ rowspan="2" Template:CT-3| OCR Double Backslash   rowspan="2" Template:CT-3| Classified   rowspan="2" Template:CT-3| Not Defined   rowspan="2" Template:CT-3| Not Defined   rowspan="2" Template:CT-3| Not Defined
0x244A 0x244B 0x244C 0x244D 0x244E
colspan="2" Template:CT-3| File:U+244A.gif colspan="2" Template:CT-3| - colspan="2" Template:CT-3| - colspan="2" Template:CT-3| - colspan="2" Template:CT-3| -

OCR software

See also

References

External links

Categories: