Misplaced Pages

Chen–Ho encoding

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
(Redirected from 10-bit Chen-Ho encoding) An efficient alternate system of binary encoding for decimal digits

Chen–Ho encoding is a memory-efficient alternate system of binary encoding for decimal digits.

The traditional system of binary encoding for decimal digits, known as binary-coded decimal (BCD), uses four bits to encode each digit, resulting in significant wastage of binary data bandwidth (since four bits can store 16 states and are being used to store only 10), even when using packed BCD.

The encoding reduces the storage requirements of two decimal digits (100 states) from 8 to 7 bits, and those of three decimal digits (1000 states) from 12 to 10 bits using only simple Boolean transformations avoiding any complex arithmetic operations like a base conversion.

History

In what appears to have been a multiple discovery, some of the concepts behind what later became known as Chen–Ho encoding were independently developed by Theodore M. Hertz in 1969 and by Tien Chi Chen (陳天機) (1928–) in 1971.

Hertz of Rockwell filed a patent for his encoding in 1969, which was granted in 1971.

Chen first discussed his ideas with Irving Tze Ho (何宜慈) (1921–2003) in 1971. Chen and Ho were both working for IBM at the time, albeit in different locations. Chen also consulted with Frank Chin Tung to verify the results of his theories independently. IBM filed a patent in their name in 1973, which was granted in 1974. At least by 1973, Hertz's earlier work must have been known to them, as the patent cites his patent as prior art.

With input from Joseph D. Rutledge and John C. McPherson, the final version of the Chen–Ho encoding was circulated inside IBM in 1974 and published in 1975 in the journal Communications of the ACM. This version included several refinements, primarily related to the application of the encoding system. It constitutes a Huffman-like prefix code.

The encoding was referred to as Chen and Ho's scheme in 1975, Chen's encoding in 1982 and became known as Chen–Ho encoding or Chen–Ho algorithm since 2000. After having filed a patent for it in 2001, Michael F. Cowlishaw published a further refinement of Chen–Ho encoding known as densely packed decimal (DPD) encoding in IEE Proceedings – Computers and Digital Techniques in 2002. DPD has subsequently been adopted as the decimal encoding used in the IEEE 754-2008 and ISO/IEC/IEEE 60559:2011 floating-point standards.

Application

Chen noted that the digits zero through seven were simply encoded using three binary digits of the corresponding octal group. He also postulated that one could use a flag to identify a different encoding for the digits eight and nine, which would be encoded using a single bit.

In practice, a series of Boolean transformations are applied to the stream of input bits, compressing BCD encoded digits from 12 bits per three digits to 10 bits per three digits. Reversed transformations are used to decode the resulting coded stream to BCD. Equivalent results can also be achieved by the use of a look-up table.

Chen–Ho encoding is limited to encoding sets of three decimal digits into groups of 10 bits (so called declets). Of the 1024 states possible by using 10 bits, it leaves only 24 states unused (with don't care bits typically set to 0 on write and ignored on read). With only 2.34% wastage it gives a 20% more efficient encoding than BCD with one digit in 4 bits.

Both, Hertz and Chen also proposed similar, but less efficient, encoding schemes to compress sets of two decimal digits (requiring 8 bits in BCD) into groups of 7 bits.

Larger sets of decimal digits could be divided into three- and two-digit groups.

The patents also discuss the possibility to adapt the scheme to digits encoded in any other decimal codes than 8-4-2-1 BCD, like f.e. Excess-3, Excess-6, Jump-at-2, Jump-at-8, Gray, Glixon, O'Brien type-I and Gray–Stibitz code. The same principles could also be applied to other bases.

In 1973, some form of Chen–Ho encoding appears to have been utilized in the address conversion hardware of the optional IBM 7070/7074 emulation feature for the IBM System/370 Model 165 and 370 Model 168 computers.

One prominent application uses a 128-bit register to store 33 decimal digits with a three digit exponent, effectively not less than what could be achieved using binary encoding (whereas BCD encoding would need 144 bits to store the same number of digits).

Encodings for two decimal digits

Hertz encoding

Hertz decimal data encoding for a single heptad (1969 form)
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
12.5% (16 states) 1 1 0 c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 0 1 f a b c 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 1 c x x f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
12.5% (16 states, 0 used) 1 0 0 x x x x 0% (0 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding, method A

Decimal data encoding for a single heptad (early 1971 form, method A)
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
25% (32 states, 16 used) 1 0 x (b) c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 1 0 f a b c 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 1 c x (a) x (b) f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding, method B

Decimal data encoding for a single heptad (early 1971 form, method B)
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
12.5% (16 states) 1 0 c 0 d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states, 4 used) 1 0 c 1 x x f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
12.5% (16 states) 1 1 f 0 a b c 0abc 100f (0–7) (8–9) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states, 0 used) 1 1 x 1 x x x 0% (0 states)
  • This encoding is not parity-preserving.

Patented and final Chen–Ho encoding

Decimal data encoding for a single heptad (patented 1973 form and final 1975 form)
Binary encoding Decimal digits
Code space (128 states) b6 b5 b4 b3 b2 b1 b0 d1 d0 Values encoded Description Occurrences (100 states)
50% (64 states) 0 a b c d e f 0abc 0def (0–7) (0–7) Two lower digits 64% (64 states)
25.0% (32 states, 16 used) 1 0 x (b) c d e f 100c 0def (8–9) (0–7) One lower digit,
one higher digit
16% (16 states)
12.5% (16 states) 1 1 1 c a b f 0abc 100f (0–7) (8–9) 16% (16 states)
12.5% (16 states, 4 used) 1 1 0 c x (a) x (b) f 100c 100f (8–9) (8–9) Two higher digits 4% (4 states)
  • Assuming certain values for the don't-care bits (f.e. 0), this encoding is parity-preserving.

Encodings for three decimal digits

Hertz encoding

Hertz decimal data encoding for a single declet (1969 form)
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 f a b c g h i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 i a b c d e f 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 f 0 0 i a b c 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 c 0 1 i d e f 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 c 1 0 f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 c 1 1 f (0) (0) i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b2 and b1 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Early Chen–Ho encoding

Decimal data encoding for a single declet (early 1971 form)
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 f g h i a b c 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 i a b c d e f 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 0 0 f i a b c 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 0 1 i c d e f 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 1 0 c f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 1 1 c f i (0) (0) 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b1 and b0 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Patented Chen–Ho encoding

Decimal data encoding for a single declet (patented 1973 form)
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b d e g h c f i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 d e g h c f i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 a b g h c f i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 d e a b c f i 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 1 0 a b c f i 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 0 1 d e c f i 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 0 0 g h c f i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 1 1 (0) (0) c f i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b4 and b3 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Final Chen–Ho encoding

Chen-Ho decimal data encoding for a single declet (final 1975 form)
Binary encoding Decimal digits
Code space (1024 states) b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 d2 d1 d0 Values encoded Description Occurrences (1000 states)
50.0% (512 states) 0 a b c d e f g h i 0abc 0def 0ghi (0–7) (0–7) (0–7) Three lower digits 51.2% (512 states)
37.5% (384 states) 1 0 0 c d e f g h i 100c 0def 0ghi (8–9) (0–7) (0–7) Two lower digits,
one higher digit
38.4% (384 states)
1 0 1 c a b f g h i 0abc 100f 0ghi (0–7) (8–9) (0–7)
1 1 0 c d e f a b i 0abc 0def 100i (0–7) (0–7) (8–9)
9.375% (96 states) 1 1 1 c 0 0 f a b i 0abc 100f 100i (0–7) (8–9) (8–9) One lower digit,
two higher digits
9.6% (96 states)
1 1 1 c 0 1 f d e i 100c 0def 100i (8–9) (0–7) (8–9)
1 1 1 c 1 0 f g h i 100c 100f 0ghi (8–9) (8–9) (0–7)
3.125% (32 states, 8 used) 1 1 1 c 1 1 f (0) (0) i 100c 100f 100i (8–9) (8–9) (8–9) Three higher digits, bits b2 and b1 are don't care 0.8% (8 states)
  • This encoding is not parity-preserving.

Storage efficiency

Storage efficiency
BCD Necessary bits Bit difference
Digits States Bits Binary code space Binary encoding 2-digit encoding 3-digit encoding Mixed encoding Mixed vs. Binary Mixed vs. BCD
1 10 4 16 4 (7) (10) 4 0 0
2 100 8 128 7 7 (10) 7 0 −1
3 1000 12 1024 10 (14) 10 10 0 −2
4 10000 16 16384 14 14 (20) 14 0 −2
5 100000 20 131072 17 (21) (20) 17 0 −3
6 1000000 24 1048576 20 21 20 20 0 −4
7 10000000 28 16777216 24 (28) (30) 24 0 −4
8 100000000 32 134217728 27 28 (30) 27 0 −5
9 1000000000 36 1073741824 30 (35) 30 30 0 −6
10 10000000000 40 17179869184 34 35 (40) 34 0 −6
11 100000000000 44 137438953472 37 (42) (40) 37 0 −7
12 1000000000000 48 1099511627776 40 42 40 40 0 −8
13 10000000000000 52 17592186044416 44 (49) (50) 44 0 −8
14 100000000000000 56 140737488355328 47 49 (50) 47 0 −9
15 1000000000000000 60 1125899906842624 50 (56) 50 50 0 −10
16 10000000000000000 64 18014398509481984 54 56 (60) 54 0 −10
17 100000000000000000 68 144115188075855872 57 (63) (60) 57 0 −11
18 1000000000000000000 72 1152921504606846976 60 63 60 60 0 −12
19 10000000000000000000 76 18446744073709551616 64 (70) (70) 64 0 −12
20 80 67 70 (70) 67 0 −13
21 84 70 (77) 70 70 0 −14
22 88 74 77 (80) 74 0 −14
23 92 77 (84) (80) 77 0 −15
24 96 80 84 80 80 0 −16
25 100 84 (91) (90) 84 0 −16
26 104 87 91 (90) 87 0 −17
27 108 90 (98) 90 90 0 −18
28 112 94 98 (100) 94 0 −18
29 116 97 (105) (100) 97 0 −19
30 120 100 105 100 100 0 −20
31 124 103 (112) (110) 104 +1 −20
32 128 107 112 (110) 107 0 −21
33 132 110 (119) 110 110 0 −22
34 136 113 119 (120) 114 +1 −22
35 140 117 (126) (120) 117 0 −23
36 144 120 126 120 120 0 −24
37 148 123 (133) (130) 124 +1 −24
38 152 127 133 (130) 127 0 −25

See also

Notes

  1. Some 4-bit decimal codes are particularly well suited as alternatives to the 8-4-2-1 BCD code: Jump-at-8 code uses the same values for the ordered states 0 to 7, whereas in the Gray BCD and Glixon codes the values for the states 0 to 7 are still from the same set, but ordered differently (which, however, is transparent for the Hertz, Chen–Ho or densely packed decimal (DPD) encodings, as they pass through the bits unaltered). In these four codes, the most-significant bit can be used as a flag denoting "large" values. For the two "large" values, all but one bits remain static (the two middle bits are always zero for 8-4-2-1 and one for Jump-at-8 code, whilst for Gray BCD code one bit is set and the other cleared, whereas for Glixon code the two lower bits are always zero and one bit inverted, thus the two "large" values being transparently swapped), requiring only minor adaptations in the encoding. Three other codes can be conveniently split into groups of eight and two states as well, containing values from two ranges of consecutive bit patterns. In the case of the and Excess-6 BCD and Jump-at-2 codes, the most-significant bit can still be used to distinguish between the two groups, however, compared to the Jump-at-8 code, the group of small values now contains only two states and the larger group contains the eight larger values. In the case of the O'Brien type-I and Gray–Stibitz code, the next-most significant bit can serve as a flag bit instead, with the remaining bits again forming two groups of consecutive values. Therefore, these differences remain transparent for the encoding.

References

  1. ^ Muller, Jean-Michel; Brisebarre, Nicolas; de Dinechin, Florent; Jeannerod, Claude-Pierre; Lefèvre, Vincent; Melquiond, Guillaume; Revol, Nathalie; Stehlé, Damien; Torres, Serge (2010). Handbook of Floating-Point Arithmetic (1 ed.). Birkhäuser. doi:10.1007/978-0-8176-4705-6. ISBN 978-0-8176-4704-9. LCCN 2009939668.
  2. ^ Hertz, Theodore M. (1971-11-02) . "System for the compact storage of decimal numbers" (Patent). Whittier, California, USA: North American Rockwell Corporation. US Patent US3618047A. Retrieved 2018-07-18. (8 pages) (NB. This expired patent discusses a coding system very similar to Chen-Ho, also cited as prior art in the Chen–Ho patent.)
  3. "We hear that..." Physics Today. Vol. 12, no. 2. American Institute of Physics (AIP). 1959. p. 62. doi:10.1063/1.3060696. ISSN 0031-9228. Archived from the original on 2020-06-24. Retrieved 2020-06-24. (1 page)
  4. Parker, David (2003). "Honorary Fellow - A Citation - Professor Chen Tien Chi" (PDF). List of Honorary Fellows. The Chinese University of Hong Kong (CUHK). Archived (PDF) from the original on 2014-12-25. Retrieved 2020-06-24. (2 pages)
  5. "CHEN Tien Chi". The Chinese University of Hong Kong (CUHK). 2013-01-12. Archived from the original on 2015-10-23. Retrieved 2016-02-07.
  6. Wong, Andrew W. F. (2014-08-15) . 陳天機 Chen Tien Chi: 如夢令 Ru Meng Ling (As If Dreaming). Classical Chinese Poems in English (in Chinese and English). Translated by Hongfa (宏發), Huang (黃). Archived from the original on 2020-06-25. Retrieved 2020-06-25.
  7. "Scientist Given Task To Set Up Science-Oriented Industrial Park". Science Bulletin. Vol. 11, no. 2. Taipei, Taiwan: National Science Council. 1979-02-01. p. 1. ISSN 1607-3509. OCLC 1658005. Archived from the original on 2020-06-25. Retrieved 2020-06-24. (1 page)
  8. Tseng, Li-Ling (1988-04-01). "High-Tech Leadership: Irving T. Ho". Taiwan Info. Archived from the original on 2016-02-08. Retrieved 2016-02-08.
  9. "Taiwan's Silicon Valley: The Evolution of Hsinchu Industrial Park". Freeman Spogli Institute for International Studies. Stanford University, Stanford, California, USA. 2000-01-11. Archived from the original on 2020-06-26. Retrieved 2017-05-02.
  10. "Irving T. Ho". San Jose Mercury News. 2003-04-26. Archived from the original on 2020-06-25. Retrieved 2020-06-25.
  11. Chen, Tien Chi (1971-03-12). Decimal-binary integer conversion scheme (Internal memo to Irving Tze Ho). IBM San Jose Research Laboratory, San Jose, California, USA: IBM.
  12. ^ Chen, Tien Chi (1971-03-29). Decimal Number Compression (PDF) (Internal memo to Irving Tze Ho). IBM San Jose Research Laboratory, San Jose, California, USA: IBM. pp. 1–4. Archived (PDF) from the original on 2012-10-17. Retrieved 2016-02-07. (4 pages)
  13. IBM资深专家Frank Tung博士8月4日来我校演讲 [IBM senior expert Dr. Frank Tung came to our school on 4 August to give a speech] (in Chinese and English). Guangzhou, China: South China University of Technology (SCUT). 2004-08-04. Archived from the original on 2004-12-08. Retrieved 2016-02-06.
  14. ^ Chen, Tien Chi; Ho, Irving Tze (1974-10-15) . Written at San Jose, California, USA & Poughkeepsie, New York, USA. "Binary coded decimal conversion apparatus" (Patent). Armonk, New York, USA: International Business Machines Corporation (IBM). US Patent US3842414A. Retrieved 2018-07-18. (14 pages) (NB. This expired patent is about the Chen–Ho algorithm.)
  15. ^ Chen, Tien Chi; Ho, Irving Tze (January 1975) . "Storage-Efficient Representation of Decimal Data". Communications of the ACM. 18 (1). IBM San Jose Research Laboratory, San Jose, California, USA & IBM Systems Products Division, Poughkeepsie/East Fishkill, New York, USA: Association for Computing Machinery: 49–52. doi:10.1145/360569.360660. ISSN 0001-0782. S2CID 14301378. (4 pages)
  16. Chen, Tien Chi; Ho, Irving Tze (1974-06-25). "Storage-Efficient Representation of Decimal Data". Research Report RJ 1420 (Technical report). IBM San Jose Research Laboratory, San Jose, California, USA: IBM.
  17. ^ Cowlishaw, Michael Frederic (2014) . "A Summary of Chen-Ho Decimal Data encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.
  18. Smith, Alan Jay (August 1975) . "Comments on a paper by T. C. Chen and I. T. Ho". Communications of the ACM. 18 (8). University of California, Berkeley, California, USA: 463. doi:10.1145/360933.360986. eISSN 1557-7317. ISSN 0001-0782. S2CID 20910959. CODEN CACMA2. Archived from the original on 2020-06-03. Retrieved 2020-06-03. (1 page) (NB. A publication also discussing Chen–Ho alternatives and variations.)
  19. Sacks-Davis, Ron (1982-11-01) . "Applications of Redundant Number Representations to Decimal Arithmetic". The Computer Journal. 25 (4). Department of Computer Science, Monash University, Clayton, Victoria, Australia: Wiley Heyden Ltd: 471–477. doi:10.1093/comjnl/25.4.471. (7 pages)
  20. Cowlishaw, Michael Frederic (2003-02-25) . Written at Coventry, UK. "Decimal to binary coder/decoder" (Patent). Armonk, New York, USA: International Business Machines Corporation (IBM). US Patent US6525679B1. Retrieved 2018-07-18 (6 pages) and Cowlishaw, Michael Frederic (2007-11-07) . Written at Winchester, Hampshire, UK. "Decimal to binary coder/decoder" (Patent). Armonk, New York, USA: International Business Machines Corporation (IBM). European Patent EP1231716A2. Retrieved 2018-07-18. (9 pages) (NB. This patent about DPD also discusses the Chen–Ho algorithm.)
  21. Cowlishaw, Michael Frederic (2002-08-07) . "Densely Packed Decimal Encoding". IEE Proceedings - Computers and Digital Techniques. 149 (3). London, UK: Institution of Electrical Engineers (IEE): 102–104. doi:10.1049/ip-cdt:20020407 (inactive 2024-12-07). ISSN 1350-2387. Archived from the original on 2017-05-20. Retrieved 2016-02-07.{{cite journal}}: CS1 maint: DOI inactive as of December 2024 (link) (3 pages)
  22. Cowlishaw, Michael Frederic (2007-02-13) . "A Summary of Densely Packed Decimal encoding". IBM. Archived from the original on 2015-09-24. Retrieved 2016-02-07.
  23. Savard, John J. G. (2018) . "Chen-Ho Encoding and Densely Packed Decimal". quadibloc. Archived from the original on 2018-07-03. Retrieved 2018-07-16.
  24. 7070/7074 Compatibility Feature for IBM System/370 Models 165, 165 II, and 168 (PDF) (2 ed.). IBM. June 1973 . GA22-6958-1 (File No. 5/370-13). Archived (PDF) from the original on 2018-07-22. Retrieved 2018-07-21. (31+5 pages)

Further reading

Category: