Misplaced Pages

Endianness: Difference between revisions

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Browse history interactively← Previous editNext edit →Content deleted Content addedVisualWikitext
Revision as of 13:22, 8 March 2004 edit129.13.169.157 (talk)No edit summary← Previous edit Revision as of 16:41, 17 March 2004 edit undoAxelBoldt (talk | contribs)Administrators44,503 edits endianness in spoken and written language, changed misleading example, merged paragraph about CPUs with material aboveNext edit →
Line 1: Line 1:
] ]


When an integer or any other data is represented with multiple bytes, the actual ordering of those bytes in memory, or the sequence in which they are transmitted over some medium, is subject to convention. This is similar to the situation in written languages, where some are written left-to-right, while others are written right-to-left. The convention is called '''endianness''', describing the method either '''big-endian''' or '''little-endian'''. Endianness is also referred to as '''byte sex'''. When an integer or any other data is represented with multiple ]s, the actual ordering of those bytes in memory, or the sequence in which they are transmitted over some medium, is subject to convention. This is similar to the situation in written languages, where some are written left-to-right, while others are written right-to-left. The convention is called '''endianness''', describing the method either '''big-endian''' or '''little-endian'''. Endianness is also referred to as '''byte sex'''.


== Endianness in computers == == Endianness in computers ==


When some computers store a 32-bit integer value in memory, for example 0xDEADBEEF (in ]), they store it as bytes in the following order: DE AD BE EF, that is, most significant ] first (that is to say, most significant byte is stored at the lowest byte address in store within this word). When some computers store a 32-bit integer value in memory, for example 0xA0B70708 (in ]), they store it as bytes in the following order: A0 B7 07 08. That is, the most significant byte (A0 in our example) is stored at the memory location with the lowest address, the next significant byte B7 is stored at the next memory location and so on.


Architectures that follow this rule are called ''big-endian'' and include ] and ]. Architectures that follow this rule are called '''big-endian''' and include ], ] and ].


Other computers store 0xDEADBEEF as EF BE AD DE, that is, least significant byte first. Other computers store 0xA0B70708 as 08 07 B7 A0, that is, least significant byte first.
Architectures that follow this rule are called ''little-endian'' and include the ] and ]. Architectures that follow this rule are called '''little-endian''' and include the ], ] and ].


Some architectures can be configured either way; these include ], ] and ]. The word '''bytesexual''', said of hardware, denotes willingness to compute or pass data in either big-endian or little-endian format (depending, presumably, on a mode bit somewhere). Some architectures can be configured either way; these include ], ], ] and ]. The word '''bytesexual''', said of hardware, denotes willingness to compute or pass data in either big-endian or little-endian format (depending, presumably, on a mode bit somewhere).


Still other (generally older) architectures, called '''middle-endian''', may have a more complicated ordering such that the bytes within a 16-bit unit are ordered differently from the 16-bit units within a 32-bit word.

For instance, 07 08 A0 B7. Middle-endian architectures include the ] family of processors.
Still other (generally older) architectures, called ''middle-endian'', may have a more complicated ordering such that the bytes within a 16-bit unit are ordered differently from the 16-bit units within a 32-bit word.
For instance, BE EF DE AD.


== Endianness in communications == == Endianness in communications ==
Line 28: Line 27:


== Endianness, software, and portability== == Endianness, software, and portability==

Endianness has implications in software portability. For example, in interpreting data stored in binary format and using an appropriate ], the endianness is important because different endianness will lead to different results from the mask. Endianness has implications in software portability. For example, in interpreting data stored in binary format and using an appropriate ], the endianness is important because different endianness will lead to different results from the mask.


Writing binary data from software, to a common format, leads to a concern of the proper endianness of the storing of data in order for to maintain the integrity of the data stored in that format. For example saving data in the ] bitmap format requires little endian integers - if the data is stored using big endian integers then the data may be corrupted since it does not match the format. Writing binary data from software to a common format leads to a concern of the proper endianness. For example saving data in the ] bitmap format requires little endian integers - if the data is stored using big endian integers then the data will be corrupted since it does not match the format.


The ] ], due to its portable nature, has software that swaps the bytes of integers and other ] datatypes in order to preserve the correct endianness since software running on ] for ] is intended to be portable for software running on ] for ]/]. The ] ] has software that swaps the bytes of integers and other ] datatypes in order to preserve the correct endianness, since software running on OPENSTEP for ] is intended to be portable to OPENSTEP running on ]/].


== Discussion, background == == Discussion, background ==
Line 40: Line 40:
See the ''Endian FAQ'' (external link, below), including the significant essay "''On holy wars and a plea for peace''" by Danny Cohen (1980). See the ''Endian FAQ'' (external link, below), including the significant essay "''On holy wars and a plea for peace''" by Danny Cohen (1980).


The written system of ]s is used world-wide and is such that the most significant digits are always written to the left of the less significant ones. In languages that write text left-to-right, this system is therefore big-endian, in languages that write right-to-left, this numeral system is little-endian. The spoken numeral system in ] is big endian (with minor exceptions: we say "seventeen" instead of "ten-seven"). ] uses a strange mixture of big- and little-endianness: 376 is pronounced as "''Dreihundertsechsundsiebzig''", i.e. "three hundred six and seventy".
* Processor families that use big-endian storage: ], ], ]

* Processor families that use little-endian format: ], ]
* Processor families that use either (determined by software): ], ], ]
* The PDP family of processors, which were word- rather than byte-addressable, used the unusual pattern of B-A-D-C (that is, byte-swap within words).
(Many CPUs have different solutions to the endian problem, such as 64-bit SPARC and MIPS, which can change their operating endianness, or i386, which has a specialized BSWAP instruction for fast endian conversion)
== External links == == External links ==
* *

Revision as of 16:41, 17 March 2004


When an integer or any other data is represented with multiple bytes, the actual ordering of those bytes in memory, or the sequence in which they are transmitted over some medium, is subject to convention. This is similar to the situation in written languages, where some are written left-to-right, while others are written right-to-left. The convention is called endianness, describing the method either big-endian or little-endian. Endianness is also referred to as byte sex.

Endianness in computers

When some computers store a 32-bit integer value in memory, for example 0xA0B70708 (in hexadecimal notation), they store it as bytes in the following order: A0 B7 07 08. That is, the most significant byte (A0 in our example) is stored at the memory location with the lowest address, the next significant byte B7 is stored at the next memory location and so on.

Architectures that follow this rule are called big-endian and include Motorola 68000, SPARC and IBM 370.

Other computers store 0xA0B70708 as 08 07 B7 A0, that is, least significant byte first. Architectures that follow this rule are called little-endian and include the MOS Technologies 650x, Intel x86 and Vax.

Some architectures can be configured either way; these include ARM, PowerPC, DEC Alpha and MIPS. The word bytesexual, said of hardware, denotes willingness to compute or pass data in either big-endian or little-endian format (depending, presumably, on a mode bit somewhere).

Still other (generally older) architectures, called middle-endian, may have a more complicated ordering such that the bytes within a 16-bit unit are ordered differently from the 16-bit units within a 32-bit word. For instance, 07 08 A0 B7. Middle-endian architectures include the PDP family of processors.

Endianness in communications

In general, the NUXI problem is the problem of transferring data between computers with differing byte order. For example, the string "UNIX" might look like "NUXI" on a machine with a different "byte sex". The problem is caused by the difference in endianness.

The Internet Protocol defines a standard "big-endian" network byte order, where binary values are in general encoded into packets, and sent out over the network, most significant byte first. This occurs regardless of the native endianness of the host CPU.

Serial devices also have bit-endianness: the bits in a byte can be sent little-endian (least significant bit first) or big-endian (most significant bit first). This decision is made in the very bottom of the data link layer of the OSI model.

Endianness, software, and portability

Endianness has implications in software portability. For example, in interpreting data stored in binary format and using an appropriate bitmask, the endianness is important because different endianness will lead to different results from the mask.

Writing binary data from software to a common format leads to a concern of the proper endianness. For example saving data in the BMP bitmap format requires little endian integers - if the data is stored using big endian integers then the data will be corrupted since it does not match the format.

The OPENSTEP operating system has software that swaps the bytes of integers and other C datatypes in order to preserve the correct endianness, since software running on OPENSTEP for PA-RISC is intended to be portable to OPENSTEP running on Mach/i386.

Discussion, background

Big-endian numbers are easier to read when debugging a program but less intuitive (because the high byte is at the smaller address); similarly little-endian numbers are more intuitive but harder to debug. The choice of big-endian vs. little-endian for a CPU design has begun a lot of flame wars. Emphasizing the futility of this argument, the very terms big-endian and little-endian were taken from the Big-Endians and Little-Endians of Jonathan Swift's Gulliver's Travels, two peoples in conflict over which end to crack an egg in the voyage to Lilliput and Blefuscu.

See the Endian FAQ (external link, below), including the significant essay "On holy wars and a plea for peace" by Danny Cohen (1980).

The written system of arabic numerals is used world-wide and is such that the most significant digits are always written to the left of the less significant ones. In languages that write text left-to-right, this system is therefore big-endian, in languages that write right-to-left, this numeral system is little-endian. The spoken numeral system in English is big endian (with minor exceptions: we say "seventeen" instead of "ten-seven"). German uses a strange mixture of big- and little-endianness: 376 is pronounced as "Dreihundertsechsundsiebzig", i.e. "three hundred six and seventy".

External links


Parts of this article were originally based on material from FOLDOC, used with permission.