Byte

The byte [ baɪt ] is a unit of digital technology and computer science, which stands for a sequence of usually 8 bits. Historically, one byte the number of bits to encode a single text character in the particular computer system and is, therefore, the smallest addressable element in many computer architectures. To specifically indicate a number of 8 bits, the term octet is used.

Definitions

What exactly denotes a byte is defined slightly differently depending on the application. The term may stand for:

A measure of an amount of data of 8 bits with the unit symbol "B", and it does not depend on the order of the individual bits.
An ordered position (n- tuples) of 8 bits, whose formal name ISO compliant octet is (1 byte = 8 bits). An octet is sometimes split into two halves (nibbles ) decomposed to 4 bits, each nibble is represented by a hexadecimal digit. An octet can thus be represented by two hexadecimal digits.
The smallest addressable by address bus data amount of a certain technical system, for example: at Telex: 1 character = 5 bits
IBM 1401: 1 character = 6 bits
In ASCII: 1 character = 7 bits
IBM PC: 1 character = 8 bits = 1 octet
At Nixdorf 820: 1 character = 12 bits
With computer systems of the types UNIVAC 1100/2200 and OS2200 Series: 1 character = 9 bits (ASCII code ) or 6 bits ( FIEL DATA Code )
For computers the PDP-10 family: 1 character = 1 ... 36 Bit, byte length freely selectable

In most of today's computers, these definitions fall ( smallest addressable unit, data type in programming languages , C data type) in a composite and are then identical in size.

The term byte is due to the large spread of systems that are based on eight bit (or a power of two multiples thereof), used to describe an eight- bit wide size in formal language ( according to ISO standards ) but correctly octet (English: octet ) is called. As a unit of measurement of size specifications, the term is used bytes ( within the meaning of 8 bit) in the German language. When transmitting a byte can be parallel ( all bits at a time ) or serial ( all bits one after the other ) to be transferred. To ensure the correctness check bits are added often. In the transmission of large amounts of further transmission protocols are possible. Thus, in 32-bit computers often 32 bits ( four bytes ) are transmitted together in one step, even if only one 8-bit -tuple needs to be transmitted. This allows a simplification of the algorithms necessary for the calculation and a smaller instruction set of the computer.

History of the term

Bit is a portmanteau of binary and digit, so called divalent digit - zero or one. Its components can be attributed to the Latin words digitus (finger), or the one used since ancient times to count (see Plautus: " computare digitis " ) and Latin (specifically neulateinisch ) Binarius (two times), cf latin to ( twice), back carry.

The word byte is artificial and dates ( German: little ) bit of English and bite ( German: Bite ) from. It was used to identify an amount of memory or the amount of data sufficient to display a character. The term was coined in 1956 by Werner Buchholz in an early design phase of an IBM computer. In the original, he described a width of six bits and put the smallest directly addressable memory unit of a computer is appropriate, and so you could see the letters and common special characters, for example, program source or other texts save (ie different characters). Bite The spelling was changed to bytes to avoid accidental confusion with bit.

In the 1960s, in its use of fast-spreading ASCII character set has been defined, still got along with a bit depth of seven bits. Later extended ASCII character sets were used by one bit, which could represent the most common international diacritics also, such as the code page 437 in these extended ASCII character sets, each character corresponds to exactly one byte with eight bits.

For a short time, there have been around 1970 4 -bit processors, the 4 -bit data words could (also called nibbles ) are represented by hexadecimal digits. 8 -bit processors were introduced shortly after the invention of the programming languages C and Pascal, so the early 1970s, and were home computers to the 1980s in use ( embedded systems, even today ), the 8- bit data words (respectively bytes ) can be represented by exactly two hexadecimal digits. Since then, the width of the data words of hardware from 4 through 8, 16, 32 doubled up today to 64 and 128 bit down again and again.

To distinguish the original meaning as the smallest addressable information unit and the importance as an 8- bit tuple in the literature is correctly used ( depending on the field ), the term octet for the latter to achieve a clear separation.

Practical Uses

In the electronic data processing is defined as the smallest possible storage unit as a bit. A bit can have one of two possible states, which are usually referred to as "zero" and "one". In many programming languages for a single bit of data type " boolean" (respectively " Boolean " or " BOOLEAN " ) are used. For technical reasons the actual picture of a Boolean occurs but usually in the form of a data word.

Eight such bits into one unit - so to speak, a data packet - summarized and commonly called bytes. The official designation is ISO compliant, however octet 1 octet = 1 byte = 8 bits. Many programming languages support a data type named "byte " (respectively "byte" or " BYTE" ), it being understood that this depending on the definition as an integer, as a set of bits, as an element of a font or type- unsafe programming languages even simultaneously for more of these data types can be used, so no more assignment compatibility.

The byte is the unit standard to refer to storage or data. This includes file sizes, the capacity of permanent storage media ( hard drives, CDs, DVDs, Blu -ray discs, floppy disks, USB mass storage devices, and so on ) and the capacity of many volatile memories (eg, memory). Transmission rate (for example, the maximum speed of an Internet connection) are, however, usually one on the basis of bits.

Meanings of decimal and Binärpräfixen for large numbers of bytes

SI prefixes to the base 10

For data storage with binary addressing to memory capacities result of 2n bytes, that is a power of two. Since there was no special unit Resolutions for powers of two until 1996, it was common to use the SI prefixes in connection with storage for designation of powers of two ( by a factor of 210 = 1024 instead of 1000 ). An example:

Chance are also mixed forms, for example in the storage capacity of a 3.5 -inch floppy disk: 1.44 MB = 1440 kB = 1440 × 1024 bytes.

IEC prefixes to the base 2

To avoid ambiguity, suggested the IEC 1996 new unit prefixes to which should be used only in the binary sense. An example:

The body responsible for the SI prefixes International Bureau of Weights and Measures ( BIPM) has recommended this notation and does not recommend the use of the binary SI prefixes expressly. The term of powers of two by Binärpräfixe according to the previous IEC 60027-2 was replaced identical due to the global ISO standard IEC 80000-13:2008 ( or DIN EN 80000-13:2009-01 ) (also accepted). It was also recommended that the SI prefixes to use only in the decimal key to preventing the both powers of two as well as powers of ten give unique names, such as:

Many standardization organizations concluded with this recommendation (see Binärpräfix )

Comparison

An overview of the possible unit prefixes and their meanings provides the following table:

For larger decimal and Binärpräfixe the distinction is important, because the nominal difference becomes larger. So it is between kB and KiB only 2.4% between TB and TiB, however, already 10%.

Capacity data storage media

The manufacturers of storage media such as hard disks, DVD - blanks and USB memory sticks, use the decimal prefixes, as is customary in international units to specify the storage capacity of their products. From this example, the problem arises that one with " 4.7 GB " labeled blank DVD software which the Dezimalpräfixe used contrary to the above-mentioned standard for designation of powers of two (like Windows Explorer), with the formal different value of " 4.38 GB " is displayed, although in both cases around 4.7 gigabytes ( 4.7 billion bytes) are meant. Also, is recognized in such cases, a "1 TB " specified hard disk drive with the apparently much smaller capacity of about " 931 GB " or " 0.9 TB", although in all three cases, each around 1.0 terabytes ( 1,000,000,000. 000 bytes ) are meant. On the other hand, with " 700 MB " marked blank CD actually contains 700 MiB, or about 734 MB.

The operating system Mac OS X version 10.6 consistently used Dezimalpräfixe only in their decimal meaning. KDE follows the IEC standard and allows the user the choice between binary and decimal data. For Linux distributions with other desktop environments, such as Ubuntu, version 11.04, there are clear guidelines on how applications should specify amounts of data; here you will find both numbers, but it outweigh the Binärpräfixe.

Address space Bus (computing) IBM Personal Computer Codepage 437 Tuple#Relational model Set (abstract data type) International Electrotechnical Commission Byte addressing

157553