IEEE 754 revision

The IEEE 754-2008, the previous working title was IEEE 754r is one that has become necessary revision of the IEEE 754 floating standard adopted in 1985 The old standard was very successful and has been adopted in numerous processors and programming languages. The discussion on the revision began in 2001; In June 2008, the standard was approved and adopted in August 2008.

The main objectives

The main objectives of the adopted standards were in

Merging the IEEE 754 and IEEE 854
The reduction of implementation alternatives,
The removal of ambiguities from IEEE 754,
An additional kumulierendes product (fused multiply -add: FMA (A, B, C) = A · B C)
Half and four times as accurate (formats for 16, 32, 64, and 128 bits ),
By the Finance deems necessary decimal formats (IEEE 854),
More variable and exchange formats,
Min and max, as well as specifications for the special cases of ± 0 and ± ∞
Cosmetics ( from now on, " denormalized " " subnormal " hot )

Be divided. The default is to define formats and methods for floating point arithmetic and a minimum quality.

Formats

Floating-point formats include half ( 16-bit), single ( 32-bit), double ( 64-bit) and quadruple (128 bits) accuracy. The half format is also known as a mini float. The basic formats are complemented by extended (extended) and extensible ( new) formats. Also added new data exchange formats.

Densely packed decimal formats (3 digits in 10 bits) are planned. The decimal formats are mainly demanded by the financial industry. Here collide two opposing points of view. On one side of the memory, computation time and cost advantages, as well as the more uniform distribution of numbers of a dual format are highlighted. On the other hand it is argued that accurate results (in most cases results are meant as hand calculations ) only with decimal arithmetic are possible and cheaper processors and memory no longer in periods of faster the disadvantages in weight. Some experts even go so far as to claim that dual arithmetic is hardly play a role in the future. An added polemical quote on the subject comes from the " Gleitpunktaltmeister " William Kahan:

Conceived decimal (DFP ) is different compared to "classical BCD formats " as follows:

The capacity of the useful bits is well exploited when three decimal digits (1000 values) in 10 bit ( 1024 possible values of 1000 used ) are stored. The waste is in contrast to BCD numbers ( 16.95% ) negligible. The last column of the table contains the information content in bits, which is only slightly smaller than the space (at D = 7 Mantissenziffern and an exponent value range Emin - Emax, taking into account the sign bit 1 d * ln 10 / ln 2 ln ( EMAX emin ) / ln 2).
The processing of decimal digits in groups of three is the usual habit of grouping ( 23'223'456, 24 W, 24 kW, 24 MW ) counter.
The number 0 has the bit pattern " 0000 ... 0".
The numbers 0-9 have 6 in the leading bits a 0
The numbers from 10 to 99 have the 3 leading bits a 0
Odd numbers can be detected using a single bit.
The 24 unused bit pattern ddx11x111x with dd = 01, 10 or 11 can be easily identified.
The so Declets with packed numbers ( Densely Packed ) are no longer sortable binary, as opposed to " classical BCD formats ".

Signaling NaNs have been proposed for deletion (3 February 2003), but later resumed in the proposal (21 February 2003).

( NZ: normalized number )

( 20 in column 3 means the storage of 6 decimal places in these bits (3 digits in 10 bits) and the storage of the normalization point in extra bits. )

Curves

Among the four old IEEE 754 rounding is an additional added, so that the following curves are required:

Magnifying (toward infinity)
Diminutive (towards - infinity)
Betragsverkleinernd (toward 0)
Best possible way and in the middle to the nearest even number ( to next or to even )
Best possible way and in the middle betragsvergrößernd ( to next - new in IEEE 754r, really only the classical hand calculation rounding )

The IEEE 754 rounding ( next even) was already proposed by Carl Friedrich Gauss and avoids towards a statistical imbalance in long calculations to larger numbers.

In the discussion about the new standard this knowledge is obviously discarded again and the " hand calculation rounding " ( to next ) reintroduced.

Exceptions

Exceptions and exception handling can be specified.

New features are predicate functions ( greater than or equal ), and operators for maximum and minimum. Here ( Inf NaN ) is mainly on the results for the special values discussed.

Decimal arithmetic

A decimal number can have several different Darstellungsbitmuster ( non-unique representation). The bit patterns that represent a number, called the cohort that number. The representations (s, q, c) and ( S, Q 1, C/10 ) belong to the same cohort, if c is divisible by 10. ( Strictly speaking, the set { 0, -0 } in the old IEEE 754 standard, the cohort of the number 0 The IEEE 754R, however, says explicitly that 0 and -0 are to belong to different cohorts.

It is presented in four bit fields S, G, F, and J. S is 1 bit wide and contains the sign of the number. G contains all Dezimalformaten in 5-bit two Exponentenbit and the leading Mantissenziffer. The remaining exponent is in w bits of the field F. It concludes with the remaining Mantissenziffern in Section J with j Declets, 10j bit and 3j digits.

S G0 G1 G2 G3 G4 F2 F3 ... F [ w 1] J1, J2, ... Jj

G = 11110 r = v = (-1) ^ s Infinity

G < 11110: R = ( S, E, B, C); v = (-1) ^ s 10 ^ (E -B) c, with c = d0 d1 ... d [p- 1]

G = 110xx | G = 1110x: d0 = 8 G4 in {8, 9 }; E = G2G3 in {0, 1, 2}

G = 0xxxx | G = 10xxx: d0 = G2G3G4 in { 0, 1, 2, 3, 4, 5, 6, 7 }; E = G0G1 in { 0, 1, 2}

Exponent: if necessary .. f consists of w bits in the F field plus two bit field in G. The two bits of field G, only the three values { 00, 01, 10} accept. This provides only 3/4 of the calculated exponent values. (Example: d32, w = 6, w 2 = 8, which would mean mathematically 256 different exponent values . ) Since the exponent must never start with 11f ... f, there are only 3/4 * 256 = 192 values . Taking into account the bias of 101, the value range of the exponent is 191-101 = 0-101 .. -101 .. 90 Since d32 seven decimal places are in the mantissa, and it is assumed that after the first digit is a point ( D0. D1 to D6) can be counted to the exponent 6. Thus we arrive at the data written in the above table values -95 .. 96.

J consists of the remaining 10j bit or 3j decimal digits with values between 0 and 999, the (0 ... 1024) are Cowlishaw - coded in 10 bits.

Alternatively Decimal numbers can be coded in binary. 2 leading exponent bits and mantissa bits are extracted four leading as decimal coding from the 5bittigen G field. After linking with the rest of mantissa bits from the J- box the entire mantissa is interpreted as a binary number. If such a mantissa exceptionally ≥ 10 ^ p, then it is regarded as non-canonical representation of zero.

IEEE 854-1987 Digital Object Identifier

407640