Pentium-FDIV-Bug

FDIV bug indicates a hardware failure of the Pentium processor from Intel. The error has been known in November 1994, one and a half years after its launch and builds floating point divisions with specific values ​​for false results. No other error in a CPU design had ever taken care of so much turmoil and excitement among users and experts.

The term FDIV bug is derived from the name of a floating point often used in x86 processors. However, the flaw affects not only the FDIV command, as one might suspect. Rather, all of the commands are concerned, to use the defective unit division. Specifically, these are: FDIV, FDIVP, FDIVRP, FIDIV, FIDIVR, FPREM, FPREM1, FPTAN and FPATAN.

Discovery of the error

As a discoverer, it is thanks to them that the public at large error was known, acknowledged to Prof. Thomas Nicely Ray from Lynchburg College. He discovered the error while trying an exact calculation of the brunschen constant. After he left the error for several months on the ground, he informed on October 30, 1994 selected textbook authors and journalists, and thus started the ball rolling. The circumstances and the timing of the discovery, Prof. Nicely summarized in a FAQ.

According to information from Intel had discovered the error before becoming known. Different sources give June, others do not until August 1994 as a time of discovery. Prof. Nicely said in his FAQ assumes that the error Intel in May while working on the floating point unit of the then still in development Pentium successor P6, the later Pentium Pro, discovered. What is certain can be said that Intel knew the error for some time. Unlike yourself, the fact that Intel could deliver a corrected version in sample quantities as early as October, do not explain.

Frequency and impact of the error

The impact of the error for the average user were controversial in the weeks after the discovery. Finally hung namely on whether and to what extent software used made ​​use of the affected commands. It should not be forgotten that used standard software 1994, a floating point long ago could not provide. Many users were working at that time still with CPUs that did not have such a unit, which is why many common applications then the affected commands not used.

What is certain is that the error in random operand on average about once every nine billion FDIV operations occurs. What this means in practice, however, was not so easy to identify.

Intel initially claimed, the error will occur randomly in a normal user only once every 27,000 years and is only relevant when the prime number generation or other sophisticated calculations. Not least, the trade press held out the other estimates. So determined, the German magazine c't in its January 1995 issue, an average frequency of one error every 60 hours at gleitkommaintensiven applications, but admitted at the same time, " that the number of those affected actually really not likely to be so great ."

IBM also got involved and stopped presse effective delivery of computers with Pentium CPU and calculated that the error is statistically could even occur once every six hours. The reactions and assertions IBM at that time were not without controversy; IBM was yet with its PowerPC CPU and its RS/6000-Workstations one of the fiercest competitor of Intel in the high-end sector. Thus, the approach was rather seen as many of IBM's market- strategic maneuvers.

Also Prof. Nicely personally took Intel, given such dramatic representations in protection. In his FAQ, he took the view that an interim Intel circulated accommodated white paper that contained a statistical analysis of the error that significantly closer to reality than the lay representations IBM.

Cause of the error

The CPU uses the method of the SRT division, are used in the lookup tables. In the faulty CPUs were here before five wrong table entries. Takes the value of 2, the value 0 is added to the cells. The error occurs due to the unequal probability distribution, in which the individual cells are read only rarely, respectively only for certain combinations of numbers on.

Reactions to the error

After Intel had discovered the error, they removed him silently and probably started sometime in late summer or early autumn in order to shift the production of various Pentium variants gradually to the corrected versions. Nevertheless affected CPUs still gave out until late in 1994, a long time of it without the knowledge of users.

Critics accused Intel before, therefore, it would cover up the error first, then want to play down. So Intel went so after the announcement of the error in the assertion that he would never occur for most users. In this context of the above -mentioned " statistical occurrence every 27,000 years at normal end users ," the speech said to have been. This assessment is triggered by users and trade press indignant reactions.

Intel announced first to want to swap CPUs only by users who were able to demonstrate that they were affected by the error. Many users asked Intel to then to replace all affected CPUs. The trade press had nothing good to say about this announcement. After the pressure was getting stronger and the group threatened a serious image damage, Intel drew on December 20 and one eventually announced a comprehensive exchange program for all CPUs on the affected.

Finally reaped Intel for the error also much glee. Jokes of the type " How many Intel employees does it take to change a light bulb? 1.9999983256 "or " You mean 2.00000000 2.000000000 3.999998456 does not equal? " Were circulating at that time abound.

But Intel also drew lessons from the incident. Andy Grove apologized in the press for the trouble caused his attitude. There was an attendant set up specially for the exchange of defective CPUs. Overall, Intel introduced $ 475 million available, which represented more than half of the gain in the fourth quarter of 1994 for this incident. At the end of about one million were exchanged faulty processors. 1995 began with the release of all in their own CPUs detected errors. To get this information, you had previously not sign a confidentiality agreement. From now on, anyone could inform updates about errors in Intel products of so-called Specification.

In the specification updates affected Pentium CPUs, the effects of FDIV bugs are described as: " Slight precision loss for floating -point Divides on Specific operand Pairs", which may be translated as: " slightly reduced accuracy in floating point divisions with specific operands pairs ". Depending on the type of CPU of Pentium FDIV bug bears in this specification updates the name Errata 20 ( the P5 Pentium), or Errata 23 ( the P54C Pentium).

Affected Pentium versions

The error is found in all Pentium CPUs that were produced until the early fall of 1994. Many copies produced later have him yet. As to the beginning of 1995, only Pentium CPUs were manufactured up to and including 100 MHz, all the faster variants are not affected by the error.

But that does not mean that all Pentium CPUs with clock speeds of 100 MHz and less error have. Since Intel swapped the CPUs in the framework of the exchange program exclusively for those with the same clock speed and bug-fixed versions of these CPUs also continued to sell, then a large number of Pentium CPUs are yet come with clock frequencies between 60 and 100 MHz in circulation that do not have the error. Alone on the clock frequency so can not infer the presence of FDIV bugs.

Until the autumn of 1994 the majority of the Pentium production but consisted of models of the first Pentium - type P5, it gave exclusively with clock frequencies of 60 and 66 MHz. The more advanced P54C - type, which was initially only available with 90 and 100 MHz, however, was expensive and comparatively rare. Among other things, why in 1994 crafted P5 models also make up the majority of all affected by FDIV bug CPUs.

Since the Intel Pentium 75 MHz ( SPGA - type) presented for the base 5 until 10 October 1994 when the bug- B5- stepping of P54C was already in production, are of the base - 5 version only the 90 - and 100 MHz types affected, which have been previously sold. In desktop PCs and servers, therefore they should be no affected 75 MHz types. When Pentium for mobile use, however, only the 75 MHz version is affected. This, of course, the possibility does not exclude that there may have been manufacturers who Socket 5 types have built with 90 and 100 MHz in notebooks.

The following Pentium versions are affected:

If you have a well-developed CPU, so I set up most easily by the so-called sSpec, a five -digit usually abbreviation of letters and numbers that is printed on the CPU case. The SSPECS concerned are indicated in the table above.

The identification of a faulty CPU based on their inscription is of course not possible if it is in a runnable system. However, there are still ways to determine whether the CPU is affected by the error. Found in a Linux system in / proc / cpuinfo an entry in the form fdiv_bug: yes, the CPU is concerned; as well as the arithmetic operations listed below in the Windows calculator deliver the wrong result:

328851
de