# Floating-point unit

Floating point unit, FPU ( Floating Point Unit ) or NPU ( for Numeric Processing Unit) are terms from computer technology and designate a special processor that performs operations on floating point numbers. Since especially in older, microcode -based systems in addition to relatively simple operations such as addition, subtraction, multiplication, division or square roots and transcendental functions such as the exponential or various trigonometric functions were implemented in hardware, it is called colloquially also from the mathematical ( co- ) processor.

## General

The FPU can also serve as external chip sitting in a separate housing ( eg Intel 80287 ) or can be integrated in a specific area within the CPU ( eg Intel Pentium ).

Early CISC processors usually missing registers and commands to handle floating point numbers. Derlei calculations and mathematical functions were handled by the software library calls through optimized for integer processing main processor. To relieve the CPU of this computation-intensive tasks, there was initially arithmetic processors like the AMD AM9511 that have been raised as peripheral devices. In later CISC CPUs such as the Intel x86 processors (up to 486 ) or the Motorola 68k CPUs, there was the option to retrofit an additional co-processor on the motherboard.

The era of integrated FPUs was heralded by several factors:

- Built- in CPU caches are incompatible with external FPUs. Intel's last independent floating point unit, the external coprocessor I487, therefore, was actually a modified full 80486DX CPU. This had built a floating point unit, in contrast to 80486SX. The modification consisted of the coprocessor in the form of an additional control pins and an artificial barrier that prevented the house operation. On discontinuation of the coprocessor one, so the SX CPU has been disabled.
- Mathematical functions have been increasingly used in "normal" applications, eg in the rendering of character sets.
- Gate functions have been increasingly cheaper, socket and connector rather expensive.

## Operation and design

The presence of an FPU provides a significant performance boost for gleitkommaintensive calculations. So offered coprocessors broader register: Even at 16 - and 32 -bit CPUs, the FPU had often 64-bit, 80-bit or 128-bit wide registers. This allowed simple calculations are performed with higher accuracy and it has a larger range of values covered. Also, since the ultimately FPU is a digital arithmetic unit in the inside, the need for further, tricky methods to obtain a true acceleration. Many models (eg 8087 ) have hardware- optimized computational methods such as the CORDIC algorithm for trigonometric functions, which manages only by addition and shift registers, but without long multiplication. Often a large acceleration is also achieved through permanently implemented lookup tables. That is, the values are determined not repeated loop iterations, but first with the help of tables approximate and then by interpolation up to sufficient accuracy determined (an error in such a table was the cause of the so-called Pentium bug). Furthermore, an FPU so accelerate vector calculations often organize their registers as a matrix and.

Most FPUs provide operations for basic arithmetic (with higher accuracy than the CPU), logarithm, root and power calculations and trigonometric functions as well as functions for calculating with matrices available.

The processing power of a FPU is usually measured in SPECfp, in contrast to the SPECint a CPU.