InfiniBand is a specification for describing a high-speed serial transmission technology. It is the result of the union of two competing systems: Future I / O Compaq, IBM and Hewlett -Packard and Next Generation I / O ( Ngio ), which was developed by Intel, Microsoft and Sun Microsystems. Shortly before the new name was chosen, InfiniBand was known as system I / O. Intel has since rather focuses on the development of quasi- alternative PCI Express; thus is the future development of InfiniBand uncertain.
InfiniBand uses a bi-directional serial bus for low-cost and low-latency data transfer (less than 2 microseconds) and creates per channel theoretical data transfer rates of up to 2.5 Gb / s in both directions, in the GDR version 5 Gb / s At InfiniBand multiple channels can be bundled transparent, wherein one cable is used. Usual four channels (4 × ) are therefore 10 and 20 Gbit / s For the connection between switches, there are also twelve -channel connections (12 × ) with 30 or 60 Gb / s
InfiniBand is normally transmitted over copper wire, as they are used also for 10 - Gigabit Ethernet. This transmission distances up to 15 meters possible. Must longer distances to be bridged, can be used on fiber optic media converters, which convert the InfiniBand channels on single fiber pairs. This optical ribbon cable come with MPO connectors are used.
The areas of InfiniBand range of bus systems to network connections. Similar HyperTransport could, however, difficult to enforce, as the bus system and is therefore currently most often used as a cluster interconnect technology. An exception are here IBM mainframe System z, from its z10, which have, for example 24 InfiniBand host bus channels, each with 6 GB / s. The great advantage of InfiniBand compared to conventional techniques such as TCP / IP Ethernet lies in minimizing the latency by relocating the protocol stack to the network hardware.
Various computational nodes are then connected by InfiniBand cable and special switches; as network cards, so-called HCAs (host channel adapters ) are used. There are various connection modes available, and others RDMA Write / RDMA Read and simple Send-/Receive-Operationen.
To avoid time-consuming switching between operating system and user context, as is the case for example with sockets, which provided for the use of memory areas are first registered with the card. This allows the card to make the translation of virtual addresses to physical addresses yourself. When sending data through the mapping of various control registers of the HCAs in the memory of the process ( Doorbell mechanism) is made the send operation without going through the operating system kernel - the HCA fetches the data from the main memory by triggering the DMA controller. The (optional reliable or not reliable ) sending as existing on the HCA data is assumed by the protocol stack of the map. The card can handle this, a translation table that is accessed the data returned to the user when registering a memory area indices.
To further minimize the latency, InfiniBand provides two connection modes available, which transmit data to the main memory of another node or read it from there, without involving the operating system or the process on the opposite side. These two operations are as RDMA Write / Read RDMA (Remote DMA ) refers. In addition, InfiniBand provides two modes for the realization of locking mechanisms are available: Atomic Compare & Swap and Atomic Fetch & Add. With this example, semaphores can be implemented; they find, and others in distributed database application.