Blue Gene

Blue Gene is a project to design a high-end computer technology and build. According to IBM, both to explore the limits of supercomputing: to use in computer architecture, necessary for programming and control of massively parallel systems and software to the computing power for attaining a better understanding of biological processes such as protein folding. The latter was also responsible for the naming.

In December 1999, IBM announced a five-year program to build a massively parallel computer to help in the study of biomolecular phenomena such as protein folding. Target was about to reach speeds in the Peta FLOPS area.

It is a cooperative project between the U.S. Department of Energy ( which the project is partly financed also ), industry (notably IBM), and the universities. In development, there are five Blue Gene projects, including Blue Gene / L, Blue Gene / P and Blue Gene / Q ( see Advanced Simulation and Computing Program).

The first architecture Blue Gene / L was provided. The requirements were for a system with a peak performance of 360 TFLOPS on 65,536 nodes and completion of 2004/2005. The following machines up to 1,000 TFLOPS reach ( Blue Gene / P, 2006/2007 ) and 3000 TFLOPS ( Blue Gene / Q, 2007/2008). The continuous power of this follow-up systems of Blue Gene / L is said to be 300 TFLOPS or 1000 TFLOPS.

One of the chief architects of the project at IBM included Monty Denneau and Alan Gara.

Blue Gene / L

At Blue Gene / L is a family very well scalable supercomputer. The project is funded by IBM in collaboration with the Lawrence Livermore National Laboratory.

The architecture consists of a base module (node ​​or a computer chip) that can be repeated again and again without bottlenecks arise. Each node of BG / L consists of an ASIC with associated DDR SDRAM memory. Each ASIC in turn contains two 0.7 GHz PowerPC 440 embedded processor cores, two "Double Hummer " FPU, cache subsystem, and a communications subsystem.

The double TFLOPS rates ( 2.8 and 5.6 TFLOPS ) at various drawings in the network move from the fact that an ASIC can be operated with two processors in two modes, which either use both processors for computing tasks, or only one for computing tasks and the other as a coprocessor for communication tasks. For the communication between processors is a high-speed network with a 3D torus topology, as well as a hierarchical network for collective operations.

The access to the torus network via memory - mapped network adapter to achieve similar InfiniBand very low latency. For communication, a modified MPICH2 implementation has been developed. On the compute nodes, a specially programmed, very small POSIX kernel, which does not support multitasking running - the running program is therefore the only process on the system.

In the November 2004 edition of the TOP500 list, the system is still under construction BlueGene / L took over at the Lawrence Livermore National Laboratory with 16 racks ( 16,384 nodes, corresponding to 32,768 processors) the top spot. Since then it has gradually grown, reaching on 27 October 2005 with 65,536 nodes over 280 TFLOPS, which earned him the lead in the TOP500 11/2005. Although this stage was originally declared as final, he was in 2007 but expanded again and provides ever since with 212 992 104 processors in racks over 478 TFLOPS. He was mid- 2008, the fourth-fastest system in the world, is however, now only in 14th place.

The architecture is good but also for other installations such as the Blue Gene Watson ( BGW) at IBM 's Thomas J. Watson Research Center (ranked 98 in the TOP500 6/2011 ), JUGENE ( Jülich Blue Gene at the Jülich Research Centre, June 2011 Place 12 even before its predecessor ) and six other entries in the top 100 These all fall under the term eServer Blue Gene Solution.

Blue Gene / P

The Blue Gene / P series was first presented at the ISC in Dresden in June 2007. Among the changes compared to BG / L include the use of clocked at 850 MHz PowerPC 450 cores of which four are now contained in a node. Now sits on each compute -card although only one instead of two such nodes, but a node -Card as the next larger unit contains 32 instead of 16 such compute cards.

Otherwise, the units are still the same and a rack thus contains twice as many processors as a BG / L. In a parallel to the clock rate increase performance enhancement of each processor of around 21% (at least the Linpack ) makes each rack now 14 instead of 5.6 TFLOPS (each Rpeak ). The memory bandwidth grew at the same rate, the bandwidth of the torus network was of 2.1 GB / s to 5.1 GB / s more than doubled and halved the latency. The energy demand has increased while the manufacturer only by 35%. For one consisting of 72 racks system that will reach the peta- FLOPS limit, representing about 2.3 megawatts.

One of the first deliveries went to the Jülich Research Center, is operated under the name of JUGENE and is available with 180 TFLOPS in the TOP500 list in late 2008 at No. 11 first measurements for the TOP500. In November 2008, seven Blue Gene / P systems are represented among the 100 world's fastest systems.

On May 26, 2009, an improved version of JUGENE ( Jülich Blue Gene ) was inaugurated, in which the number of processors is increased from 65,536 to 294,912 and thus a peak performance of 1 petaflops is achieved. This calculator is therefore currently (2012 ) one of the fastest computer in Europe and is in the TOP500 list from November 2011 to the 13th place among the world's fastest supercomputers.

Blue Gene / Q

The newest supercomputer design of the series, Blue Gene / Q, aimed to achieve 20 petaflops in the time frame to 2011. It is designed as a further improvement and extension of the Blue Gene/L- and P- architectures with a higher clock speed at a substantially improved energy efficiency. Blue Gene / Q has a comparable number of nodes, but 16 instead of four cores per node ( new developed POWER CPU A2).

The reference installation of a Blue Gene / Q system called IBM Sequoia took place at the Lawrence Livermore National Laboratory in 2011 as part of the " Advanced Simulation and Computing Program"; it serves nuclear simulations and advanced scientific research.

A Blue Gene / Q system called Mira was installed in early 2012 at the Argonne National Laboratory. It consists of approximately 50,000 compute nodes (16 cores per node ), 70 PB disk space ( 470 GB / s I / O bandwidth ), and is cooled by water.

Also in 2012 went in the data center CINECA in Bologna FERMI in operation, an installation with 10,240 PowerA2 socket with 16 cores.

133196
de