Nehalem (microarchitecture)

The Intel Nehalem microarchitecture developed by Intel microarchitecture. It is based partly on the Intel Core microarchitecture and sparked it off in 2010. Processors based on the Nehalem architecture are the first Intel processors with integrated memory controllers. The first version of the Nehalem architecture has arrived in November 2008 on the market as a high-end CPU ( Bloomfield ) for desktop PCs as the Core i7 for socket 1366 ( X58 ).

It was replaced by the Intel Sandy Bridge Microarchitecture, 2011.

The name of this microarchitecture is a small coastal town called Nehalem back in Oregon.

Technical

A main feature of this architecture is that the front side bus ( FSB), the connection between the processor and chipset manufactured at predecessors of a point- to-point interconnect called QuickPath Interconnect ( QPI ) has given way to that designed for high throughput and scalability is - similar to the transition to AMD's HyperTransport five years ago. Another important innovation is also similar to AMD, is the integration of memory with an integrated memory controller. By connecting the processor with a much lower latency to access the memory. These two measures create from the existing to date bottleneck of the core processors, which was given by the FSB. However, this new base is required.

It is implemented in addition Simultaneous Multithreading ( SMT), which is already under the name Hyper -Threading was used in Pentium 4 processors. The new implementation clearly generates more power, also because the resources per thread are increased by the four -issue superscalar design as opposed to three times superscalar design of the Pentium 4. With Intel's SMT processor can process up to eight threads simultaneously with four cores. However, the benefits of a four- Kerner was questionable in desktop applications because only when specifically optimized software on this run so many performance-related threads at the same time. So far as " normal" quad cores suffer from being exploited by the few programs and therefore, due to be partially slower than dual core models with a lower clock per core, in the applications. In the Server field on the other hand make several - real or virtual - processors tend to be more useful because often a number of requests to be processed in parallel.

Compared with its predecessors, the Nehalem architecture, a three-level cache hierarchy similar to that of AMD Phenom: Each core has besides an exclusive L1 cache also has its own 256 KiB L2 cache, while all cores share a common L3 cache, up to 8 MIB is large. This is effectively less than the last MIB for up to 6 two cores in core 2, but the benefits of such large caches is questionable; lost in this point, stripped-down versions of the Core 2 often only minimal performance. The latter is an inclusive cache, i.e., it always contains all the data that are stored in L1 and L2 caches. Thus, the cache coherence protocol is simplified and snooping traffic reduced. The L1 and L2 caches exist in contrast to the previous processors no longer ordinary 6T SRAM, but from 8T SRAM cells, which hopes to Intel savings in energy consumption.

The Power Control Unit ( PCU), a kind of co-processor for the processor power management, and novel power gating circuits to ensure optimization of the energy balance. This should be minimized in each load situation of a power consumption, on the other hand, in order to implement the so-called turbo mode in which the processor is automatically clocked slightly higher with a corresponding weak -threaded load when permitted by the energy balance of the processor. Specifically, this means that if two physical cores are idle and the TDP is not exceeded, the in-use cores are clocked higher by at least one multiplier stage. Works even only one core, the clock frequency increase of the working core is even more bigger. The non-active cores are clocked down.

Other enhancements include a further stage of the Streaming SIMD Extensions, SSE4.2, and that all four cores are housed on one die.

Westmere

Under the name " Westmere " Intel made ​​since the end of 2009, the Nehalem microarchitecture to 32 nm feature size shrunk. The first semiconductor chips of this kind are the dual-core processors called Clarkdale. In contrast to four -cores on the Nehalem architecture does not apply to these processors again the innovation of the integrated memory controller, but instead it is placed again on a different chip. Communication takes place but not back on FSB but on QPI, which does not lead to better latencies for memory accesses compared to the FSB connection. A speed advantage over the dual cores on the Core architecture experience the new Westmere dual cores only through simultaneous multithreading ( SMT). With deactivated SMT performance per clock is therefore similar to the older Core architecture. With SMT, the new dual cores behave with quad core superscalar design similar to the Tricore models with triple superscalar core design in benchmarks with varying gethreadeter software.

In the first half of 2010 also Sechskerner and quad were presented on the basis of Gulftowns. This Westmere CPUs differ in the architecture not of Bloomfield CPUs - only the production was switched to the 32 -nm process. Thus, additional cores and more cache within the same TDP limits are possible. Furthermore, were - as with the dual-core Westmere CPUs - added seven additional instructions, six of them to the topic of AES encryption devote. The larger L3 cache also has slightly larger latency than its predecessor.

202672
de