Non-Uniform Memory Access

Non-Uniform Memory Access or short- NUMA is a computer memory architecture for multiprocessor systems in which each processor has its own local memory, but other processors via a shared address space, accessed directly granted (Distributed Shared Memory ). The memory access time in such a composite therefore depend on whether a memory address is in local or foreign memory.

In contrast, there are

  • Uniform Memory Access (UMA ), in which there is a central repository, the access times are always the same.
  • Cache -only Memory Access ( CoMA ), and as with NUMA a common address space and for each processor separately local memory exists, other processors but not on the " remote " write memory, but these (similar caches) in their transmit local memory, then deletes it on its previous position ( invalidated ) is.

NUMA architectures are the next step to increase the scalability of SMP architectures.

Cache coherent NUMA ( ccNUMA )

Almost all computer architectures use a small amount of very fast memory, called a cache location to take advantage of characteristics in memory accesses. Using NUMA maintaining cache coherency across the distributed memory for additional overhead makes. As an example, imagine that a processor data from the memory of another processor fetches, so calculations hires and writes the results to its local cache. The cache of the processor from which the data originated (and perhaps also other caches in the system) must be synchronized.

Do not cache -coherent NUMA systems are indeed easier to develop and to build but difficult to be programmed with the standard von Neumann programming model. Therefore, all currently have in use NUMA systems special hardware to ensure cache coherency, and are therefore also referred to as cache- coherent NUMA ( ccNUMA ).

This is usually achieved by inter-processor communication between the cache controllers, to provide for the consistent memory contents if the same memory location is stored in more than one cache. ccNUMA suffers from poor performance when multiple processors want to access quickly in succession on the same memory location. Therefore, an operating system with NUMA support attempts to minimize the frequency of such requests by processors and memory on NUMA - friendly manner are allocated.

Current implementations of ccNUMA systems, for example, AMD Opteron multiprocessor systems on - base and SGI systems with NUMAlink. Earlier ccNUMA systems based on the Alpha processor EV7 Digital Equipment Corporation (DEC ) or the MIPS - R1x000 - processors, such as in the SGI Origin series.

NUMA vs.. Cluster Computing

In contrast to NUMA computers each cluster node has its own address space. Furthermore, the communication latency between the cluster nodes are significantly higher than those between NUMA - coupled processors.

By adjusting the operating system when paging virtual memory, it is possible to implement cluster-wide address spaces, and thus " NUMA software " to implement. However, since the high latencies remain, this is rarely useful.