Cache coherence

By ensuring cache coherency in multi-processor systems having a plurality of CPU caches prevents return the individual caches different for the same memory address ( inconsistent ) information.

A temporary inconsistency between memory and the cache is permitted, provided that it is identified and corrected at least for read accesses. Inconsistencies are generated eg by the write-back method ( write-back ), which does not update immediately the main memory as opposed to a write-through method ( write-through ) when writing to the cache. Comparisons to cache consistency.

Cache coherence protocols

A cache coherence protocol has the task to track the status of a cached memory block. Essentially, there are two technical foundations on which such protocols can be implemented:

  • Directory -based ( Directory Based): It is run a centralized list of the status of all cached blocks. There is filed which processors currently has a read-only copy ( Status Shared) or which processor exclusive write access ( Status Exclusive) on a block. The protocol regulates the transition between the different status and behavior in read miss, write miss or data writeback.
  • Snoopingbasiert: Usually run of hits on the central store on a shared medium (eg, bus or switch). All connected cache controller can observe this medium, and identify read or write accesses to blocks that they have cached. The exact response of the controller is defined in the Protocol.

The most common - both directory as well snoopingbasiert - a write-back invalidation protocol ( write- invalidate protocol) is used, such as the Modified Shared Invalid protocol (MSI ) or its extensions MESI and MOESI. Alternatively, there are write-back update logs (see bus snarfing ), however, lead to an increased bus traffic.

The choice between directory and snoopingbasiert depends, inter alia, also on the number of participating processors (cache controller ) from. By no later than 64 processors usually have directory-based protocols are used, since the bandwidth of the bus is not sufficiently scaled. For smaller installations, the snoopingbasierte approach due to the lack of central authority is somewhat better performance.

For multiprocessor installation with distributed memory ( distributed memory ) a separate directory is usually done so that the directory access is not a bottleneck per store.

158144
de