The use of caches in multiprocessor systems presents a bit of a problem. Without special precautions, it is possible for two CPUs to have different values for the same data. Figure 2 below shows an example of this occurring; the ending state (circled in red) shows the problem, where CPU 1 has an old value for X in its cache (note that memory also has an old value, which is fairly common, however, caches must always have the same value).
Instruction sequence:
CPU 0 Load X from memory
CPU 1 Load X from memory
CPU 0 X = X 1 (decrement)
CPU 0 Store X to cache
A simple way of thinking about cache coherency is that you want to make sure time is flowing in the right direction for all the processors. The problem illustrated by Figure 2 is called the cache coherency problem, and the heart of the issue is ensuring that each memory address contains the same piece of data from every caches perspective. This is a slightly informal definition, and the formal one is:
1. A write to an address in memory will eventually propogate to all of the system. This means that every processor will eventually know that changes have been made.
2. Multiple writes to an address in memory are serialized; every processor in the system sees the writes occur in the same order.
A cache coherency protocol is just a set of rules that the system designer chooses in order to make sure that both of these properties are observed. All coherency protocols work with cache lines as the basic data structure, and assign each cache line a sharing state which is used to manage the communication between processors. There is a lot of room for variation amongst coherency protocols, but we will try and touch on all the major design differences.