MSI Cache Coherence Protocol

Introduction to Cache Coherence

In multiprocessor systems, each processor typically has its own cache to improve performance by storing frequently accessed data closer to the CPU. However, this creates a fundamental problem: when multiple processors cache the same memory location, how do we ensure that all processors see a consistent view of memory?

Cache Coherence is the discipline that ensures that changes in the values of shared operands are propagated throughout the system in a timely fashion. The goal is to maintain the illusion that there is only a single copy of each memory location, even though multiple cached copies may exist.

The MSI Protocol

The MSI (Modified-Shared-Invalid) protocol is one of the fundamental cache coherence protocols used in multiprocessor systems. It defines three possible states for each cache line:

Cache Line States

  1. Modified (M)

    • The cache line has been modified (written) by this processor
    • This processor has the only valid copy; main memory is stale
    • This processor has exclusive ownership and can read/write without bus transactions
    • Responsibility to write back to memory when evicted
  2. Shared (S)

    • The cache line is unmodified and potentially shared with other processors
    • Multiple processors may have copies in the Shared state
    • Main memory is up-to-date
    • Reads can be satisfied locally, but writes require bus transactions
  3. Invalid (I)

    • The cache line is invalid (not present or not valid)
    • Any access requires a bus transaction to obtain the data
    • Initial state of all cache lines

Processor Operations

  • PrRd (Processor Read): The processor wants to read from a memory location
  • PrWr (Processor Write): The processor wants to write to a memory location

Bus Transactions

  • BusRd: A read request is placed on the bus when a processor cache miss occurs
  • BusRdX: A read-exclusive request for a cache line that will be modified
  • BusUpgr: An upgrade request to gain exclusive access to a cache line currently in Shared state

State Transition Rules

The MSI protocol defines specific rules for how cache lines transition between states:

From Invalid (I) State:

  • PrRd → S: On a processor read miss, issue BusRd, obtain data, and move to Shared state
  • PrWr → M: On a processor write miss, issue BusRdX, obtain exclusive access, and move to Modified state

From Shared (S) State:

  • PrRd → S: Read hit, remain in Shared state
  • PrWr → M: Issue BusUpgr to invalidate other copies, then move to Modified state
  • BusRdX → I: Another processor issued BusRdX, invalidate local copy

From Modified (M) State:

  • PrRd → M: Read hit, remain in Modified state
  • PrWr → M: Write hit, remain in Modified state
  • BusRd → S: Another processor wants to read, share the data, write back to memory
  • BusRdX → I: Another processor wants exclusive access, write back and invalidate

Bus Snooping

The MSI protocol relies on bus snooping, where each cache controller monitors (snoops) the bus for transactions that might affect its cached data. When a cache controller sees a relevant bus transaction, it responds appropriately by changing the state of its cache lines.

Performance Implications

The MSI protocol involves several performance trade-offs:

Advantages:

  • Ensures data consistency across all processors
  • Relatively simple implementation
  • Good performance for read-shared data

Disadvantages:

  • Write operations on shared data can be expensive due to invalidations
  • False sharing can cause unnecessary invalidations
  • Bus bandwidth can become a bottleneck

Applications in Modern Systems

The MSI protocol serves as the foundation for more complex protocols used in modern multicore processors:

  • MESI: Adds an Exclusive state to reduce bus traffic
  • MOESI: Adds an Owned state for better sharing of modified data
  • Directory-based protocols: Scale beyond bus-based systems

Understanding the MSI protocol is crucial for:

  • Computer architecture design
  • Parallel programming optimization
  • Performance analysis of multicore applications
  • System-level debugging of concurrent programs