Cache Management Overhead

\(A,B,C)\text{number of sets } S = \frac{C}{A\cdot B}$

The lines of cache are given by: .

Cache payload = , bits

Management overhead:

  • Valid bits:
  • Tag bits: , where
  • Dirty bits:
  • Replacement bits: Approximately

Understanding Cache Misses

Each memory access can be defined as either a cache hit or cache miss

Why do Cache Misses Happen

  • Are cache misses happening because it is somehow unavoidable?
  • Are cache misses happening because of the limited capacity of the cache?
  • Are cache misses happening because of the limited associativity of the cache?

This leads to a cache miss classification model called the 3C model

The 3C Cache Miss Model

Our cache is defined as \_{1}=$(A,B,C;S)$.

Based on our cache defined above, let’s consider a fully-associative cache of the same block size and the same capacity: \_{3}=$\left( \frac{C}{B}, B, C;1 \right)$

Compulsory Miss

Since we start with all the data in and nothing in $, the first access to any memory block is necessarily a miss. This is called a compulsory miss.

  • This access would be a miss even in an infinite-capacity cache \_{2}=$(A,B,\infty)$
  • The number of compulsory misses is a tight lower bound on the minimum possible number of cache misses.

Capacity Miss

If a memory reference that is a non-compulsory miss in \{1}${3}$, then limited associativity can’t be the cause of the miss, and it must be because of limited capacity. This is called a capacity miss.

Conflict Miss

If a memory reference that is a non-compulsory miss in \{1}${3}$, then limited associativity is the cause of the miss. This is called a conflict miss.

Memory Performance Equation

Let be the total number of memory references and let be the number of these references that miss in $. Define to be the miss rate and to be the hit rate. We want to be large and to be small.

Let be the hit time and let be the additionally miss penalty (both in processor cycles)

  • Typically, .

Average memory access time (AMAT): is

  • e.g., if , AMAT cycles

Memory Effects Dominate Performance

Compare the magnitudes of the numbers in the memory performance equation with the corresponding magnitudes in the Pipeline Performance Equation.

The AMAT term would totally swamp out the load-use-penalty (lp), misprediction penalty (mp), and return penalty (rp) terms.

Furthermore, this is an average. An individual memory access takes between 1 to 101 cycles.

Dealing with this in hardware involves a complete rethinking of the processor implementation (Advanced Computer Architecture)

There are good programming practices to reduce the miss rate , we talk about this in Writing Cache-Friendly Code.