Cache Notes

  1. Cache
    • Cache is a small amount of fast memory combined with a large
      amount (the non-cache) of slower memory.
    • Access to memory is generally slower than the CPU, so cache is a
      scheme to help speed up the memory access time.
    • Locality principle - the observation that the memory references
      made in any short time interval tend to use only a small fraction
      of the total memory.
      • This forms the basis of caching systems.
      • If a given memory reference is to address A, it is likely
        that the next memory reference will be in the general vicinity of A.
      • The general idea of the locality principle is that when a
        word is referenced, it is brought from the large slow memory
        into the cache, so that the next time it is used, it can be accessed quickly.
    • If a word is read or written k times in a short interval, the computer
      will need 1 reference to slow memory and k-1 references to fast memory (cache).
      • The first reference to the word is made to main (slow) memory because
        at this point the word is not in the cache.
      • After the first reference to word, the word is placed in the cache
      • A reference to a word first checks the cache for the word.
        If the word is not in the cache, then main memory is checked.
      • The larger k is, the better the overall performance because all
        of the calls to the word (except the first on) are satisfied out of the cache
    • h, the hit ratio, is the fraction of all references that can be
      satisfied from the cache.
      • h = (k-1)/k
      • For example, if a word is read/written 10 times, main memory
        is accessed once, and cache is accessed 9 times. (the word is placed
        in cache after the first hit).
      • The hit ratio here is 9/10 or 90%. If k is 1000 times,
        h= 999/1000=99.9%.
      • The miss ration is 1-h.
    • mean access time = c + (1-h)m
      where c is the cache access time, and m is the main memory access time
    • As h approaches 1, most of the access is done in cache, and the
      mean access time approaces c, the cache access time.
    • As h approaches 0, the mean access time approaches c + m, first
      a time c to check the cache (unsuccessfully) and then a time m to
      do the memory reference.
  2. Associative cache
    • In associative cache:
      Number of blocks that are needed = size of memory/block size
    • In associative cache, each slot contains:
      slot size (in bits) = 1 bit (valid/invalid) +
      #bits needed for block addresses + #bits needed to store the actual data
    • The #bits needed to store the actual data is computed:
      (block size in bytes) * 8 bits/byte
    • Associative cache - consists of a number of slots or lines,
      each containing: valid bit, block number, value
    • When Valid = 1, the slot is in use in the cache
    • In Fig 28, there is a cache with 1024 slots.
    • In associative cache, the order of the entries is random. When
      the computer is reset, all the Valid bits are set to 0, no cache entries
      are valid.
    • Fig 28 - the memory is 2^24 bytes divided into 2^22 4 byte blocks
      • The cache needs 22 bits to store labels of up to 22 bit-length names.
      • The block size is 4 bytes per block, which is 32 bits per block.
      • The cache needs 1 bit (valid/invalid) + 22 bits (for the block number) +
        32 bits (for the block value) = 55 bits per line X 1024 lines
  3. Direct-mapped cache
    • In direct-mapped cache:
      Each slot contains:
      1 bit (valid/invalid) + # bits for tag + #bits for actual data
    • To compute the number of bits for the tag:
      #bits for address - log base 2 of the block size - log base 2 of number of slots
      • Subracting log base 2 of the block size is like dividing by
        the block size.
      • Subtracting log base 2 of the number of slots is like performing
        the operation: block number mod the number of slots
    • Fig 29: 2^24 addresses, 4 (2^2) byte blocks, 1024 (2^10)slots in the cache
    • Slot number = 2^24/2^2 mod 2^10 = 2^22 mod 2^10
      • This is the same as using the lower 10 bits.
      • The upper 12 bits are the tag