Each chipset hosts up to 8 cores, equipped with a 256 KB primary cache
, as well as a secondary cache of up to 24 MB, an I/O processor, a memory controller, and a system controller.
The system leaves a copy of frequently accessed data in primary cache
for very fast retrieval.
* Primary controller transfers the data into the primary cache
* Primary controller computes a new CRC for the new parity and transfers the data to the primary cache
Level 1 cache (referred to as Ll or primary cache
) is located on the same chip as the processor.
They find the average memory access time to the primary cache
alone to go up from 1.1 to 5.7, mostly due to extra queueing delays at the contended banks, with an overall performance drop of 24%.
The penalty of a primary cache miss that hits in the secondary cache is at least 12 cycles, and the total penalty of a miss that goes all the way to memory is at least 75 cycles (plus any delays due to contention, which is modeled in detail).
(Note that this is overly simplistic, since the primary cache sizes are often limited more by access time than the amount of silicon area available.) For our baseline architecture, the additional storage necessary to support basic cooperative prefetching is 640 bytes at the level of the primary I-cache (128 bytes for the prefetch bits used by prefetch filtering, and 512 bytes for the prefetch buffer), and 8KB for the 2-bit saturating counters added to the L2 cache.
When a primary cache miss occurs, one of several stream buffers is allocated to service the new reference stream.
To help mitigate this effect, a small history buffer is used to record the most recent primary cache misses.
Assume that each element of the array a is 8 bytes, a cache line contains 32 bytes, the primary cache
size is 8KB, and that memory feedback tells us that the load of a [j] suffered an 8.3% miss rate.
Read Operations Write Operations Hit in Primary Cache
1 pclock Fill from Secondary Cache 15 pclock Fill from Local Node 29 pclock Fill from Dirty Remote, 132 pclock Remote Home Owned by Secondary Cache 4 pclock Owned by Local Node 17 pclock Owned by Remote Node 89 pclock Owned in Dirty Remote, 120 pclock Remote Home