Simulation Parameters for the Baseline Architecture Pipeline Parameters Fetch AND Decode Width 8 aligned sequential instructions Issue AND Graduate Width 4 Functional Units 2 Integer, 2 FP, 2 Memory, 2 Branch Reorder Buffer Size 32 Integer Multiply 12 cycles Integer Divide 76 cycles All Other Integers 1 cycle FP Divide 15 cycles FP Square Root 20 cycles All Other FPs 2 cycles Branch Prediction Scheme 2-bit Counters Memory Parameters Line Size 32B I-Cache 32KB, 2-way set-associative, 4 banks Instruction Prefetch Buffer 16 entries D-Cache 32KB, 2-way set-associative, 4 banks Victim Buffers 8 entries each for data and inst.
To overcome these limitations, we rely on software rather than hardware to launch nonsequential instruction prefetches early enough.
With respect to pipelining, our instruction prefetches differ in two important ways from data prefetches: (i) the pipeline stage in which the prefetch address is known, and (ii) the computational resources consumed by the prefetches.