Assume that main memory accesses take 70 ns and that memory accesses are 36% of
ID: 3834275 • Letter: A
Question
Assume that main memory accesses take 70 ns and that memory accesses are 36% of all instructions. The following table shows data for L1 caches attached to each of two processors, P1 and P2. Processor LI Cache Size LI Miss Rate LI Hit Time PI 2 KB 8.0% 0.66 ns 4 KB 6.0% 0,90 ns a Assuming that the L1 hit time determines the cycle times for P1 and P2, what are their respective clock rates? (Hint: time for one instruction-CPI clock rate) b) What is the Average Memory Access Time for P1 and P2? Which processor is faster for memory accesses c) Assuming a base CPI of l.0 without any memory stalls (for the rest of the instruction types in the program), what is the total average CPI for P1 and P2? We will consider the addition of an L2 cache to P1 to presumably make up for its limited L1 cache capacity; on a miss, Pl will now first check L2 cache, and only if that is a miss, will then need a main memory access. Use the L1 cache capacities and hit times from the previous table when solving these problems. The L2 miss rate indicated is its local miss rate. PI 5.62 ns d) What is the AMAT for Pl with the addition of an L2 cache?Explanation / Answer
(a)
Clock rate = 1 / <cycle time> and <cycle time> = <L1 hit time> P1: 1/0.66ns = 1.52 GHz; P2: 1/0.90ns = 1.11 GHz
(b)
P1: Main memory access takes 70 ns, which is 70 ns / 0.66 ns/cycle = 107 clock cycles
AMAT is 1 + 8% * 107 = 9.56
cycles 9.56 * 0.66 = 6.31 ns.
P2: Main memory access takes 70 ns, which is 70 ns / 0.90 ns/cycle = 78 clock cycles
AMAT is 1 + 6% * 78 = 5.68
cycles 5.68 * 0.90 = 5.11 ns.
(c)
CPIStal=<CPIideal>+<average memory-stall cycles>
P1: 1.0+0.36*0.08*107=4.08 clock cycles
4.08*0.66=2.69 ns
p2:
1.0+0.36*0.06*78=2.68 clock cycles
2.68*0.90=2.42 ns
(d)
AMAT L2 = Hit Rate of L2 * Hit time of L2 + Miss Rate of L2 * Miss Time of L2
0.05*5.62 ns +0.95*70 ns = 69.6 ns
(e)
CPIStal=<CPIideal>+<average memory-stall cycles>
P1: 1.0+0.36*0.95*12=5.104 clock cycles
5.104*5.62=28.684 ns
(f)
P2 is faster