Memory-Latency Wall [15 marks] Killer micros mitigate memory latency by having a
ID: 3585070 • Letter: M
Question
Memory-Latency Wall [15 marks] Killer micros mitigate memory latency by having a multilevel cache reduce the latency to a couple of cycles, and then use pipeline parallelism (from ILP) to mitigate the rest. This strategy would collapse if the cache left any more than a few cycles of latency. Why? Because killer micros have ridiculously little pipeline parallelism. Let us calculate the average time 'tav' to complete a memory reference measured in processor cycles. This will show the remaining latency. Let 't_c' and 't_m' be the D-cache and DRAM access times, and let 'P' be the probability of a D-cache hit. The cache line is 1-word long. We have: tav = P * t_c + (1 - P) * (t_m + t_c) [in seconds] = t_c + (1 - P) * t_m [in seconds] = 1 + (1 - P) * t_m [in cycles ] In line 3, we have assumed that 't_c' is always one cycle. Assume a D-cache with a miss rate of 1%. Assume the DRAM latency decreases by a factor of 1.03 every year, and the processor clock cycle decreases by a factor of 1.75 every year. If a memory reference has a 200-cycle latency today, how many cycles will 'tav' be after 6 years? after 8?
Explanation / Answer
Generally, latency(TL) is given by multiplying clock cycle time(CCT) and no. of clock cycles(NCC) i.e.
TL = CCT * NCC
Anyways, even if we go by your given data processor clock cycle is t_c which is -1.75 where - sign shows a decrease.
t_m is DRAM which is -1.03 where - sign is again a decrease
Since D-cache miss rate is 1% or say 0.01 then P = 1-.01 = .99 since P is hit.
Since, t_c is not 1 here, we apply a formula
tav = t_c + (1-P)*t_m
Also, after 6 years t_m is 1.03^6 = 1.194 and after 8 years it is 1.03^8 = 1.27(approx)
Therefore, required tav(after 6 years) = -1.75 - .01*1.194 = -1.76194 = 1.76194 decrease
Hence, clock cycles = 200/1.76194 = 113.51
Similarly after 8 years tav = -1.75 + .01*1.27 = -1.7627 = 1.7627 decrease
Hence, clock cycles = 200/1.7627 = 113.46
As we see it is not a major difference numerically but has far reaching effects on computer speed.