Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Assume that each of the steps in instruction execution takes the amount of time

ID: 3594532 • Letter: A

Question

Assume that each of the steps in instruction execution takes the amount of time specified in the
list below. (16 pts)
Fetch   300 ps
Decode 100 ps
Execute 200 ps
Memory 350 ps
Writeback 100 ps

Assume that the latency of pipeline registers is 50 ps.

a. Given the times for the datapath stages listed above, what is the clock frequency for the
non-pipelined and pipelined datapaths (assume pipelined datapath uses a common
clock whose frequency is determined by the slowest stage) (4 pts).

b. Assuming no hazards or stalls, what is the ideal pipeline speedup? (3 pts)

c. If we can extend the pipelined datapath to 10 stages, how would you allocate the
additional pipeline stages? Justify your answer. (5 pts)

d. In an ideal situation, calculate the achievable speedup of the 10-stage pipelined data-
path against the classic 5-stage pipelined data-path (assume constant clock time per
stage). (4 pts)

Assume that each of the steps in instruction execution takes the amount of time specified in the list below. (16 pts) Fetch Decode Execute Memory Writeback 1. 300 ps 100 ps 200 ps 350 ps 100 ps Assume that the latency of pipeline registers is 50 ps. Given the times for the datapath stages listed above, what is the clock frequency for the non-pipelined and pipelined datapaths (assume pipelined datapath uses a common clock whose frequency is determined by the slowest stage) (4 pts) a. b. Assuming no hazards or stalls, what is the ideal pipeline speedup? (3 pts) If we can extend the pipelined datapath to 10 stages, how would you allocate the additional pipeline stages? Justify your answer. (5 pts) c. In an ideal situation, calculate the achievable speedup of the 10-stage pipelined data- path against the classic 5-stage pipelined data-path (assume constant clock time per stage). (4 pts) d.

Explanation / Answer

There are multiple parts of this question. I will answer the first one.

5

A)

for the non-pipelined data path time taken for each instruction will be the sum of all the stages =

300+100+200+350+100 = 1050 ps

frequency will be 1/time period = 952380000 Hz or 952 Mhz

for the pipelined processor the time taken for each instruction will be equal to the longest stage in the pipeline plus the register time.

350+50=400 ps

frequency will be 1/400 = 2,500,000,000 Hz or 2.5 Ghz

b)

The ideal speed up is given by

time taken in unpipelined processor/ number of pipe stages = 1050/5 = 210

c)

Ideally I would like to divide the longest time consuming path in the pipeline so that throughtput can be increased.

Since now the length of the pipe is doubled 10 stages we have extra 5 stages.

Fetch I will divide into 3 stages so that it takes 100 100 100 ns

Execute I will divide into 2 stages so that it takes 100 100 ns

Memory I will divide into 3 stages so that it takes 118 116 116 ns

so the pipeline will be

Fetch1 - 100 ps

Fetch 2 - 100ps

Fetch 3 - 100ps

Decode - 100 ps

Execute 1 - 100 ps

Execute 2 - 100 ps

Memory 1 - 116 ps

Memory 2 - 116 ps

Memory 3 - 118 ps

Writeback - 100 ps

D)

Speed up in 10 stage pipeline will be 1050/10 = 105

compared to 5 stage it will be 2x faster.