Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-s

ID: 3572624 • Letter: A

Question

A RISC processor is driven by a 2-GHz clock. Instructions are executed in five-stage pipeline. Instruction statistics in a large program are as follows: Branch 10% Load 40% Store 20% Computational instructions 30% Please answer the following questions. Assume all memory access operations are cache hit. What is the ideal instruction throughput? Assume there are an instruction cache and a data cache. For both caches, it takes 4 cycles to complete the memory access if it is a cache miss. 80% of instruction fetch arc cache hit. 20% arc cache miss. For all Load instructions, the data to be accessed are in the data cache, but 30% of the Load instructions are followed by a dependent instruction, which will stall the pipeline for one cycle. 40% of the Store instructions store data into the data cache, while 60% of them store data into the main memory. What is the instruction throughput? Assume all memory accesses are cache hit. 40% of the branch instructions are unconditional, while 60% are conditional, 80% of the conditional branches are taken, 20% are not taken. The penalty for taking the branch is one cycle. What is the instruction throughput?

Explanation / Answer

(a) Scenario 1:

     The throughput of the system would be maximum. As the system accesses the data from only the cache memory in every execution, additional time is not required for the system to fetch instructions from the main memory and also the traffic in the pipelines can be beared minimal by the system as the additional data transformation from the main memory is not required.

(b) Scenario 2:

     The instruction throughput of the system will be much slow and minimal compared to all the three scenarios provided. As the pipeline traffic would be at the peak of excecution, the processor might heat up and hang often, Provided 20% are cache miss, 30% of instructions are dependent and 60% have to be accessed by the main memory and frequent stalling of the pipelline for many cycles will bring down the system performance, increase power consuption and rapidly increase the data congestion within the chip.

(c)Scenario 3:

The throughput of the system will be efficient. The traffic within the chip will be bearable, as the execution jumps depending on the branch instructions. Provided all the instructions will be a cache hit, the system will not have to access any other memory for execution thereby increasing the system efficency and reduction in processing time. The penalty taken by the processor may be negligable as the entire operation is done based on the cache memory. The system performs to it's max as 80% of the branches are taken.