Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Increasing cache associativity (a) reduces compulsory misses (b) reduces miss pe

ID: 3586303 • Letter: I

Question

Increasing cache associativity (a) reduces compulsory misses (b) reduces miss penalty (c) reduces hit time (d) reduces conflict misses (e) all of the above (f) none of the above . The effectiveness of critical word first and early restart is greater when (a) cache block (line/row) sizes are small (b) miss penalties are small (c) the likelihood to access blocks near each other is small (d) all of the above (e) none of the above A measure for the throughput metric used to evaluate computer system performance is (a) CPI (b) MIPS (c) FLOPS (d) TPM (e) all of the above (f) none of the above . If the clock cycle time of a single-cycle unpipelined processor is C, and the clock cycle time of an n-stage pipeline is C/n, then the ideal speedup of the pipelined processor is approximately (a) n/C (b) (C/n)/C (c) C'/n (d) n (e) n-1 . In reality, pipelining cannot achieve an ideal speedup because of (a) data dependencies (b) hazards (c) faults (d) interrupts (e) all of the above (f) none of the above

Explanation / Answer

Ans1. ==> (e) All of the above
Higher associativity can improve hit rate but increased hit time
Higher associativity reduces conflict misses hence reducing Miss Rate
Higher associativity would increase clock cycle time over direct mapped cache
so slower cycle time because of comparator.

Ans2. ==> (c)the likelihood to access blocks near each other is small
It is based on the observation that the CPU normally needs just one word of the block at a time.
There are two strategies based on impatience: Don’t wait for the full block to be loaded before sending the requested word and restarting the CPU.

Here are two specific strategies:
Critical word first — Request the missed word first from memory and send it to the CPU as soon as it arrives;
let the CPU continue execution while filling the rest of the words in the block.
Critical-word-first fetch is also called wrapped fetch and requested word first.

Early restart — Fetch the words in normal order, but as soon as the requested word of the block arrives,
send it to the CPU and let the CPU continue execution.
Hence Generally these techniques only benefit designs with large cache blocks, since the benefit is low unless blocks are large.

>Generally useful only for large blocks,
>Spatial locality a problem; tend to want next sequential
word, so not clear if there is a benefit by early restart
>The benefits of this approach depend on the size of the
cache block and the likelihood of another access to the
portion of the block not yet been fetched

Ans3. ==> (d) TPM
As,Throughput is the amount of transactions produced over time.
CPI (Cycles per instruction) which is Function of program, compiler, ISA, micro-architecture
MIPS (millions of instructions per second) which is instruction count / execution time*10^6 = clock rate / (CPI * 10^6)
FLOPS or MFLOPS (millions of floating point operations per second) which is floating point operations / (execution time * 10^6)
TPM(transactions per minute) or TPS(transactions per second) is throughput performance.

Ans4. ==> (f) 1/n
The theoretical speedup offered by a pipeline can be determined
as follows:
• Let k be total number of stages and tp be the time per stage
• Each instruction represents a task, T, in the pipeline and n be
the total number of tasks
• The first task (instruction) requires k × tp time to complete in a
k-stage pipeline.
• The remaining (n - 1) tasks emerge from the pipeline one per
cycle
• So the total time to complete the remaining tasks is (n - 1)tp
• Thus, to complete n tasks using a k-stage pipeline requires:
(k × tp) + (n - 1)tp = (k + n - 1)tp
If we take the time required to complete n tasks without a pipeline and
divide it by the time it takes to complete n tasks using a pipeline, we find:
speedup s = ntn/(k + n - 1)tp
since ntn = k × tp
If we take the limit as n approaches infinity, (k + n - 1) approaches n,
which results in a theoretical speedup of:
speedup s = k × tp/tp = k
So clock cycle time C, clock cycle time of n stage pipeline is C/n then
speedup s = (C/n)/C = 1/n

Ans5. ==> (b) hazards
Hazards prevent the next instruction in the instruction stream from being executing during its designated clock cycle.
Hazards reduce the performance from the ideal speedup gained by pipelining.
1. Structural Hazard
2. Data Hazard
3. Control Hazard
They arise from resource conflicts when the hardware cannot support all possible .
A cache miss stalls all the instructions on pipeline both before and after the instruction causing the miss.
Eliminating a hazard often requires that some instructions in the pipeline to be allowed to proceed while others are delayed.
When the instruction is stalled, all the instructions issued later than the stalled instruction are also stalled.
Instructions issued earlier than the stalled instruction must continue, since otherwise the hazard will never clear.