Can someone explain this unrolling loop Like on the first Four LF how do I Know
ID: 3666752 • Letter: C
Question
Can someone explain this unrolling loop
Like on the first Four LF how do I Know to use F6 F10 and F14 also how to put that -8, -16 and -14 to put infrot
Please explain line by line of the unrolling loop please
Loop : LF F0, 0(R1)
ADDF F4, F0, F2
SUB R1, R1, #8
BNEZ R1, Loop
SF 8(R1), F4
Loop unrolling
LF F0, 0(R1)
LF F6, -8(R1)
LF F10, -16(R1)
LF F14, -24(R1)
ADDF F4, F0, F2
ADDF F8, F6, F2
ADDF F12, F10, F2
ADDF F16, F14, F2
SF 0(R1), F4
SF -8(R1), F8
SF -16(R1), F12
SUB R1, R1, #32
BNEZ R1, Loop
SF 8(R1), F16
Explanation / Answer
Loop Unrolling:
LF F0, 0(R1) ;F0 - array element
LF F6, -8(R1) ;F6 - array element
LF F10, -16(R1) ;F10 - array element
LF F14, -24(R1) ;F14 - array element
ADDF F4, F0, F2 ;add scalar from F2
ADDF F8, F6, F2 ;add scalar from F2
ADDF F12, F10, F2 ;add scalar from F2
ADDF F16, F14, F2 ;add scalar from F2
SF 0(R1), F4 ;store results
SF -8(R1), F8 ;store results
SF -16(R1), F12 ;store results
SUB R1, R1, #32 ;decrement pointer
BNEZ R1, Loop ; branch R1 != zero
SF 8(R1), F16 ;8-32 = -24
The SUB instructions form a dependent chain that involves the SUB, LF, and SF instructions. This forces the body to execute in order, as well as making the SUB instructions necessary, which increases the instruction count. The compiler removes this dependence by symbolically computing the intermediate values of R1 and folding the computation into the offset of the LF and SF instructions and by changing the final SUB into a decrement by 32.