Academic Integrity: tutoring, explanations, and feedback — we don’t complete graded work or submit on a student’s behalf.

Can some one jusify this formula on image. I mean can you tell us how they get t

ID: 3745297 • Letter: C

Question

Can some one jusify this formula on image. I mean can you tell us how they get this formula?

External Merge Sort (Cont.) Cost analysis: 1 block per run leads to too many seeks during merge Instead use b, buffer blocks per run > read/write b, blocks at a time Can mergeLM/b1 runs in one pass Total number of merge passes required: 109.Mb0ufbM) | Block transfers for initial run creation as well as in each pass is 2b, for final pass, we don' t count write cost we ignore final write cost for all operations since the output of an operation may be sent to the parent operation without being written to disk Thus total number of block transfers for external sorting: - Seeks: next slide

Explanation / Answer

The main purpose of external merge sort is to use external memory while using the merge sort algorithm which maximizes the number of disk accesses and performance.

The explanation for the derivation of formula (Rough):

- Given: There are b blocks/files with M elements each, which are to be sorted externally.

- We'll take bb number of blocks for sorting in a single run (pass) of merge sort algorithm.

- For M blocks, the total number of merges in a single pass/run floor(M/bb) or ceil(M/bb). Take (M/bb)-1 for a simplified view (It merely depends on values of M and bb).

- The number of merge passes required will be log(merges in a single run)(Total number of runs required) where merges in a single run = (M/bb) - 1, and Total number of runs required is bb/M (Calculated as for 1 file it will take 1/M elements, similarly for br files it will take br/M).

- Block transfers for one run/pass will be twice of log(merges in a single run)(Total number of runs required). One for transferring blocks to external memory and one for taking it back.

- Total number of block transfers will therefore equal to 2*(log(merges in a single run)(Total number of runs required)) + 1. Plus one is because for either the case of the first or last block (depending upon from where we start merging blocks) we'll have to transfer only one block for that.

- Hence, the cost formula can be extended for br block size is given as above.