Question
Consider the condition join R1 ??R1.a=R2.b R2, given the following information about the relations to be joined. The cost-metric is the number of IOs. The cost of writing the result would be the same independent of the particular join method used, hence we henceforth can ignore it. Given:
R1 has 10,000 tuple, 10 tuples per block • R2 has 2,000 tuple, 10 tuples per block • The available memory buffers are 52
Assume we use a sort-merge join, and we use the “Efficient Sort-Merge” algorithm covered in class where we merge the sorting and joining together a. What is the cost of the join algorithm? b. What is the minimum number of buffers needed for the cost to remain unchanged, i.e., Can we use less than 52 buffers and still have the same cost that you calculated in 2.a?
Explanation / Answer
Sort-Merge Join Sort both relations on join attribute(s) – Merge both sorted relations • Caution if duplicates exist – The result size still is |R|*|S| in worst case – If there are r/s tuples with value x in the join attribute in R / S, we need to output r*s tuples for x So what is the worst case?? • Example R A B A1 0 A2 1 A4 1 A3 2 SB C 1 C1 1 C3 1 C5 2 C2 3 C4 R A B A1 0 A2 1 A4 1 A3 2 SB C 1 C1 1 C3 1 C5 2 C2 3 C4 A B C A2 1 C1 A2 1 C3 A2 1 C5 Partial join {,} S Example Continued R A B A1 0 A2 1 A4 1 A3 2 S B C 1 C1 1 C3 1 C5 2 C2 3 C4 Partial join {,, } S A BC A2 1 C1 A2 1 C3 A2 1 C5 A4 1 C1 A4 1 C3 A4 1 C5 Merge Phase r := first (R); s := first (S); WHILE NOT EOR (R) and NOT EOR (S) DO IF r[B] s[B] THEN s := next (S) ELSE /* r[B] = s[B]*/ b := r[B]; B := ; WHILE NOT EOR(S) and s[B] = b DO B := B {s}; s = next (S); END DO; WHILE NOT EOR(R) and r[B] = b DO FOR EACH e in B DO OUTPUT (r,e); r := next (R); END DO; END DO; Cost estimation Sorting R costs 2*b(R) * ceil(logm(b(R))) • Sorting S costs 2*b(S) * ceil(logm(b(S))) • Merge phase reads each relation once • Total IO – b(R) + b(S) + 2*b(R)*ceil(logm(b(R))) + 2*b(S)*ceil(logm(b(S))) • Improvement – While sorting, do not perform last read/write phase – Open all sorted runs in parallel for merging – Saves 2*b(R) + 2*b(S) IO • Sometimes, we are able to save the sorting phase – If sort is performed somewhere down in the tree – Needs to be a sort on the same attribute(s) • Merge-sort join: No inner/outer relation Better than Blocked-Nested-Loop? Assume b(R)=10.000, b(S)=2.000, m=500 – BNL costs 42.000 • With S as outer relation – SM: 10.000+2.000+4*10.000+4*2.000 = 60.000 – Improved SM: 36.000 • Assume b(R)=1.000.000, b(S)=1.000, m=500 – BNL costs 1000 + 1.000.000*1000/500 = 2.001.000 – SM: 1.000.000+1.000+6*1.000.000+4*1.000 = 7.005.000 – Improved SM: 5.003.000 • When is SM better than BNL?? – Consider improved version with • 2*b(R)*ceil(logm(b(R))) + 2*b(S)*ceil(logm(b(S))) – b(R) - b(S) ~ • 2*b(R)*(logm(b)+1) + 2*b(S)*(logm(S)+1) – b(R) – b(S) ~ • 2*b(R)*logm(b) + 2*b(S)*logm(S) – b(R) – b(S) ~ • b(R)*(2*logm(b)-1) + b(S)*(2*logm(S)-1) – In most cases, this means 3*(b(S)+b(R)) Comparison Assume relations of equal size b • SM: 2*b*(2*logm(b)-1) • BNL: b+b2/m • BNL > SM – b+b2/m > 2*b*(2*logm(b)-1) – 1+b/m > 4*logm(b) - 2 – b > 4m*logm(b) – 3m • Example – b=10.000, m=100 (10.000 > 500) • BNL: 10.000 + 1.000.000, SM: 6*10.000 = 60.000 – b=10.000, m=5000 (10.000 < 25.000) • BNL: 10.000 + 20.000, SM: 6*10.000 = 60.000 Comparison 2 b(R)=1.000.000, b(S)=2.000, m between 100 and 90.000 0 5.000.000 10.000.000 15.000.000 20.000.000 25.000.000 100400700100040007000100004000070000 BNL Join SM+Join • BNL very good is one relation is much smaller than other and sufficient memory available • SM can better cope with limited memory • SM profit from more memory has jumps (= number of runs) Comparison 3 b(R)=1.000.000, b(S)=50.000, m between 500 and 90.000 • BNL very sensible to small memory sizes 0 10.000.000 20.000.000 30.000.000 40.000.000 50.000.000 60.000.000 70.000.000 80.000.000 90.000.000 500 700 900 2000 4000 6000 8000 10000 30000 50000 70000 9 Merge-Join and Main Memory We have no „m“ in the formula of the merge phase – Implicitly, it is in the number of runs required • More memory doesn’t decrease number of blocks to read, but can be used for sequential reads – Always fill memory with m/2 blocks from R and m/2 blocks from S – Use asynchronous IO 1. Schedule request for m/4 blocks from R and m/4 blocks from S 2. Wait until loaded 3. Schedule request for next m/4 blocks from R and next m/4 blocks from S 4. Do not wait – perform merge on first 2 chunks of m/4 blocks 5. Wait until previous request finished 1. We used this waiting time very well 6. Jump to 3, using m/4 chunks of M in turn Pseudocode function sortMerge(relation left, relation right, attribute a) var relation output var list left_sorted := sort(left, a) // Relation left sorted on attribute a var list right_sorted := sort(right, a) var attribute left_key, right_key var set left_subset, right_subset // These sets discarded except where join predicate is satisfied advance(left_subset, left_sorted, left_key, a) advance(right_subset, right_sorted, right_key, a) while not empty(left_subset) and not empty(right_subset) if left_key = right_key // Join predicate satisfied add cartesian product of left_subset and right_subset to output advance(left_subset, left_sorted, left_key, a) advance(right_subset, right_sorted, right_key, a) else if left_key right_key advance(right_subset, right_sorted, right_key, a) return output // Remove tuples from sorted to subset until the sorted[1].a value changes function advance(subset out, sorted inout, key out, a in) key := sorted[1].a subset := emptySet while not empty(sorted) and sorted[1].a = key insert sorted[1] into subset remove sorted[1]