The choice of join algorithm is based on the cost estimate. Let us use the following information to elaborate the same:
MARKS (enroll no, subject code, marks):10000 rows, 500 blocks
STUDENT (enroll no, name, dob): 2000 rows, 100 blocks
This is a variant of nested-loop join in which a complete block of outer loop is joined with the block of inner loop.
The algorithm for this may be written has:
for each block s of STUDENT
{
for each block m of MARKS
{
for each tuple si in s
{
for each tuple mi in m
{
Check if (si and mi) satisfy the join condition if they do output joined tuple to the result
};
};
};
};
Worst case of block accesses in this case = Number of Blocks of outer relation (STUDENT) ×Number of blocks of inner relation (MARKS) + Number of blocks of outer relation (STUDENT).
Best case: Blocks of STUDENT + Blocks of MARKS
Number of block transfers assuming worst case:
100*500 + 100 = 50,100 (much less than nested-loop join)
Number of block transfers assuming best case:
400 + 100 = 500 (same as with nested-loop join)
Improvements to Block Nested-Loop Algorithm
The following modifications improve the block Nested method:
Use M – 2 disk blocks as the blocking unit for the outer relation, where M = memory size in blocks.
Use one buffer block to buffer the inner relation.
Use one buffer block to buffer the output.
This method minimizes the number of iterations.
MARKS (enroll no, subject code, marks):10000 rows, 500 blocks
STUDENT (enroll no, name, dob): 2000 rows, 100 blocks
This is a variant of nested-loop join in which a complete block of outer loop is joined with the block of inner loop.
The algorithm for this may be written has:
for each block s of STUDENT
{
for each block m of MARKS
{
for each tuple si in s
{
for each tuple mi in m
{
Check if (si and mi) satisfy the join condition if they do output joined tuple to the result
};
};
};
};
Worst case of block accesses in this case = Number of Blocks of outer relation (STUDENT) ×Number of blocks of inner relation (MARKS) + Number of blocks of outer relation (STUDENT).
Best case: Blocks of STUDENT + Blocks of MARKS
Number of block transfers assuming worst case:
100*500 + 100 = 50,100 (much less than nested-loop join)
Number of block transfers assuming best case:
400 + 100 = 500 (same as with nested-loop join)
Improvements to Block Nested-Loop Algorithm
The following modifications improve the block Nested method:
Use M – 2 disk blocks as the blocking unit for the outer relation, where M = memory size in blocks.
Use one buffer block to buffer the inner relation.
Use one buffer block to buffer the output.
This method minimizes the number of iterations.
No comments:
Post a Comment