The choice of join algorithm is based on the cost estimate. Let us use the following information to elaborate the same:
MARKS (enroll no, subject code, marks):10000 rows, 500 blocks
STUDENT (enroll no, name, dob): 2000 rows, 100 blocks
Index scans can replace file scans if the join is an equi-join or natural join, and an index is available on the inner relation’s join attribute.
For each tuple si in the outer relation STUDENT, use the index to look up tuples in MARKS that satisfy the join condition with tuple si. In a worst case scenarios, the buffer has space for only one page of STUDENT, and, for each tuple in MARKS, then we should perform an index lookup on MARKS index.
Worst case: Block transfer of STUDENT+ number of records in STUDENT * cost of searching through index and retrieving all matching tuples for each tuple of STUDENT.
If a supporting index does not exist than it can be constructed as and when needed.
If indices are available on the join attributes of both STUDENT and MARKS, then use the relation with fewer tuples as the outer relation.
Example of Index Nested-Loop Join Costs
Compute the cost for STUDENT and MARKS join, with STUDENT as the outer relation. Suppose MARKS has a primary B+-tree index on enroll no, which contains 10 entries in each index node. Since MARKS has 10,000 tuples, the height of the tree is 4, and one more access is needed to the actual data.
The STUDENT has 2000 tuples. Thus, the cost of indexed nested loops join as:
100 + 2000 * 5 = 10,100 disk accesses
MARKS (enroll no, subject code, marks):10000 rows, 500 blocks
STUDENT (enroll no, name, dob): 2000 rows, 100 blocks
Index scans can replace file scans if the join is an equi-join or natural join, and an index is available on the inner relation’s join attribute.
For each tuple si in the outer relation STUDENT, use the index to look up tuples in MARKS that satisfy the join condition with tuple si. In a worst case scenarios, the buffer has space for only one page of STUDENT, and, for each tuple in MARKS, then we should perform an index lookup on MARKS index.
Worst case: Block transfer of STUDENT+ number of records in STUDENT * cost of searching through index and retrieving all matching tuples for each tuple of STUDENT.
If a supporting index does not exist than it can be constructed as and when needed.
If indices are available on the join attributes of both STUDENT and MARKS, then use the relation with fewer tuples as the outer relation.
Example of Index Nested-Loop Join Costs
Compute the cost for STUDENT and MARKS join, with STUDENT as the outer relation. Suppose MARKS has a primary B+-tree index on enroll no, which contains 10 entries in each index node. Since MARKS has 10,000 tuples, the height of the tree is 4, and one more access is needed to the actual data.
The STUDENT has 2000 tuples. Thus, the cost of indexed nested loops join as:
100 + 2000 * 5 = 10,100 disk accesses
No comments:
Post a Comment