The selection operation can be performed in a number of ways. Let us discuss the algorithms and the related cost of performing selection operations.
File scan: These are the search algorithms that locate and retrieve records that fulfil a selection condition in a file. The following are the two basic files scan algorithms for selection operation:
1) Linear search: This algorithm scans each file block and tests all records to see whether they satisfy the selection condition.
The cost of this algorithm (in terms of block transfer) is defined as:
Cost of searching records satisfying a condition = Number of blocks in a database = Nb.
Cost for searching a key attribute value = Average number of block transfer for locating the value
(on an average, half of the file needs to be traversed) so it is = Nb/2. Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices.
2) Binary search: It is applicable when the selection is an equality comparison on the attribute on which file is ordered. Assume that the blocks of a relation are stored continuously then, cost can be estimated as:
Cost = Cost of locating the first tuple by a binary search on the blocks + sequence of other blocks that continue to satisfy the condition
= log2 (Nb) + (number average of tuples with the same value / Blocking factore Number of Tuples in a block of the relations)
These two values may be calculated from the statistics of the database.
Index scan: Search algorithms that use an index are restricted because the selection condition must be on the search-key of the index.
These two values may be calculated from the statistics of the database.
Index scan: Search algorithms that use an index are restricted because the selection condition must be on the search-key of the index
File scan: These are the search algorithms that locate and retrieve records that fulfil a selection condition in a file. The following are the two basic files scan algorithms for selection operation:
1) Linear search: This algorithm scans each file block and tests all records to see whether they satisfy the selection condition.
The cost of this algorithm (in terms of block transfer) is defined as:
Cost of searching records satisfying a condition = Number of blocks in a database = Nb.
Cost for searching a key attribute value = Average number of block transfer for locating the value
(on an average, half of the file needs to be traversed) so it is = Nb/2. Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices.
2) Binary search: It is applicable when the selection is an equality comparison on the attribute on which file is ordered. Assume that the blocks of a relation are stored continuously then, cost can be estimated as:
Cost = Cost of locating the first tuple by a binary search on the blocks + sequence of other blocks that continue to satisfy the condition
= log2 (Nb) + (number average of tuples with the same value / Blocking factore Number of Tuples in a block of the relations)
These two values may be calculated from the statistics of the database.
Index scan: Search algorithms that use an index are restricted because the selection condition must be on the search-key of the index.
These two values may be calculated from the statistics of the database.
Index scan: Search algorithms that use an index are restricted because the selection condition must be on the search-key of the index
No comments:
Post a Comment