Tutorials and lab classes: Tutorial 4
Complete the following exercises on query optimisation.
Exercise 1
Consider a relation r over the attributes A, B, C. r has 5,000 tuples, with 5 tuples stored per page. The attributes A, B, C have the following characteristics:
•Attribute A is a candidate key
•Unclustered hash index on attribute A
•Clustered B+ tree index on attribute B of depth d
•Attribute B has 1,000 distinct values in r
•Attribute C has 500 distinct tuples and an unclustered 3-level B+ tree index
•Assume that the the tuples are stored in 2 leaf nodes, rather than just in one.
•Hint: The total cost of accessing the leaf nodes in the tree is given by the cost of searching the index for the address of the first leaf node + 1 (cost of retrieving the second page of the index).
Estimate the cost of computing the following queries:
(a)σA=valuet(r) using the index
(b)σB=value(r) using the index
(c)σC=value(r) using the index
(d)πAC (r)
Optional question:
(e) Can you guess what the cost of writing the output of πAC (r) is? Assume that all attributes contribute equally to the tuple size.
Exercise 2
Consider a relation s over the attributes A and B with the following characteristics:
•7,000 tuples with 70 tuples per page
•A hash index on attribute A
•The values that the attribute A takes in relation s are integers that are uniformly distributed in the range 1 – 200.
Estimate the cost of computing the following:
(a) Assuming that the aforesaid index on A is unclustered, estimate the number of disk accesses needed to compute the query σA=18(s).
Optional question:
(b) What would be the cost estimate if the index were clustered? Explain your reasoning.
Hint: The tuples might cluster around a page boundary.
Optional exercise
Estimate the cost of r ⋈ s using
(a) Sort-merge join
(b) Block nested loops
where r has 1,000 tuples, 20 tuples per page; s has 2,000 tuples, 4 tuples per page; and the main memory buffer for this operation is 22 pages long.