jump to navigation

Clustering Factor: A Consideration in Concatenated Index Leading Column Decision (Sweet Thing) February 15, 2008

Posted by Richard Foote in Clustering Factor, Concatenated Indexes, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
14 comments

Short but sweet today.

I last discussed how high cardinality columns shouldn’t necessarily be in the leading column of a concatenated index as  they don’t provide the performance benefit as sometimes claimed.

If all things are equal and the columns in the concatenated index are all likely to be referenced, a simple consideration that is often forgotten when deciding which column to have as the leading index column is the Clustering Factor of the corresponding columns.

As previously discussed, the Clustering Factor  determines how well aligned or ordered the index entries are in relation to the rows in the parent table. So if the rows are ordered within the table on a particular column or columns (such as a sequential ID column, a monotonically increasing date or time-stamp, etc), then an index on these columns is likely to have a very good Clustering Factor. Consequently less IOs will be required to retrieve all the required rows via the index as all the required rows will be housed in relatively few, well clustered data blocks.

It therefore makes sense to at least consider the Clustering Factor of the various columns in a concatenated index. Why ? Because if the leading column has a very good Clustering Factor, the concatenated index by definition must also have a very good Clustering Factor as all indexes are primarily ordered based on the leading indexed column. A concatenated index with a good Clustering Factor is going to be more efficient in retrieving rows from the table and more importantly, will be considered more desirably by the CBO when costing access path options.

Of course, the opposite is also true. By having a leading column with a poor Clustering Factor will mean the concatenated index will have a poor Clustering Factor, making it less efficient and less likely to be considered by the CBO.

As such, the Clustering Factor of each corresponding column in a concatenated index is at least worthy of some consideration when making the decision on how best to order the indexed columns.

This demo on Index Column Order and Clustering Factor  shows how the order of columns in a concatenated index has a big impact on the Clustering Factor of the resultant index.

UPDATE: However as Tom Kyte has stated in the comments, in virtually all cases, the Clustering Factor is not really a factor (yes, pun fully intended) as subsequently columns are generally going to impact the CF anyways or the selectivity of the index is such that the improved CF is not relevant anyways.

More relevant considerations regarding the ordering of columns in an index coming I promise 🙂