jump to navigation

The CBO and Indexes: Introduction Continues … June 15, 2009

Posted by Richard Foote in CBO, Oracle Indexes.
11 comments

OK, I previously briefly covered how the CBO calculates the basic cost of an index range scan. Yes, those cardinality/rows values in the execution plans are vitally important as they highlight whether or not the CBO has used the appropriate selectivity values in the index costing formula. And yes, the associated cost values are meaningful and potentially useful as they determine the actual costs associated with the execution plan in terms of the expected number of I/Os the CBO estimates will be required (when using the IO costing model and often the CPU costing model as well).

I’m just going to look at another example now using the same table setup as before, but this time running an SQL query that has 5 distinct values in an IN list predicate on our demo table (again, follow the link to see the query and formatted execution plan).

The first thing we notice in this example, is that Oracle has decided to use a FTS rather than use the index on the ID column. Considering we’re only after 5 values out of the possible 100 values, some may not see this as expected behaviour, especially considering the index has such a good Clustering Factor. Basically Oracle is deciding to access each and every block below the HWM of the table, retrieving all 100% of the rows in the table, only to ultimately discard 95% of them.

It certainly appears at first glance to be a more “costly” option than using the index to directly access just the 5% of rows we’re interested in …

The first thing to check is the estimated cardinality figures, to see if the CBO has miscalculated the expected number of rows it needs to retrieve. However, as the statistics have just been fully computed and that the ID column has perfectly even distributed values, we notice the cardinality figures are again spot on. The query returns 10,000 rows and indeed the rows estimate in the execution plan is exactly10,000 rows. The calculation is simply 0.01 (density of column) x 200,000 (rows) x 5 (values in select list) = 10,000.

Let’s now calculate the cost of using the index using our index costing formula, using the CEIL function this time ;)

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854) = 2 + 5 x 7 + 43 = 37 + 43 = 80.

So the cost of using an index range scan to retrieve 5% of the rows in our example comes to a total of 80.

If we look at the cost of the FTS in our explain plan in the above link, we notice the cost is just 65. 65 is less than 80, so the FTS wins.

So how did the CBO come to a cost of just 65 when it has to read all 659 blocks in the table ?

Well, the first missing piece of information is the value of the db_file_multiblock_read_count parameter because this governs how many blocks Oracle will attempt to read within a single logical multiblock I/O call. Remember, when performing a FTS, Oracle knows it has to read all the table related blocks below the HWM and so rather than reading one tiny little block at a time, it does so more efficiently by reading multiple blocks at a time. This is the fundamental advantage of the FTS over the index range scan which can only ever access the one block at a time.

SQL> show parameter db_file_multi

NAME                          TYPE    VALUE
----------------------------- ------- -----
db_file_multiblock_read_count integer    16

 

So the db_file_multiblock_read_count is 16.

The next thing to note is that it’s very unlikely that Oracle will actually read the full 16 blocks at a time as there are a number of factors that prevents this from occurring. Extent boundaries is one classic example (a multiblock read can not span across extent boundaries) but the more common issue is a block within the table already being stored in the buffer cache. Rather than storing the same block at the same consistent point twice in memory, Oracle breaks up the multiblock read and only reads up to the block that is already cached in the buffer cache. Therefore, for Oracle to actually read the entire table using the full 16 block multiblock I/Os, it would mean there are no cached blocks from the table currently in the buffer cache, an unlikely event.

Therefore, Oracle doesn’t use the full 16 value when determining the number of expected multiblock I/Os, but a modified “fudge” value which equates to approximately 10.4. for a MBRC of 16. Again, Jonathan Lewis in his excellent “Cost-Based Oracle Fundamentals” book discusses all this is some detail. 

Remember also that Oracle needs to access the segment header as part of a FTS as I explained is some detail in my “Indexes and Small Table” series. So that’s an additional single block I/O on top of the multiblock I/Os.

Therefore the cost of performing a FTS is:

segment header I/O + ceil(table blocks/fudged mbrc value) = 1 + ceil(659/10.4) = 1 + 64 = 65.

The 65 cost for the FTS does make sense when one understands a little how this value is derived by the CBO …

As the FTS can read big chunks of the table at a time whereas the index range scan can only read each necessary block one at a time, the FTS can indeed read the table and retrieve the required 5% of data in fewer LIOs and so has the lesser associated cost than the index.

Now there are a few issues with all of this. Firstly, is the db_file_multiblock_read_count actually a valid and correct setting as this directly impacts not only the actual size of the multiblock read operations but critically, the associated costs relating to FTS operations (and indeed Fast Full Index Scans as well) ?

Also, is it really correct and valid to assume the cost of a multiblock I/O to be the same and equal to the cost of a single block I/O ? Surely, the process of performing a single block I/O is likely to be “cheaper” than that of a multiblock I/O and yet the CBO treats both types of I/Os as having the same fundamental “cost”.

Also the CPU overheads of having to access each and every row in each and every block is likely going to be more significant than the CPU required to access just specific data from specific blocks when using an index.

Perhaps, the more “expensive” index range scan might actually be a better alternative than the FTS if these factors were taken into consideration ?

Now this may indeed be true, if these factors were correctly taken into consideration. However, this may also indeed be quite false and the FTS may really truly be the better and more efficient alternative and attempts to force the use of the index may be inappropriate and ultimately more expensive.

I’ll next discuss some really bad (although still very common) methods of making the CBO favour indexes, using generally inappropriate so-called “Silver Bullets” …

Follow

Get every new post delivered to your Inbox.

Join 1,852 other followers