jump to navigation

Index Monitoring and Index Statistics (The Great Gig In The Sky) September 16, 2008

Posted by Richard Foote in 11g, Concatenated Indexes, Extended Statistics, Index Monitoring, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
trackback

I write this post whilst listening to Pink Floyd’s masterpiece “The Dark Side Of The Moon” while sadly lamenting the passing away of Richard Wright. RIP and thank you for all the great musical gifts you’ve given me over the years.

In my last post I highlighted an example of where Index Monitoring doesn’t show how indeed Oracle does indeed use an index when checking for the existence of Foreign Key values. Thought I might discuss yet another example of where Oracle does indeed use an index but it’s again not picked up by index monitoring, this time with a slight 11g flavour.

This specific example involves using the statistics associated with the indexes to provide the CBO with useful additional information, although the index itself is not used directly within the execution plan. Dropping the index means losing this information which could possibly result in a different, non optimal execution plans.

Prior to 11g, Oracle can have a hard time of accurately determining the correct selectively where there is a correlation between two (or more) columns in a table. By default, Oracle assumes the selectivity of two distinct columns to be the density of both columns multiplied together. So for example, if one column (say “A”) had 10 distinct values and the other column (say “B” also had 10 distinct values, Oracle assumes the selectivity of both columns combined to be 10 x 10 = 100 distinct values. A predicate such as:

WHERE A = 5 and B = 2

would assume 1% of data would be retrieved if both columns A and B both had 10 distinct possible values.

However, what if there’s a special relationship between the columns and the actual number of distinct combinations was somewhat different ? What if B always equals 2 when A equals 5, what if instead of the theoretical 100 different combinations there were only 10 combinations (or some such) because most of the other possible combination don’t actually exist …

This 9i and 10g demo shows how there is indeed only 10 distinct values for each of two different columns, however there is a direct relationship between these columns such that there is actually only 10 distinct combinations of both these columns (and not the 100 combinations which are possible and which Oracle assumes in it’s selectivity calculations).

Instead of the actual 10,000 rows (or 10% of all data) being selected, Oracle is incorrectly assuming only 1000 rows (or 1%) will be selected. This is a significant error by a order of magnitude which in many cases can result in a less efficient execution plan.

With 11g, Oracle can use the statistics associated with an index to give Oracle some vital extra information. Because if there’s an index based on the two columns, then the number of distinct key values recorded for the index can provide Oracle with a much more accurate estimation of the true selectivity based on the two columns. If a concatenated index based on the two columns only has say 10 distinct values, then Oracle can assume that a specific combination of the two columns is likely to also retrieve 1/10 of all the values and not the 1/100 that are theoretically possible.

This identical 11g demo to the one above shows by having an index on the two columns that have a correlation, Oracle is using the DISTINCT_KEYS statistic for the index to determine the correct selectivity and associated cardinality for the query.

However, the demo clearly shows index monitoring is still not showing the index as being “USED”. If you were to hence drop the index, the CBO loses potentially vital information and the cardinality estimates revert back to being the product of the two column densities as with pre 11g.

By dropping the index which appears to not be used, we can potentially impact other execution plans, even though they don’t directly use the index within the execution plan. The correct cardinality estimates of a table can for example potential drive the order in which the table is subsequently joined or the manner in which it’s joined.

This demo on the possible impact of dropping an “unused” index shows how an execution plan can change to be sub-optimal, even though neither execution actually directly uses the associated index. It’s not a particularly “clever” example, but it does illustrate the potential impact of dropping these so called unused indexes.

Of course, with 11g, we now have the capability of collecting extended statistics. We can potentially determine these same level of statistics by generating statistics on both columns combined. Oracle can determine the actual distinct combinations of columns that are somehow correlated and produce more accurate and detailed statistics with hence make the CBO determine more accurate and reliable cardinality estimates.

This final demo on extended statistics shows how we can recreate the more efficient execution plan and provide the CBO with more detailed extended statistics so that it can accurately determine the correct cardinality estimates without the need to recreate the “unused” index.

Extended statistics can be extremely useful in determining correct cardinality values for column combinations that exist however it still falls somewhat short when it attempts to estimate the expected cardinality for combinations that don’t actually exist, even with histograms.

But that’s a tale for another day.

Now it’s time for “Wish You Were Here” …

Comments»

1. Greg Rahn - September 16, 2008

Some of the 11g functionality to use an index that contains the predicate columns to calculate the cardinality as a group vs. individual columns has been back ported to 10.2.0.4. This is tracked by bug 5040753 and can be enabled by using
alter session/system set “_fix_control” = ‘5765456:7’;
Another bug to be mindful of is 6221403 which is related to extended statistics and out-of-range values.

Like

2. Richard Foote - September 22, 2008

Hi Greg

Thanks for the heads up, much appreciated.

Like

3. Jonathan Lewis - September 25, 2008

Richard,

There’s been a hint of this column dependency code in the optimizer since 8i in the “index sanity check” for joins – but it only applies for unique indexes.

I’ve been saying for several years “watch out for the day that Oracle starts doing this for non-unique indexes, because you’ll be even more worried about the side-effects of dropping, or changing, multi-column indexes” – thanks for letting me know that the day has arrived. (Interestingly – perhaps thankfully – the sanity check on joins is still restricted to unique indexes)

Regards
Jonathan Lewis

Opening Menu

Like

4. ET - September 26, 2008

Jonathan: I see u no longer list this blog in your listings. Something the savy professional should note yes.

Like

5. Jonathan Lewis - September 26, 2008

ET,

I keep my blog list short, and rotate a few entries on and off over time. I’ve recently added a couple of people, and Richard was (I think) one of the people that dropped off as a consequence.

Have you heard the one about the woman who gave her son two ties for his birthday, a red one and a blue one. The day after his birthday he decided to wear the red one, and his mother said: “so what’s wrong with the blue tie ?”

Regards
Jonathan Lewis

Like

6. Richard Foote - September 29, 2008

Hi Jonathan

Yes, I remember reading from you somewhere previously how Oracle can use an index as a sanity check. Certainly one now needs to be very careful of the possible effects of dropping so-called unused indexes with later releases.

BTW, your comment to ET wasn’t necessary, how you manage your site and links is entirely your business, “savy professionals” should know better than to make such comments 😉

Like

7. Richard Foote - September 29, 2008

Hi ET

The fact you visited this site, does that make you a “savy professional” or not ?

Like

8. Book Review: Oracle SQL Recipes: A Problem – Solution Approach « Charles Hooper's Oracle Notes - June 7, 2010

[…] be used by the cost-based optimizer even when monitoring reveals that the index is never used. [Reference/Test Case – Check the Comments Section Also] [Reference/Test Case – Check the Comments Section […]

Like


Leave a comment