jump to navigation

Index Monitoring and Index Statistics (The Great Gig In The Sky) September 16, 2008

Posted by Richard Foote in 11g, Concatenated Indexes, Extended Statistics, Index Monitoring, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
8 comments

I write this post whilst listening to Pink Floyd’s masterpiece “The Dark Side Of The Moon” while sadly lamenting the passing away of Richard Wright. RIP and thank you for all the great musical gifts you’ve given me over the years.

In my last post I highlighted an example of where Index Monitoring doesn’t show how indeed Oracle does indeed use an index when checking for the existence of Foreign Key values. Thought I might discuss yet another example of where Oracle does indeed use an index but it’s again not picked up by index monitoring, this time with a slight 11g flavour.

This specific example involves using the statistics associated with the indexes to provide the CBO with useful additional information, although the index itself is not used directly within the execution plan. Dropping the index means losing this information which could possibly result in a different, non optimal execution plans.

Prior to 11g, Oracle can have a hard time of accurately determining the correct selectively where there is a correlation between two (or more) columns in a table. By default, Oracle assumes the selectivity of two distinct columns to be the density of both columns multiplied together. So for example, if one column (say “A”) had 10 distinct values and the other column (say “B” also had 10 distinct values, Oracle assumes the selectivity of both columns combined to be 10 x 10 = 100 distinct values. A predicate such as:

WHERE A = 5 and B = 2

would assume 1% of data would be retrieved if both columns A and B both had 10 distinct possible values.

However, what if there’s a special relationship between the columns and the actual number of distinct combinations was somewhat different ? What if B always equals 2 when A equals 5, what if instead of the theoretical 100 different combinations there were only 10 combinations (or some such) because most of the other possible combination don’t actually exist …

This 9i and 10g demo shows how there is indeed only 10 distinct values for each of two different columns, however there is a direct relationship between these columns such that there is actually only 10 distinct combinations of both these columns (and not the 100 combinations which are possible and which Oracle assumes in it’s selectivity calculations).

Instead of the actual 10,000 rows (or 10% of all data) being selected, Oracle is incorrectly assuming only 1000 rows (or 1%) will be selected. This is a significant error by a order of magnitude which in many cases can result in a less efficient execution plan.

With 11g, Oracle can use the statistics associated with an index to give Oracle some vital extra information. Because if there’s an index based on the two columns, then the number of distinct key values recorded for the index can provide Oracle with a much more accurate estimation of the true selectivity based on the two columns. If a concatenated index based on the two columns only has say 10 distinct values, then Oracle can assume that a specific combination of the two columns is likely to also retrieve 1/10 of all the values and not the 1/100 that are theoretically possible.

This identical 11g demo to the one above shows by having an index on the two columns that have a correlation, Oracle is using the DISTINCT_KEYS statistic for the index to determine the correct selectivity and associated cardinality for the query.

However, the demo clearly shows index monitoring is still not showing the index as being “USED”. If you were to hence drop the index, the CBO loses potentially vital information and the cardinality estimates revert back to being the product of the two column densities as with pre 11g.

By dropping the index which appears to not be used, we can potentially impact other execution plans, even though they don’t directly use the index within the execution plan. The correct cardinality estimates of a table can for example potential drive the order in which the table is subsequently joined or the manner in which it’s joined.

This demo on the possible impact of dropping an “unused” index shows how an execution plan can change to be sub-optimal, even though neither execution actually directly uses the associated index. It’s not a particularly “clever” example, but it does illustrate the potential impact of dropping these so called unused indexes.

Of course, with 11g, we now have the capability of collecting extended statistics. We can potentially determine these same level of statistics by generating statistics on both columns combined. Oracle can determine the actual distinct combinations of columns that are somehow correlated and produce more accurate and detailed statistics with hence make the CBO determine more accurate and reliable cardinality estimates.

This final demo on extended statistics shows how we can recreate the more efficient execution plan and provide the CBO with more detailed extended statistics so that it can accurately determine the correct cardinality estimates without the need to recreate the “unused” index.

Extended statistics can be extremely useful in determining correct cardinality values for column combinations that exist however it still falls somewhat short when it attempts to estimate the expected cardinality for combinations that don’t actually exist, even with histograms.

But that’s a tale for another day.

Now it’s time for “Wish You Were Here” …

Index Monitoring and Foreign Keys – Caution (Kid A) September 12, 2008

Posted by Richard Foote in 11g, Constraints, Foreign Keys, Index Monitoring.
11 comments

One of the questions asked of indexes is are they actually being used or are they sitting there looking pretty not being used, wasting valuable resources and storage.

One possible method of determining whether an index is being used is to set the MONITORING USAGE attribute of the index. This will place the index in the v$object_usage view and will flag the index as being used when indeed the Cost Based Optimizer decides to “use” the index in an execution plan.

For example:

SQL> create table daddy (id number constraint daddy_pk primary key, name varchar2(20));

Table created.

SQL> insert into daddy values (1, ‘BOWIE’);

1 row created.

SQL> insert into daddy values (2, ‘ZIGGY’);

1 row created.

SQL> insert into daddy values (3, ‘THIN WHITE DUKE’);

1 row created.

SQL> commit;

Commit complete.

SQL> alter index daddy_pk monitoring usage;

Index altered.

If we look at the v$object_usage view, we’ll now see an entry for this index, with the USED flag currently set to NO:

SQL> select index_name, monitoring, used from v$object_usage where index_name = ‘DADDY_PK’;

INDEX_NAME   MON USE
------------ --- ---
DADDY_PK     YES NO

If we however make the CBO “use” this index:

SQL> select * from daddy where id = 1;

        ID NAME
---------- --------------------
         1 BOWIE

--------------------------------------------
|Id| Operation                   | Name    |
--------------------------------------------
| 0| SELECT STATEMENT            |         |
| 1|  TABLE ACCESS BY INDEX ROWID| DADDY   |
|*2|   INDEX UNIQUE SCAN         | DADDY_PK|
--------------------------------------------

We now notice that the index is now marked as having been used:

SQL> select index_name, monitoring, used from v$object_usage where index_name = ‘DADDY_PK’;

INDEX_NAME   MON USE
------------ --- ---
DADDY_PK     YES YES

Although it doesn’t give any indication on just how often the index is actually used, if an index is still marked as unused for a period of time, it does provide some indication that the index has not been used during that time and is a candidate for being dropped.

Or does it ?

Unfortunately, the are some cases when an index is being used but monitoring the index doesn’t give any such indication. However dropping such indexes can prove disastrous …

For example, let’s create another table that has a Foreign Key which references this DADDY table:

SQL> create table kiddie (id number, name varchar2(20), fk number, constraint kiddie_fk foreign key(fk) references daddy(id));

Table created.

Let’s now populate it with a number of rows:

SQL> insert into kiddie select rownum, ‘MAJOR TOM’, 1 from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Let’s now create an index on the Foreign Key column and collect some stats:

SQL> create index kiddie_fk_i on kiddie(fk);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’DADDY’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’KIDDIE’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

Let’s now monitor this index on the Foreign Key and see where it may or may not get used:

SQL> alter index kiddie_fk_i monitoring usage;

Index altered.

SQL> select index_name, monitoring, used from v$object_usage where index_name = ‘KIDDIE_FK_I’;

INDEX_NAME   MON USE
------------ --- ---
KIDDIE_FK_I  YES NO

OK, let’s now delete a row from the parent table and look at the statistics to see how many consistent reads were performed:

SQL> delete daddy where id = 2;

1 row deleted.

Statistics
---------------------------
          8 db block gets
         26 consistent gets
          1 physical reads
          1  rows processed

Now, behind the scenes, Oracle indeed used the index on the Foreign Key to determine whether or not the parent row could be safely deleted or whether there were child records that were still referencing the parent row. Note we used just 26 consistent reads.

Let’s see if the monitoring of the index has picked this up …

SQL> select index_name, monitoring, used from v$object_usage where index_name = ‘KIDDIE_FK_I’;

INDEX_NAME   MON USE
------------ --- ---
KIDDIE_FK_I  YES NO

Ooops, this is not good. Is this index really not being used, is it really perhaps safe to drop.

Let’s find out:

SQL> drop index kiddie_fk_i;

Index dropped.

Let’s now delete another parent row and see if we notice a difference in performance:

SQL> delete daddy where id = 3;

1 row deleted.

Statistics
----------------------------
          7  db block gets
       3194  consistent gets
       1599  physical reads
          1  rows processed

Ooops, the consistent gets have jumped up dramatically from 26 to a massive 3194 !! Without the index on the Foreign Key, the only way now for Oracle to check whether it’s OK to delete a parent row is to perform a Full Table Scan on the child table. Not only is this potentially very costly but there could be some nasty locking implications as well.

Index Monitoring only tells you if an index has been used or if it hasn’t been used by the CBO.

It doesn’t however tell you whether an index has actually been used or whether it’s safe to drop the index.

More examples of when it may not be safe to drop a so called unused index to come …

Follow

Get every new post delivered to your Inbox.

Join 1,715 other followers