jump to navigation

Descending Indexes Solution (Yellow Submarine) September 9, 2011

Posted by Richard Foote in Descending Indexes, Oracle Indexes, Quiz.
28 comments

Answers to the quiz on Descending Indexes and hopefully some useful dangers and tips on using them.

The answer to the first question is Yes, a “normal” Ascending Index can be used by the CBO to retrieve data in descending order and hence possibly avoid a sort. The reason being that leaf blocks in the index structure have effectively two pointers, one that points to the next leaf block in the index structure (except for the very last leaf block) and one that points to the previous block (except for the first leaf block). So the data in an index can be retrieved in either order.

The answer to the second question is Yes as well, a Descending Index can also be used to also retrieve data in either logical order as again all the leaf blocks have the two set of pointers.

That being the case, if an index has just the one column value, does it therefore make any difference which index one creates, ascending or descending ?

Hence my last question. The answer is maybe, as there are a number of fundamental differences in how each type of index is implemented.

Naturally, a little demo to illustrate :)

Let’s begin by creating a simple little table and a normal B-Tree index on an ID column, which has monotonically increasing values:

SQL> create table bowie (id number, name varchar2(30));

Table created.

SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> insert into bowie select rownum, 'DAVID BOWIE' from dual connect by level <= 100000;

100000 rows created.

SQL> commit;

Commit complete.

Note the index is indeed a “Normal” B-Tree index and because the indexed values monotonically increase, all index leaf block splits are 90-10 splits resulting a perfectly compact, 100% utilised index structure:

SQL> select index_type from dba_indexes where index_name = 'BOWIE_ID_I';

INDEX_TYPE
---------------------------
NORMAL

SQL> analyze index bowie_id_i validate structure;

Index analyzed.

SQL> select lf_rows, lf_blks, pct_used from index_stats;

   LF_ROWS    LF_BLKS   PCT_USED
---------- ---------- ----------
    100000        199        100

Let’s now run a query to ensure the index is indeed used and that the sort can indeed be avoided. Note I’ve not actually collected any CBO statistics at this stage but I’m definitely using the CBO:

SQL> alter system set optimizer_mode='ALL_ROWS' scope=both;

System altered.

SQL> select * from bowie where id between 42 and 84 order by id desc;

43 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2771731789

-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |    43 |  1290 |     1   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BOWIE      |    43 |  1290 |     1   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN DESCENDING| BOWIE_ID_I |    43 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID">=42 AND "ID"<=84)
       filter("ID">=42 AND "ID"<=84)

Note
-----
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
       1092  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         43  rows processed

So the execution plan clearly shows the use of the index via an index range scan descending and that there are indeed no sort operations performed. There were no statistics gathered, so the CBO performed some dynamic sampling to determine a taste for the data.

Let’s now change the optimizer_mode to CHOOSE, a common default setting (especially pre 11g, this example is run on a 10.2.0.4 database) and re-run the query:

SQL> select * from bowie where id between 42 and 84 order by id desc;

43 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3062669298

---------------------------------------------------
| Id  | Operation                    | Name       |
---------------------------------------------------
|   0 | SELECT STATEMENT             |            |
|   1 |  SORT ORDER BY               |            |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOWIE      |
|*  3 |    INDEX RANGE SCAN          | BOWIE_ID_I |
---------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("ID">=42 AND "ID"<=84)

Note
-----
   - rule based optimizer used (consider using cbo)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          3  consistent gets
          0  physical reads
          0  redo size
       1092  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         43  rows processed

No statistics on the table now means the Rule Based Optimizer kicks in and although a sort operation is performed (as there’s no descending scan), Oracle at least used the index.

OK, let’s now run the exact same sequence of events, but this time using a Descending Index.

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, name varchar2(30));

Table created.

SQL> create index bowie_id_i on bowie(id desc);

Index created.

SQL> insert into bowie select rownum, 'DAVID BOWIE' from dual connect by level <= 100000;

100000 rows created.

SQL> commit;

Commit complete.

So it’s the exact same table and set of data. Let’s now look at the type of index created:

SQL> select index_type from dba_indexes where index_name = 'BOWIE_ID_I';

INDEX_TYPE
---------------------------
FUNCTION-BASED NORMAL

OK, Difference Number 1. A Descending Index is no ordinary “Normal” index, but is implemented as a  “Function-Based Normal” index instead. This means there’ll be a new hidden virtual column created behind the scenes and that the Rule Based Optimizer is going to have an issue here as it can’t cope with Function-based Indexes.

Let’s look at some Index_Stats:

SQL> analyze index bowie_id_i validate structure;

Index analyzed.
SQL> select lf_rows, lf_blks, pct_used from index_stats;

   LF_ROWS    LF_BLKS   PCT_USED
---------- ---------- ----------
    100000        426         50

Difference Number 2: This index is approximately double the size of the previous index and only half as efficient with its storage. Why ? Because as the data is now inserted in reverse logical order, the last index leaf block no longer receives the largest current index value and so 90-10 splits are not performed. As only 50-50 splits are performed, the index structure is left with 50% empty blocks which can not be reused. Unfortunately, a possible candidate for periodic index rebuilds …

Let’s now re-run the query using the CBO:

SQL> select * from bowie where id between 42 and 84 order by id desc;

43 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3472402785

------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |   208 |  6240 |     1   (0)|00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      |   208 |  6240 |     1   (0)|00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I |     1 |       |     1   (0)|00:00:01 |
------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access(SYS_OP_DESCEND("ID")>=HEXTORAW('3EAAFF')  AND
              SYS_OP_DESCEND("ID")<=HEXTORAW('3ED4FF') )
       filter(SYS_OP_UNDESCEND(SYS_OP_DESCEND("ID"))>=42 AND
              SYS_OP_UNDESCEND(SYS_OP_DESCEND("ID"))<=84)

Note
-----
   - dynamic sampling used for this statement
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          9  consistent gets
          0  physical reads
          0  redo size
       1092  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         43  rows processed

Difference Number 3. Although the same execution plan with the same number of consistent gets is performed, the cardinality estimates are not as accurate and the SYS_OP_DESCEND and SYS_OP_UNDESCEND functions are used as access/filter conditions as they’re the functions implemented in the function-based index.

If we run the same query using the Rule Based Optimizer (remember, we “forgot” to collect statistics on the table):

SQL> alter system set optimizer_mode='CHOOSE' scope=both;

System altered.

SQL> select * from bowie where id between 42 and 84 order by id desc;

43 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2027917145

------------------------------------
| Id  | Operation          | Name  |
------------------------------------
|   0 | SELECT STATEMENT   |       |
|   1 |  SORT ORDER BY     |       |
|*  2 |   TABLE ACCESS FULL| BOWIE |
------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter("ID"<=84 AND "ID">=42)

Note
-----
   - rule based optimizer used (consider using cbo)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        309  consistent gets
          0  physical reads
          0  redo size
       1092  bytes sent via SQL*Net to client
        418  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
         43  rows processed

Difference Number 4. The Rule based Optimizer does not support Function-Based Indexes and so the index is now completely ignored. Oracle has no choice here but to perform the much more expensive Full Table Scan, when previously the ascending index was used.

A Descending Index can potentially be useful in a concatenated, multi-column index, in which the columns could be ordered in a combination of ascending/descending order that could in turn return the data in a required specific order, thereby negating the need for a potentially expensive sort operation.

However, with a single column index, one would need to question the need for making such an index descending …

Having fun :) Enjoy your weekend !!

Descending Indexes Quiz (Up On The Ladder) September 8, 2011

Posted by Richard Foote in Descending Indexes, Oracle Indexes, Quiz.
13 comments

OK, you won’t find the answer to these questions on my blog, so using my search facility won’t be of any help :)

Actually, it’s quite an easy one this, honest ;)

If you have a query such as:

SELECT * FROM bowie WHERE id BETWEEN 42 and 84 ORDER BY id DESC;

1) Can a default B-Tree index on the single ID column that stores data in Ascending order be used by the CBO to retrieve the data automatically in the required Descending order, without the need for a sort operation ?

2) Can a Descending B-Tree index on the ID column be used by the CBO to retrieve the data automatically in Ascending order, without the need for a sort operation ?

3) Depending on your answers above, what are the differences (if any) between the implementation of an Ascending and Descending index ?

Enjoy !!

Best Method To Select One Row From Small Table – Solution (Revolution 1) September 7, 2011

Posted by Richard Foote in Oracle Indexes, Small Indexes.
4 comments

OK, time for some answers, although of course regular readers of this blog will already know the answer :)

When selecting one row from the small table as in the quiz in my previous post, the correct order is as follows:

1) PK access of an Index Organized Table.

This option only requires just the 1 consistent get (as the IOT only consists of just 1 block) and this consistent get is a “cheaper” consistent get examination, which in turn only requires the 1 latch. So 1 consistent get and 1 latch. For more info on this, see: http://richardfoote.wordpress.com/2009/05/27/indexes-and-small-tables-part-vii-cluster-one/.

2) Use of Unique Index With a Heap Table.

This option only requires 2 consistent gets (one to read the index leaf block and one to read the table block). There can only be a maximum of one row when a single equality predicate is specified as the index is Unique, which means that 2 consistent gets is the maximum necessary (there may only be the 1 consistent get if there are either no rows to be returned or if all required columns can be found in the index). Additionally, because the index is Unique, both consistent gets are the cheaper, 1 latch consistent get examinations. So that’s 2 consistent gets and 2 latches. For more information, see: http://richardfoote.wordpress.com/2009/05/13/indexes-and-small-tables-part-v-its-no-game/.

3) Use of Non-Unique Index With a Heap Table

This option requires at most 3 consistent gets (one to read the leaf block, one to access the table and possibly one more to perform an additional fetch operation and checking the leaf block again in case of more rows, possibly necessary as the index is Non-Unique and there could be more than one row that matches an equality predicate). Unfortunately, as the index is Non-Unique, the consistent gets are the full-blown consistent gets which requires the buffer block to be pinned/unpinned via 2 latch calls. If the last value is specified, then the additional fetch may be unnecessary and if the index contains all the necessary columns, then just 1 consistent get would be necessary. But in the example provided, we’re looking at typically 3 consistent gets and 6 latches. For more information, see: http://richardfoote.wordpress.com/2009/05/05/indexes-and-small-tables-part-iv-treefingers/.

4) Use of a Full Table Scan

This option requires as a minimum 4 consistent gets (as Oracle needs to not only access the one block containing the 42 rows, but additionally the table segment header multiple times in order to determine the extent map, segment HWM etc.). These additional consistent gets are not generally an issue as these overheads are negligible for a typical FTS of a typical table. But in this example, with a tiny table, these consistent gets makes the difference. Note also that all 4 consistent gets require the block to be pinned/unpinned and so require 2 latch gets each. So that’s 4 consistent gets and 8 latches, minimum, even if all the rows can fit in 1 table block. On its own, no big deal, but if this small lookup table is accessed 10,000 times a minute, that’s potentially a lot of extra CPU and contention when performing a FTS over the above options. For more info, see: http://richardfoote.wordpress.com/2009/04/16/indexes-on-small-tables-part-i-one-of-the-few/ and the other articles on Indexes on Small Tables.

To those that got it right, well done. To those that didn’t, hopefully you’ve learnt something useful.

The moral of the story, no table is too small to potentially benefit from an index, even if the table is only one block in size :)

New question and discussion tomorrow.

Best Method To Select One Row From Small Table Quiz (Each Small Candle) September 5, 2011

Posted by Richard Foote in Oracle Indexes, Quiz, Small Indexes.
22 comments

Assume you have a tiny little table with just 42 rows (naturally) that all fit in one table block. Order the following options in order of “efficiency” (most efficient option first) when accessing just one of these rows:

1) Full Table Scan of Heap Table

2) PK access of an Index Organised Table

3) Index access of Heap Table via a Unique Index

4) Index access of Heap Table via a Non-Unique Index

If you think any of the options are the same, then you can order them as follows (example only):

1) Option 1

2) Option 2

2) Option 3

4) Option 4

Answer in the next few days …

UPDATE: just to clarify based on comments already made.

Yes, any index must visit the table as there are required columns within the table that are not stored in the index (this is implied by the above options). The table has only ever contained 42 rows and the are no additional table blocks below the table HWM (not that this really makes a difference to the answer). To keep it simple, the column being queried has a NOT NULL constraint (although it doesn’t really matter, except for when you want to put a PK constraint on it such as with the IOT option).

METHOD_OPT=> SIZE AUTO Quiz Solution (The Trickster) September 1, 2011

Posted by Richard Foote in CBO, Histograms, Oracle Indexes, Oracle Statistics.
16 comments

I was going to leave it for a few days but there have already been so many comments and discussions on all this, I thought I better write something up. In case anyone was wondering, yes I probably am driving my colleagues at work mad with my “Question of the Day” !!

Unfortunately, some might be disappointed at both Oracle and myself :)

Yes, I did kinda set things up to trick the unwary and yes, perhaps the answer isn’t what many are expecting.

The answer to my previous question of which column is going to have a histogram when using the METHOD_OPT  SIZE AUTO option is in fact Column 2. Well done to everyone who got it right.

Why ?

The simplest answer is because it’s the only column of the three that has 254 or less distinct values.

Here’s the key point. When using METHOD_OPT SIZE AUTO, every column with 254 or less distinct values that has been referenced within a predicate, will have a Frequency-based histogram. Each and every one of them, regardless of whether the data is actually skewed or not. So Column 2 with only 254 distinct values AND having previously been referenced in a predicate was guaranteed to have a histogram.

If a column has more than 254 distinct values, whether it then has a Height-Based histogram depends on how the data is skewed. If the data is perfectly evenly distributed, then it won’t have a histogram. Column 1, having sequenced based unique values will not meet the criteria and so not have a histogram.

Column 3 is interesting. Having inserted the outlier value, it now has 255 distinct values and so no longer qualifies for an automatic frequency based histogram. However, if all its values are evenly distributed, then it won’t qualify for a height based histogram either and Column 3 only has just the one outlier value, all other values are evenly distributed values. Unfortunately, Oracle doesn’t pick up on rare outlier values (even if you collect 100% statistics and it’s one of the low/high points of the column) and so will not generate a height-based histogram.

The only column that qualifies is Column 2.

A demo to illustrate. First, let’s create and populate our table:

SQL> create table bowie (id number, code1 number, code2 number);

Table created.

SQL> insert into bowie select rownum, mod(rownum,254), mod(rownum,254) from dual  connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Notice I’m using a MOD function to generate a perfectly even distribution of data. I’ve noticed a few examples (such as that by Charles Hooper in the comments of the Quiz posting), in which the DBMS_RANDOM function is used. Note this will almost certainly generate data with enough natural skewness on a 1M table with 254 random values that when the outlier 255th value is introduced, it will qualify for a height-based histogram. Very easy way to test and find out. Simply generate the 1M data with 255 random values and I suggest a height-based histogram is created regardless.

OK, I’ll run some SQL to generate sufficient workload to qualify the columns for automatic histograms:

SQL> select * from bowie where id = 42;
SQL> select * from bowie where code1 = 42;
SQL> select * from bowie where code2 = 42;

BTW, the difference between the SIZE AUTO and SIZE SKEWONLY options, is that AUTO requires previous workload to suggest a histogram might be relevant, SKEWONLY does not. 

If we were to collect statistics at this stage, we would notice that the second and third columns both have a Frequency-Based histogram as both columns only have 254 distinct values and so automatically qualify:

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE', estimate_percent=> null, cascade=>true);

PL/SQL procedure successfully completed.

SQL> select column_name, histogram from dba_tab_columns where table_name = 'BOWIE';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
ID                             NONE
CODE1                          FREQUENCY
CODE2                          FREQUENCY

If we were to run a query using the third column, notice how the cardinality estimates aren’t too bad in this example:

SQL> select * from bowie where code2 > 600;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |    13 |   660   (2)| 00:00:08 |
|*  1 |  TABLE ACCESS FULL| BOWIE |     1 |    13 |   660   (2)| 00:00:08 |
---------------------------------------------------------------------------

There are no rows that are greater than 600 and so an estimate of 1 isn’t too bad at all.

OK, let’s add in this one, tiny little row and collect fresh, <strong>100% accurate statistics</strong> (Note: the accurate statistics is very important as Niall’s examples has demonstrated):

&nbsp;

SQL> insert into bowie values (1000001, 42, 99999999);

1 row created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE', estimate_percent=> null, cascade=>true);

PL/SQL procedure successfully completed.

SQL> select column_name, histogram from dba_tab_columns where table_name = 'BOWIE';

COLUMN_NAME                    HISTOGRAM
------------------------------ ---------------
ID                             NONE
CODE1                          FREQUENCY
CODE2                          NONE

Note that the third column now has 255 distinct values and so no longer qualifies for the automatic Frequency-Based histogram. As most of its data is perfectly evenly distributed with just the one outlier value, the column doesn’t qualify for a Height-based histogram either and so now has no histogram at all.

Note as I collected 100% accurate statistics, Oracle is definitely aware of this outlier value:

SQL> select column_name, low_value, high_value from dba_tab_columns where table_name='BOWIE' and column_name='CODE2';

COLUMN_NAME  LOW_VALUE  HIGH_VALUE
------------ ---------- ------------
CODE2        80         C464646464

SQL> var high_num number
SQL> exec dbms_stats.convert_raw_value('C464646464',:high_num);

PL/SQL procedure successfully completed.

SQL> print high_num

  HIGH_NUM
----------
  99999999

But it’s not enough for Oracle to automatically generate a histogram. Which is a shame really, because now we can have all sorts of problems:

SQL> select * from bowie where code2 > 600;
Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |   999K|    12M|   660   (2)| 00:00:08 |
|*  1 |  TABLE ACCESS FULL| BOWIE |   999K|    12M|   660   (2)| 00:00:08 |
---------------------------------------------------------------------------

When previously it had the cardinality estimates spot on, now they’re terrible (expecting not 1 row but 999,000 rows !!) because without a histogram, Oracle is assuming even distribution between its low and high point values.

I’m not a great fan of either the SIZE AUTO or SIZE SKEWONLY options ;)

Hope you’re enjoying these little quizzes, I’ll have another one for you all soon.

METHOD_OPT => SIZE AUTO Quiz (Automatic For The People) August 31, 2011

Posted by Richard Foote in Method_Opt Size AUTO, Oracle Indexes, Oracle Statistics.
40 comments

OK, a nice, easy, simple one today. No tricks, honest ;)

You have a table with 3 columns and lets say 1M rows.

Column 1 is effectively unique, so it has 1M distinct values. Let’s say 1 – 1000000, not that it really matters.

Column 2 has 254 distinct values, all evenly distributed so it has approximately the same number of rows for each value. Let’s say the values are 0-253 again it doesn’t really matter.

Column 3 is identical to Column 2, it also has 254 distinct values, all evenly distributed as well such that it also has approximately the same number of rows for each value. Let’s say the range of values are the same, 0-253, again it doesn’t really matter.

You have various queries in the database in which all 3 columns are referenced somewhere in WHERE conditions (eg. WHERE Column1 = 42).

You then insert just one row that has the following values based on our example: VALUES (1000001, 42, 99999999).

The key points here is that for Column1, it’s just another unique value, just 1 greater than the previous maximum value. Nothing special.

For Column2, it’s just another of the existing 254 values that doesn’t really change the even distribution of the data. Nothing special.

However, for Column 3, it’s not only a new value that didn’t previously exist (and so there’s just the one row with this value in the data, whereas all the other values correspond to roughly 1/254 of the rows) but it’s also a value that is way way way outside the normal range of existing values (nothing comes close to having a value of 99999999).

OK, we have the scenario, hopefully you can see where I going with this.

You decide to collect fresh statistics with DBMS_STATS, you want them to be completely accurate so you use a 100% sample size (or compute with estimate_percent=>null). But because you want to get with the AUTO program and make life easier for yourself, you decide to let Oracle work out which columns might require and benefit from a histogram by using METHOD_OPT=>’ FOR ALL COLUMNS SIZE AUTO’.

Now finally comes the question. Of the three columns, only one column will have a histogram. Which one and why is it so ?

If you understand how Oracle collects statistics, the answer will hopefully be obvious ;)

MIN / MAX Quiz Answer (One Shot) August 31, 2011

Posted by Richard Foote in CBO, Index Full Scan (Min/Max), MAX, MIN, Oracle Indexes.
9 comments

Not only are my regular blog readers a good deal better looking than the average person, but they’re quite a bit smarter as well :)

As most people have correctly identified, the answer I was after to my previous Min/Max Quiz is that Option 1 is indeed the odd one out, as it’s the only option that can’t make use of the Index Full Scan (Min/Max) access path.

If you’re after either the minimum or the maximum of a column value and the column is indexed, the CBO can potentially use the Index Full Scan (Min/Max), which simply navigates to the first OR last leaf block in the index structure looking for the Min or Max value in question. Oracle can of course navigate to the first (left-most) or last (right-most) leaf blocks very easily by simply following the associated first/last pointers in the Root/Branch structure of the index. All things being equal and providing there haven’t been any subsequent deletes to empty out the index entries from these leaf blocks, Oracle can very quickly determine the minimum or maximum value of the column.

However, the Index Full Scan (Min/Max) can only visit one side of the index, not both. Therefore, if you want both the minimum and the maximum column value, an Index Full Scan (Min/Max) is not viable and the CBO is forced to look for other alternatives. It sounds like such a trivial thing to implement but that’s how it goes. I do remember way back when Oracle9i was released and the introduction of the Index Skip Scan I thought perhaps Oracle might also soon introduce an index skip scan version of Min/Max (as it basically just needs to “skip” all the index leaf blocks in the “middle” of the index via another lookup of the index), but it was not to be.

So for a query such as in Option 1, if the column IS NULL and does not have a NOT NULL constraint, then:

  
SQL> select min(id), max(id) from muse;

   MIN(ID)    MAX(ID)
---------- ----------
         1    1000000
Execution Plan
----------------------------------------------------------
Plan hash value: 421245806

---------------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |     5 |  2125   (1)| 00:00:26 |
|   1 |  SORT AGGREGATE    |      |     1 |     5 |            |          |
|   2 |   TABLE ACCESS FULL| MUSE |  1000K|  4882K|  2125   (1)| 00:00:26 |
---------------------------------------------------------------------------

 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       7795  consistent gets
       7788  physical reads
          0  redo size
        470  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

  

Then an (expensive) Full Table Scan is likely the way to go. However, if the column has a NOT NULL constraint and the index is indeed smaller than the parent table, then:

  
SQL> alter table muse modify id not null;

Table altered.

SQL> select min(id), max(id) from muse;

   MIN(ID)    MAX(ID)
---------- ----------
         1    1000000
Execution Plan
----------------------------------------------------------
Plan hash value: 1592024618

-----------------------------------------------------------------------------------
| Id  | Operation             | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT      |           |     1 |     5 |   611   (1)| 00:00:08 |
|   1 |  SORT AGGREGATE       |           |     1 |     5 |            |          |
|   2 |   INDEX FAST FULL SCAN| MUSE_ID_I |  1000K|  4882K|   611   (1)| 00:00:08 |
-----------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2242  consistent gets
          0  physical reads
          0  redo size
        470  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

  

Then an Index Fast Full Scan becomes viable.

All the other options I’ve used to return the Min/Max of the column all incorporate two separate SELECT clauses and so all can potentially use an Index Full Scan (Min/Max) access path for each distinct clause.

Be it when using a UNION operation:

  
SQL> select min(id) as "MIN(ID)/MAX(ID)" from muse union all select max(id) from muse;

 MIN(ID)/MAX(ID)
---------------
              1
        1000000
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1370940131

-----------------------------------------------------------------------------------------
| Id  | Operation                   | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |           |     2 |    10 |     6  (50)| 00:00:01 |
|   1 |  UNION-ALL                  |           |       |       |            |          |
|   2 |   SORT AGGREGATE            |           |     1 |     5 |            |          |
|   3 |    INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)| 00:00:01 |
|   4 |   SORT AGGREGATE            |           |     1 |     5 |            |          |
|   5 |    INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        456  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

Although as pointed out in the comments, this does return 2 rows.

Or I could use Scalar Sub-Queries:

  
SQL> select (select min(id) from muse) "MIN(ID)", (select max(id) from muse) "MAX(ID)" from dual;

   MIN(ID)    MAX(ID)
---------- ----------
         1    1000000
Execution Plan
----------------------------------------------------------
Plan hash value: 2177063930

----------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |     1 |       |     2   (0)| 00:00:01 |
|   1 |  SORT AGGREGATE            |           |     1 |     5 |            |       |
|   2 |   INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)| 00:00:01 |
|   3 |  SORT AGGREGATE            |           |     1 |     5 |            |       |
|   4 |   INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)| 00:00:01 |
|   5 |  FAST DUAL                 |           |     1 |       |     2   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        468  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

  

Or indeed I could use a WITH clause:

  
SQL> with min_id as (select min(id) from muse), max_id as (select max(id) from muse) select * from min_id, max_id;

   MIN(ID)    MAX(ID)
---------- ----------
         1    1000000

 
Execution Plan
----------------------------------------------------------
Plan hash value: 3280440773

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    26 |     6   (0)|00:00:01 |
|   1 |  NESTED LOOPS                |           |     1 |    26 |     6   (0)|00:00:01 |
|   2 |   VIEW                       |           |     1 |    13 |     3   (0)|00:00:01 |
|   3 |    SORT AGGREGATE            |           |     1 |     5 |            |         |
|   4 |     INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)|00:00:01 |
|   5 |   VIEW                       |           |     1 |    13 |     3   (0)|00:00:01 |
|   6 |    SORT AGGREGATE            |           |     1 |     5 |            |         |
|   7 |     INDEX FULL SCAN (MIN/MAX)| MUSE_ID_I |     1 |     5 |     3   (0)|00:00:01 |
------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        470  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

They’re all subtly different but they all can make use of the Index Full Scan (Min/Max) scan for each separate SELECT clause and they all can perform the necessary resultant work with just 6 consistent gets in my example.

More quizzes to come …

MIN/MAX Quiz (Funtime) August 29, 2011

Posted by Richard Foote in MAX, MIN, Oracle Indexes.
8 comments

At my work I’ve recently introduced a little “Question of the Day” for my fellow DBAs, hopefully to pass on a few interesting little titbits of information and have a bit of fun along the way.

I was going to just write a blog article on the following topic of how Oracle deals with MIN and MAX queries, but decided to give you all the chance to have a bit of a think first via a little quiz. This was a question I asked recently, see what you think:

There are many ways one could write a query to determine the MIN and MAX of a column. That said, which of the following is the odd one out:

1) select min(id), max(id) from muse;

2) select min(id) as “MIN(ID)/MAX(ID)” from muse union all select max(id) from muse;

3) select (select min(id) from muse) “MIN(ID)”, (select max(id) from muse) “MAX(ID)” from dual;

4) with min_id as (select min(id) from muse), max_id as (select max(id) from muse) select * from min_id, max_id;

 

It’s a “big” table, there’s an index on the ID and the ID has a not null constraint (the constraint doesn’t actually matter to the answer of the question, but it can make a subtle difference).

Now, one could come up with lots of different possible answers (eg: option 4 is the only one that uses a WITH clause), however the answer I’m after is one that might be best associated with the common theme of this blog.

You don’t have to make a comment, you can just quietly ponder which option is somewhat different or maybe consider which one you may not want to use.

No more clues :)

More details in a few day’s time.

PS. Folk from my work and Jonathan Lewis may not answer as they already know the answer I’m after :)

BLEVEL 1 => BLEVEL 2 (Teenage Wildlife) August 23, 2011

Posted by Richard Foote in BLEVEL, CBO, Oracle Indexes.
4 comments

Jonathan Lewis recently wrote a really nice blog piece blevel=1 on the dangers of an index toggling between BLEVEL 1 and BLEVEL 2. I thought it would be useful to demonstrate this issue with a quick demo (Note: this example is on 11.2.0.1, with an 8K block size).
 
First, create a simple little table with 336,000 rows and an index on an ID number column:

  
SQL> create table major_tom (id number, code number, name varchar2(30));
 
Table created.
 
SQL> create index major_tom_i on major_tom(id);
 
Index created.
 
SQL> insert into major_tom select rownum, mod(rownum,100), 'GROUND CONTROL' from dual connect by level <=336000;
 
336000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MAJOR_TOM', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name='MAJOR_TOM_I';
 
    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         1         671              1296

 
Note the index has 671 leaf blocks and a Blevel=1. This of course means the index basically consists of a Root Block, which in turn references all its 671 leaf blocks. Therefore to read a specific index entry, requires a read of the index root block followed by a read of the specific index leaf block. That’s 2 reads in total.
 
Let’s run a query to return one row (note the ID column is effectively unique although I’ve only created a non-unique index):
 

SQL> select * from major_tom where id = 42;
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4155681103
 
-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    23 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        531  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

 
 

Note: The cost of using the index is just 1 not 2 as perhaps expected. This is due to the CBO ignoring the Blevel in its calculations when the Blevel = 1.  As the index is relatively small, the CBO takes the approach that the root block is very likely already cached and so is not worth costing.
 
As the data is perfectly evenly distributed and effectively unique, the CBO has correctly estimated the number of returned rows as just 1. Therefore, the overall cost of the execution plan is just 2, 1 to read the leaf block and 1 to read the table block.
 
Notice that the number of consistent gets is 4. 1 to read the index root block, 1 for the index leaf block, 1 for the table block and as the index is non-unique, 1 for an additional fetch performed to check the index again that there are no further rows to be returned.
 
If we now create another table of 1M rows that will be used in a join operation:
 

 
SQL> create table ziggy (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into ziggy select rownum, mod(rownum,10000), 'ZIGGY STARDUST' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'ZIGGY', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

If we now join these 2 tables and select a moderate number of rows:
 

 
SQL> select * from ziggy z, major_tom m where z.id = m.id and z.code in (42, 4242);
 
68 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 2011771477
 
--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |   200 |  9400 |  1372   (2)| 00:00:17 |
|   1 |  NESTED LOOPS                |             |       |       |            |          |
|   2 |   NESTED LOOPS               |             |   200 |  9400 |  1372   (2)| 00:00:17 |
|*  3 |    TABLE ACCESS FULL         | ZIGGY       |   200 |  4800 |  1105   (2)| 00:00:14 |
|*  4 |    INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("Z"."CODE"=42 OR "Z"."CODE"=4242)
   4 - access("Z"."ID"="M"."ID")
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4175  consistent gets
       4024  physical reads
          0  redo size
       1950  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         68  rows processed

The CBO goes for a Nested Loop join, primarily because the inner table is only accessed a relatively small number of times AND because the cost of doing so via the index is so damn cheap.
 
However, if we add just a few more rows and collect fresh statistics …
 

 
SQL> insert into major_tom select rownum+336000, mod(rownum,100), 'GROUND CONTROL' from dual connect by level <=500;
 
500 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MAJOR_TOM', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 

SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name='MAJOR_TOM_I';
 
    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         2         672              1298

The index has now toggled over to become a Blevel 2 index. We only added a handful of rows resulting in just the one additional index leaf block, but 672 is just too many to be referenced within the one index root block in this example. The root block has split, two new index branches have been created that now reference the leaf blocks and the root block now only references the two new branch blocks.
 
Overall, the changes are quite minor but the ramifications can be quite dramatic …
 
If we now select one row again:
 

 
SQL> select * from major_tom where id = 42;
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4155681103
 
-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    23 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        531  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

The cost of the index has now jumped up by 2 from 1 to 3 with the overall costs up from 2 to 4, even though the index is practically the same size. As the Blevel is now 2, the CBO now includes the cost of the Blevel in its calculations. The cost associated with accessing the root block and a branch block all now count. Although overall the costs are still low, this increase actually represents a 100% increase in the use of the index for an equality search.
 
This increase can be significant if the index needs to be accessed multiple times. Let’s now re-run the join query:
 

 
SQL> select * from ziggy z, major_tom m where m.id = z.id and z.code in (42, 4242);
 
68 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1928189744
 
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   200 |  9400 |  1485   (2)| 00:00:18 |
|*  1 |  HASH JOIN         |           |   200 |  9400 |  1485   (2)| 00:00:18 |
|*  2 |   TABLE ACCESS FULL| ZIGGY     |   200 |  4800 |  1105   (2)| 00:00:14 |
|   3 |   TABLE ACCESS FULL| MAJOR_TOM |   336K|  7558K|   378   (1)| 00:00:05 |
--------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("M"."ID"="Z"."ID")
   2 - filter("Z"."CODE"=42 OR "Z"."CODE"=4242)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5366  consistent gets
       4024  physical reads
          0  redo size
       1964  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         68  rows processed

Although it’s retrieving exactly the same data, the execution plan has changed significantly. The Nested Loop join is no longer as appealing to the CBO as the cost of accessing the inner table via the index has now effectively doubled. The CBO has now gone for a Hash Join, accessing both tables via Full Tables Scans. The overall cost of the Nested Loop plan was 1372, but this has increased to over 1485, the cost of the now so-called more efficient Hash Join plan.
 
If you have indexes that are on the boundary of increasing from a blevel=1 to a blevel=2, execution plans can potentially change significantly based on the differences in how indexes get costed. This can be especially troublesome when such indexes get regularly rebuilt as they may toggle between Blevel 1 and 2 based on associated space savings and can sometimes result in unpredictable performance depending on when new statistics get collected.
 
I liken it to a child growing up from being a young kid to a teenager. It may only be a difference of a year or so but boy, can the differences be dramatic !!

Indexing A Column With Just One Distinct Value (All The Madmen) August 10, 2011

Posted by Richard Foote in CBO, Oracle Indexes.
11 comments

When one thinks about a column that might benefit from being indexed, one generally considers columns with lots of different values so that the selectivity of the column is such that relatively few rows get selected, making the index appealing to the cost based optimizer.
 
There are of course many exceptions to this generalisation and there are times when an index is the perfect and most efficient method when selecting 100% of all data, even if it involves a normal index range scan reading every row out of the table.
 
There are also times when a column that has very few distinct values should be indexed, even with a standard B-tree index.
 
In this example, I’ll index a column that only has just the 1 single value and the CBO will choose to use the index gladly.
 
Just going to create my standard little table and create an index on the CODE column:
 

 
SQL> create table bowie (id number, code number, name varchar2(50));
 
Table created.
 
SQL> create index bowie_code_i on bowie(code);
 
Index created.

 

 

I’ll now load the table with 1 million rows but all the CODE column will remain unpopulated and consist only of NULLS:
 

 
SQL> insert into bowie select rownum, null, 'Ziggy Stardust and the Spiders From Mars' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.

 

 

I’ll only populate one in every 10,000 rows that a CODE value of 42 (of course):
 

 
SQL> update bowie set code = 42 where mod(id,10000) = 0;
 
100 rows updated.
 
SQL> commit;
 
Commit complete.

 

 

Let’s collect accurate statistics, however note I’m not collecting histograms even though there are relatively few rows that have a CODE value of 42:
 

 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 

 

Indeed, the column only has the one value (42) which occurs relatively infrequently in my data, most rows however remain as NULL:
 

 
SQL> select code, count(*) from bowie group by code;
 
      CODE   COUNT(*)
---------- ----------
               999900
        42        100

 

 

Note that the index statistics clearly shows it only has the 1 distinct value. Remember, NULLS are not indexed by default in B-Tree indexes and as a result the index is tiny with just the one leaf block:
 

 
SQL> select blevel, leaf_blocks, distinct_keys from dba_indexes where index_name='BOWIE_CODE_I';
 
    BLEVEL LEAF_BLOCKS DISTINCT_KEYS
---------- ----------- -------------
         0           1             1

 

 

Note also that the column statistics clearly highlight there’s just the one distinct value, however it also records that there are many rows (999900) that are NULL:

 
 

 
SQL> select column_name, num_distinct, num_nulls from dba_tab_columns where table_name = 'BOWIE' and column_name = 'CODE';
 
COLUMN_NAME  NUM_DISTINCT  NUM_NULLS
------------ ------------ ----------
CODE                    1     999900

 

 

Therefore, Oracle with accurate statistics and without requiring any histograms has all the information it needs to know when selecting rows that contain the one and only distinct value of our CODE column, that it will only actually be selecting 100 rows out of the 1 million rows in the table.

 
 

 
SQL> select * from bowie where code = 42;
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1602289932
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |   100 |  4700 |   101   (0)| 00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE        |   100 |  4700 |   101   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_CODE_I |   100 |       |     1   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("CODE"=42)
 

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        102  consistent gets
          0  physical reads
          0  redo size
       1423  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 

 

The CBO has the rows estimate/cardinality spot on (100) and has decided that the index is indeed the most efficient access path to select the 100 rows of interest.

With flag and bollean type columns, it might be worth some consideration simply storing the one value when one value is relatively rare, such that the resultant index can take advantage of not having to store any of the corresponding NULL values.
 
Even columns with just the one distinct value can potentially benefit from being indexed …

Bitmap Indexes & Minimize Records_Per_Block (Little Wonder) July 19, 2011

Posted by Richard Foote in Bitmap Indexes, MINIMIZE RECORDS_PER_BLOCK, Oracle Indexes.
Tags:
6 comments

As mentioned in my first post regarding the use of Bitmap Indexes to service Not Equal predicates, an index entry in a Bitmap Index consists of the indexed value and a pair of rowids that determine the range of potential rows spanned by the corresponding bitmap sequence. However, Oracle has no way of determining how many rows might actually be in each specific data block and so must make an assumption that all blocks might hold the maximum number of rows that could potentially fit within a block and assign a bitmap bit accordingly. If a row doesn’t actually exist, then it’s simply a “phantom” row and is assigned a 0 to signify that it doesn’t contain the value of the index entry.
 
This maximum number of possible rows that could potentially fit in a block is called the “Hakan Factor” and is determined at the creation of the table based on the definition of the table (such as number of columns, type of columns, whether they’re nullable, etc.) and of course the block size. The smaller the possible size of the row, the more rows that could fit in a block and the more bits that need to be assigned by the Bitmap Index to cover all possible rowids within the rowid range within a Bitmap Index entry. As an example within an 8K block, taking into consideration block overheads, the maximum number of rows within a block that Oracle can potentially estimate can be as many as 736 rows.
 
These additional 0s that get assigned to cater for rows that might not actually exist, although compressed to some degree, still takes up some space within the index. This additional space can be very significant if:
 
– The difference between the minimum possible size of a row and the actual average size of a row is large (or to put it another way, if the difference between the estimated number of rows per blocks and the actual number of rows per block is large)
– The effective clustering of the indexed data is poor within the table as this will limit the effective compression of the additional 0 bits
 
To highlight how all this can make a significant difference to the size of a Bitmap Index, a simple demo as usual to illustrate.
 
First, I’m going to create a table that has a number of nullable VARCHAR2(100) fields, so they might contain up to 100 characters or perhaps no value at all. The potential size of a row might be tiny or it might be quite large.

 
SQL> create table muse (id number, code number, name1 varchar2(100), name2 varchar2(100), name3 varchar2(100), name4 varchar2(100), name5 varchar2(100), name6 varchar2(100), name7 varchar2(100), name8 varchar2(100), name9 varchar2(100), name10 varchar2(100));
 
Table created.

 
OK, time to populate the table. A couple of key points with the data I’m going to use.
 
Firstly, the CODE column is going to have 100 distinct values but these values will be evenly distributed throughout the entire table. So the clustering associated with this column will be terrible, as bad as it gets.
 
Secondly, although all the VARCHAR2(100) columns might not contain much data (or indeed any at all), in actual fact they’re going to be almost fully populated with data. So although the potential average size of a row could have been quite tiny, in actual fact all the rows are quite large. Although we could potentially have fitted many rows within our (8K) block, in actual fact we’re only going to be able to fit just 7 rows per block. There isn’t actually a single block within our table that contains more than 7 rows.

 
SQL> insert into muse select rownum, mod(rownum,100), 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia','Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'MUSE', estimate_percent=>null, cascade=>true, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select table_name, num_rows, blocks, avg_row_len from dba_tables where table_name='MUSE';
 
TABLE_NAME   NUM_ROWS     BLOCKS AVG_ROW_LEN
---------- ---------- ---------- -----------
MUSE          1000000     145549         998

 
 
Let’s now create a Bitmap Index on the CODE column. I’ll set the PCTFREE to 0 to build the smallest possible index structure:

 
SQL> create bitmap index muse_code_i on muse(code) pctfree 0;
 
Index created.
 
SQL> select index_name, leaf_blocks, avg_leaf_blocks_per_key, num_rows from dba_indexes where index_name = 'MUSE_CODE_I';
 
INDEX_NAME  LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY   NUM_ROWS
----------- ----------- ----------------------- ----------
MUSE_CODE_I         400                       4        800

 
 
So the Bitmap Index currently consists of 400 leaf blocks.
 
As we now know this table has rows that on average are considerably larger than the minimum possible row size, the Bitmap Index has had to cater for the possible existence of many rows that don’t actually exist. Additionally, as the clustering of the indexed data is very poor, the Bitmap Index will not be able to effectively compress these additional 0 bits as much as it might, as there are bits set to 1 littered all over the place that will hamper the effective compression capabilities of the index (I’ve discuss the impact of the Clustering Factor on the effectiveness of Bitmap Index compression previously).
 
Therefore, it might well be beneficial to more accurately determine the number of rows that really exist within a block. We can change the Hakan Factor by altering the table with the MINIMIZE RECORDS_PER_BLOCK clause. Effectively this results in Oracle performing a full table scan, checking for the number of rows per block (a quick check of the nrow count in the block header suffices) and keeping track of the block that currently contains the most number of rows. The highest value of the nrow count within the table becomes the new Hakan Factor.
 
Let’s give it a go:

 
SQL> alter table muse minimize records_per_block;
alter table muse minimize records_per_block
*
ERROR at line 1:
ORA-28602: statement not permitted on tables containing bitmap indexes

 
Unfortunately, this statement is not permitted if there are already any Bitmap indexes assigned to the table as they have already been based on the current Hakan Factor. All current Bitmap Indexes assigned to the table must first be dropped.

 
SQL> drop index muse_code_i;
 
Index dropped.
 
SQL> alter table muse minimize records_per_block;
 
Table altered.

 
 
OK, so now Oracle has a much more accurate picture of the actual number of rows that exist within a block in this table. The new Hakan Factor is based on the maximum number of rows that actually currently exist within a block in the table (just 7 in this specific example), which is significantly less than was defined previously. Oracle ensures the integrity of the new Hakan Factor from here on in by now limiting the number of rows that can be inserted into blocks within the table to this new value, even if in the future additional rows could potentially have fitted within a block. Once the Hakan Factor is reached, the block is taken off the freelist or marked as full in an ASSM segment.
 
Now any Bitmap Index on this table only has to cater for a relatively small number of rows per block, vastly reducing the number of bits that need to be considered and stored.
 
This can significantly reduce the overall size of associated bitmap indexes:
 

 
SQL> create bitmap index muse_code_i on muse(code) pctfree 0;
 
Index created.
 
SQL> select index_name, leaf_blocks, avg_leaf_blocks_per_key, num_rows from dba_indexes where index_name = 'MUSE_CODE_I';
 
INDEX_NAME  LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY   NUM_ROWS
----------- ----------- ----------------------- ----------
MUSE_CODE_I         150                       1        300

The new Bitmap Index is now only 150 leaf blocks in size, substantially smaller than the previous 400 leaf blocks.

Bitmap Indexes and Not Equal Part II (Sheep) July 7, 2011

Posted by Richard Foote in Bitmap Indexes, NOT Equal, Oracle Indexes.
11 comments

An excellent comment/question by mdinh made me realise my demos in Part I might be a little extreme in returning 0 rows and perhaps give the false impression that Not Equal conditions are only considered or applicable if no rows are returned. This is not the case and with the bitmap index now considered with Not Equal conditions, the choice of whether or not to actually use the index as usual comes down to the comparative costs associated with the available plans. 

So, I’ll expand on my demo a tab by introducing a new value for the FLAG column: 

SQL> update radiohead
  2  set flag = 1
  3  where rownum < 101;

 100 rows updated.

 SQL> commit;

 Commit complete.

 SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'RADIOHEAD',cascade=>true, estimate_percent=>null, method_opt=> 'FOR COLUMNS FLAG SIZE 5');

 PL/SQL procedure successfully completed.

 

 
OK, so now we have some 100 rows which have a value of FLAG which are not equal to 42, which are evenly distributed among all 5 CODE values. I’ve created a histogram however on the FLAG column as the 2 values (1 and 42) are not evenly distributed.

Let’s run the query now:

SQL> select * from radiohead where code = 1 and flag <> 42;
20 rows selected.

 
Execution Plan
----------------------------------------------------------
Plan hash value: 2786215024

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    20 |   300 |     46  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |     46  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP MINUS              |                  |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
|*  5 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_FLAG_I |       |       |            |          |
-------------------------------------------------------------------------------------------------

 
Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CODE"=1)
   5 - access("FLAG"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         47  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

 

 
We notice a few key points. Firstly, as we have a histogram on the FLAG column and the data is perfectly evenly distributed among the CODE values, the CBO has got the estimated cardinality of 20 rows spot on. So all things being equal, we can have some confidence the CBO has done the right thing and selected the most efficient execution plan.

We also notice that the cost has now gone up considerably to 46 (up from 3) but it’s still significantly less than the cost of 761 associated with a Full Table Scan. Therefore, the CBO has still chosen the same execution plan with the two bitmap indexes returning the 20 rows, as it did when it returned none in the previous example.

 
In answer to another comment/question by SJ12345, regarding the use of unbounded predicates, if we now try the following:

SQL> select * from radiohead where code = 1 and flag > 42;

no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2939001425

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |   150 |      6  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    10 |   150 |      6  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP AND                |                  |       |       |            |          |
|   4 |     BITMAP MERGE             |                  |       |       |            |          |
|*  5 |      BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |            |          |
|*  6 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("FLAG">42)
       filter("FLAG">42)
   6 - access("CODE"=1)

 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

Yep, cheap with Bitmap Indexes says the CBO. Note the difference here though is that the CBO uses a BITMAP MERGE to first get all possible rowid values of FLAG that are > 42 and then uses a BITMAP AND operation in combination with the CODE Bitmap index to get all rowids that match from both Bitmap indexes. However, as it evaluates the Bitmap Index on the FLAG index first and there are no index entries with a value > 42, it doesn’t have to actually worry about the CODE condition as no rows can possibly be returned. Therefore a very tiny 2 consistent gets are all that are necessary. 

The following will looking for anything < than 42, remembering we now have 20 rows that meet this condition:

SQL> select * from radiohead where code = 1 and flag < 42;

20 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2939001425

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    20 |   300 |      8  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |      8  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP AND                |                  |       |       |            |          |
|   4 |     BITMAP MERGE             |                  |       |       |            |          |
|*  5 |      BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |            |          |
|*  6 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("FLAG"<42)
       filter("FLAG"<42)
   6 - access("CODE"=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

 
Yep, also cheap with Bitmap Indexes, using the same plan as the previous > than example but using more consistent gets as there are a number of rows that need to be accessed this time (although all in the same data block).

To now complete the picture:

SQL> select * from radiohead where code = 1 and (flag < 42 or flag > 42);

20 rows selected.

 
Execution Plan
----------------------------------------------------------
Plan hash value: 3720408756

--------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes | Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |    30 |   450 |   14   (0)| 00:00:01 |
|   1 |  CONCATENATION                |                  |       |       |           |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    10 |   150 |    6   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |
|   4 |     BITMAP AND                |                  |       |       |           |          |
|   5 |      BITMAP MERGE             |                  |       |       |           |          |
|*  6 |       BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |           |          |
|*  7 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |           |          |
|   8 |   TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |    8   (0)| 00:00:01 |
|   9 |    BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |
|  10 |     BITMAP AND                |                  |       |       |           |          |
|  11 |      BITMAP MERGE             |                  |       |       |           |          |
|* 12 |       BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |           |          |
|* 13 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |           |          |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("FLAG">42)
       filter("FLAG">42)
   7 - access("CODE"=1)
  12 - access("FLAG"<42)
       filter(LNNVL("FLAG">42) AND "FLAG"<42)
  13 - access("CODE"=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

We now get a combination of both previous plans, concatenated together. Note in this case, it’s actually a cheaper and more efficient alternative to the first Not Equal example. Having got the setup for this demo, you can of course create the same demo yourselves and have a play and experiment. Me, I’m now off to watch Cadel Evans win the Tour De France :)

Bitmap Indexes and Not Equal (Holy Holy) July 5, 2011

Posted by Richard Foote in Bitmap Indexes, NOT Equal, Oracle Indexes.
14 comments

Way back, I previously discussed how the CBO will simply ignore any possible indexes when determining the best execution plan involving a NOT EQUAL(<>) condition, even if an index might in theory provide the most efficient access path. Oracle just assumes that the vast majority of rows are likely to be returned and so doesn’t even bother to cost and consider any potential indexes. The previous discussion was aimed specifically at B-Tree indexes, but as a comment at the time by Christian Antognini highlighted, things are a little different for Bitmap indexes. Thought it might be worth exploring this point a little further.

To start and to recap, I’ll begin by creating a simple little table, populated with 1,000,000 rows. It has 2 columns of interest for now, one called CODE which has just 5 distinct values and another called FLAG that only has the 1 distinct value (a value of ’42’ naturally):

SQL> create table radiohead (code number not null, type number not null, flag number not null, name varchar2(30));

 Table created.

 SQL> insert into radiohead select mod(rownum,5)+1, mod(rownum,20)+1, 42, 'ZIGGY' from dual connect by level <= 1000000;

 1000000 rows created.

 SQL> commit;

 Commit complete. 

 I’ll begin by creating standard B-Tree indexes on these columns:

SQL> create index radiohead_code_i on radiohead(code);

 Index created.

 SQL> create index radiohead_type_i on radiohead(type);

 Index created.

 SQL> create index radiohead_flag_i on radiohead(flag);

 Index created.

 SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'RADIOHEAD', cascade=> true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

 PL/SQL procedure successfully completed. 

If we run a query that returns all rows that don’t have a FLAG value of 42 (of which there are none):

SQL> select * from radiohead where flag <> 42;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    15 |   762   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| RADIOHEAD |     1 |    15 |   762   (2)| 00:00:10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("FLAG"<>42)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       2770  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

 
Note that although the CBO has estimated it will likely only return just the 1 row, it has opted to go for a Full Table Scan. A 10053 trace would show that the index on the FLAG column wasn’t even considered by the CBO. The use of the Not Equal (<>) condition has totally negated the use of the available index.

If we look at a query now on the CODE column:

SQL> select * from radiohead where code = 1;

200000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   200K|  2929K|   761   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| RADIOHEAD |   200K|  2929K|   761   (2)| 00:00:10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=1)

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2809  consistent gets
          0  physical reads
          0  redo size
    1602034  bytes sent via SQL*Net to client
        824  bytes received via SQL*Net from client
         41  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     200000  rows processed

 

As there are only 5 evenly distributed values, the CBO has got the cardinality estimate spot on and has decided that visiting the table 200,000 times via the index is just too expensive and that the Full Table Scan is the more efficient method. Fair enough.

If we now run a query that looks for all values of a specific CODE but only if the FLAG is not 42 (which again is going to return 0 rows):


SQL> alter session set events '10053 trace name context forever';

Session altered.

SQL> select * from radiohead where code = 1 and flag <> 42;

no rows selected

 
Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    15 |   762   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| RADIOHEAD |     1 |    15 |   762   (2)| 00:00:10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("FLAG"<>42 AND "CODE"=1)

 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2770  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

 
Again, the Full Table Scan is the way to go says the CBO. The index on the FLAG column is not considered and the index on the CODE column is just too expensive. A 10053 trace confirms this:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: RADIOHEAD  Alias: RADIOHEAD
    #Rows: 1000000  #Blks:  2764  AvgRowLen:  15.00
Index Stats::
  Index: RADIOHEAD_CODE_I  Col#: 1
    LVLS: 2  #LB: 1950  #DK: 5  LB/K: 390.00  DB/K: 2755.00  CLUF: 13775.00
  Index: RADIOHEAD_FLAG_I  Col#: 3
    LVLS: 2  #LB: 1950  #DK: 1  LB/K: 1950.00  DB/K: 2755.00  CLUF: 2755.00
  Index: RADIOHEAD_TYPE_I  Col#: 2
    LVLS: 2  #LB: 1950  #DK: 20  LB/K: 97.00  DB/K: 2755.00  CLUF: 55100.00
Access path analysis for RADIOHEAD
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for RADIOHEAD[RADIOHEAD]

  Table: RADIOHEAD  Alias: RADIOHEAD
    Card: Original: 1000000.000000  Rounded: 1  Computed: 0.20  Non Adjusted: 0.20
  Access Path: TableScan
    Cost:  762.05  Resp: 762.05  Degree: 0
      Cost_io: 750.00  Cost_cpu: 259683730
      Resp_io: 750.00  Resp_cpu: 259683730
  Access Path: index (AllEqRange)
    Index: RADIOHEAD_CODE_I
    resc_io: 3147.00  resc_cpu: 114411172
    ix_sel: 0.200000  ix_sel_with_filters: 0.200000 
    Cost: 3152.31  Resp: 3152.31  Degree: 1
  Best:: AccessPath: TableScan
         Cost: 762.05  Degree: 1  Resp: 762.05  Card: 0.20  Bytes: 0

***************************************

Note that the index on the FLAG column is not even mentioned within the possible execution plans and the index on the CODE column has a cost of 3152.31 which is way more than the Full Table Scan cost of 762. So the Full Table Scan is selected, even though no rows are returned and the CBO estimates that just 1 row is likely to be returned. OK, let’s now drop the B-Tree indexes and replace them with Bitmap indexes:


SQL> drop index radiohead_code_i;

Index dropped.

SQL> drop index radiohead_type_i;

Index dropped.

SQL> drop index radiohead_flag_i;

Index dropped.

SQL> create bitmap index radiohead_code_i on radiohead(code);

Index created.

SQL> create bitmap index radiohead_type_i on radiohead(type);

Index created.

SQL> create bitmap index radiohead_flag_i on radiohead(flag);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'RADIOHEAD', cascade=> true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

 
If we now run the same query again on the FLAG column:

SQL> select * from radiohead where flag <> 42;

 no rows selected

 
Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

 ------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     1 |    15 |   762   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| RADIOHEAD |     1 |    15 |   762   (2)| 00:00:10 |
-------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
---------------------------------------------------

    1 - filter("FLAG"<>42)

 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2770  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

 

We notice that again the index is not chosen, even though the bitmap index stores references to those rowids where this condition is not true (a bitmap value of 0) and even though the CBO estimates only the 1 row is likely to be returned. To see why this is the case, let’s look at a partial bitmap index entry via a block dump of the bitmap index:

 
Block header dump:  0x01c01d1c
 Object id on Block? Y
 seg/obj: 0x13e38  csc: 0x00.1234e2a  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1c01d18 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.01234e2a
Leaf block dump
===============
header address 214311524=0xcc62264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 960=0x3c0
kdxcoavs 920
kdxlespl 0
kdxlende 0
kdxlenxt 29367581=0x1c01d1d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4499] flag: ——, lock: 0, len=3537
col 0; len 2; (2):  c1 2b
col 1; len 6; (6):  01 40 2c 82 00 00
col 2; len 6; (6):  01 40 2c c4 00 7f
col 3; len 3516; (3516):
 cf ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff
 ff ff cf ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff ff ff cd ff ff ff ff
 ff 07 ff 29 ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff ff ff cf ff ff ff
 ff ff ff ff ff cf ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff ff ff cd ff
 ff ff ff ff 07 ff 29 ff ff ff ff ff ff ff ff cf ff ff ff ff ff ff ff ff cf
 …

We notice the bitmap index entry consists of the indexed value (c1 2b), a start rowid (01 40 2c 82 00 00), an end rowid (01 40 2c c4 00 7f) and a bitmap string for which a bit corresponds to every row within the rowid range, set to either 1 (for true) or 0 (for false). The 0s are compressed and represented by a value based on the actual number of compressed bits.

However, if the bitmap entry only has a start and end rowid range, how does it actually know the location of all the corresponding rows, as there could be differing number of rows for any of the given data blocks. How does it know just how many rows actually exist within the rowid range ?

The answer is that it can’t possibly know. Therefore, Oracle makes a very important assumption and based on the definition of the table, determines the maximum number of rows that could potentially fit within a block and assigns a bit for every possible rowid that could in theory exist within the specified rowid range (I’ll expand on this point in my next post).

If the rowid actually corresponds to an existing row, then the bit is set accordingly depending on the value of the indexed column for that row. If the rowid doesn’t exist (or doesn’t exist yet), then the corresponding bit is simply set to a 0. If there are a whole bunch of consecutive 0s for rows that don’t exist, they get compressed and the overheads are minimised.

However, the value of a bit set to 0 can therefore potentially mean one of two things. It could mean that the row exists but doesn’t have the value represented by the index entry or it could mean that the row simply doesn’t exist at all. There is no way for Oracle to tell the difference between these two scenarios.

If one is after rows for which the column has a specific value, then no problem, all the bits with a value of 1 must correspond to rows that really do exist and have the column value of interest. However, if one is after all rows for which the column value is not the one represented by a bitmap index entry (as in a <> condition), then referencing all the bits that have a 0 won’t be sufficient as they could potentially point at rows that don’t actually exist and accessing a table looking up rows that don’t exist will open up a can of worms.

Therefore, just like a B-Tree index, the CBO will not consider a Bitmap index for a query that exclusively contains a not equal or not in condition.

If we now look at the second query based on the CODE column:

SQL> select * from radiohead where code = 1;

200000 rows selected.

 
Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |   200K|  2929K|   761   (2)| 00:00:10 |
|*  1 |  TABLE ACCESS FULL| RADIOHEAD |   200K|  2929K|   761   (2)| 00:00:10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=1)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       2809  consistent gets
          0  physical reads
          0  redo size
    1602034  bytes sent via SQL*Net to client
        824  bytes received via SQL*Net from client
         41  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     200000  rows processed

 

We notice that the CBO again chooses the Full Table Scan as again, the query is returning 20% of all rows and deems it too expensive to visit the table 200,000 times to retrieve the data via the index, even if the Bitmap index structure itself is relatively small and efficient. So again, no difference to the B-Tree index example.

However, if we run the third query based on both the CODE column and the <> condition on the FLAG column:

SQL> select * from radiohead where code = 1 and flag <> 42;

no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 1712231689

--------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes |  Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |     1 |    15 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID  | RADIOHEAD        |     1 |    15 |     3   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS |                  |       |       |            |          |
|   3 |    BITMAP MINUS               |                  |       |       |            |          |
|   4 |     BITMAP MINUS              |                  |       |       |            |          |
|*  5 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
|*  6 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_FLAG_I |       |       |            |          |
|*  7 |     BITMAP INDEX SINGLE VALUE | RADIOHEAD_FLAG_I |       |       |            |          |
--------------------------------------------------------------------------------------------------

 
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("CODE"=1)
   6 - access("FLAG" IS NULL)
   7 - access("FLAG"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         47  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

Now we see a distinct difference from the B-Tree index example as both Bitmap Indexes have been used rather than a Full Table Scan.

In conjunction with another index that returns rowids of interest that obviously exist, a bitmap index can be used successfully to determine a Not Equal condition. By logically subtracting all the matching rowids of one bitmap index (that contains rowids than aren’t of interest) from the other bitmap index (which contains rowids that are of interest), a list of actual rowids of interest can be determined to access the table. Note this can also potentially be performed by looking up the 0 bits, as corresponding rows do not have the indexed value and any matching rowids can be proven to exist by their appearance within the other Bitmap index.

As most of this processing only involves simple bit comparisons via accesses to relatively small, efficient Bitmap index structures, the relative overheads can be significantly reduced from that of the Full Table Scan (eg. in this example, consistent gets reduced from 2770 to just 47).

So a Not Equal/Not In can be serviced via a Bitmap Index, providing another index is also accessed that returns rowids of interest.

DEL_LF_ROWS Index Rebuild Criteria ? (Codex) May 22, 2011

Posted by Richard Foote in DEL_LF_ROWS, Indexing Myth, Oracle Indexes.
19 comments

In the same article and book where the erroneous claims regarding the BLKS_GETS_PER_ACCESS were made, we also find the following recommendation:
 
Another rebuild condition would be cases where deleted leaf nodes comprise more than 20% of the index nodes“.
 
This is a very common claim however no matter how often it might be stated, it’s fundamentally flawed. Note My Oracle Support previously made these claims as well, such as with Metalink note 122008.1, although this has now been totally revised with the recommendation withdrawn.
 
It’s a flawed index rebuild criteria for two very good reasons.

1) It assumes this deleted space is somehow “wasted” or “deadwood” and needs to be removed. However, in the majority of scenarios, it’s nothing more than free space which will be subsequently reused by later inserts into the index. Therefore, basing a rebuild criteria on the percentage of deleted space will likely include indexes that will not benefit from a rebuild.

2) It assumes the DEL_LF_ROWS index statistic somehow accounts for all the “deleted” space within an index. However this statistic can significantly under-estimate the potential “wasted” space within an index. Therefore basing a rebuild criteria on the percentage of  “currently marked” deleted space can possibly miss indexes that might actually benefit from a rebuild (or coalesce).
 
Any criteria which selects indexes that don’t need to be rebuilt but misses out on those that do, is as I say, fundamentally flawed. Basing a rebuild criteria on the percentage of currently marked deleted index entries suggests a lack of understanding of both how Oracle indexes work and the actual meaning behind DEL_LF_BLKS.
 
This of course has been discussed many times before, but a quick demo to illustrate. First, I’m going to create a table with 10 rows with a unique ID column.

SQL> create table bowie (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <=10;
 
10 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.

 

If we now delete four of these rows:


SQL> delete bowie where id in (2,4,6,8);
 
4 rows deleted.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index bowie_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_delete from index_stats;
 
LF_ROWS DEL_LF_ROWS PCT_DELETE
------- ----------- ----------
     10           4         40

 
Indeed, we currently have 4 index entries marked as deleted. If we look at a dump of this index block:
 

Block header dump:  0x01c0521b
 Object id on Block? Y
 seg/obj: 0x13039  csc: 0x00.e30e53  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1c05218 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0x0005.017.00004219  0x00c002b4.0a36.07  –U-    4  fsc 0x0038.00e30e79
Leaf block dump
===============
header address 220340836=0xd222264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 10
kdxcofbo 56=0x38
kdxcofeo 7916=0x1eec
kdxcoavs 7860
kdxlespl 0
kdxlende 4
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 c0 52 13 00 00
row#1[8012] flag: —D–, lock: 2, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 c0 52 13 00 01
row#2[8000] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 04
col 1; len 6; (6):  01 c0 52 13 00 02
row#3[7988] flag: —D–, lock: 2, len=12
col 0; len 2; (2):  c1 05
col 1; len 6; (6):  01 c0 52 13 00 03
row#4[7976] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 06
col 1; len 6; (6):  01 c0 52 13 00 04
row#5[7964] flag: —D–, lock: 2, len=12
col 0; len 2; (2):  c1 07
col 1; len 6; (6):  01 c0 52 13 00 05
row#6[7952] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 08
col 1; len 6; (6):  01 c0 52 13 00 06
row#7[7940] flag: —D–, lock: 2, len=12
col 0; len 2; (2):  c1 09
col 1; len 6; (6):  01 c0 52 13 00 07
row#8[7928] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 0a
col 1; len 6; (6):  01 c0 52 13 00 08
row#9[7916] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 0b
col 1; len 6; (6):  01 c0 52 13 00 09
—– end of leaf block dump —–
 

We note there are currently 4 rows with the delete flag set (eg. row#1[8012] flag: —D–). In order to make the transaction more efficient, Oracle doesn’t physically remove the deleted index entries but marks them as logically deleted. However, any subsequent DML within the index block will result in all the deleted entries being physically removed from the leaf block.

For example, here we insert a new index entry with a value of 42. Note this value didn’t exist previously and is outside all the current values within the index leaf block:

  
SQL> insert into bowie values (42, 1, 'MAJOR TOM');
 
1 row created.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index bowie_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_delete from index_stats;
 
   LF_ROWS DEL_LF_ROWS PCT_DELETE
---------- ----------- ----------
         7           0          0

 

Note that this has resulted in the removal of the previously logically deleted index entries. There are currently no deleted index entries within the leaf block. A block dump now clearly shows the deleted index entries are now all gone:

 
 
Block header dump:  0x01c0521b
 Object id on Block? Y
 seg/obj: 0x13039  csc: 0x00.e31029  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1c05218 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0x0005.012.00004213  0x00c002b4.0a36.31  –U-    1  fsc 0x0000.00e3102b
Leaf block dump
===============
header address 220340836=0xd222264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 7
kdxcofbo 50=0x32
kdxcofeo 7904=0x1ee0
kdxcoavs 7902
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 c0 52 13 00 00
row#1[8000] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 04
col 1; len 6; (6):  01 c0 52 13 00 02
row#2[7976] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 06
col 1; len 6; (6):  01 c0 52 13 00 04
row#3[7952] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 08
col 1; len 6; (6):  01 c0 52 13 00 06
row#4[7928] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 0a
col 1; len 6; (6):  01 c0 52 13 00 08
row#5[7916] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 0b
col 1; len 6; (6):  01 c0 52 13 00 09
row#6[7904] flag: ——, lock: 2, len=12
col 0; len 2; (2):  c1 2b
col 1; len 6; (6):  01 c0 52 15 00 00
—– end of leaf block dump —–
 

So deleted space in most scenarios is nothing more than free space that can be cleaned out and reused by subsequent DML operations.
 
Lets look at another example, this time on a larger table with 1M rows with two indexes. The first index is effectively unique while the second index has 100 evenly distributed values (and so 10000 occurrences of each value):
 

SQL> create table radiohead (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into radiohead select rownum, mod(rownum, 100), 'OK COMPUTER' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index radiohead_code_i on radiohead(code);
 
Index created.
 
SQL> create index radiohead_id_i on radiohead(id);
 
Index created.

 

If we now delete 40% of the table, those with an ID up to 400000, this has 2 different effects on our indexes. On the ID index, it will result in the first 40% of leaf blocks containing nothing but deleted index entries. On the CODE index, it will result in all the leaf blocks containing approximately 40% of deleted index entries. Note this delete is being performed within a single very large logical transaction, not via lots of small discrete transactions, so that none the deleted entries are cleaned out. Note this also requires a sufficiently large buffer cache to cache all this data to prevent read operations (including the validate structure command itself) from subsequently cleaning out the deleted index entries as well via subsequent block cleanout operations.
 

SQL> delete radiohead where id between 1 and 400000;
 
400000 rows deleted.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index radiohead_code_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
 1000000      400000          40
 
SQL> analyze index radiohead_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
 1000000      400000          40

 

So both have a “large” percentage of deleted index entries, although the distribution of such deleted entries differs between the two indexes. A rebuild criteria simply based on the percentage of deleted entries would suggest both indexes need a rebuild. However, if we were to re-insert a similar amount of data again (note, even with a new range of values for the ID column):
 

SQL> insert into radiohead select rownum+1000000, mod(rownum,100), 'THE KING OF LIMBS' from dual connect by level <= 400000;
 
400000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index radiohead_code_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
 1000002           2       .0002

SQL> analyze index radiohead_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
 1007633        7633  .757517866
 

We notice that the vast majority of the deleted entries have automatically been cleaned out, without an index rebuild in sight. In the case of the ID index with 40% of index leaf blocks containing nothing but deleted index entries, even though a different range of values were inserted, the effectively “empty” leaf blocks were recycled within the index, removing all the associated deleted index entries in the process.
 
In the case of the CODE index with all the leaf blocks containing approximately 40% of deleted index entries, subsequent inserts into these leaf blocks cleaned out all the deleted index entries and reused the subsequently freed space.
  
Once you begin to understand how Oracle reuses deleted index space and how the free space related to previously deleted entries may not actually be reflected in the DEL_LF_ROWS statistic anyways, one begins to understand why basing an index rebuild criteria on the so-called ratio of deleted index entries is so flawed.

In fact, if you want to avoid some nonsensical  index rebuild criteria based on DEL_LF_ROWS, all you need to do is simply delete yet more rows, as so wonderfully here explained by Jonathan Lewis.

To illustrate, we create a table and index and populate it with 100000 rows:

SQL> create table pink_floyd (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into pink_floyd select rownum, mod(rownum, 100), 'WISH YOU WERE HERE' from dual connect by level <= 100000;
 
100000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index pink_floyd_id_i on pink_floyd(id);
 
Index created.
 
SQL> analyze index pink_floyd_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
  100000           0           0

 

So far, so good. Let’s now deleted 40% of the rows throughout the table:


SQL> delete pink_floyd where mod(id,100) between 20 and 29 or mod(id, 100) between 40 and 49 or mod(id, 100) between 60 and 69 or mod(id, 100) between 80 and 89;
 
40000 rows deleted.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index pink_floyd_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
  100000       40000          40
 

Oh dear, we have an index with 40% of the rows deleted. However, rather than rebuild the index, let’s simply delete a few more rows …


SQL> delete pink_floyd where mod(id, 100) = 1;
 
1000 rows deleted.
 
SQL> commit;
 
Commit complete.
 
SQL> analyze index pink_floyd_id_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, del_lf_rows, del_lf_rows/lf_rows*100 pct_deleted from index_stats;
 
 LF_ROWS DEL_LF_ROWS PCT_DELETED
-------- ----------- -----------
   60000        1000  1.66666667

What do you know, now the ratio of deleted rows is just 1.67%, no need to rebuild it now !!

If you have a rebuild criteria which selects indexes that don’t actually need to be rebuilt and it misses out on those that potentially could benefit from a rebuild, while at the same time makes indexes look less in need of a rebuild the more you actually potentially fragment the index with further deletions, well as I said at the start, such a rebuild criteria is fundamentally flawed …

BLKS_GETS_PER_ACCESS Index Rebuild Criteria ? (Twisted Logic) April 20, 2011

Posted by Richard Foote in Indexing Myth, Oracle Indexes, Validate Structure.
17 comments

A recent question on the database OTN forum and a previous request by Charles Hooper that I cover some basic indexing concepts for newbie’s who might be confused by “dubious” information out there in internet land has prompted me to discuss the BLKS_GETS_PER_ACCESS metric, available in INDEX_STATS after an analyze validate structure operation.
 
The OTN thread had someone questioning why after freshly rebuilding an index, the index still had a particularly “high” BLKS_GETS_PER_ACCESS value of 110 and wouldn’t reduce down below a value of 5. They requested if someone could please explain why the BLKS_GETS_PER_ACCESS was not getting reduced, as they wanted it to be below 5.
 
The two obvious questions I had in return were why of earth would anyone want the BLKS_GETS_PER_ACCESS to be below 5 and why on earth would they think rebuilding the index would reduce the BLKS_GETS_PER_ACCESS from a value of 110 down to 5.
 
However a quick Google search confirmed my suspicions, subsequently confirmed by the OP. This “Identifying Which Indexes to Rebuild” article by Don Burleson states:
 
Gets per index access
 
The number of “gets” per access refers to the amount of logical I/O that is required to fetch a row with the index. As you may know, a logical “get” is not necessarily a physical I/O since much of the index may reside in the Oracle buffer cache. However, any SAP index with a number greater than 10 would probably benefit from an index rebuild.”
 
and
 
We might want to rebuild an index if the “block gets” per access is greater than five, since excessive “blocks gets” indicate a fragmented b-tree structure.”
 
The same claims are made on page 727 in “Oracle Tuning: The Definitive Reference” .
 
The problem with all this of course is that it’s a complete nonsense. The very idea of having a rebuild criteria based on BLKS_GETS_PER_ACCESS being greater than value “X” and that a subsequent index rebuild will reduce BLKS_GETS_PER_ACCESS down to less than value “X” shows a complete lack of understanding of what BLKS_GETS_PER_ACCESS actually represents.
 
BLKS_GETS_PER_ACCESS is basically the number of blocks required to get a randomly chosen row of interest via an index.
 
The Oracle Reference Manual describes BLKS_GETS_PER_ACCESS somewhat ambiguously as:

Expected number of consistent mode block reads per row, assuming that a randomly chosen row is accessed using the index. Used to calculate the number of consistent reads that will occur during an index scan.”
 
The key point here is that it’s a “randomly” chosen row when accessed via the specific index.
 
Now in the case of a Unique index, the number of blocks needed to access a random row can easily be determined as simply being the height of the index plus one additional block to access the associated table block. If the index is Unique, there can only be one row per index value which requires precisely one visit to the table.
 
In the following example, we create a table with 1M rows with two indexes. The index on the ID column is effectively unique as there are 1M distinct values while the index on the CODE column is Non-Unique with only 100 distinct values and so with 10000 occurrences of each indexed value.

  
SQL> create table bowie (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into bowie select rownum, mod(rownum,100), 'ZIGGY STARDUST' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.
 
SQL> create index bowie_code_i on bowie(code);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=>true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select index_name, num_rows, distinct_keys from dba_indexes where table_name='BOWIE';
 
INDEX_NAME     NUM_ROWS DISTINCT_KEYS
------------ ---------- -------------
BOWIE_CODE_I    1000000           100
BOWIE_ID_I      1000000       1000000

 
 

If we now collect INDEX_STATS on the effectively Unique index:

  
SQL> analyze index bowie_id_i validate structure;
 
Index analyzed.
 
SQL> select height, lf_rows, distinct_keys, rows_per_key, blks_gets_per_access from index_stats;
 
HEIGHT    LF_ROWS DISTINCT_KEYS ROWS_PER_KEY BLKS_GETS_PER_ACCESS
------ ---------- ------------- ------------ --------------------
     3    1000000       1000000            1                    4
 

 
 

We can see that the BLKS_GETS_PER_ACCESS is 4 as expected. 3 due to the index height and additional 1 to access the specific row in the associated table block.
 
The formula Oracle uses here is basically:

HEIGHT + (LF_ROWS/DISTINCT_KEYS + 1)/2 = HEIGHT + (ROWS_PER_KEY + 1)/2 = 3 + (1 + 1)/2 = 3 + 2/2 = 3 + 1 = 4.
 
So basically to calculate the BLKS_GETS_PER_ACCESS, we simply take the ROWS_PER_KEY, add 1 to it, then divide the total by 2 and finally add the index height to get the final value. The reason for this exact formula makes more sense when we look at a Non-Unique index.
 
How do we cost the access to a specific row via a non-unique index when there could be multiple occurrences of the specific index value ?
If there are say 100 occurrences of an indexed value, if we want the first of these within the index, then we need to access the same blocks as per the unique index (index height plus 1). However, if we want the last of these 100 occurrences referenced within the index, then we might need to access index height + the 100 blocks until we reach the last occurrence of the indexed value. If we’re simply interested in an “average” row, then on average we might need to access 1/2 the ROWS_PER_KEY in addition to the index height.

So the formula is now basically HEIGHT + ROWS_PER_KEY/2. But as this is only an average guesstimate to begin with and so as we don’t ruin the perfect value we can derive for Unique Indexes, the formula is adjusted a tad by adding 1 to the ROWS_PER_KEY/2 figure so that the result makes sense and is accurate for Unique indexes as well.
 
Hence the final formula of HEIGHT + (ROWS_PER_KEY + 1)/2.
 
If we now look at the index on the CODE column, which has only 100 distinct values (and so 10000 occurrences per distinct value):

 

SQL> analyze index bowie_code_i validate structure;
 
Index analyzed.
 
SQL> select height, lf_rows, distinct_keys, rows_per_key, blks_gets_per_access from index_stats;
 
HEIGHT    LF_ROWS DISTINCT_KEYS ROWS_PER_KEY BLKS_GETS_PER_ACCESS
------ ---------- ------------- ------------ --------------------
     3    1000000           100        10000               5003.5

 
 

HEIGHT + (ROWS_PER_KEY + 1)/2 = 3 + (10000 + 1)/2 = 3 + 5000.5 = 5003.5.
 
We can see how this 5003.5 figure is actually derived.
 
The actual blocks needed to be accessed when using the index is of course also very highly dependant on the Clustering Factor (CF) of the index but the CF is not calculated as part of Index_stats. The BLKS_GETS_PER_KEY can therefore be viewed as being a guesstimate of the number of blocks required to read a specific indexed value, based on an “average” CF. As the maximum CF is LF_ROWS, an “average” CF can therefore be viewed as simply being 1/2 of LF_ROWS.
 
In the above example, an “average” CF would therefore be 1000000/2 = 500000.
 
To access all table blocks for a specific indexed value would therefore basically be:

500000/distinct keys = 500000/100 = 5000.

If we then add the index height = 5000 + 3 = 5003.
 
5003 is almost identical to 5003.5, the 0.5 difference simply attributed to the additional 1 that’s added in the previous formula.
 
So BLKS_GETS_PER_ACCESS can effectively be viewed as being the either the number of blocks required to access a specific row “on average” within the table or the number of blocks required to read all occurrences of a specific index value IF the index had an average CF.
 
Note that both definitions are nothing but wild guesstimates of the blocks that might actually need to be accessed as both are making very broad assumptions in that the CF is indeed “average”, that the accessed row of interest is indeed “somewhere near the middle” of the index range scan, that the data is indeed evenly distributed, etc. etc. The actual blocks that might need to be accessed when using the index could therefore be significantly different if these basic assumptions are not correct.
 
Now here come a few interesting points.
 
Firstly, note the formula used only takes into consideration the index height and the average expected accesses to the table. The possible accesses to additional index leaf blocks is not considered at all.
 
Why ?
 
Likely because the vast majority of accesses involving an index range scan actually involves the block reads associated with accessing the table, not reading the index itself. As the figure is only a very rough estimate to begin with, it’s somewhat pointless adding a little here or there for the trivial additional leaf blocks that might need to be accessed during a scan . So in order to keep things simple, only the index height is considered in the actual BLKS_GETS_PER_ACCESS costings.
 
Therefore, an index rebuild is likewise going to have a trivial impact on the derived BLKS_GETS_PER_ACCESS as only the index height is considered in its calculation. In the vast majority of cases, rebuilding an index will have absolutely no impact at all as most index heights are not impacted by an index rebuild. In extremely rare occasions, an index might reduce its height but then the final BLKS_GETS_PER_INDEX is only going to reduced by 1. The absolute maximum amount that an index rebuild can impact the BLKS_GETS_PER_ACCESS is just HEIGHT-1.
 
In the above example, the BLKS_GETS_PER_ACCESS is 5003.5, substantially greater than 5 or 10 or even 42. However, rebuilding the index:

  
SQL> alter index bowie_code_i rebuild;
 
Index altered.
 
SQL> analyze index bowie_code_i validate structure;
 
Index analyzed.
 
SQL> select height, lf_rows, distinct_keys, rows_per_key, blks_gets_per_access from index_stats;
 
HEIGHT    LF_ROWS DISTINCT_KEYS ROWS_PER_KEY BLKS_GETS_PER_ACCESS
------ ---------- ------------- ------------ --------------------
     3    1000000           100        10000               5003.5

 

  
makes precisely no difference at all.
 
So what do we do, just keep rebuilding this same index week after week after week as it continually meets the ill-considered index rebuild guidelines …
 
Note also that the actual value of BLKS_GETS_PER_ACCESS could be anything, as it’s primarily based on the number of rows per keys. For a very very large table on an index with relatively few distinct values, this figure could likewise be “very large”, even with the most perfectly compact, defragmented index.
 
Therefore, having an index rebuild criteria based if some nominal value of BLKS_GETS_PER_ACCESS “is excessive” (be it 5 or 10 or 42 or 5000 or whatever) is simply nonsensical as this value has practically nothing to do with the efficiency of an index but is directly proportional to the average number of rows per index key.

Additionally, suggesting an index rebuild will have a significant impact on BLKS_GETS_PER_ACCESS is likewise nonsensical as the only cost directly attributed to the index itself is the index height, which very rarely changes following an index rebuild. In fact, the BLKS_GETS_PER_ACCESS formula specifically negates the impact of any potential index rebuilds by implicitly excluding index block leafs in its calculations.
 
In short, basing an index rebuild criteria on the value of BLKS_GETS_PER_ACCESS is totally ludicrous. It’s actually similar to the silly myth that one should rebuild an index if the clustering factor is “too large”, where in actual fact there is no such threshold value and an index rebuild doesn’t impact the subsequent CF anyways.
 
I hate to think how many indexes have been repeatedly rebuilt unnecessarily based on a rebuild criteria where BLKS_GETS_PER_ACCESS is greater than value “X”, when such an index rebuild has made absolutely no difference to the original rebuild criteria :(

Follow

Get every new post delivered to your Inbox.

Join 1,862 other followers