jump to navigation

12c Index Like Table Statistics Collection (Wearing The Inside Out) July 9, 2014

Posted by Richard Foote in 12c, Automatic Table Statistics, Index statistics, Oracle Indexes.
2 comments

This change introduced in 12c has caught me out on a number of occasions.

If you were to create a new table:

SQL> create table thin_white_duke1 (id number, code number, name varchar2(30));

Table created.

And then populate it with a conventional insert:

SQL> insert into thin_white_duke1 select rownum, mod(rownum,10), 'DAVID BOWIE'  from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

We find there are no statistics associated with the table until we explicitly collect them:

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed from dba_tables where table_name='THIN_WHITE_DUKE1';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE1

But if we were to now create an index on this table:

SQL> create index thin_white_duke1_code_i on thin_white_duke1(code);

Index created.

We find that we now do indeed have index statistics collected by default (since 9i days anyways):

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name='THIN_WHITE_DUKE1_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE1_CODE_I    1000000          2        1936            10

OK, if we now create another table and index:

SQL> create table thin_white_duke2 (id number, code number, name varchar2(30));

Table created.

SQL> create index thin_white_duke2_code_i on thin_white_duke2(code);

Index created.

And populate it with a bulk load parallel, append insert:

SQL> insert /*+ append */ into thin_white_duke2 select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <=1000000;

1000000 rows created.

 SQL> commit;

Commit complete.

If we now look at the table statistics:

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed
from dba_tables where table_name='THIN_WHITE_DUKE2';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE2    1000000       3511          0          20 09-JUL-14

SQL> select column_name, num_distinct, density, num_buckets from dba_tab_columns
 where table_name='THIN_WHITE_DUKE2';

COLUMN_NAME NUM_DISTINCT    DENSITY NUM_BUCKETS
----------- ------------ ---------- -----------
NAME                   1          1           1
CODE                  10         .1           1
ID               1000000    .000001           1

SQL> select column_name, num_distinct, density, histogram, notes
from dba_tab_col_statistics where table_name='THIN_WHITE_DUKE2';

COLUMN_NAME NUM_DISTINCT    DENSITY HISTOGRAM  NOTES
----------- ------------ ---------- ---------- -------------
NAME                   1          1 NONE       STATS_ON_LOAD
CODE                  10         .1 NONE       STATS_ON_LOAD
ID               1000000    .000001 NONE       STATS_ON_LOAD

 

We notice that in 12c, they’re automatically populated and accurate. The NOTES column in dba_tab_col_statistics highlights that the statistics were collected via STATS_ON_LOAD.

If however we look at the current state of the index statistics:

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name='THIN_WHITE_DUKE2_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE2_CODE_I          0          0           0             0

We noticed they haven’t changed since the index was initially created. So by populating an empty table with an append insert, table statistics are now collected on the fly, but not the statistics on any existing indexes.

If we want to now collect just the index statistics and potentially any useful histograms on the table (as histograms are not collect during the bulk load operation), we can use the new GATHER_AUTO option with the dbms_stats.gather_table_stats procedure without having to re-gather base table statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>user,
tabname=>'THIN_WHITE_DUKE2', options=>'GATHER AUTO');

PL/SQL procedure successfully completed.

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys
from dba_indexes where index_name='THIN_WHITE_DUKE2_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE2_CODE_I    1000000          2        1739            10

 The same automatic table statistics gathering occurs when we create a table via a sub-select clause:

SQL> create table thin_white_duke3
as select rownum id, mod(rownum,1000) code, 'DAVID BOWIE' name
from dual connect by level <= 1000000;

Table created.

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed
from dba_tables where table_name='THIN_WHITE_DUKE3';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE3    1000000       3787          0          22 09-JUL-14

Of course, sometimes when creating test tables as I regularly do, one sometimes forgets these statistics are now automatically collected !!

Clustering Factor Calculation Improvement Part III (Too Much Rope) June 4, 2013

Posted by Richard Foote in 11g, CBO, Clustering Factor, Index statistics, TABLE_CACHED_BLOCKS.
9 comments

In my previous post, I discussed how for larger tables that really do have randomised data, not even setting the new TABLE_CACHED_BLOCKS to its 255 maximum value is going to make any real difference to the resultant Clustering Factor (CF) of an index. The 255 maximum value here protects us somewhat from abusing this capability and setting unrealistic CF values.

However, for smaller tables in particular, we do need to exercise some caution.

In the following example, we’re only creating a relatively small table and associated index with a CODE column that is randomly distributed throughout the table:

SQL> create table bowie (id number, code number, text varchar2(30));
Table created.

SQL> insert into bowie select rownum,  trunc(dbms_random.value(0, 100)), 'DAVID BOWIE'
from dual connect by level <= 70000;

70000 rows created.

SQL> commit;

Commit complete.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true,
     method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> create index bowie_code_i on bowie(code);

Index created.

If we look at the CF of this index:

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ ------------ ---------- ---------- -----------------
BOWIE        BOWIE_CODE_I        244      70000             22711

We notice that at 22,711, the CF it’s pretty average. The table though is quite small at only 244 blocks.

If we run a simple query:

SQL> select * from bowie where code between 42 and 44;
2050 rows selected.

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |  2814 | 56280 |    65   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |  2814 | 56280 |    65   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter('CODE'<=44 AND 'CODE'>=42)

Statistics
----------------------------------------------------------

0  recursive calls
1  db block gets
254  consistent gets
0  physical reads
0  redo size
25044  bytes sent via SQL*Net to client
519  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
2050  rows processed

The CBO goes for a Full Table Scan. This is not unexpected as we likely have to visit all 244 blocks anyways to fetch the required 2050 rows due to the CODE data being so randomly distributed throughout the table. In a naive attempt to improve things, we decide to improve the CF by setting the TABLE_CACHED_BLOCKS to the maximum 255 value:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true,
method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I';

TABLE_NAME   INDEX_NAME         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ -------------- ---------- ---------- -----------------
BOWIE        BOWIE_CODE_I          244      70000               240

Indeed, we have improved the CF, dramatically reducing it down to just 240 from the previous 22711. Even though the column value for the CODE really is randomly distributed throughout the table, the CF now suggests the data is perfectly clustered. This is because with a table with only 244 blocks, incrementing the CF if the current index entry references a table block more than 255 blocks ago is now impossible. The CF is now guaranteed to be “perfect” as each index entry can only reference one of the 244 table blocks and so is incremented only when each table block is referenced the first time.

This dramatic reduction in the CF will certainly make the index more attractive to the CBO. But is this really a good thing:

SQL> select * from bowie where code between 42 and 44;
2050 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1602289932
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |  2814 | 56280 |    17   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE        |  2814 | 56280 |    17   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_CODE_I |  2814 |       |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access('CODE'>=42 AND 'CODE'<=44)

Statistics
----------------------------------------------------------

0  recursive calls
0  db block gets
681  consistent gets
0  physical reads
0  redo size
20895  bytes sent via SQL*Net to client
519  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
2050  rows processed

The CBO is now using the index, but is now performing significantly more consistent gets, 681 rather than the previous 254. This execution plan is actually much less efficient than the previous FTS execution plan. The CBO is getting this wrong now as the CF isn’t really anywhere near as good as it’s now being lead to believe.

Caution setting TABLE_CACHED_BLOCKS to a value that is anywhere close to the number of blocks in the table. This is one of the reasons for Oracle having a 1% of table blocks default value for this setting.

Clustering Factor Calculation Improvement Part II (Blocks On Blocks) May 14, 2013

Posted by Richard Foote in 11g, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
6 comments

My previous post on the new TABLE_CACHED_BLOCKS statistics gathering preference certainly generated some interest :) My blog hits for the week have gone off the charts !!

One of the concerns raised by this new capability was that setting such a preference might result in really unrealistic and inaccurate Clustering Factor (CF) values, especially for those tables that truly have appalling CFs. Although there are certainly some dangers, Oracle has limited the possible “abuse” by ensuring TABLE_CACHED_BLOCKS can only be set to a maximum of 255. This means Oracle will only ignore a maximum of 255 table blocks that have recently been accessed during the CF calculation. For larger tables with truly randomised data patterns, not even the maximum 255 setting if utilised will make an appreciable difference to the final CF.

A couple of examples to demonstrate.

The first table is a relatively “large” table that has a DOB column that is effectively randomised throughout the table. There are approximately 20,000 different DOB values in a 2 million row table (so each DOB occurs approximately 100 times, give or take).

SQL> create table major_tom (id number, DOB date, text varchar2(30));

Table created.

SQL> insert into major_tom select rownum,  sysdate-trunc(dbms_random.value(0, 20000)), 'DAVID BOWIE' from dual connectby level <= 2000000;

2000000 rows created.

SQL> commit;

Commit complete.

Let’s now create an index on this DOB column and have a look at the CF:

SQL> create index major_tom_dob_i on major_tom(dob);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'MAJOR_TOM', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
MAJOR_TOM    MAJOR_TOM_DOB_I       9077    2000000           1988164

So at 1,988,164, the CF is terrible. This is as expected as the DOB values are all randomised throughout the table. The index is not being used as we had hope (naively) so let’s use the new TABLE_CACHED_BLOCKS preference to now improve the calculated CF by setting it to the maximum 255 setting and recalculate the index statistics:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'MAJOR_TOM',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'MAJOR_TOM_DOB_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
MAJOR_TOM    MAJOR_TOM_DOB_I       9077    2000000           1941946

We notice that although the CF has improved marginally, at whopping 1,941,946 it’s still terrible and has made no real appreciable difference. Why ?

Well let’s do some basic maths here. There are 9077 blocks in the table and the next DOB referenced in the index can potentially be in any one of them. Therefore, the chances of the next DOB being in one of the 255 previously accessed table blocks is only 255/9077 x 100 = approximately 2.8%. So in only 2.8% of the time is the CF likely to not be incremented and so the CF is only likely to drop by around this 2.8% amount.

Let’s check. (1988164 – 1941946)/1988164 x 100  indeed does equal approximately 2.8%.

So statistically with such a poor CF on such a “large” table, to limit the CF calculation if any of the last 255 table blocks are referenced is only going to improve things by 2.8% on average. Effectively of no real use at all.

Another example now, but this time with a CODE column with just 100 distinct values that are randomly distributed throughout another reasonable “large” 2 million row table. For those mathematically challenged, that means each value occurs approximately 20,000 times, give or take:

SQL> create table ziggy (id number, code number, text varchar2(30));

Table created.

SQL> insert into ziggy select rownum,  trunc(dbms_random.value(0, 100)), 'DAVID
BOWIE' from dual connect by level <= 2000000;

2000000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy_code_i on ziggy(code);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=> null, cascade=> true,
method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
ZIGGY        ZIGGY_CODE_I          7048    2000000            662962

So at 662,962 it’s what I would describe as a “poor to average” CF. It’s not particularly great with there being just  7,048 table blocks but it’s still some distance from the 2,000,000 row value.

The index is not being used in SQL statements as we (naively) wish, so let’s try and improve things by lowering the index CF by setting the new TABLE_CACHED_BLOCKS preference to the maximum 255 setting:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'ZIGGY',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'ZIGGY_CODE_I',
estimate_percent=>null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
ZIGGY        ZIGGY_CODE_I          7048    2000000            662962

We notice to our great disappointment (well, not really) that the CF remains completely unchanged at 662,962 !! Why ?

Again, let’s do some basic maths and consider the data distribution.

The table has some 7048 blocks but each distinct CODE value has some 20,000 occurrences on average. Therefore, each value is going to be found 20000/7048 = roughly 2 to 3 times per block. As the index is in CODE order and for each CODE in rowid order, the CF is going to increment for each CODE value for each distinct block we visit. We will therefore only go back to a previously visited table block (except for the 2 to 3 visits to the current block) when the CODE value changes but this will take us all the way back to the first block which is always going to be some 7047 blocks away from the current one. As 7047 is much greater than the 255 the CF calculation will only cater for, the CF is going to remain unchanged from the default calculation as a result.

And this is all as it should be, as the fundamental CF is indeed poor for these scenarios and even going back the maximum 255 data blocks will not reduce appreciably the manner in which the CF is calculated.

Of course, if there was no limit, then a setting of TABLE_CACHED_BLOCKS  of say 7100 would enable the CF to be recalculated as being perfect in the above scenario, which would indeed be a concern. But 255 is the limit and so limits the potential “damaged” that can be done.

More on all this to come :)

Important !! Clustering Factor Calculation Improvement (Fix You) May 8, 2013

Posted by Richard Foote in 11g, ASSM, CBO, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
43 comments

Believe me, this article is worth reading :)

I’m currently not allowed to discuss Oracle 12c Database goodies but I am allowed to discuss things perhaps initially intended for 12c that are currently available and already back-ported to 11g. This includes a wonderful improvement in the manageability of how the Clustering Factor (CF) of an index can now be calculated. Many thanks to Martin Decker for pointing this out to me.

As anyone who has attended my Index Seminars will know, the CF of an index is one of the most important statistics used by the Cost Based Optimizer (CBO) in determining the most efficient execution plan. As such, it has always been an issue for me that the manner in which the CF is calculated has been so flawed.

Basically, the CF is calculated by performing a Full Index Scan and looking at the rowid of each index entry. If the table block being referenced differs from that of the previous index entry, the CF is incremented. If the table block being referenced is the same as the previous index entry, the CF is not incremented. So the CF gives an indication of how well ordered the data in the table is in relation to the index entries (which are always sorted and stored in the order of the index entries). The better (lower) the CF, the more efficient it would be to use the index as less table blocks would need to be accessed to retrieve the necessary data via the index.

However, there’s a basic flaw here. The CF calculation doesn’t take into consideration the fact the referenced table block, although maybe different from the previous one index entry, might already have recently been accessed. As such, during an index scan, the table block being accessed is almost certainly still cached in the buffer cache from the previous access, thereby not reducing the effectiveness of the index in any appreciable manner. A classic example of this would be a table with a few freelists. Although the data being inserted is not ordered precisely within the same data blocks, the data might actually be very well clustered within only a few blocks of each other.

Picture a table with 100 rows being inserted by 2 sessions simultaneously, each inserting 50 rows based on an ordered sequence. With one freelist, the data is basically inserted in one block first and then once full a second table block. The data is therefore perfectly ordered/clustered and the CF will evaluate to a value of 2 on such an indexed column. But with 2 freelists, one session could insert data into one block while the other session inserts into a second block, with the ordered sequenced values being randomly distributed among the 2 blocks.  The CF could now potentially evaluate to a value of 100 as the rows are jumbled or “toggled” across the two blocks. This is a much much worse value (2 vs. 100) that can adversely impact the CBO calculations, although the efficiency of such an index is really almost identical as both table blocks are certain to be cached during an index scan regardless.

This is also a very common scenario with Automatic Segment Space Management (ASSM) tablespaces as I’ve discussed previously, which of course is now the default these days.

OK, let’s look at an example scenario. I’ll begin by creating a simple little table, an ordered sequence and a procedure that inserts 100,000 rows into the table:


SQL> create table bowie (id number, text varchar2(30));

Table created.

SQL> create sequence bowie_seq order;

Sequence created.

SQL> CREATE OR REPLACE PROCEDURE bowie_proc AS

2  BEGIN

3     FOR i IN 1..100000 LOOP

4         INSERT INTO bowie VALUES (bowie_seq.NEXTVAL, 'ZIGGY STARDUST');

5         COMMIT;

6     END LOOP;

7  END;

8  /

Procedure created.

We note the table lives in an ASSM tablespace:


SQL> select table_name, i.tablespace_name, segment_space_management

from dba_tables i, dba_tablespaces t   where i.tablespace_name = t.tablespace_name and table_name='BOWIE';

TABLE_NAME   TABLESPACE_NAME                SEGMEN

------------ ------------------------------ ------

BOWIE        USERS                          AUTO

We next have 3 different sessions that simultaneously run the procedure to load the table. Note that an ordered sequence is used which means the 3 sessions are randomly grabbing the next sequenced value to insert. The data though is basically being inserted in order of the ID column, it’s just that the data is being distributed across a few blocks as we go along the table, rather than strictly one block after the other.


SQL> exec bowie_proc

PL/SQL procedure successfully completed.

Let’s create an index on the ID (sequenced) column and collect fresh statistics:


SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE',      estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000            241465

We notice that although the data in the table in reality is actually quite well clustered/ordered on the ID column, the actual CF of the index is not reflecting this. At a massive 241,465 it’s an extremely high (bad) CF, much closer in value to rows in the table than the number of table blocks, as the CF calculation keeps flipping back and forth between differing blocks. With such a high CF, the CBO is therefore going to cost an index scan accordingly:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------

| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |       |   389 |  7780 |   310   (1)| 00:00:04 |

|*  1 |  TABLE ACCESS FULL| BOWIE |   389 |  7780 |   310   (1)| 00:00:04 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("ID"<=429 AND "ID">=42)

Statistics

----------------------------------------------------------

0  recursive calls

1  db block gets

1093  consistent gets

0  physical reads

0  redo size

4084  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

Even though only approx. 0.13% of rows are being accessed and more importantly a similar low percentage of table blocks, the CBO has determined that a Full Table Scan (FTS) is the cheaper alternative. This is an all too familiar scenario, all down to the fact the CF is not accurately reflecting the true clustering of the data and subsequent efficiency of the index.

Finally, at long last, there’s now an official fix for this !!

Bug 13262857 Enh: provide some control over DBMS_STATS index clustering factor computation INDEX describes this scenario and currently has available patches that can be applied on both Exadata databases and Oracle versions 11.1.0.7, 11.2.0.2 and 11.2.0.3. The patches (eg. Patch ID 15830250) describe the fix as addressing “Index Clustering Factor Computation Is Pessimistic“. I couldn’t have described it better myself :)

Once applied (the following demo is on a patched 11.2.0.3 database), there is a new statistics collection preference that can be defined, called TABLE_CACHED_BLOCKS. This basically sets the number of table blocks we can assume would already be cached when performing an index scan and can be ignored when incrementing the CF during statistics gathering. The default is 1 (i.e. as performed presently) but can be set up to be a value between 1 and 255, meaning during the collection of index statistics, it will not increment the CF if the table block being referenced by the current index entry has already been referenced by any of the prior 255 index entries (if set to 255). It basically sets the appropriate parameter in the sys_op_countchg function used to calculate the CF value during statistic gathering to not increment the CF if the current table block has already been accessed “x” index entries previously.

The TABLE_CACHED_BLOCKS preference can be set by either the DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS or DBMS_STATS.SET_DATABASE_PREFS procedures.

So let’s now change the TABLE_CACHED_BLOCKS preference for this table and re-calculate the index statistics:


SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',

pname=>'TABLE_CACHED_BLOCKS', pvalue=>42);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000              1035

We notice that the CF has now been significantly reduced (down from 241465 to just 1035), reflecting far more accurately the true clustering of the data when considering the actual effectiveness of using the index.

If we now run the same query as before:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3472402785

------------------------------------------------------------------------------------------

| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |

------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |            |   389 |  7780 |     4   (0)|00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      |   389 |  7780 |     4   (0)|00:00:01 |

|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I |   389 |       |     2   (0)|00:00:01 |

------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("ID">=42 AND "ID"<=429)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

6  consistent gets

0  physical reads

0  redo size

9882  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

We notice the index is now being selected by the CBO. At a cost of 4 (previously the cost was somewhat greater than the 310 cost of the FTS), this much more accurately reflects the true cost of using the index (notice only 6 consistent gets are performed).

Being able to now set the TABLE_CACHED_BLOCKS preference during statistics collection finally gives us a fully supported and easy method to collect more accurate CF statistics. This in turn can only lead to more informed and accurate decisions by the CBO and ultimately better performing applications. Although available right now via the back ported patches, this will no doubt all be fully documented once the 12c database is finally released.

I can’t recommend enough the use of this new capability :)

The CBO CPU Costing Model and Indexes – Another Introduction September 16, 2009

Posted by Richard Foote in CBO, Index statistics, Oracle Indexes, System Statistics.
11 comments

I’ve previously discussed some basic concepts and formulas regarding how the CBO derives index related costings via the I/O costing model. Time to look at system statistics and the CPU costing model with specific regard to indexes.
 
The first point I would make is that the CPU costing model has some significant improvements over the older I/O costing method and I would strongly recommend adopting the CPU costing model where possible. I’ll explain some of these improvements and advantages over the coming posts.
 
The I/O costing model basically looks at the cost of a specific execution plan in terms of the estimated number of physical I/Os. The less I/Os, the less costly and more efficient the execution plan and the faster the expected response times. There are however a number of short falls with this basic I/O costing strategy in that is doesn’t automatically differentiate between the costs associated with different types of I/Os (eg. between single block and multiblock reads), it doesn’t automatically determine a typical or average size of a multiblock I/O and it doesn’t cost and take into consideration the time and overheads associated with likely CPU resources.
 
The CPU costing model attempts to take into consideration these previous limitations. It automatically takes into consideration discrepancies between the time to complete an average single block I/O versus a multiblock I/O, automatically determines the average size of a multiblock I/Os so it can more accurately determine the likely number of multiblock I/Os in a FTS and automatically determines the expected CPU time for a specific task.
 
To use the CBO CPU costing model, one needs to collect system statistics so that CBO has this additional information, based on the actual system hardware characteristics (Note: since 10g, the hidden parameter _optimizer_cost_model defaults to ‘cpu’ and so is used by default). You do this with the dbms_stats.gather_system_stats procedure. You can collect “Noworkload” statistics in which Oracle basically randomly reads the database data files to determine base statistics such as the average I/O seek time, the average I/O transfer speed and the CPU speed. However,  I would rather recommend the collection of “Workload” stats which are based on the actual workload characteristics of your hardware, based on the real load on your system during the time in which system statistics are gathered (in which case Noworkload statistics are simply ignored).
 
You can gather Workload system statistics by either running:
 
dbms_stats.gather_system_stats(‘START’) to start the system stats collection process followed by dbms_stats.gather_system_stats(‘STOP’) to stop the collection process over a typical, workload period, or
 
dbms_stats.gather_system_stats(‘INTERVAL’, interval=> 120) to say collect system workload stats over a 120 minute period.
 
To view the collected system statistics, query SYS.AUX_STATS$.
 
 
SQL> SELECT pname, pval1 FROM SYS.AUX_STATS$
           WHERE pname IN (‘SREADTIM’, ‘MREADTIM’, ‘MBRC’, ‘CPUSPEED’);
 

PNAME             PVAL1
------------ ----------
SREADTIM              5
MREADTIM             10
CPUSPEED           1745
MBRC                 10

 
The four systems statistics that I’ll focus on for now are:
 
SREADTIM – time in milliseconds for a single block I/O
MREADTIM- time in milliseconds for a multiblock I/O
CPUSPEED - million of CPU cycles per second
MBRC – average number of blocks actually read during multiblock read operations
 
In the above figures, just note therefore that a multiblock read on average takes approximately double the time of that of a single block read and that on average, 10 blocks are read during a multiblock read operation. This provides the CBO with vital information regarding how to now cost and compare potential execution plans.
 
 
The CBO CPU costing model basically looks at the total time required to complete an execution plan by summing:
 
total time to complete all single block I/O activity +
total time to complete all multiblock I/O activity +
total time to complete all the CPU activity

 
This can basically be calculated by:
 
sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second
 
In theory, this should provide the total response time to service an execution plan. However, to keep the actual “cost” figures calculated by the CBO consistent with the I/O costing model, the CBO divides this total time by the average time for a single block I/O, such that the full formula becomes:
 
(sum of all the single block I/Os x average wait time for a single block I/O + 
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 
The final “cost” figure, even with the CPU costing model, is therefore still expressed in units of single block I/Os. This is an important point …
 
So how does the CBO determine the value of the various figures within this formula ? Well as we’ll see, the CBO get’s the required information both from the system statistics and from the costing formulas previously discussed with the I/O costing model.

However, for index related access paths, there’s some good news regarding being able to simplify matters somewhat.
 
The first bit of good news is that from the perspective of an index access path, there are no multiblock I/Os (except for a Fast Full Index Scan) and so the CPU costing formula can be simplified for indexes to remove the multiblock read component and be just:
 
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 
Secondly, if the CPU component is relatively trivial, it may not be sufficient enough to count towards the final cost. As smaller index scans are likely to consume little CPU, it means the CPU component can also generally be ignored. This reduces the formula for such index scans to just:
 
(sum of all the single block I/Os x average wait time for a single block I/O)
/
average wait time for a single block I/O
 
However, the average wait time for a single block I/O now becomes redundant in this simplified equation, reducing the cost to now be just:
 
sum of all the single block I/Os
 
Well the next bit of good news for those that have followed my previous blog entries with regard to the CBO and Indexes is that the previous formulas regarding the I/O costing model are still applicable when determining the sum of all expected I/Os. The sum of all the single block I/Os associated with an index scan is still basically:
 
sum of all the single block I/Os = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
 
In other words, for smaller index scan execution plans, the cost calculated by CBO using the CPU costing model is the same as with the I/O costing model. So no, I wasn’t wasting everyone’s time discussing the various formulas using the older I/O costing model :)
 
If we run the same demo as I ran previously in my initial post regarding the CBO and Indexes where the total cost of the index access plan was 18, but this time using the system statistics listed above:
 
 SQL> alter session set “_optimizer_cost_model” = cpu;
 
Session altered.
 
SQL> SELECT * FROM bowie_stuff2 WHERE id = 420;
 
2000 rows selected.
 

Execution Plan
———————————————————-
Plan hash value: 134336835
 
——————————————————————————–
|Id|Operation                   |Name          |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————————–
| 0|SELECT STATEMENT            |              |2000|36000|   18   (0)|00:00:01|
| 1| TABLE ACCESS BY INDEX ROWID|BOWIE_STUFF2  |2000|36000|   18   (0)|00:00:01|
|*2|  INDEX RANGE SCAN          |BOWIE_STUFF2_I|2000|     |    9   (0)|00:00:01|
——————————————————————————–
 

We notice that the cost remains exactly the same at 9 for the index range scan component and exactly the same at 18 for the total execution plan when comparing the cost of using the IO costing model vs. the CPU costing model. Introducing system statistics hasn’t changed things for this particular index related execution plan.
 
And this is a very common observation. As indexes use single block I/Os, as the CBO cost remains as a unit of single block I/Os and as CPU consumption for an index scan is often trivial, the resultant costs for index access paths often remain unchanged with the CPU costing model. 

Previously, we looked at how changing parameters such as the optimizer_index_cost_adj impacts the costings of index related execution plans to create a level playing field between index and FTS execution plans.
 
The key point to make with regard to system statistics and the CPU costing model is that in general, the system statistics and the associated formula will automatically ensure a level playing field. However, unlike the optimizer parameters, it will do so by typically adjusting the associated costs of the FTS (rather than the index accesses) as the true costs and wait times associated with multiblock FTS are calculated, but are divided by and expressed in units of single block reads.

So rather than decreasing the associated costs of an index access path, system statistics and the CPU costing model will typically create a level playing by automatically increasing the associated costs of a FTS as appropriate.

To be discussed further …

Index Monitoring and Index Statistics (The Great Gig In The Sky) September 16, 2008

Posted by Richard Foote in 11g, Concatenated Indexes, Extended Statistics, Index Monitoring, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
8 comments

I write this post whilst listening to Pink Floyd’s masterpiece “The Dark Side Of The Moon” while sadly lamenting the passing away of Richard Wright. RIP and thank you for all the great musical gifts you’ve given me over the years.

In my last post I highlighted an example of where Index Monitoring doesn’t show how indeed Oracle does indeed use an index when checking for the existence of Foreign Key values. Thought I might discuss yet another example of where Oracle does indeed use an index but it’s again not picked up by index monitoring, this time with a slight 11g flavour.

This specific example involves using the statistics associated with the indexes to provide the CBO with useful additional information, although the index itself is not used directly within the execution plan. Dropping the index means losing this information which could possibly result in a different, non optimal execution plans.

Prior to 11g, Oracle can have a hard time of accurately determining the correct selectively where there is a correlation between two (or more) columns in a table. By default, Oracle assumes the selectivity of two distinct columns to be the density of both columns multiplied together. So for example, if one column (say “A”) had 10 distinct values and the other column (say “B” also had 10 distinct values, Oracle assumes the selectivity of both columns combined to be 10 x 10 = 100 distinct values. A predicate such as:

WHERE A = 5 and B = 2

would assume 1% of data would be retrieved if both columns A and B both had 10 distinct possible values.

However, what if there’s a special relationship between the columns and the actual number of distinct combinations was somewhat different ? What if B always equals 2 when A equals 5, what if instead of the theoretical 100 different combinations there were only 10 combinations (or some such) because most of the other possible combination don’t actually exist …

This 9i and 10g demo shows how there is indeed only 10 distinct values for each of two different columns, however there is a direct relationship between these columns such that there is actually only 10 distinct combinations of both these columns (and not the 100 combinations which are possible and which Oracle assumes in it’s selectivity calculations).

Instead of the actual 10,000 rows (or 10% of all data) being selected, Oracle is incorrectly assuming only 1000 rows (or 1%) will be selected. This is a significant error by a order of magnitude which in many cases can result in a less efficient execution plan.

With 11g, Oracle can use the statistics associated with an index to give Oracle some vital extra information. Because if there’s an index based on the two columns, then the number of distinct key values recorded for the index can provide Oracle with a much more accurate estimation of the true selectivity based on the two columns. If a concatenated index based on the two columns only has say 10 distinct values, then Oracle can assume that a specific combination of the two columns is likely to also retrieve 1/10 of all the values and not the 1/100 that are theoretically possible.

This identical 11g demo to the one above shows by having an index on the two columns that have a correlation, Oracle is using the DISTINCT_KEYS statistic for the index to determine the correct selectivity and associated cardinality for the query.

However, the demo clearly shows index monitoring is still not showing the index as being “USED”. If you were to hence drop the index, the CBO loses potentially vital information and the cardinality estimates revert back to being the product of the two column densities as with pre 11g.

By dropping the index which appears to not be used, we can potentially impact other execution plans, even though they don’t directly use the index within the execution plan. The correct cardinality estimates of a table can for example potential drive the order in which the table is subsequently joined or the manner in which it’s joined.

This demo on the possible impact of dropping an “unused” index shows how an execution plan can change to be sub-optimal, even though neither execution actually directly uses the associated index. It’s not a particularly “clever” example, but it does illustrate the potential impact of dropping these so called unused indexes.

Of course, with 11g, we now have the capability of collecting extended statistics. We can potentially determine these same level of statistics by generating statistics on both columns combined. Oracle can determine the actual distinct combinations of columns that are somehow correlated and produce more accurate and detailed statistics with hence make the CBO determine more accurate and reliable cardinality estimates.

This final demo on extended statistics shows how we can recreate the more efficient execution plan and provide the CBO with more detailed extended statistics so that it can accurately determine the correct cardinality estimates without the need to recreate the “unused” index.

Extended statistics can be extremely useful in determining correct cardinality values for column combinations that exist however it still falls somewhat short when it attempts to estimate the expected cardinality for combinations that don’t actually exist, even with histograms.

But that’s a tale for another day.

Now it’s time for “Wish You Were Here” …

Empty Leaf Blocks and Statistics (Sense Of Doubt) July 8, 2008

Posted by Richard Foote in Index Access Path, Index Block Splits, Index Delete Operations, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes, Performance Tuning.
14 comments

I’ve recently been discussing how empty index blocks or those blocks that contain nothing but deleted index entries are placed on the index freelist and can potentially be recycled during subsequent index block split operations.

A point that’s not so well known about such empty index blocks is how Oracle considers them when calculating index related statistics and the possible implications this may have on the CBO.

Let’s set the scene with an example I’ve used previously where we load a table/index with 10000 entries and then subsequently delete the vast majority of them.

SQL> create table rich as select rownum id, ‘Bowie’ text from dual connect by level <= 10000;
 
Table created.
 
SQL> create index rich_i on rich(id);
 
Index created.

OK, so we now have an index with 10000 entries. Let’s just check to see how many leaf blocks we currently have:

SQL> analyze index rich_i validate structure;

Index analyzed.

SQL> select lf_rows, lf_blks, del_lf_rows from index_stats;

   LF_ROWS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
     10000         21           0

So we currently have 10000 LF_ROWS and 21 LK_BLKS with no deleted index rows at this stage.

Let’s now deleted the vast majority of rows from the table and hence index row entries from the index:
SQL> delete rich where id <= 9990;
 
9990 rows deleted.
 
SQL> commit;
 
Commit complete.

OK, so now we have an index with the vast majority of the index entries having been deleted and with all but one index leaf block effectively empty.

Let’s start by looking at how the ANALYZE INDEX … VALIDATE STRUCTURE deals with empty leaf blocks and index entries:

SQL> analyze index rich_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, lf_blks, del_lf_rows from index_stats;

   LF_ROWS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
     10000         21        9990

The first thing we notice is that the LF_ROWS statistics still has a value of 10000. It still counts index entries, even if they’ve been deleted.

We also notice that the LF_BLKS value is 21 so those leaf blocks that are effectively empty are still counted as well.

Let’s now collect statistics using DBMS_STATS as currently recommended by Oracle:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘RICH’, cascade => true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

If we now look at the index statistics:

SQL> select index_name, num_rows, leaf_blocks from dba_indexes where index_name = ‘RICH_I';

INDEX_NAME NUM_ROWS LEAF_BLOCKS
---------- -------- -----------
RICH_I           10           1

We notice a couple of important differences. Firstly, the NUM_ROWS value is 10, highlighting that only non-deleted index entries are counted. We also notice that the number of LEAF_BLOCKS is only 1, highlighting that only those index leaf blocks that contain non-deleted index entries are counted. Although there are 20 other leaf blocks within the index structure, these are not counted and considered by the CBO when statistics are calculated using DBMS_STATS.

If we run the following simple little query that effectively selects all remaining rows from the table, we notice the following execution plan:

SQL> select * from rich where id between 1 and 10000;

        ID TEXT
---------- -----
      9991 Bowie
      9992 Bowie
      9993 Bowie
      9994 Bowie
      9995 Bowie
      9996 Bowie
      9997 Bowie
      9998 Bowie
      9999 Bowie
     10000 Bowie

Execution Plan
--------------------------------------------
|Id | Operation                   | Name   |
--------------------------------------------
| 0 | SELECT STATEMENT            |        |
| 1 |  TABLE ACCESS BY INDEX ROWID| RICH   |
|*2 |   INDEX RANGE SCAN          | RICH_I |
--------------------------------------------

The index is actually used to select all the remaining 10 rows, in part because the index related costs are so low.

Let’s see what would happens if we were to use the old, ANALYZE command to calculate the index statistics:

SQL> analyze index rich_i compute statistics;

Index analyzed.

First, let’s see if the index statistics are any different …

select index_name, num_rows, leaf_blocks from dba_indexes where index_name = ‘RICH_I';

INDEX_NAME NUM_ROWS LEAF_BLOCKS
---------- -------- -----------
RICH_I           10          21

OK, a big big difference here. Where previously, DBMS_STATS didn’t include the empty leaf blocks in it’s statistics, we now notice that using the ANALYZE command does include such empty leaf blocks. The LEAF_BLOCKS value is now 21, not 1 as it was previously. Note though that the number of NUM_ROWS is still 10, so it still doesn’t count the deleted index entries themselves, just the empty leaf blocks.

But leaf blocks is one of the key statistics used by the CBO when calculating the cost of using an index related access path. Could this all make a difference in how our previous query is costed by the CBO ?

SQL> select * from rich where id between 1 and 10000;

        ID TEXT
---------- -----
      9991 Bowie
      9992 Bowie
      9993 Bowie
      9994 Bowie
      9995 Bowie
      9996 Bowie
      9997 Bowie
      9998 Bowie
      9999 Bowie
     10000 Bowie

10 rows selected.

Execution Plan
----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| RICH |
----------------------------------

Oh yes indeed. Now the CBO has decided to use a Full Table Scan, in large part because of the additional calculated costs associated with using the index.

Note these tests work the same on all supported versions of Oracle.

So empty leaf blocks can still have a large impact on not only how a query may perform but indeed on how the CBO calculates the associated costs, depending on how the statistics are generated.

Yes, there are differences between the ANALYZE command and DBMS_STATS. This is one of the more subtle differences …

So When Does An Oracle B-Tree Index Increase In Height ? (Almost Grown) April 3, 2008

Posted by Richard Foote in Index Height, Index statistics, Oracle General, Oracle Indexes, Oracle Myths.
59 comments

So when does an Oracle B-Tree index actually increase in height ?

I’ve basically been asked this same question a number of times over the past few days with regard to the discussions on indexes and different block sized tablespaces, so I thought it might be worth quickly sharing the answer to a wider audience.

Imagine a new, empty table and a corresponding new, empty index. At this stage, the index structure basically consists of one, empty block. The index has a BLEVEL of 0 (from DBA_INDEXES) and a HEIGHT of 1 (from INDEX_STATS), yes it can be confusing ;) This block is basically the Root block of the index as it’s the first (and currently only) block to be accessed during an index scan, but at this stage is used to also store the actual index entries as well (and so can kinda be viewed as being a Leaf block as well).

We now start to insert rows into the table and thus row entries into the index. These index entries basically consist of the indexed column(s) and its corresponding ROWID, and are sorted based on the indexed column values.

Eventually, this single index block will fill; Oracle simply can’t add any more index entries into it. Now comes the fun bit.

When Oracle wants to insert a new index entry but it can’t as this Root index block is full, Oracle will allocate two new index blocks. If the new index entry is the maximum value currently to be indexed, Oracle will move all the index entries from the full block and put it into one of the new index blocks and place the new index entry into the other block. This is known as a 90-10 index block split.

If the new index entry isn’t the maximum value, Oracle will place the lower 1/2 valued index entries into one new block and the other 1/2 into the other new block. This is known as a 50-50 index block split.

These two new blocks are now the new leaf blocks in the index structure.

The contents of the previously single filled block is now totally replaced with pointers to the two new blocks. This block therefore remains the Root block in the index structure. These pointers basically consist of the Relative Block Address (RBA) to the new index blocks and a value which represents the lowest indexed value found in the specific referenced leaf block. These indexed values in the Root block are now used by Oracle as the method by which it can navigate the index structure to find the specific index leaf block containing a required indexed entry.

The index has just increased in height and now has a BLEVEL of 1 and a HEIGHT of 2.

As we continue to add more rows into the table, we add more index entries into our 2 leaf blocks. Eventually they will fill again and will again perform either a 90-10 or 50-50 block split depending on the new index value to be inserted. With a non Root block split, only one additional index block is allocated and the index entries are distributed between the full and new index block. Each time a leaf block splits in a BLEVEL 1 index, a new entry is also added into the Root block to point to the new Leaf block. 

Once we have enough Leaf blocks, the Root block will again eventually fill. At this point, Oracle will again allocate two new blocks and distribute the contents of the Root block into these two new blocks, again 90-10 or 50-50 depending on the new indexed value to be inserted. The contents of the Root block is now totally replaced with pointers to these 2 new “Branch” blocks which of course in turn now contain the pointers to the Leaf blocks.

The index has again increased in height and we now have an index with a BLEVEL of 2 and a HEIGHT of 3.

As the leaf blocks continue fill and split, a new entry is added to the corresponding Branch block each time. When these Branch blocks fill and split, a new entry is added to the Root block. When the Root block eventually fills, it will again allocate 2 new blocks and so the index grows in height again.

So basically, an index increases in height whenever the index Root block splits and the two new allocated blocks result in a new level within the index structure. Note the index Root block remains the same throughout the entire life of the index, no matter the index height.

Note also a Root block split is the only time an index increases in height. Therefore, the number of levels between the Root block and any/all of the Leaf blocks is always and must always be the same. Hence, an Oracle B-Tree index is always structurally height balanced, always.

Store Indexes In A Larger Block Tablespace: The Multiblock Read Myth Part II (The Fly) March 20, 2008

Posted by Richard Foote in Index Block Size, Index statistics, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
67 comments

Thought I might begin by mentioning a lovely little story by Billy Verreynne in this OTN Forum Thread.

Basically a scientist is doing research on the behaviour of flies. He notes when he opens the jar lid of a trapped fly and claps his hands, the fly takes off and flies away. One day, he decides to pull the wings off the fly. When he claps his hands, the fly just sits there. No matter how loud he claps, other than a slight rocking motion, the fly just doesn’t budge.

Excitedly, he writes in his journal his latest discovery. When you pull the wings off a fly, it goes stone deaf !!

This is soooo funny because this sort of thing happens all the time in the Oracle world (and elsewhere). I see it time after time after time, we’ve all done it.

Hey, I just compressed a segment into one extent and performance improved. Conclusion, storing a segment in one extent improves performance.

Hey, I just separated my indexes from my tables and performance improved. Conclusion, separating indexes from tables improves performance.

Hey, I just moved my indexes into a larger block sized tablespace and my Index Fast Full Scan performance improved. Conclusion, a larger index block size improves Index Fast Full Scan performance.

Lots of flies with no wings. Lots of people thinking flies go deaf when they don’t have wings. An action appears to cause an effect, but does it really …

As Billy himself suggests, the fundamental reason why so many people think “flies with no wings go deaf” is that many don’t understand what’s actually going on. Many simply don’t understand the basic workings, the fundamental processes and mechanisms involved in how Oracle functions or how Oracle performs a specific operation. Tuning by observation, tuning by making change “A” without understanding all the implications of such a change and subsequently making suppositions on the results of such a change, ultimately means we have lots of people thinking flies without wings are deaf.

As I discussed in Part 1, Oracle performs exactly the same sized I/O during a multiblock read, regardless of the block size of the segment. Exactly the same. Without understanding this simple fact, one could very easily come to a wrong conclusion regarding the ramification of block sizes on multiblock reads. Without understanding flies don’t have ears or sound sensors in their wings, one could very well come to a wrong conclusion regarding the ramifications of removing the wings from a fly.

If we perform an Index Fast Full Scan and performance improves, it can’t be because associated multiblock I/Os are more efficient. A fly doesn’t have sound sensors in its wings. There must be another explanation. Conversely, in my example in Part I of this discussion, performance went worse with a larger index block size (as it did in Greg Rahn’s example on this OTN Forum Thread), but again not as a result of multiblock read performance.

So how could the performance of an Index Fast Full Scan change (for better or worse), if one simply rebuilds the index within a larger block size tablespace ? Well there are of course many possible reasons, with the two more obvious explanations being the following:

1) Most randomly inserted indexes have a PCT_USED value ranging between 70-75% as these indexes perform random 50-50 block splits that are subsequently in differing stages of being filled. By rebuilding an index (say back to a default PCTFREE of 10%) one might increase index compactness by say 15% and hence decrease the overall index size (note reduced block overheads may also reduce the index a little as well, depending on index size and differences in block sizes). The Fast Full Index Scan is the access path that potentially benefits most by defragmenting an index as the associated costs are proportional to the overall size of the index. Reducing the size of an index could therefore impact subsequent performance. However, rebuilding the index in the current block size would likely achieve a similar result (plus block overheads), compacting the index and resulting in potentially better performance (although once the index blocks begin to split again, the index would eventually return back to its previous state). Therefore, it’s not the bigger block size but the resultant defragmentation of the index that’s improved matters. The fly isn’t deaf, it just needs its wings to fly …

2) By storing an index in a larger block tablespace, the index must physically be stored on a different database file. This file could be on a faster disk, improving performance, this file could be on a faster part of the disk, improving performance, this file could be on a disk with far less disk contention, improving performance, etc. etc. It’s not the larger block size that’s improved (or worsened) performance, it’s the new physical characteristics of where the index is now stored. If one were to rebuild the index with the current block size and use the same physical characteristics of the larger block index, subsequent performance would likewise increase (or decrease). The fly isn’t deaf, it just needs its wings to fly …

There are many other possible reasons, the system was less busy when using the larger block index, more of the index was physically cached when using the larger block index, etc. etc.

Of course, an Index Fast Full Scan is rarely a scalability issue anyways. Do we really want our applications to perform hundreds of large, concurrent Index Fast Full Scans ? Tuning the application to avoid these overheads should be the focus rather than moving indexes into a larger block tablespace in the vain hope it will improve things dramatically. But that’s the topic of another discussion …

Can the performance of an Index Fast Full scan change after moving the index to a larger block size tablespace. Absolutely. However, it doesn’t necessarily mean such a change in performance is a direct result of the index having a larger block size and that multiblock read performance has improved.

It doesn’t mean moving indexes to larger block size tablespaces suddenly makes Oracle go deaf …

Bzzzzzzzzzzzzzzzzzz

UPDATE: I’ve added this simple little demo that illustrates how performance improves when an index is rebuilt in a larger block tablespace. This will of course suggest to some folk that the larger block tablespace improved the performance but what actually improved things was rebuilding the fragmented index to be a more efficient structure. The larger block tablespace was not the fix, rebuilding the index was the important factor. In fact, by rebuilding the index in the original smaller block tablespace, not only do we also improve performance, but things are further improved as we reduce CPU overheads incurred by the larger block tablespace and as a result elapsed times are further improved.

Store Indexes In A Larger Block Tablespace: The Multiblock Read Myth (Karma Police) March 18, 2008

Posted by Richard Foote in Index Block Size, Index statistics, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
8 comments

One of the great myths surrounding the use of differing block sizes is that storing indexes in larger block sizes somehow dramatically improves the performance of index related multiblock reads.

Oracle performs index multiblock reads when performing an Index Fast Full Scan, when it basically treats the index structure as a skinny version of the table. It reads the entire index structure, multiple blocks at a time, “throwing away” any non leaf blocks as it stumbles across them.

The theory goes that by storing indexes in larger size blocks, we would obviously have fewer index related blocks. If we need to read the entire index, as we do with an Index Fast Full Scan, surely it must be more efficient if we have fewer, larger index blocks.

The evidence looks convincing. Here’s a link to an extract from a book by Robin Schumacher where he clearly shows a dramatic “improvement” by using an index tablespace with double the block size. In one query using an 8K block index, the consistent reads during an Index Fast Full Scan was 421. However, when the same index was recreated as a 16K block index, the same query only used 211 consistent gets, 1/2 of what it was previously.

Conclusive “proof” that the 16K block index improved performance wouldn’t you say ?

Well actually, it’s only conclusive proof that the number of consistent gets has dropped, whether it actually improves “performance” is another thing entirely.

There are a couple of little “details” that many don’t quite appreciate. The devil is always in the details …

The first point to note is that when Oracle performs a multiblock read, it uses the value in the db_file_multiblock_read_count parameter to determine how many blocks to read per multiblock read (with system statistics, Oracle itself can determine how best to set this value).

So if the db_file_multiblock_read_count value were set to say 16, Oracle will attempt to read as many as 16 blocks at a time during a multiblock read operation.

Note this value is based on the default block size of the database. So if the default block size is 8K and the db_file_multiblock_read_count is 16, Oracle will try and read 16 x 8K blocks at a time during a multiblock read operation.

However, if there’s a non-default block sized segment (say 16K), Oracle will adjust the number of blocks that are actually read during a multiblock read operation so that the maximum size of the overall multiblock read is identical to that of the default block size.

So if the db_file_multiblock_read_count is 16 and the default block size is 8K, a multiblock read of an object in a 16K tablespace will only read 8 blocks at a time (and not 16). A multiblock read of an object in a 2K tablespace will read 64 blocks at a time.

The actual size of a multiblock read therefore is identical regardless of the block size of an object within a database.

An easy way to highlight this is to simply trace a session and see the specific size of corresponding multiblock read operations.

A sample from a trace on an 8K block index (with the db_file_multiblock_read_count set to 16), performing an Index Fast Full Scan looks like this:

WAIT #1: nam=’db file scattered read’ ela= 1487 file#=8 block#=1050 blocks=16 obj#=78294 tim=615409554677
WAIT #1: nam=’db file scattered read’ ela= 1377 file#=8 block#=1066 blocks=16 obj#=78294 tim=615409557777
WAIT #1: nam=’db file scattered read’ ela= 1143 file#=8 block#=1082 blocks=16 obj#=78294 tim=615409561563

Note that Oracle is reading 16 x 8K blocks (128K) per multiblock read operation.

However, when the index is recreated in a 16K block size tablespace, the Fast Full Index Scan looks like this:

WAIT #1: nam=’db file scattered read’ ela= 1413 file#=6 block#=14 blocks=8 obj#=78296 tim=626802128684
WAIT #1: nam=’db file scattered read’ ela= 1447 file#=6 block#=22 blocks=8 obj#=78296 tim=626802131649
WAIT #1: nam=’db file scattered read’ ela= 2014 file#=6 block#=30 blocks=8 obj#=78296 tim=626802135222

Note that Oracle is now only reading 8 x 16K blocks (128K) per multiblock operation.

Both indexes are effectively doing exactly the same work, both are effectively reading up to 128K of data per multiblock read …

It’s like paying someone $50 per 1/2 hour of work and then deciding to make things more “efficient” by paying them $100 per hour of work instead. In the end, you’re still just paying them $800 for an 8 hour day’s work regardless …

Note a larger block size will have less associated block overheads with there being less actual blocks so the overall size of an index may reduce a little, depending on index size and differences in block sizes. Therefore any possible improvements will only be restricted to the potential savings in the overall index size. With many databases having default block sizes of 8k and a maximum block size restricted to 16k, these savings may be minimum or non-existent.

This demo on the impact of different block sizes on multiblock read operations shows how Oracle actually performs the same sized reads when performing multiblock reads from differing block sized tablespaces, with the performance of the index in the larger block size tablespace being somewhat worse in this specific example.

With Oracle effectively performing identical work behind the scenes, the performance between different block size tablespaces is likely to be similar. You’re still paying $800 a day regardless …

Although it’s often claimed that multiblock reads is one of the key areas where larger index block sizes are beneficial, a claim based generally on the simplistic fact the number of consistent reads is reduced, the reality of the situation is somewhat different …

Introduction To Reverse Key Indexes: Part IV (Cluster One) January 21, 2008

Posted by Richard Foote in Clustering Factor, Index statistics, Oracle Indexes, Oracle Myths, Reverse Key Indexes.
5 comments

There’s a myth that suggests if the Clustering Factor (CF) of an index is greater than a certain ratio when compared to the number of rows in the table, the CF is poor and an index rebuild would be beneficial.

The slight problem with this advice is that the CF actually measures how well aligned the order of the column values are in the table as compared to the order of the index entries in the index. Generally, a table in which the column values are ordered in a similar manner to the index will have a CF closer to the number of blocks in the table. A table in which the column values are ordered in a random manner when compared to the index will have a CF closer to the number of rows in the table.

An index rebuild doesn’t change the ordering of the index row entries and an index rebuild has no impact on the table so therefore the comparative ordering of both remains unchanged. Therefore the CF of an index will be identical after the rebuild as it was before.

Well actually, there is one slight exception to this rule. Reverse Key Indexes.

Generally, rows with monotonically increasing column values are physically inserted in the order of the monotonically increasing columns. This may not be the case however with tables in ASSM tablespaces or tables with multiple freelists or freelist groups as concurrent inserts will be directed to differing blocks. In these cases we may actually have data that is quite well clustered but may have quite poor CF values due to the manner in which the CF is calculated.

Assuming a Non-ASSM, single freelist/freelist group table, the CF of monotonically increasing indexed values would ordinarily be quite good. As a non-reverse index must also have its values in the monotonically column order, the CF of the index is likely to be nice and low.

However, if you were to rebuild the index as a Reverse Key Index, the index values get reversed and “randomly” redistributed within the index structure, totally changing the order of the index entries within the index. As a result, the index values are no longer aligned with those of the table and the CF is likely to now be quite appalling.

Rebuilding an index generally has no impact on the CF as the index row values retain the same logically order. Rebuilding an index to be reverse (or visa-versa) is the exception to the rule as it will physically (and logically) change the index row order.

A Reverse Key Index is likely therefore to have a much worse CF than it’s non-reverse equivalent.

A Reverse Key Index will be ignored for range predicates (as already discussed in Part I) so a poor CF may not have an impact. However, as also discussed, index range scans are still viable in some scenarios so an increased CF may impact execution plans detrimentally.

See this demo on how a Reverse Index Rebuild turns a “perfect” CF into a shocker.

DBMS_STATS METHOD_OPT default behaviour changed in 10g. Be careful … January 4, 2008

Posted by Richard Foote in Index statistics, Oracle Cost Based Optimizer, Oracle General, Performance Tuning, Richard's Musings.
76 comments

A question on the OTN forum has prompted me to quickly knock up a demo on the possible dangers of the default behaviour in 10g with regard to the METHOD_OPT option in DBMS_STATS.

When collecting statistics with DBMS_STATS in 9i, the default value of METHOD_OPT was ‘FOR ALL COLUMNS SIZE 1′. This basically says to Oracle please only collect basic column statistics (min, max, distinct values etc.), do not collect histograms on these columns. For columns that are evenly distributed and for columns that are not referenced in SQL statements, this is perfectly adequate. If a column was unevenly distributed and detrimentally impacted the CBO’s costings of an execution plan, then one could generate histograms for those particular columns separately.

However, this default behaviour changed in 10g and IMHO this change is possibly the most significant and problematic difference when migrating to 10g.

The new default value of METHOD_OPT with 10g is ‘FOR ALL COLUMNS SIZE AUTO’. This basically means that Oracle will automatically decide for us which columns need histograms and which columns don’t based on what it considers to be the distribution of values within a column and based on the “workload” associated with the table (basically are there any SQL statements running in the database referencing columns which might need histograms for those statements to be costed correctly).

This sounds like an ideal scenario, just let Oracle work it out for us.

However, the problem is that Oracle in many cases doesn’t do a particularly good job at determining when it should generate a histogram and when it shouldn’t. In fact, the likelihood is that Oracle will actually generate many many many unnecessary histograms while still missing out on some columns that should have them.

In environments with few tables and with few users executing few distinct SQL statements, the impact of some unnecessary histograms may be minimal. However in environments with many tables and columns (potentially many thousands) with many users executing many different SQL statements, the ramifications of potentially suddenly having thousands of additional histograms can be disastrous.

Note also that by having a histogram, Oracle changes the manner in which the DENSITY statistic for a column is calculated (as stored in DBA_TAB_COLUMNS). This is often used by Oracle to determine the selectivity of predicates so the impact of suddenly having additional unnecessary histograms can be wider and more significant than one might initially imagine.

Of course, the impact on the shared_pool and the row_cache and it’s associated latches in particular can be extremely ugly indeed if suddenly Oracle had to deal with thousands of new histograms when parsing statements.

This silly little demo, “Dangers of default METHOD_OPT behaviour in 10g“,  creates a simple little table with three columns. The first column has an outlier value and as previously discussed here, a histogram might be required to correctly cost range scans. The second column is perfectly distributed, it has 10 distinct values with 100,000 occurrences of each. The third column is also perfectly distributed but it’s a special example in that it has only 1 distinct value.

As you can see by the results of the demo, Oracle has got it wrong one way or the other in varying degrees in all three examples. It hasn’t created a histogram when it was needed and created histograms when they weren’t needed, impacting the Density column statistics as a result.

My advice. Just be very careful when using the default method_opt ‘FOR ALL COLUMNS SIZE AUTO’ behaviour in 10g.

Index Internals – Rebuilding The Truth December 11, 2007

Posted by Richard Foote in Index Coalesce, Index Height, Index Internals, Index Rebuild, Index Shrink, Index statistics, Oracle Indexes, Oracle Myths, Oracle Opinion, Richard's Musings.
7 comments

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.

Recently updated version: Index Internals – Rebuilding The Truth

Follow

Get every new post delivered to your Inbox.

Join 1,883 other followers