jump to navigation

Myth: Bitmap Indexes With High Distinct Columns (Supermassive Black Hole) March 3, 2010

Posted by Richard Foote in Bitmap Indexes, CBO, Clustering Factor, Oracle Indexes, Oracle Myths.
34 comments

As discussed in my previous post, it’s a silly myth to suggest a bitmap index should only be used for so-called “low cardinality” columns else the resultant index would be “huge”. I thought it might be worth a quick refresh to see a simple little demonstration why such claims are a nonsense.  There is in fact no limit to the number of distinct values by which a column could be considered a candidate for a bitmap index.  

I’ve already shown how a bitmap index on a column with 10,000 distinct values could potentially be smaller than an index with just 4 distinct values, even with the same data type and size. The number of distinct values in a column is but one small consideration, the number of rows in the table and the average ratio of repeated values are just as important. The other important consideration that can significant impact the size of a bitmap index is the distribution and clustering of the data within the table as we shall see.    

Using the same demo as the previous post, a reminder of the size of our bitmap index on a column with 10,000 distinct values in a 1 million row table.
 

SQL> select index_name, index_type, distinct_keys, blevel, leaf_blocks from dba_indexes where index_name = 'BIG_BOWIE_CODE_BITMAP_I';
 
INDEX_NAME                INDEX_TYPE DISTINCT_KEYS  BLEVEL LEAF_BLOCKS
------------------------- ---------- ------------- ------- -----------
BIG_BOWIE_CODE_BITMAP_I   BITMAP             10000       1          52
 

  

 
Let’s now see if the CBO will actually use this bitmap index on its own: 

  

SQL> select * from big_bowie where code = 42;

100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4280385727

--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |   100 |  7300 |    22   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIG_BOWIE               |   100 |  7300 |    22   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                         |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | BIG_BOWIE_CODE_BITMAP_I |       |       |            |          |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("CODE"=42)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1854  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 
 

Absolutely it does. As there are on average just 100 rows per distinct value, that's a small enough selectivity for the CBO to use the bitmap index on its own. Note the query has used just 6 consistent gets to return the 100 rows of data.
 
Let's now drop the bitmap index and see how a regular B-Tree index would compare and perform:
 

  
SQL> drop index big_bowie_code_bitmap_i;
 
Index dropped.
 
SQL> create index big_bowie_code_i on big_bowie(code);
 
Index created.
 
SQL> select index_name, index_type, distinct_keys, blevel, leaf_blocks, clustering_factor as cf from dba_indexes where index_name = 'BIG_BOWIE_CODE_I';
 
INDEX_NAME           INDEX_TYPE DISTINCT_KEYS  BLEVEL LEAF_BLOCKS         CF
-------------------- ---------- ------------- ------- ----------- ----------
BIG_BOWIE_CODE_I     NORMAL             10000       2        2090      10895

  

The first thing we notice is that the B-Tree index is significantly larger than the equivalent bitmap index. 1090 leafs blocks whereas the bitmap index was only 56 leaf blocks. So it's not the bitmap index that's so-called "huge" here on a column with 10,000 distinct values but the corresponding B-Tree index. Notice also that the Clustering Factor of the index is quite good at 10,895 in a 1 million row table.   

If we now run the same query as before: 

 
SQL> select * from big_bowie where code = 42;
 
100 rows selected.
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1856845120
 
------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                  |   100 |  7300 |     5 (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_BOWIE        |   100 |  7300 |     5 (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIG_BOWIE_CODE_I |   100 |       |     3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("CODE"=42)
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
       1854  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 
 

 

The query also uses the index but as the B-Tree index is somewhat larger, the overall number of consistent reads has increased from 6 up to 8. So not only is the bitmap index substantially smaller despite having 10,000 distinct values, but it's also more efficient to use as a result.A key reason why the bitmap index is so small and compact is because the effective Clustering Factor of the indexed column is excellent. The data was inserted into the table in CODE column order and so all the values with the same CODE value are grouped, ordered and clustered together within the table. Within the bitmap index, this means there are large and few grouping of zeros (0) that can be compressed extremely efficiently. 

For example, for the first CODE value of 1, the bitmap sequence would look something like: 

111111 .... 1110000000000000000000000000000000........000000 

with the 100 values of 1 (true) grouped together followed only by a series of effectively 999,900 zeros (o representing false). The 999,900 zeros can be compressed back almost nothing at all. Note there could actually be somewhat more false (0) values than actual rows in the table but that's a discussion for another day. 

The next value of 2 would have a bitmap sequence something like: 

00000000....0001111111...11111100000000000000000000...0000 

with 100 zeros followed by 100 1s followed by effectively 999,800 zeros, with again the two grouping of zeros compressed down to almost nothing at all. 

And so on ... 
Let's now create a different table that contains the identical data but  this time with the CODE values effectively randomized throughout the table. The Clustering Factor of the CODE column in this case will be terrible: 

 
 
SQL> create table big_bowie_2 as select * from big_bowie order by mod(id,100);
 
Table created.
 
SQL> create bitmap index big_bowie_2_code_bm_i on big_bowie_2(code);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BIG_BOWIE_2', estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select index_name, index_type, distinct_keys, blevel, leaf_blocks from dba_indexes where index_name = 'BIG_BOWIE_2_CODE_BM_I';
 
INDEX_NAME             INDEX_TYPE DISTINCT_KEYS  BLEVEL LEAF_BLOCKS
---------------------- ---------- ------------- ------- -----------
BIG_BOWIE_2_CODE_BM_I  BITMAP             10000       1         520
 

 

The first thing we notice is that the bitmap index is now substantially larger than it was previously. It's gone from just 52 leaf blocks all the up to 520 blocks, a whole 10 times larger than before. The reason is all to do with the clustering of the data as now values for CODE are littered all over the table and are no longer grouped together. 

The bitmap sequence for the first CODE value of 1 would now look something like: 

00000000000100000000000000000..0000010000000000000000...0000010000000000001000000000...00000100000.... 

with the 1s now littered all over the place. This means it's far less efficient to compress the groups of zeros as there are now substantially more such groupings than before. The index is now 10 times larger as a result. 

If we now run the same query as before: 

  

SQL> select * from big_bowie_2 where code = 42;
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1324437154
 
------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                       |   100 |  7300 |  22     (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIG_BOWIE_2           |   100 |  7300 |  22     (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                       |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | BIG_BOWIE_2_CODE_BM_I |       |       |            |          |
------------------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("CODE"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        103  consistent gets
          0  physical reads
          0  redo size
       1854  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed
 

 

 Although the CBO is still using the bitmap index on its own, the performance of the query has deteriorated substantially with the number of consistent gets increasing from 6 all the way up to 103.  

So the bitmap index is now nowhere near as efficient. A bitmap index isn't that great with large numbers of distinct values if the associated clustering is poor and so the B-Tree index is the way to go after all, right ? 

Well just wait a minute. If the clustering is poor for a bitmap index, the clustering will likewise be poor for a corresponding B-Tree index as well.  Most of these consistent reads are due to reading the data out of the table, not from reading the index directly. So the performance of using an equivalent B-Tree index is also likely to be impacted as well. 

Let's compare the difference with a B-Tree index: 

   
SQL> drop index big_bowie_2_code_bm_i;
 
Index dropped.
 
SQL> create index big_bowie_2_code_i on big_bowie_2(code);
 
Index created.
 
SQL> select index_name, index_type, distinct_keys, blevel, leaf_blocks, clustering_factor as cf from dba_indexes where index_name = 'BIG_BOWIE_2_CODE_I';
 
INDEX_NAME             INDEX_TYPE DISTINCT_KEYS  BLEVEL LEAF_BLOCKS         CF
---------------------- ---------- ------------- ------- ----------- ----------
BIG_BOWIE_2_CODE_I     NORMAL             10000       2        2090     999922
 
 

 

The first thing to note here is that the B-Tree index is still 2090 leaf blocks in size. Even compared with the now far less efficient Bitmap index, at 520 leaf blocks it's still approximately just 25% the size of the B-Tree index. So the 10,000 distinct value bitmap index, even when it's as inefficient as possible, still can't be described as "huge"  here as it's still only a 1/4 the size of the corresponding B-Tree index. With a Clustering Factor of 999,992 on a 1 million rows table, it doesn't really get much worse than that and yet the Bitmap index on a 10,000 distinct column is still way ahead of the B-Tree index.  

Let's see how the query performs now: 

 
SQL> select * from big_bowie_2 where code = 42;
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 2550134594
 
--------------------------------------------------------------------------------------------------
| Id  | Operation                   | Name               | Rows  | Bytes |  Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                    |   100 |  7300 |   103   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BIG_BOWIE_2        |   100 |  7300 |   103   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BIG_BOWIE_2_CODE_I |   100 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("CODE"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        105  consistent gets
          0  physical reads
          0  redo size
       1854  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

As we can see, the performance of the B-Tree index has likewise deteriorated with such a bad Clustering Factor. At 105 consistent gets, it's still worse than the corresponding Bitmap index which only needed 103 consistent gets. 

With a Bitmap index that is as inefficient as it can be, on a column that has 10,000 distinct values in a table of only 1 million rows, the Bitmap index still outperforms the corresponding B-Tree index. 

It's a complete myth and utter nonsense that a bitmap index is only suitable for low cardinality columns and would become "HUGE" and be 100s of times slower if created on so-called high cardinality column 

Let me repeat: There is no limit to the number of distinct values in a column for it to be considered for a Bitmap index.

How Does An Execution Plan Suddenly Change When The Statistics (And Everything Else) Remains The Same ? (In Limbo) February 16, 2010

Posted by Richard Foote in CBO, Index Access Path, Oracle Cost Based Optimizer, Oracle Indexes, Oracle Myths.
111 comments

I’ve slipped this post in as there have been a number of discussions recently on how execution plans have changed while nothing else appears to have changed in the database. How can an execution plan suddenly change when no one has made any changes to the database ?
 
By no changes, it means that there have been no alterations to any segments, no new indexes have been added, no changes associated  bind peeking (indeed, there may not even be any bind variables), no parameters changes, no new patches or upgrades, no new outlines or profiles, no new system stats and perhaps most prevalent of all, no changes to any CBO statistics.
 
The DBA hasn’t touched a thing and yet suddenly, for no apparent reason, execution plans suddenly change and (say) an inappropriate index is suddenly used and causes performance degradation.
 
How can this be possible ?
 
There are two key points I want to emphasise.
 
The first is that there’s a common misperception that if no new statistics are gathered (and assuming nothing else is altered in the database), that execution plans must always remain the same. That by not collecting statistics, one somehow can ensure and guarantee the database will simply perform in the same manner and generate the same execution plans.
 
This is fundamentally not true. In fact, quite the opposite can be true. One might need to collect fresh statistics to make sure vital execution plans don’t change. It’s the act of not refreshing statistics that can cause execution plans to suddenly change.
 
The second point is that when one goes through all the things that might have changed in the database, two important aspects are often overlooked.
 
The first thing that does usually change within most databases is the actual data within the database. Those damn users log on and keep adding new data and modifying data all the time. It might not be a database change as such but the fact the data changes within a database is a critical change that can directly influence CBO behaviour. When pointing the finger at what might have caused an execution plan to change, many simply ignore the fact the data is constantly changing in the background.
 
The other aspect that always changes is time. People have tried but it’s very difficult to stop time. When things worked well, it was at a different point in time than now when things have suddenly gone wrong.
 
So some things do change that are not in direct control of the DBA.
 
But if we don’t collect fresh statistics, even though the data might have changed, won’t those data changes be effectively invisible to the CBO? Won’t the statistics not reflect any possible data changes and if the CBO doesn’t think the data has changed, doesn’t that mean it can’t suddenly change how it determines an execution plan ?
 
Not true. It’s quite possible that because the statistics haven’t changed, the CBO is forced into makings changes in how it costs and determines an execution plan.
 
A very simple example follows, a classic case of why not refreshing statistics has caused the CBO to suddenly change an execution plan for no apparent reason.
 
I’ll begin by creating a simple little table and populate it with approximately 5 years worth of data.

 
SQL> create table muse (id number, muse_date date, name varchar2(10));
 
Table created.
 
SQL> declare
  2  v_count  number;
  3  begin
  4  v_count:=0;
  5  for i in 1..1830 loop
  6     for j in 1..1000 loop
  7     v_count:= v_count+1;
  8     insert into muse values (v_count, sysdate-i, 'MUSE');
  9     end loop;
 10  end loop;
 11  commit;
 12  end;
 13  /
 
PL/SQL procedure successfully completed.
 
SQL> create index muse_i on muse(muse_date);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'MUSE', casca
de=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

So the procedure basically populates the table, setting the MUSE_DATE column with approximately 5 years worth of data, with 1000 rows for each day so the data is evenly distributed across those 5 years.

Note that I’ve collected statistics on the table and index and they’re fully up to date.

The following query is a typical query in our application, where we’re only interested in looking at the previous year’s worth of data. It simply selects all data that is only a year old. This is a query that’s run all the time and only looks at a “moving window” of data, that being just those rows that were inserted up to a year ago.


 
SQL> select * from muse where muse_date > sysdate - 365;
 
364000 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2738706195
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   363K|  6390K|  1330  (11)| 00:00:07 |
|*  1 |  TABLE ACCESS FULL| MUSE |   363K|  6390K|  1330  (11)| 00:00:07 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("MUSE_DATE">SYSDATE@!-365)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       5992  consistent gets
       5912  physical reads
          0  redo size
    3638996  bytes sent via SQL*Net to client
       1188  bytes received via SQL*Net from client
         74  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     364000  rows processed
 

Notice how the CBO has decided to use a Full Table Scan (FTS) as a year is quite a chunk of the table and is more effectively accessed in this manner. Notice also how the CBO has got the cardinality spot on and has correctly predicted the number of rows to be returned. If the CBO gets the selectivity and so the cardinality of the query correct, we have some confidence that it has indeed come up with the most efficient execution plan. Indeed, the users are perfectly happy with the performance of the query, the DBAs are happy and because we don’t really want to risk the CBO suddenly changing things, we decide to not collect statistics on this table any more.

However, more data is pumped into the table each and every day by the end-users.

The following procedure will add another years worth of data into the table to simulate how the table will be populated in a year’s time …

SQL> declare
  2  v_count  number;
  3  begin
  4  v_count:=1830000;
  5  for i in 1..365 loop
  6     for j in 1..1000 loop
  7     v_count:= v_count+1;
  8     insert into muse values (v_count, sysdate+i, 'MUSE');
  9     end loop;
 10  end loop;
 11  commit;
 12  end;
 13  /
 
PL/SQL procedure successfully completed.

Note that we have NOT collected any new statistics.

OK, let’s now fast track ourselves one year into the future and run the same query again. Note in a year’s time, we will be 365 days past the current sysdate. So we’ll mimic running the identical query by simply adding 365 days to the sysdate and again querying for the latest year’s worth of data:


 
SQL> select * from muse where muse_date > (sysdate+365) - 365;
 
365000 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 1682432684
 
--------------------------------------------------------------------------------------
| Id  | Operation                   | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |        |   944 | 16992 |     9   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MUSE   |   944 | 16992 |     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | MUSE_I |   944 |       |     5   (0)| 00:00:01 |
--------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("MUSE_DATE">SYSDATE@!+365-365)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       4005  consistent gets
       1301  physical reads
     134192  redo size
    4024147  bytes sent via SQL*Net to client
       1188  bytes received via SQL*Net from client
         74  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     365000  rows processed
 

We notice that the execution plan has now changed !!

It’s now suddenly starting to use an index where previously it was using a FTS. Notice also that the CBO has got the cardinalty estimate way wrong, predicting only 944 rows will be returned. Instead of estimating it will get a year’s worth of data, the CBO is estimating only approximately 1 days worth of data or the selectivity based on one distinct value. If the CBO get’s this so terribly wrong, it’s a good chance it has also got the execution plan terribly wrong as well.

The query is effectively the same query that would be run in a year’s time, the statistics have not been changed and yet the execution plan has indeed changed. The CBO suddenly using this index may be a terrible thing, resulting in a really inefficient execution plan and a massive increase in LIOs.

Why has the plan changed when the statistics have not ?

The key issue here is that the CBO thinks the maximum date in the table was from a year ago when the statistics were last calculated. However, the query is attempting to select data that is beyond the range of values known to the CBO. How can it now know the estimated cardinality of the query, the number of expected rows to be returned when we’re only interested in rows that are beyond its maximum known range of data ?

The answer is that it can’t. Not accurately anyway.

The CBO has to guess and depending on the version of Oracle and the type of query being parsed, it can guess in a number of different ways. Because the query is effectively after data that is greater than the maximum known value, the CBO is basically “guessing” there will only be a days worth of data greater than its maximum known value, not the full years worth that’s really in the table. The CBO having to guess is not a good thing, especially when it’s more than likely to get the guess hopelessly wrong.

Note this change will have occurred suddenly one day into the future when the CBO  starts to consider there are so few days worth returning that the index suddenly becomes the best and cheapest option.

How do we fix this inefficient execution plan ?

Rather than having the CBO guess how many rows might be returned, let it actually know. Simply collect fresh statistics and let the CBO know that we actually have a full year’s worth of data since the statistics were previously collected:

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'MUSE', cascade=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.

 

If we run the same query again now …


 
SQL> select * from muse where muse_date > (sysdate+365) - 365;
 
365000 rows selected.
 
 
Execution Plan
----------------------------------------------------------
Plan hash value: 2738706195
 
--------------------------------------------------------------------------
| Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |      |   364K|  6413K|  1652  (14)| 00:00:09 |
|*  1 |  TABLE ACCESS FULL| MUSE |   364K|  6413K|  1652  (14)| 00:00:09 |
--------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("MUSE_DATE">SYSDATE@!+365-365)
 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       7205  consistent gets
       6709  physical reads
          0  redo size
    4024147  bytes sent via SQL*Net to client
       1188  bytes received via SQL*Net from client
         74  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     365000  rows processed

 

We now notice the CBO has got the cardinality spot on again and choses to use the efficient FTS.

So yes, an execution plan can change even if we don’t make any changes to the database, including not collecting fresh statistics. If you think by not collecting statistics, things will simply remain the same, one day when you least expect it, things might suddenly go terribly wrong.

Solving such issues can be extremely different if you try to do so by looking at what might have changed, especially if you don’t know what you’re looking for …

The CPU Costing Model: A Few Thoughts Part V (Reality) January 13, 2010

Posted by Richard Foote in CBO, Multiblock Reads, Oracle Cost Based Optimizer, Richard's Musings, System Statistics.
5 comments

There’s plenty more I could talk about regarding the CBO CPU costing model and system statistics but I’ll make this my final little comment on this subject for now.
 
As previously discussed, the CPU costing model basically takes the time it takes to perform the all necessary I/O related activities and all the time it takes to perform all necessary CPU related activities and adds them together to get the overall time to complete a task. The CBO then takes this total and divides it by the average time to perform a single block I/O so that it expresses the overall costs in units of single block I/Os.
 
There are two advantages with expressing CBO costs in this manner.
 
Firstly, it makes the move from the old I/O costing model a little easier in that the “units” of cost under both CBO costing models is very similar.
 
With the I/O costing model, the unit of cost was also basically the number of I/Os. It’s just that the CBO made no (automatic) distinction between the I/O costs associated with single and multiblock reads. The cost was simply the expected total number of I/Os for a given execution plan, with single block and multiblock I/Os being consider the same (unless the optimiser_index_cost_adj parameter kicked in).
 
With the CPU costing modelling, the costs are expressed specifically in units of single block I/Os. However, the CBO automatically takes into consideration and differentiates the relative costs associated with multiblock I/Os (and CPU operations) and incorporates them automatically into the final cost.

The other nice advantage is that one can use the actual cost values as an indication of how long an operation or execution plan is likely to take. The overall execution times of the plan are divided by the average time of a single block I/O when using the CPU costing formula. Therefore by multiplying these cost values out again by the average time of a single block I/O (SREADTIM system statistic), one can have an indicative idea of the overall expected execution time.
 
The overall execution times as estimated by the CBO using the CPU costing model is therefore basically = cost of execution plan multiplied by SREADTIM system statistic.
 
Using my previous example with the FTS where the overall cost of the execution plan was 70, and the SREADTIM system statistic was 5:
 
the overall execution time as estimated by the CBO is approximately 70 x 5 = 350 ms.
 
Now this of course is only an indicative value. As all system related statistics are simply averages, there could obviously be discrepancies with how long specific I/Os take to actually perform, the size and number of specific multiblock read operations, etc. There may also be caching characteristics of objects that may influence the actual number of physical reads and associated wait times, it doesn’t take into consideration time taken to actually return data to the “client”, etc. etc. etc.
 
However, it provides one with a rough “ballpark figure”. If the actual executions times in the above example were (say) 20 seconds, then it’s a strong indication that the CBO may have got it wrong, that it may have calculated the wrong cost and maybe as a result the wrong execution plan. Somewhere, something such as the segment statistics, the system statistics, optimizer parameters, etc. may be inaccurate and is causing the CBO to get its costings incorrect.
 
The CBO cost value doesn’t compare well to reality and so is perhaps worthy of further investigation.
 
The cost values associated with CPU costing model is not some random, ambiguous, mysterious number but a value that can often be derived and which can be most useful in determining and resolving problematic SQL statements and execution plans.

The CPU Costing Model: A Few Thoughts Part IV (Map of the Problematique) January 7, 2010

Posted by Richard Foote in CBO, Oracle Cost Based Optimizer, System Statistics.
10 comments

It’s called the CPU Costing model because among other things, it includes the time associated with performing CPU operations.
 
The CPU Costing model formula once again:
 
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 

So the portion detailing the sum of all required CPU cycles divided by the CPU cycles per second can potentially contribute a significant proportion of the overall costs.
 
When I previously discussed the costs associated with the CPU model between using an index and a FTS, the FTS example I used had an overall cost of 70 but I calculated that the I/O component costs were only 67. Therefore the costs directly related to CPU operations with the FTS example was 3.
 
However, these CPU specific costs in this example may vary from database to database, although the FTS might be identical as might the required CPU cycles. However, a variable in all this is the CPU cycles per second system statistic (CPUSPEED) associated with a particular database environment.
 
Obviously, the faster the CPUs, the quicker it can perform the necessary CPU operations associated with the FTS (or any operation for that matter). Conversely, the slower the CPUs, the longer it will take to complete the necessary CPU related operations. The CPU costing model formula automatically takes all this into consideration.
 
In the previous example, the CPUSPEED system statistic was 1745.
 
Let’s now run the identical FTS but this time with a faster and a slower CPU and see how this might adjust the overall related costs when using the CPU costing model.
 
One can simulate a “faster” CPU by simply adjusting the CPUSPEED system statistic. Let’s make the CPUs appear 10 times faster:
 

SQL> exec dbms_stats.set_system_stats(pname=>’cpuspeed’, pvalue=>17450);
 
PL/SQL procedure successfully completed.
 

OK, let’s now see how this impacts the cost of the FTS:
 
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
 
1000 rows selected.
 

Execution Plan
———————————————————-
Plan hash value: 910563088
 
——————————————————————-
|Id|Operation         |Name       |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————-
| 0|SELECT STATEMENT  |           |1000|18000|   67   (0)|00:00:01|
|*1| TABLE ACCESS FULL|BOWIE_STUFF|1000|18000|   67   (0)|00:00:01|
——————————————————————-
 

We notice that the overall cost has reduced from 70 down to 67. The cost of 3 that was previously attributed to just the CPU related costs have all disappeared and the costs are now just the 67 in relation to the I/O component.
 
The CPU is now so fast that it can effectively perform all the necessary operations in a negligible amount of time. An even faster CPU will not further improve the costs associated with this FTS as the costs now only include the I/O related components.

The (%CPU) value of (0) gives us this information if you didn’t follow how I derived the I/O cost of 67 in my previous post.

If we go the other way and now make the CPU about 1/10 the speed of the original example:
 
 
SQL> exec dbms_stats.set_system_stats(pname=>’cpuspeed’, pvalue=>175);
 
PL/SQL procedure successfully completed.
 
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
 
1000 rows selected.
 

Execution Plan
———————————————————-
Plan hash value: 910563088
 
——————————————————————-
|Id|Operation         |Name       |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————-
| 0|SELECT STATEMENT  |           |1000|18000|   93  (28)|00:00:01|
|*1| TABLE ACCESS FULL|BOWIE_STUFF|1000|18000|   93  (28)|00:00:01|
——————————————————————-
 

We now notice the overall costs have jumped up considerably up from 70 up 93.
 
The costs associated directly with CPU activities have now increased up from 3 to 26. The CPU component is in the ballpark of 10 times as expensive/significant when you take into account rounding errors (the original 3 value was rounded accordingly). Remember also that these figures are times expressed in units of time it takes to perform a single block I/O.
 
The CPUs are now so slow that it takes a considerable amount of time to complete all the required CPU operations.
 
Note that the (%CPU) value is now a significant (28%) of the overall costs as derived from the following formula:

round(cpu related cost/total cost) x 100 = round(26/93 x 100) = 28.
 
So having a faster (or possibly slower) CPU when performing a hardware upgrade/change can result in potentially different execution plan costings (and as such different execution plans) when using the CPU CBO costing model.
 
It’s called the CPU costing model for a reason and as one would indeed hope, the speed of said CPU(s) can directly impact the associated costs and decisions made by the CBO.

The CPU Costing Model: A Few Thoughts Part III (Bang Bang) December 21, 2009

Posted by Richard Foote in CBO, Multiblock Reads, System Statistics.
add a comment

One of the advantages of system statistics and the CPU costing model is in how the CBO deals with multiblock reads and the impact of the db_file_multiblock_read_count parameter.

When performing a full table (or fast full index) scan, the CBO needs to determine just how many multiblock read operations are likely to be performed so the associated operation can be costed correctly. The CBO simply takes the number of blocks to be read (e.g. for a table, BLOCKS in dba_tables), and divides this by the “effective” multiblock read count.

With the I/O costing model, the effective multiblock read count value is derived directly from the db_file_multiblock_read_count parameter. However, the CBO doesn’t use the actual db_file_multiblock_read_count value as the CBO knows that most multiblock read operations are unlikely to read the maximum possible number of  blocks.

This is due to a number of factors. For example a multiblock read operation can’t span across extent boundaries. The more common reason however is that Oracle will not read a block into the buffer cache that is already cached and so will split a multiblock read operation at the point where a block is already in cache. Therefore, if the db_file_multiblock_read_count is set to (say) 16 blocks, many of the actual multiblock read operations may read something less than the maximum 16 blocks if some of the blocks in the table are already cached. So if within the next 16 possible blocks to be read, the 6th block is already in cache, the multiblock read will only consist of the 5 uncached blocks and the next mutliblock read will commence from the 7th block. 

The CBO takes this into consideration and therefore doesn’t simply divide the blocks in the table by the full db_file_multiblock_read_count value. It uses a “fudged” or an “adjusted” multiblock read value in its calculations. Jonathan Lewis discusses this very well in his Cost-Based Oracle Fundamentals book and lists how the adjusted multiblock read value for the following examples is adjusted:

db_file_multiblock_read_count of 8 is adjusted to 6.59
db_file_multiblock_read_count of 16 is adjusted to 10.40
db_file_multiblock_read_count of 32 is adjusted to 16.39
db_file_multiblock_read_count of 64 is adjusted to 25.84 

Now these adjusted values are nothing more than globally implemented “guesses”, adjusted with the assumption that the larger the multiblock read operation, the more likely a block will actually be cached and the less likely the maximum possible multiblock read is going to be actually performed.  Not only are these adjusted values simply guesses, but they’re derived directly from the db_file_multiblock_read_count parameter. By increasing this parameter, you directly impact the actual costs associated with FTS as the value by which to divide the number of blocks in the segment is impacted. Adjusting this parameter needs to be very carefully considered as it not only changes the number of blocks that might be read per multiblock read but the associated CBO costs as well. Simply increasing the db_file_multiblock_read_count blindly might lead Oracle to start favouring FTS operations inappropriately, as the increase might not actually result in fewer, more efficient multiblock reads but it will result in lower FTS costs.

System Statistics and the CPU costing model simplifies and addresses some of these issues.

The first point is that the CBO no longer uses a blind guess based on the size of the db_file_multiblock_read_count parameter, but rather an “educated” guess based on the actual average size of multiblock read operations in the specific database environment. The MBRC system statistic is the actual average size of a multiblock read operation during the period in which system statistics are collected.

Like any average or single generic figure, it will not always be right but at least it’s based on the actual average multiblock read size in the specific database environment. You can’t really ask more from an average figure than to use the actual average figure in your environment. 

However, the other important point is that simply increasing the value of the db_file_multiblock_read_count parameter is no longer quite so dangerous as it will not now directly impact the costs of multiblock read operations if indeed increasing the parameter has no such or very minimal effect. Only if the average MBRC is actually changed and reflected in any updated system statistics will the subsequent CBO costs of FTS be adjusted.

Therefore, you can potentially apply one of 2 strategies when setting the db_file_multiblock_read_count parameter with the CPU costing model:

 1) Simply set it to the highest supported value in your environment and get the “biggest bang for your buck” while the actual average MBRC is automatically captured by the system statistics and used for costing purposes by the CBO, or

2) Simply leave the parameter unset and let Oracle automatically determine and set the parameter to the largest physical read size supported by your environment (recommended)

Either way, you allow the multiblock reads to be as large and efficient as possible, you allow the actual average multiblock read size to be as large and efficient as possible, but the CBO will only use the actual average multiblock read that is achieved in your database environment when determining the cost of FTS operations. This means multiblock reads will be as large and as efficient as possible but will likely be costed appropriately by the CBO.

Not only are the I/O costs relating to multiblock reads likely to be more accurate in general as a result of using system statistics, but just as importantly, they also take into consideration the actual CPU related costs of such operations and automatically incorporate these into the final costs as well.

To be discussed next.

So in summary, when using CPU costing model, don’t set the db_file_multiblock_read_count parameter, let Oracle determine the maximum optimal size for you and ensure the system statistics are accurate and reflect the actual average MBRC size in your database environment.

The CPU Costing Model – A Few Thoughts Part II December 14, 2009

Posted by Richard Foote in CBO, OPTIMIZER_INDEX_COST_ADJ, Oracle Cost Based Optimizer, Oracle Indexes, System Statistics.
8 comments

As previously discussed, the formula used by the CBO using the CPU costing model is basically:
 
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 

When determining the multiblock I/Os costs associated with a FTS, the CBO basically:
 
 – determines the number of multiblock operations (blocks in dba_tables / mbrc system statistic)
 
 – then multiplies this out by the average wait time of a multiblock I/O (mreadtim system statistic)
 
to determine the total wait time for all expected multiblock read operations.
 
  -This total wait time of all multiblock read operations is then finally divided by the average wait time for a single block I/O (sreadtim system statistic) to express the final cost in units of single block I/Os.
 
 
Remember these average wait times associated with both single and multiblock I/Os are actual wait times for these events as experienced in the specific database environment and captured during the collection of system statistics.
 
Therefore, the formula automatically takes into consideration and incorporates into the calculations any discrepancies and differences in wait times between a single and a multiblock I/O.
 
For example, if a multiblock I/O actually takes (say) 10ms to perform on average, while a single block I/O only takes (say) 5ms to perform on average, then the formula will automatically make the costs of performing multiblock reads to be twice as expensive as the costs associated with performing the single block reads as performed by index scans.
 
These discrepancies in costs and trying to make a level playing field when comparing the multiblock I/Os costs associated with FTS vs. the single block I/Os costs associated with index scan is precisely what the optimizer_index_cost_adj parameter was designed to addressed.
 
Rather than treat both types of I/Os as being the same, which is the default behaviour with the I/O costing model, the optimizer_index_cost_adj parameter is designed to adjust the single block read costs to ensure that they are indeed costed as being (say) 1/2 the cost as that of a typical multiblock I/O.
 
However, when using the CPU costing model, the optimizer_index_cost adj parameter is effectively redundant as the necessary adjustments are already incorporated into the final costs. The total time required to perform a multiblock read operation is divided by the time it takes on average to perform a single block read operation. Using the optimizer_index_cost_adj parameter, although supported and permissible, will likely result in the final CBO costs being adjusted inappropriately as the index related single block I/Os will “double-dip” and potentially reduce both as a result of the system statistic differences between sreadtim and mreadtim and also as a result of the optimizer_index_cost_adj parameter as well.
 
The system stats are much preferred provided they’re accurate and kept reasonably up to date, because one doesn’t need to “manually” change any associated database parameter.
 
Not only are the comparative differences between sreadtim and mreadtim maintained, but so are other useful system statistics such as the mbrc statistic to be discussed next.
 
So in summary, when using the CPU costing model, do not set the optimizer_index_cost_adj parameter at all. Leave it alone, collect representative system statistics and let the system statistics look after the comparative costs between single and multiblock I/Os for you automatically.

The CPU Costing Model: A Few Thoughts Part I December 8, 2009

Posted by Richard Foote in CBO, Oracle Cost Based Optimizer, System Statistics.
4 comments

In the coming days, I’ll post a series of little entries highlighting a specific point in relation to the use of system statistics and the CPU cost model. In my previous post, we looked at how the cost of a FTS is calculated using the CPU costing model and how this generally results in an increase in the associated FTS cost over the I/O costing model.

The table in my demo though had an index with an appalling clustering factor and even though the cost of the FTS increased substantially from 33 to 70, this cost was still significantly less than the large cost associated with using such an inefficient index. So in that specific example, the change of FTS costs as introduced with the CPU costing model made no difference to the final execution plan.
 
The key point I want to emphasise with this post,  is that by increasing FTS costs as is common with the CPU costing model over the I/O costing model, this can of course potentially result in entirely different execution plans, especially if a candidate index has a reasonable clustering factor. Substantially increasing the associated costs of a FTS can be very significant, especially where the difference in costs between a FTS and an index can be much narrower for well clustered indexes.
 
In this previous I/O Costing Model example using the BOWIE_STUFF2 table, the index had an excellent clustering factor. However the query resulted in a FTS as the cost of 65 was just a little less than using an associated index:

SQL> select * from bowie_stuff2 where id in (20, 30, 40, 50, 60);
10000 rows selected.

 
Execution Plan
———————————————————-
Plan hash value: 573616353
——————————————————————
| Id | Operation | Name | Rows | Bytes | Cost |
——————————————————————
| 0 | SELECT STATEMENT | | 10000 | 175K| 65 |
|* 1 | TABLE ACCESS FULL| BOWIE_STUFF2 | 10000 | 175K| 65 |
——————————————————————

Remember, this was “addressed” and the CBO started using the index, by manually adjusting the optimizer_index_cost_adj parameter from its default value to a value of 75 as explained in this previous post on the effects of the optimizer_index_cost_adj parameter.

However, with system stats and the use of the CPU costing model, the extra FTS cost can have a direct impact on the resultant execution plan. Running the same query again, but this time without changing any optimizer parameters and using the same system stats as in my last post on the CPU Costing Model:

PNAME    PVAL1
-------- -----
SREADTIM     5
MREADTIM    10
CPUSPEED  1745
MBRC        10

SQL> select * from bowie_stuff2 where id in (20, 30, 40, 50, 60);
 
10000 rows selected.
 

Execution Plan
———————————————————-
Plan hash value: 2964430066
 
———————————————————————————————–
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
———————————————————————————————–
|   0 | SELECT STATEMENT             |                | 10000 |   175K|       69(2)| 00:00:01 |
|   1 |  INLIST ITERATOR             |                |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF2   | 10000 |   175K|       69(2)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | BOWIE_STUFF2_I | 10000 |       |       25(0)| 00:00:01 |
———————————————————————————————–
 

We notice that the CBO has now chosen the index automatically, without having to make any changes to the optimizer_index_cost_adj parameter at all.
 
Previously, the FTS costs were 65. However, the current costs for a FTS are at least:
 
(ceil(blocks/mbrc) x mreadtime) / sreadtime = (ceil(659/10) x 10) / 5 = 132.
 
132 is already way greater than the 69 cost associated with using the above index and the 132 cost doesn’t even take into consideration any additional costs related to CPU usage.
 
So in general, using the CPU costing model will likely increase the associated costs of FTS, making indexes automatically more “attractive” to the CBO as a result. This change alone in how the FTS in particular is costed using the CPU costing model can have a major impact in execution plans chosen by the CBO. This is but one of the key reasons why things can change so dramatically when moving from 9i to 10g where the CPU costing model is the default.

The CBO CPU Costing Model: Indexes vs. Full Table Scans November 25, 2009

Posted by Richard Foote in CBO, Full Table Scans, Oracle Indexes, System Statistics.
5 comments

As previously promised, I thought I might look at how the CBO goes about costing a Full Table Scan (FTS) with system statistics and the CPU costing model, so we can understand why the CBO may have chosen one option over the other.
 
WARNING: You might need to grab a calculator to help you along :)

To illustrate, I’m simply going to use the original BOWIE_STUFF table and index setup I created in my earlier Introduction to the CBO. I’ll however recreate the demo here again from scratch to refresh your memory:
 
I first create a table that has 100,000 rows, with an indexed “ID” column that has 100 distinct, evenly distributed values. For those mathematically challenged, this means each distinct value will return 1000 rows.
 

SQL> CREATE TABLE bowie_stuff AS SELECT (mod(rownum,100)+1)*10 id, 'Ziggy Stardust' name FROM dual CONNECT BY LEVEL <= 100000;
 
Table created.
 
SQL> CREATE INDEX bowie_stuff_i ON bowie_stuff(id);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select blocks from dba_tables where table_name='BOWIE_STUFF';
  
BLOCKS
------
   329

Note the table has 329 blocks. It’s a number I’ll refer to a number of times throughout.
 

SQL> SELECT index_name, blevel, leaf_blocks, clustering_factor FROM user_indexes WHERE index_name = 'BOWIE_STUFF_I';
  

INDEX_NAME    BLEVEL
------------- ------
BOWIE_STUFF_I      1

LEAF_BLOCKS CLUSTERING_FACTOR
----------- -----------------
        207             32900

   
Note also that the index has a blevel of 1, 207 leaf blocks and a rather poor Clustering Factor (CF) of 32900, not close at all to the number of blocks in the table. As we’ll see, the CF is so bad that the CBO will choose a FTS over the index.
 

SQL> show parameter db_file_multi
   
NAME                          VALUE
----------------------------- -----
db_file_multiblock_read_count    16

 
   
Note the db_file_multiblock_read_count is manually set to 16. Relevant when calculating the cost of a FTS with the I/O costing model. Less so with CPU costing in use as we’ll see.

Finally, if we look at the system statistics in place:
 

SQL> select pname, pval1 from sys.aux_stats$ where pname in ('SREADTIM', 'MREADTIM', 'MBRC', 'CPUSPEED');

PNAME    PVAL1
-------- -----
SREADTIM     5
MREADTIM    10
CPUSPEED  1745
MBRC        10

All of these values will be relevant when calculating the cost of a FTS with the CPU costing model.
 
OK, we now have all the information we need to determine how the CBO will treat both index and FTS activities on this table.
 
Let’s start by refreshing ourselves with how the I/O based CBO model will deal with such a scenario.
 

SQL> alter session set "_optimizer_cost_model" = io;
 
Session altered.

 
OK, let’s just run a simple query that selects data for a specific ID. Remember, there are 100 evenly distributed distinct IDs so this query will return 1% of the data (1000 rows):

SQL> set autotrace traceonly
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
 
  
-----------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost  |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |       |
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |       |
-----------------------------------------------------------------

 
 

Note: the CBO has decided to use a FTS to select the 1% of rows as it has the lowest associated cost.
 
As previously discussed, the cost of using the index is approximately:
 
index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
 
 = 1 + (207 x 0.01) + (32900 x 0.01) = 1 + 3 + 329 = 333        

Note: the 1 for the blevel can be dropped by the CBO bringing the cost down to 332, to be discussed another time .
 
As previously discussed, the FTS cost is approximately:
 
segment header I/O + ceil(table blocks/fudged mbrc value) 

Note: for a db_file_multiblock_read_count of 16, the adjusted, “fudged” value used by the CBO is approximately 10.4.

Therefore, for the above example, the FTS cost is calculated as:
 
= 1 + ceil(329/10.4) = 1 + 32 = 33
 
33 is significantly less than 333 so the FTS easily wins out.
 

So how do things change when using System Statistics and the CPU costing model ? How does the CBO calculate the cost of the FTS with the above system statistics in place ?
 
As I’ve previously discussed, the significant change with the CPU costing model is not so much how it impacts the cost of index accesses but that of the FTS.
 
Let’s run the same SQL, but this time using the CPU costing model:
 

SQL> alter session set "_optimizer_cost_model" = cpu;
 
Session altered.
 
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
 
1000 rows selected.
 

--------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |     70(5)  | 00:00:01|
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |     70(5)  | 00:00:01|
--------------------------------------------------------------------------------

Note: CBO is still picking the FTS as the CF is truly awful. However the cost of the FTS has increased significantly from 33 to 70, although nowhere near the approximate 333 cost of using the index.
 
So why has the FTS cost increased and how is this new cost calculated ?
 
As previously discussed , the CPU costing formula is basically:
 
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 

If we first focus on the single block I/O portion of the formula, the only single block read considered by the CBO during a FTS is the one associated with reading the segment header. Note that the average wait time for a single block read is the SREADTIM system statistic.
 
If there’s just the one single block I/O, the single block read portion of the formula effectively equates to (1 x sreadtim) / sreadtim, which just equals 1. So 1 is basically added to the cost with regard to reading the segment header as it is with the I/O costing model.
 
OK, lets next look at the portion of the formula with regard to multiblock I/Os.
 
The sum of all the multiblock I/Os is calculated in a similar manner as it was with the I/O costing model. It’s simply the number of blocks in the table below the HWM (329 in our example) but this time divided by the MBRC system statistic. Note however the MBRC statistic isn’t some fudged, somewhat arbitrarily set figure based on the db_file_multiblock_read_count parameter, but the actual average size of multiblock I/Os in the specific database environment. Note also that the average wait time for a multiblock read is the MREADTIM system statistic.
 
So the total wait time for all multiblock reads in the above example is:
 
sum of all the multiblock I/Os x average wait time for a multiblock I/O = (BLOCKS/MBRC) x MREADTIM = ceil(329/10) x 10 = 330.
 
This value is then divided by the average wait time for a single block read (the SREADTIM system statistic) to give the overall cost of multiblock reads, but expressed in units of single block I/Os.
 
The total cost for multiblock I/Os is therefore:

 ((BLOCKS/MBRC) x MREADTIM)/ SREADTIM = 330/5 = 66.
 
So the total costs associated for all I/Os is the 1 for reading the segment header plus 66 for all the multiblock reads = 67.
 
However, the cost of the FTS is 70, not 67. Where does the additional cost of 3 come from ?
 
Well, that’s the CPU portion of the formula. The CBO has determined that the FTS operation will require ‘x’ number of CPU cycles and this value is then divided by the CPUSPEED to determine how long this CPU activity will take.
 
This CPU elapsed figure is then again divided by the average wait of a single block read (SREADTIM) to also put the CPU costs in units of single block reads. In this example, the total CPU related costs amount to 3.
 
Oracle gives us an indication of what the CPU component is in the overall cost within the execution plan via the %CPU value (which is 5 in the above execution plan). The (%CPU) value is the ceil of the overall percentage of CPU costs as calculated by the following formula:
 
%CPU = ceil(CPU related costs/overall costs)
 
So in our example, %CPU = ceil(3/70 x 100) = ceil(4.29) = 5% (as indeed displayed in the above execution plan).
 

Again, all the costs associated with a FTS with the CPU costing model can be derived and make some kinda sense. Providing all the necessary inputs are all actually correct and valid, the CBO will indeed correctly decide to use a FTS over an index when it’s the less expensive option.
 
I’ll next expand these points and why understanding how these costs are derived can be extremely useful.

You can now put your calculators away :)

The CBO CPU Costing Model and Indexes – Another Introduction September 16, 2009

Posted by Richard Foote in CBO, Index statistics, Oracle Indexes, System Statistics.
11 comments

I’ve previously discussed some basic concepts and formulas regarding how the CBO derives index related costings via the I/O costing model. Time to look at system statistics and the CPU costing model with specific regard to indexes.
 
The first point I would make is that the CPU costing model has some significant improvements over the older I/O costing method and I would strongly recommend adopting the CPU costing model where possible. I’ll explain some of these improvements and advantages over the coming posts.
 
The I/O costing model basically looks at the cost of a specific execution plan in terms of the estimated number of physical I/Os. The less I/Os, the less costly and more efficient the execution plan and the faster the expected response times. There are however a number of short falls with this basic I/O costing strategy in that is doesn’t automatically differentiate between the costs associated with different types of I/Os (eg. between single block and multiblock reads), it doesn’t automatically determine a typical or average size of a multiblock I/O and it doesn’t cost and take into consideration the time and overheads associated with likely CPU resources.
 
The CPU costing model attempts to take into consideration these previous limitations. It automatically takes into consideration discrepancies between the time to complete an average single block I/O versus a multiblock I/O, automatically determines the average size of a multiblock I/Os so it can more accurately determine the likely number of multiblock I/Os in a FTS and automatically determines the expected CPU time for a specific task.
 
To use the CBO CPU costing model, one needs to collect system statistics so that CBO has this additional information, based on the actual system hardware characteristics (Note: since 10g, the hidden parameter _optimizer_cost_model defaults to ‘cpu’ and so is used by default). You do this with the dbms_stats.gather_system_stats procedure. You can collect “Noworkload” statistics in which Oracle basically randomly reads the database data files to determine base statistics such as the average I/O seek time, the average I/O transfer speed and the CPU speed. However,  I would rather recommend the collection of “Workload” stats which are based on the actual workload characteristics of your hardware, based on the real load on your system during the time in which system statistics are gathered (in which case Noworkload statistics are simply ignored).
 
You can gather Workload system statistics by either running:
 
dbms_stats.gather_system_stats(‘START’) to start the system stats collection process followed by dbms_stats.gather_system_stats(‘STOP’) to stop the collection process over a typical, workload period, or
 
dbms_stats.gather_system_stats(‘INTERVAL’, interval=> 120) to say collect system workload stats over a 120 minute period.
 
To view the collected system statistics, query SYS.AUX_STATS$.
 
 
SQL> SELECT pname, pval1 FROM SYS.AUX_STATS$
           WHERE pname IN (‘SREADTIM’, ‘MREADTIM’, ‘MBRC’, ‘CPUSPEED’);
 

PNAME             PVAL1
------------ ----------
SREADTIM              5
MREADTIM             10
CPUSPEED           1745
MBRC                 10

 
The four systems statistics that I’ll focus on for now are:
 
SREADTIM – time in milliseconds for a single block I/O
MREADTIM- time in milliseconds for a multiblock I/O
CPUSPEED - million of CPU cycles per second
MBRC – average number of blocks actually read during multiblock read operations
 
In the above figures, just note therefore that a multiblock read on average takes approximately double the time of that of a single block read and that on average, 10 blocks are read during a multiblock read operation. This provides the CBO with vital information regarding how to now cost and compare potential execution plans.
 
 
The CBO CPU costing model basically looks at the total time required to complete an execution plan by summing:
 
total time to complete all single block I/O activity +
total time to complete all multiblock I/O activity +
total time to complete all the CPU activity

 
This can basically be calculated by:
 
sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second
 
In theory, this should provide the total response time to service an execution plan. However, to keep the actual “cost” figures calculated by the CBO consistent with the I/O costing model, the CBO divides this total time by the average time for a single block I/O, such that the full formula becomes:
 
(sum of all the single block I/Os x average wait time for a single block I/O + 
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 
The final “cost” figure, even with the CPU costing model, is therefore still expressed in units of single block I/Os. This is an important point …
 
So how does the CBO determine the value of the various figures within this formula ? Well as we’ll see, the CBO get’s the required information both from the system statistics and from the costing formulas previously discussed with the I/O costing model.

However, for index related access paths, there’s some good news regarding being able to simplify matters somewhat.
 
The first bit of good news is that from the perspective of an index access path, there are no multiblock I/Os (except for a Fast Full Index Scan) and so the CPU costing formula can be simplified for indexes to remove the multiblock read component and be just:
 
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O
 
Secondly, if the CPU component is relatively trivial, it may not be sufficient enough to count towards the final cost. As smaller index scans are likely to consume little CPU, it means the CPU component can also generally be ignored. This reduces the formula for such index scans to just:
 
(sum of all the single block I/Os x average wait time for a single block I/O)
/
average wait time for a single block I/O
 
However, the average wait time for a single block I/O now becomes redundant in this simplified equation, reducing the cost to now be just:
 
sum of all the single block I/Os
 
Well the next bit of good news for those that have followed my previous blog entries with regard to the CBO and Indexes is that the previous formulas regarding the I/O costing model are still applicable when determining the sum of all expected I/Os. The sum of all the single block I/Os associated with an index scan is still basically:
 
sum of all the single block I/Os = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
 
In other words, for smaller index scan execution plans, the cost calculated by CBO using the CPU costing model is the same as with the I/O costing model. So no, I wasn’t wasting everyone’s time discussing the various formulas using the older I/O costing model :)
 
If we run the same demo as I ran previously in my initial post regarding the CBO and Indexes where the total cost of the index access plan was 18, but this time using the system statistics listed above:
 
 SQL> alter session set “_optimizer_cost_model” = cpu;
 
Session altered.
 
SQL> SELECT * FROM bowie_stuff2 WHERE id = 420;
 
2000 rows selected.
 

Execution Plan
———————————————————-
Plan hash value: 134336835
 
——————————————————————————–
|Id|Operation                   |Name          |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————————–
| 0|SELECT STATEMENT            |              |2000|36000|   18   (0)|00:00:01|
| 1| TABLE ACCESS BY INDEX ROWID|BOWIE_STUFF2  |2000|36000|   18   (0)|00:00:01|
|*2|  INDEX RANGE SCAN          |BOWIE_STUFF2_I|2000|     |    9   (0)|00:00:01|
——————————————————————————–
 

We notice that the cost remains exactly the same at 9 for the index range scan component and exactly the same at 18 for the total execution plan when comparing the cost of using the IO costing model vs. the CPU costing model. Introducing system statistics hasn’t changed things for this particular index related execution plan.
 
And this is a very common observation. As indexes use single block I/Os, as the CBO cost remains as a unit of single block I/Os and as CPU consumption for an index scan is often trivial, the resultant costs for index access paths often remain unchanged with the CPU costing model. 

Previously, we looked at how changing parameters such as the optimizer_index_cost_adj impacts the costings of index related execution plans to create a level playing field between index and FTS execution plans.
 
The key point to make with regard to system statistics and the CPU costing model is that in general, the system statistics and the associated formula will automatically ensure a level playing field. However, unlike the optimizer parameters, it will do so by typically adjusting the associated costs of the FTS (rather than the index accesses) as the true costs and wait times associated with multiblock FTS are calculated, but are divided by and expressed in units of single block reads.

So rather than decreasing the associated costs of an index access path, system statistics and the CPU costing model will typically create a level playing by automatically increasing the associated costs of a FTS as appropriate.

To be discussed further …

The CBO and Indexes: OPTIMIZER_INDEX_COST_ADJ Part I July 8, 2009

Posted by Richard Foote in CBO, Index Access Path, OPTIMIZER_INDEX_COST_ADJ, Oracle Cost Based Optimizer, Oracle Indexes.
17 comments

In the previous entry regarding The CBO and Indexes, we saw how the CBO with a query that selected 5 distinct values in an IN list, representing 5% of the data, decided to use a FTS because the costs of doing so were less than that of using a corresponding index. These costs (using the I/O costing model) represented the number of expected I/Os and the FTS was basically going to perform fewer I/Os than the index. Less I/Os, less cost and so the FTS was selected as the preferred access path.
 
However, by default, the CBO when determining these costs via the I/O costing model makes two very important assumptions which may not necessarily be true.
 
Assumption one is that all I/Os are likely to be “physical I/Os” which all need to be costed and taken into account.
 
Assumption two is that all I/Os are costed equally, even though the size of a multiblock I/O performed typically during a FTS is larger and so potentially more costly than a single block I/O usually associated with an index access.
 
Today, I’m only going to focus on this second assumption. 

Now, when performing and processing data from a multiblock I/O as performed during a FTS operation, it’s typical for such operations to be more resource intensive than that of a single block I/O as performed during an index range scan, as the associated overheads are likely be greater such as having to read more actual data off the disk, having to transfer more data into the SGA, having to process more data in each associated block, etc.
 
Therefore, not all I/Os are equal. However, by default the CBO ignores all these possible differences and costs all I/Os associated with a FTS (multiblock) and an index (single block) as being equivalent or the same.
 
Now, this hardly seems fair or indeed accurate and desirable when determining the true cost differences between an index and a FTS. Shouldn’t the fact that a single block I/O is likely to be less resource intensive and take less elapsed time to process be taken into consideration when determining these relative costs ? 

Enter the optimizer_index_cost_adj parameter.
 
The purpose of this parameter is simply to “adjust” the corresponding costs associated with an index to (hopefully) more accurately reflect the relative I/O costs between using an index and a FTS. If for example a single block I/O only takes 1/2 the time and resources to perform compared to a multiblock I/O, shouldn’t these associated I/O cost differences be reflected when determining whether or not to use an index and perhaps reduce the index related costs by 1/2 as a result ?
 
This parameter has a very simple impact on how the CBO costs the use of an index based access path. It takes the value of the optimizer_index_cost_adj as a percentage and adjusts the cost of an index related range scan access path to only be the corresponding percentage of the total index cost. By default, it has a value of 100 meaning that a single block I/O is 100% when compared to that of a multiblock I/O which in turn means that the index related I/O costs are treated the same as that of a multiblock FTS I/O. A default value of 100 therefore has no effect on the overall cost of using an index related access path.
 
However, if the optimizer_index_cost_adj only has a value of (say) 25, it means that all single block I/O are only 25% as costly as that of a multiblock I/O and so index related range scan costs are adjusted to be only 25% of that of the total index access path cost.
 
Going back to the previous demo where the FTS was selected, I calculated the cost of using the index when retrieving the 5% of data to be:

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854) = 2 + 5 x 7 + 43 = 37 + 43 = 80.
 
The cost of using a FTS was calculated as being only 65. A cost of 65 for the FTS is less than a cost of 80 for the index and so the FTS was selected.

This time, the linked demo sets the optimizer_index_cost_adj = 25 before running the exact same query again.

We notice of couple of key differences. The first obvious difference is that the plan has changed and that the CBO has now decided to use the index. The second difference is the associated cost relating to the use of the index. Previously, it was calculated as being 80 but now it only has a cost of 20. The maths is pretty simple as with an optimizer_index_cost_adj = 25, we need only mutliply the previous total with 0.25:

(2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854)) x 0.25 = (2 + 5 x 7 + 43) x 0.25 = 80 x 0.25 = 20.

Note also that just the index range scan cost component was previously 2 + 5 x ceil(0.01 x 602) = 37, but is now also adjusted to 37 x 0.25 which rounds to 9.

Basically by setting the optimizer_index_cost_adj = 25, we have effectively reduced the overall cost of using the index based execution path down from 80 to just 20, to just 25% of the previous total index cost.
 
The cost of the FTS remains unchanged at 65. The index access path at just 20 is now less than the FTS alternative and so the index is now chosen by the CBO.

Yes, all these numbers and costs make sense when one understands how the CBO performs its calculations and the effect of setting the optimizer_index_cost_adj parameter to a non-default value.

The  optimizer_index_cost_adj parameter can therefore obviously have a very significant impact in the behaviour and subsequent performance of the database as the CBO will reduce (or maybe increase) the actual costs of index related access paths by the percentage denoted in the optimizer_index_cost_adj parameter. It can potentially dramatically increase (or decrease) the likelihood of an index access path being chosen over a FTS.
 
There are typically 3 very different ways in which this parameter is set, which I’ll list in increasing order of preference.
 
1) Set it arbitrarily to a very low figure such that indexes reign supreme as their associated costs get adjusted to such a low figure by the CBO that a FTS access path has little chance of being chosen (for example, here’s a suggestion to set it to a magical value of 12). Generally a very bad thing to do in any database …
 
2) Set it to a value that the DBA determines is an approximate percentage of the costs associated with a single block I/O when compared to a multiblock I/O. An improvement of option 1), but I still prefer the next option 3) …
 
3) Leave it at the default value of 100 such that it has no impact and the CBO does not use it to adjust the cost of an index access path

 

I’ll explain in Part II a sensible approach in setting the optimizer_index_cost_adj parameter and why option 3 is the preferred option with any currently supported version of Oracle.

The CBO and Indexes: Introduction Continues … June 15, 2009

Posted by Richard Foote in CBO, Oracle Indexes.
11 comments

OK, I previously briefly covered how the CBO calculates the basic cost of an index range scan. Yes, those cardinality/rows values in the execution plans are vitally important as they highlight whether or not the CBO has used the appropriate selectivity values in the index costing formula. And yes, the associated cost values are meaningful and potentially useful as they determine the actual costs associated with the execution plan in terms of the expected number of I/Os the CBO estimates will be required (when using the IO costing model and often the CPU costing model as well).

I’m just going to look at another example now using the same table setup as before, but this time running an SQL query that has 5 distinct values in an IN list predicate on our demo table (again, follow the link to see the query and formatted execution plan).

The first thing we notice in this example, is that Oracle has decided to use a FTS rather than use the index on the ID column. Considering we’re only after 5 values out of the possible 100 values, some may not see this as expected behaviour, especially considering the index has such a good Clustering Factor. Basically Oracle is deciding to access each and every block below the HWM of the table, retrieving all 100% of the rows in the table, only to ultimately discard 95% of them.

It certainly appears at first glance to be a more “costly” option than using the index to directly access just the 5% of rows we’re interested in …

The first thing to check is the estimated cardinality figures, to see if the CBO has miscalculated the expected number of rows it needs to retrieve. However, as the statistics have just been fully computed and that the ID column has perfectly even distributed values, we notice the cardinality figures are again spot on. The query returns 10,000 rows and indeed the rows estimate in the execution plan is exactly10,000 rows. The calculation is simply 0.01 (density of column) x 200,000 (rows) x 5 (values in select list) = 10,000.

Let’s now calculate the cost of using the index using our index costing formula, using the CEIL function this time ;)

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854) = 2 + 5 x 7 + 43 = 37 + 43 = 80.

So the cost of using an index range scan to retrieve 5% of the rows in our example comes to a total of 80.

If we look at the cost of the FTS in our explain plan in the above link, we notice the cost is just 65. 65 is less than 80, so the FTS wins.

So how did the CBO come to a cost of just 65 when it has to read all 659 blocks in the table ?

Well, the first missing piece of information is the value of the db_file_multiblock_read_count parameter because this governs how many blocks Oracle will attempt to read within a single logical multiblock I/O call. Remember, when performing a FTS, Oracle knows it has to read all the table related blocks below the HWM and so rather than reading one tiny little block at a time, it does so more efficiently by reading multiple blocks at a time. This is the fundamental advantage of the FTS over the index range scan which can only ever access the one block at a time.

SQL> show parameter db_file_multi

NAME                          TYPE    VALUE
----------------------------- ------- -----
db_file_multiblock_read_count integer    16

 

So the db_file_multiblock_read_count is 16.

The next thing to note is that it’s very unlikely that Oracle will actually read the full 16 blocks at a time as there are a number of factors that prevents this from occurring. Extent boundaries is one classic example (a multiblock read can not span across extent boundaries) but the more common issue is a block within the table already being stored in the buffer cache. Rather than storing the same block at the same consistent point twice in memory, Oracle breaks up the multiblock read and only reads up to the block that is already cached in the buffer cache. Therefore, for Oracle to actually read the entire table using the full 16 block multiblock I/Os, it would mean there are no cached blocks from the table currently in the buffer cache, an unlikely event.

Therefore, Oracle doesn’t use the full 16 value when determining the number of expected multiblock I/Os, but a modified “fudge” value which equates to approximately 10.4. for a MBRC of 16. Again, Jonathan Lewis in his excellent “Cost-Based Oracle Fundamentals” book discusses all this is some detail. 

Remember also that Oracle needs to access the segment header as part of a FTS as I explained is some detail in my “Indexes and Small Table” series. So that’s an additional single block I/O on top of the multiblock I/Os.

Therefore the cost of performing a FTS is:

segment header I/O + ceil(table blocks/fudged mbrc value) = 1 + ceil(659/10.4) = 1 + 64 = 65.

The 65 cost for the FTS does make sense when one understands a little how this value is derived by the CBO …

As the FTS can read big chunks of the table at a time whereas the index range scan can only read each necessary block one at a time, the FTS can indeed read the table and retrieve the required 5% of data in fewer LIOs and so has the lesser associated cost than the index.

Now there are a few issues with all of this. Firstly, is the db_file_multiblock_read_count actually a valid and correct setting as this directly impacts not only the actual size of the multiblock read operations but critically, the associated costs relating to FTS operations (and indeed Fast Full Index Scans as well) ?

Also, is it really correct and valid to assume the cost of a multiblock I/O to be the same and equal to the cost of a single block I/O ? Surely, the process of performing a single block I/O is likely to be “cheaper” than that of a multiblock I/O and yet the CBO treats both types of I/Os as having the same fundamental “cost”.

Also the CPU overheads of having to access each and every row in each and every block is likely going to be more significant than the CPU required to access just specific data from specific blocks when using an index.

Perhaps, the more “expensive” index range scan might actually be a better alternative than the FTS if these factors were taken into consideration ?

Now this may indeed be true, if these factors were correctly taken into consideration. However, this may also indeed be quite false and the FTS may really truly be the better and more efficient alternative and attempts to force the use of the index may be inappropriate and ultimately more expensive.

I’ll next discuss some really bad (although still very common) methods of making the CBO favour indexes, using generally inappropriate so-called “Silver Bullets” …

The CBO and Indexes: An Introduction (Absolute Beginners) June 9, 2009

Posted by Richard Foote in CBO, Oracle Indexes.
17 comments

One of the more common questions I get asked and one of the most common questions asked in the various Oracle related forums is the general question of why doesn’t the CBO choose to use a particular index. There are various reasons why an index might be ignored but generally and rather simplistically (for now), the answer is often simply because the CBO considers the cost of using the index to be more expensive than other alternatives.

As one would expect with the CBO, the cost is a rather important consideration and yet many don’t understand what the costs associated with an execution plan actually means and represents. Some people think the costs associated with an execution plan are basically meaningless and are only used by the CBO in “mysterious ways”. Some people even go so far as to suggest that the CBO sucks and recommend all sorts of inappropriate ways to force Oracle to use an index.

It all comes back to simply not understanding CBO fundamentals and how Oracle costs the basic use of an index.

In reality of course, the CBO doesn’t “suck” with 10g, in fact it’s actually extremely clever and resourceful when it comes to determining appropriate, efficient execution plans. If the CBO is armed with “accurate enough” statistics, it will generally do a remarkably good job. The costs and the associated cardinality of the various steps within an execution plan provide valuable information on why the CBO has made its decisions.

So I thought I might discuss the CBO a little and start with a very basic introduction into how the CBO costs an index to get this message across.

The basic formula for costing an index based range scan is:

basic index range scan cost = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

Note there are various slight variations and interpretations of this formula from database version to database version and for differing specific scenarios, which can vary the resultant costs marginally. All the following examples are from a windows based 10.2.0.3 database.

So from an index perspective, the index blevel, the number of leaf blocks and the clustering factor are all statistics that directly influence the cost of using the index. However, just as important are the associated selectivities of accessing both the index and the table. These statistics are often based on the associated column statistics (but not always) and if Oracle gets these selectivities wrong, then the CBO will generate wrong costings and all bets are off regarding the appropriateness of the resultant execution plan.

OK, let’s create a nice, simple little scenario which will be the basis of future demos.

SQL> CREATE TABLE bowie_stuff AS SELECT (mod(rownum,100)+1)*10 id, 'Ziggy Stardust' name FROM dual CONNECT BY LEVEL <= 100000;

Table created.

SQL> CREATE INDEX bowie_stuff_i ON bowie_stuff(id);

Index created.

We’ll use the BOWIE_STUFF table in future demos, but for now I’m after a relatively large index with an excellent clustering factor. So I’ll create a second table, ordered on the ID column and double it’s size …

SQL> CREATE TABLE bowie_stuff2 AS SELECT * FROM bowie_stuff ORDER BY id;

Table created.

SQL> INSERT INTO bowie_stuff2 SELECT * FROM bowie_stuff2;

100000 rows created.

SQL> commit;

Commit complete.

SQL> CREATE INDEX bowie_stuff2_i ON bowie_stuff2(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF2', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we now look at some of the key statistics, we note the following:

SQL> SELECT table_name, blocks, num_rows FROM user_tables WHERE table_name = 'BOWIE_STUFF2'; 
TABLE_NAME   BLOCKS NUM_ROWS
------------ ------ --------
BOWIE_STUFF2    659   200000

The table has 200,000 rows and consists of 659 data blocks.

SQL> SELECT column_name, num_distinct, density, num_nulls FROM user_tab_col_statistics WHERE table_name = 'BOWIE_STUFF2' and column_name = 'ID';
COLUMN_NAME NUM_DISTINCT DENSITY NUM_NULLS
----------- ------------ ------- ---------
ID                   100     .01         0

The indexed ID column has 100 distinct values, which are perfectly evenly distributed.  The density of the ID is an “accurate” 0.01 with a selectivity of 1%, so a selection of one distinct value will return 200,000 x 0.01 = 2,000 rows. Nice easy numbers …

SQL> SELECT index_name, blevel, leaf_blocks, num_rows, distinct_keys, clustering_factor FROM user_indexes WHERE index_name = 'BOWIE_STUFF2_I';

INDEX_NAME      BLEVEL LEAF_BLOCKS
-------------- ------- -----------
BOWIE_STUFF2_I       2         605  

NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
-------- ------------- -----------------  
200000           100               854

The index has a blevel of 2, it has 605 leaf blocks and a rather good clustering factor of just 854 (note the rows in the table were basically inserted in ID order).

OK, we now have all the information we need.

Note to start with, I’ll only use the “older” IO costing model by setting the following parameter:

SQL> alter session set "_optimizer_cost_model" = io;

Session altered.

As we’ll see in future posts, the use of the CPU costing model actually has minimal effect on many index related access costings but we’ll use the IO costing model as a starting point.

This first equality SQL demo shows the costings in relation to a simple, single value equality predicate (I’ve linked to a PDF to preserve formatting).

OK, 2 important pieces of information to note. Firstly, the CBO has got the expected cardinality (2000 rows) spot on.

As the stats are 100% accurate and the values are perfectly evenly distributed, this is to be expected. If Oracle gets the cardinality correct, we can be reasonably confident that the CBO to have made a reasonable decision here.

As previously mentioned, there are 100 distinct values of which we want one of those those values. Oracle has calculated the selectivity of the query to be 1/100 = 1% or a value of 0.01 (as defined in the density column statistic). The expected number of rows is therefore 200,000 x 0.01 = 2000. Like I said, spot on.

Secondly, the CBO has calculated the cost of the index range scan to be 9 and the overall cost of the query to be 18. How has the CBO derived these costings ? Well as it’s a simple, single column index and associated query predicate, we can plug in the 0.01 selectivity into the above formula as follows:

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + ceil(0.01 x 605) + ceil(0.01 x 854) = 2 + 7 + 9 =    9 + 9 = 18

So we can see that the costs associated with accessing just the index:

blevel + ceil(index selectivity x leaf blocks) = 2 + ceil(0.01 x 605) = 2 + 7 = 9

indeed comes to a total of 9

and the overall cost of the execution plan indeed does come to 18.

The CBO costings all do actually make sense and add up as expected.

In this second IN list demo, we’re just going to expand on things a touch by selecting 3 distinct values in a IN list. However, the actual principles and cost calculations are basically the same.

This time we’re after 3 possible values as listed in the IN clause. So that’s 3 out of the 100 possible values, 3/100 = 3% or 0.03. The required cardinality is therefore 200,000 rows x 0.03 = 6000 which again Oracle has got spot on. Usually a good sign that the CBO has made a reasonable decision.

This time the costs have come to 23 for the index range scan part of the execution plan and 49 overall. Again using our basic formula:

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + 3 x ceil(0.01 x 605) + ceil(0.03 x 854) = 2 + 3×7 + 26 = 2 + 21 + 26 = 23 + 26 = 49.

So the cost associated with the index is indeed 23 and the overall cost of the query is indeed 49. Again, all these CBO costings make sense. These costings aren’t meaningless, internal numbers but values that give us a useful insight into how and why the CBO has come to a specific cost and so to a specific execution plan.

Note BTW, one of those subtle differences in how the formula is implemented with 10g (from say 9i) is the fact the selectivity is calculated for the index by performing the CEIL function for one value first and then multiplied by the overall number of expected values in the IN list:

2 + 3 x  ceil(0.01 x 605) = 23

rather than determining and using the total index selectivity within the CEIL function:

2 +  ceil(0.03 x 605) = 21

The net result being such costs are going to be just that little bit higher in 10g than they would have been in 9i.

So indeed, these things can all change a little from release to release but the basic principle is still the same. Feed the CBO acurate enough statistics and it’ll likely do the right thing. If it doesn’t, understanding the CBO and how it costs various execution plans will help to determine what the issues might be. Huge clues are provided by the costings and cardinality estimates in the associated query execution plans. 

I’ll expand on all this is future posts but for more information on this subject, I highly recommend the writings of Wolfgang Breitling (and his excellent paper “A Look Under The Hood Of The CBO: The 10053 Event” in particular) and the writings of Jonathan Lewis (and his excellent book “Cost-Based Oracle Fundamentals” in particular).

Like I said, this is only meant as an introduction, more to come …

Follow

Get every new post delivered to your Inbox.

Join 1,821 other followers