jump to navigation

Upcoming Index Internals and Best Practices Seminar Dates July 27, 2009

Posted by Richard Foote in Oracle Indexes.
add a comment

I have a number of upcoming dates for my “Index Internals And Index Internals” 2 Day Seminar.

A couple of dates in Asia:

Singapore 31 August – 1 September 2009

Malaysia 3-4 September 2009

 

And a couple of dates in Europe:

Switzerland 2-3 November 2009

France 5-6 November 2009

 

If you enjoy the contents of this blog, you’ll likely enjoy my seminar where I spend a couple of pretty full on days going into index internals, learning all about the various index structures and options supported by Oracle (B-Tree, Bitmap, Partitioned, Function-Based, Linguistic, Invisible, Index Organized Tables, IOT Secondary, Fake, Reverse Key, etc. ) discover how indexes should really be supported and maintained, how they actually get used and costed by the CBO, what type of index and which index option should be used where and when, how to collect statistics and what the statistics all mean and can be used, plus lots lots more including heaps of tricks and tips for getting the most out of your indexes and improving the performance of your applications and database environments generally.

There”ll even be a bit of time to ask all those indexing related questions you’re always wanted to ask 🙂

More details can be found on my Index Internals and Best Practices Seminar page.

Hopefully, I’ll get the chance to meet some more of you at one of these events.

The CBO and Indexes: OPTIMIZER_INDEX_COST_ADJ Part II July 22, 2009

Posted by Richard Foote in OPTIMIZER_INDEX_COST_ADJ, Oracle Cost Based Optimizer, Oracle Indexes.
15 comments

In OPTIMIZER_INDEX_COST_ADJ Part I,  I discussed how this parameter can significantly impact the overall cost of an index related execution path by “adjusting” the overall costs by the percentage denoted in the parameter. An OPTIMIZER_INDEX_COST_ADJ value of 25 for example will result in the costs of an index access path being reduced to just 25% of the calculated total.

Reducing the effective costs of using an index will obviously increase the likelihood of an index being chosen by the CBO over a Full Table Scan (FTS). This may be a good thing when an appropriate index is not being used by the CBO. However, this may also be a very bad thing if it results in an inappropriate index being chosen by the CBO over a more efficient FTS …

There are typically 3 different methods by which this parameter is set.

Method One: Set it to a really low value such that indexes are typically preferred over a FTS.

This is such a common piece of advice. I wish I was given a dollar every time I read someone suggest set (or reduce further) the OPTIMIZER_INDEX_COST_ADJ to be a really low value in order for a specific piece of SQL to start using an index. Classic example right here where someone suggests setting this parameter to 12 in order to get the CBO to use an index.

Why 12 ?

Perhaps because they once had a SQL statement that used a FTS instead of an index and setting the OPTIMIZER_INDEX_COST_ADJ to 12 made the CBO use the index and so setting it to 12 obviously works, right ? Perhaps because they were born on the 12th and 12 has always been a lucky number ? Who knows ?

However, simply setting the OPTIMIZER_INDEX_COST_ADJ to a low value for no specific reason is generally a very dangerous and ill advised thing to do.

Remember this key point. The purpose of the OPTIMIZER_INDEX_COST_ADJ is not to ensure indexes are favoured over a FTS. As I discussed in Part I, the purpose is to more accurately reflect any discrepancies between the actual costs of a single block I/O associated with an index access path versus the actual costs of a multiblock I/O associated with a FTS. Therefore, the OPTIMIZER_INDEX_COST_ADJ should as accurately as possible reflect any such discrepancies so that the comparative costs are true and reflective of the real corresponding costs.

The parameter should attempt to set a even playing field between the use of an index and a FTS, not simply just adjust things in favour of indexes generally.

Setting this parameter to an arbitrarily low number may indeed make a specific SQL statement suddenly use an index when previously it used a FTS. Perhaps this SQL statement might indeed benefit from now using the index.

However, this parameter is a global parameter in the sense that it impacts all SQL statements for the entire system or session. Although it might indeed make the CBO use indexes appropriately in places, unfortunately if set incorrectly, it might also make the CBO start to use indexes inappropriately as well. Perhaps the use of a FTS is actually the way to go, is actually the most efficient and less costly option for other SQL statements, but by making this parameter so low as to favour indexes by so much, it might suddenly make the CBO choose indexes that are less efficient and much more costly than the alternative FTS.

Because there are of course many many occasions when a FTS is actually the desired access method as it’s simply the cheaper and less costly alternative. In fact, an index that’s selected inappropriately, an index that’s used instead of the cheaper FTS can actually be far more damaging to overall database performance than the potential damage caused by an inappropriate FTS.

Consider this. In the worst case scenario, the maximum number of logical (or physical) I/Os that a FTS will generate is the number of blocks in the table. The number of actual I/Os is likely to be far fewer than the number of blocks in the table as a FTS will generally perform multiblock I/Os and read all the table blocks in larger “chunks” at a time. However, even with a poorly configured db_file_multiblock_read_count set to effectively 1, the other great advantage of a FTS is that Oracle only needs to access a specific block the once. So reading all the table blocks once, one at a time, is effectively as bad as a FTS can get.

In a 10,000,000 row table housed in say 100,000 table blocks, the maximum number of LIOs is effectively 100,000for a FTS.

However, the worse case scenario for an index range scan can potentially be much much worse. In a poorly clustered index, it’s actually possible to have to read not only each and every leaf block in the index structure (plus a few branch blocks to get to the first index leaf block of interest) but in addition also perform as many logical (or physical) I/Os as there are rows in the table.

In a 10,000,000 row table housed in 100,000 table blocks with say 25,000 leaf blocks in a corresponding index, the maximum number of LIOs could potentially be as high as 10,025,003 for an index range scan, in the order of 100 times worse than the corresponding worse case scenario with the FTS. Remember of course that a FTS is likely to actually have far fewer actual LIOs than the worse case scenario, assuming the db_file_multiblock_read_count is set to something reasonable. So it could quite conceivably be in the order of 1000 times worse than the corresponding FTS.

An index scan you see may need to visit a specific table block many times because the indexed data in the table may be randomly distributed throughout the table (the index has a very bad clustering factor in other words) and so only accesses individual rows within a table block at differing times during the index range scan. As each index entry is read, the corresponding rowids keep referencing different table blocks from those recently accessed. If a specific table block contains say 100 rows, an index range scan may need to access this same block as many as a 100 different times during the index range scan operation.

A very simple example to illustrate this point.

Here, I create a table with10,000,000 rows, with an ID column that has values that are randomly distributed throughout the table:

SQL> create table bowie (id number, text varchar2(80));
 
Table created.
 
SQL> insert into bowie select ceil(dbms_random.value(0, 10000)), ‘This is just a piece of text designed to make a row a reasonable size’ from dual connect by level <= 10000000;
 
10000000 rows created.
 
SQL> commit;
 
Commit complete.

 

I now create an index on this ID column and collect 100% accurate statistics:

SQL> create index bowie_id on bowie(id);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);
 
PL/SQL procedure successfully completed.

 

I now run a select statement that selects just 10%of rows in the table. Many would expect the index to be used by the CBO as 90% of all rows are of no interest:

SQL> select * from bowie where id between 1 and 1000;
 
1000873 rows selected.
 
Elapsed: 00:01:49.75
 
Execution Plan
———————————————————-
Plan hash value: 1845943507

—————————————————————————
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————
|   0 | SELECT STATEMENT  |       |  1000K|    69M| 22068   (2)| 00:01:51 |
|*  1 |  TABLE ACCESS FULL| BOWIE |  1000K|    69M| 22068   (2)| 00:01:51 |
—————————————————————————

Predicate Information (identified by operation id):
—————————————————
 
   1 – filter(“ID”<=1000 AND “ID”>=1)
 

Statistics
———————————————————-
          1  recursive calls
          0  db block gets
     108986  consistent gets
     108672  physical reads
          0  redo size
    8920558  bytes sent via SQL*Net to client
       2596  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000873  rows processed

 

However, we note the CBO has actually selected a FTS and the query is a little slow at nearly 1 minute and 50 seconds.

Hummmm, I know, let’s change the OPTIMIZER_INDEX_COST_ADJ parameter to a really low figure and see if the CBO now decides “to do the right thing” and use the index.

I’m going to pick the value of, uuuummmmm let me think, oh I know, let’s use a value of 2 because I scored 2 goals yesterday in football and 2 has always been somewhat lucky for me. That should do the trick, let’s make the index appear to be only 2% of it’s original cost and really give this index a good chance of being selected by the CBO

SQL> alter session set optimizer_index_cost_adj = 2;
 
Session altered.
 
SQL> select * from bowie where id between 1 and 1000;
 
1000873 rows selected.
 
Elapsed: 02:16:44.42

Execution Plan
———————————————————-
Plan hash value: 4117205494
 
—————————————————————————————-
| Id  | Operation                   | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
—————————————————————————————-
|   0 | SELECT STATEMENT            |          |  1000K|    69M| 19970   (1)| 00:01:40 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE    |  1000K|    69M| 19970   (1)| 00:01:40 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ID |  1000K|       |    42   (0)| 00:00:01 |
—————————————————————————————-
 

Predicate Information (identified by operation id):
—————————————————
 
   2 – access(“ID”>=1 AND “ID”<=1000)
 

Statistics
———————————————————-
          1  recursive calls
          0  db block gets
     998652  consistent gets
     916134  physical reads

          0  redo size
    5034208  bytes sent via SQL*Net to client
       2596  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000873  rows processed

 

Well the good news is that the CBO is now using the index.

However, the bad news is that the elapsed time has gone from 1 min 50 seconds to a massive 2 hours 16 minutes and 44 seconds.

Changing the OPTIMIZER_INDEX_COST_ADJ to an arbitrarily “low” value has suddenly caused some critical SQL statements to now perform appallingly.

And remember, this for a query that was only returning 10% of the data …

Why are things so bad now ?

Because the ID values are randomly distributed throughout the table, the index on the ID column has as a result a terrible/awful clustering factor. Therefore, as the index range scan is performed, Oracle is forced to keep visiting different table blocks with each new index key, potentially returning to the same table block time and time again. The number of logical I/Os is going to be massive and has increased significantly from that of the FTS but the CBO is effectively only costing 2% of all these I/Os in it’s calculations because of the OPTIMIZER_INDEX_COST_ADJ value.

It was giving the index an unfair advantage and severely underestimating the true costs associated with using the index vs. the FTS.

Worse, as it’s using an index, all these massive numbers of blocks are likely going to cause more performance issues in the buffer cache due to the more favourable manner in which such blocks are cached over blocks accessed via a FTS. 

I might also add that in this specific example, the CBO was using the CPU costing model (which I’ll discuss later). So the OPTIMIZER_INDEX_COST_ADJ parameter still has an impact even when the CBO in theory should already have a good idea on the comparative costs of a single vs a multiblock I/O.

So simply setting the OPTIMIZER_INDEX_COST_ADJ parameter to an arbitrarily low value is a very dangerous tuning technique due to the risk of inappropriate index access paths being selected by the CBO.

I’ll next discuss two better techniques in how to set the OPTIMIZER_INDEX_COST_ADJ parameter to a more appropriate value.

Index Quizzes With Jonathan Lewis (Talk Show Host) July 21, 2009

Posted by Richard Foote in Oracle Indexes.
add a comment

I’ve recently looked at some really atrocious Oracle related articles that are out there to confuse the newbies and uninitiated. Scary stuff indeed !!

However, there’s of course plenty of excellent Oracle related material being written all the time, with none better than the writings of Jonathan Lewis on his blog.

Jonathan recently started a series of postings relating to Oracle B-Tree indexes that are going to be well worth a read. He’s going to use a series of little quizzes to explore and get people thinking about some of the more unusual scenarios regarding the behavior of Oracle B-Tree indexes.

I’ve promised to keep quiet regarding some of the specific details to be covered so as not to spoil the show (while researching further some of the issues myself), but if you don’t already, I would highly recommend popping over and having a read. You’ll be sure to learn something both interesting and useful.

I’ll likely blog about some of the specifics myself in due course.

Why A Segment Really Has To Be At Least 2 Blocks In Size July 20, 2009

Posted by Richard Foote in Extent Management, Humour, Tablespace Management.
17 comments

I just couldn’t resist this one …

In the latest Don Burleson news piece dated July 17, 2009 called “Tablespace size initial extent size” (although the referenced article itself is dated June 18, 2009), it attempts to answer the question why is there a mismatch between the uniform extent size specified in the tablespace and the actual size of extents in the tablespace.

The question uses the following basic example:

CREATE SMALLFILE
TABLESPACE “TEST2”
NOLOGGING
DATAFILE ‘/u02/oradata/ORCLBD/TEST2.dbf’
   SIZE 5M
   EXTENT MANAGEMENT LOCAL
   UNIFORM SIZE 3K
   SEGMENT SPACE MANAGEMENT MANUAL;

 

The uniform extent size is only a tiny 3K, this in an 8K block database. It’s suspected an initial extent would therefore be at least 8K in size but are puzzled why the initial extent is actually 16K.

The reason given by Don Burleson is nothing short of bizarre. The suggestion is that because “the default MINEXTENTS is 2″and because the uniform size of the tablespace is less than the database block size, therefore “the MINEXTENTS takes precedence and you see 16K, 8K * 2, for your initial tablespace size“.

It also mentions that “The initial size of a tablespace is also governed by the MINEXTENTS.  In this case, you did not specify in in your create tablespace syntax, so the default in dba_tablespaces takes effect“.

There are of course a number of things wrong with this explanation.

Firstly, the default value of MINEXTENTS is of course not 2 but 1. This has been the default value of MINEXTENTS since I can remember (with the exception of a rollback segment which is a special case).

In fact, with regard to LMTs, Oracle doesn’t even allow a non-default MINEXTENT value when defining the tablespace as it simply doesn’t make sense to do so (or any other storage clause value for that matter). It’s therefore not specified in the create tablespace syntax because it’s simply not allowed with a LMT !!

Secondly, if one simply just attempted to create an object in such a tablespace, one would clearly see that there is indeed only one extent created by default. And as the question itself suggests, this one and only extent indeed has a size of 16K (assuming it’s in an 8K block tablespace).

First create the tablespace:

SQL> CREATE SMALLFILE
  2  TABLESPACE “TEST2”
  3  NOLOGGING
  4  DATAFILE ‘c:/temp/TEST2.dbf’
  5     SIZE 5M
  6     EXTENT MANAGEMENT LOCAL
  7     UNIFORM SIZE 3K
  8     SEGMENT SPACE MANAGEMENT MANUAL
  9  /

Tablespace created.

 

Next, create a table in the tablespace …

SQL> CREATE TABLE BABY (id number) TABLESPACE TEST2;

Table created.

 

Let’s look at what we’re created …

SQL> select segment_name, tablespace_name, extent_id, bytes, blocks from dba_extents where segment_name = ‘BABY’;

SEGMENT_N EXTENT_ID BYTES BLOCKS
--------- --------- ----- ------
BABY              0 16384      2 

SQL> select segment_name, tablespace_name, min_extents, bytes, blocks from dba_segments where segment_name = ‘BABY’;

SEGMENT_N MIN_EXTENTS BYTES BLOCKS
--------- ----------- ----- ------
BABY                1 16384      2 

 

We of course notice that the table indeed only has 1 extent. We of course notice that the default value of MINEXTENTS is indeed just 1, not 2

And we of course notice that the extent is made up of 2 blocks for a combined total of 16K as suggested by the question.

In keeping with an index theme, we notice exactly the same characteristics if we create an index in this tablespace:

SQL> create index baby_i on baby(id) tablespace test2;

Index created. 

SQL> select segment_name, tablespace_name, extent_id, bytes, blocks from dba_extents where segment_name = ‘BABY_I’;

SEGMENT_N EXTENT_ID BYTES BLOCKS
--------- --------- ----- ------
BABY_I            0 16384      2 

SQL> select segment_name, tablespace_name, min_extents, bytes, blocks from dba_segments where segment_name = ‘BABY_I’;

SEGMENT_N MIN_EXTENTS BYTES BLOCKS
--------- ----------- ----- ------
BABY_I              1 16384      2 

 

Same thing. Only one extent with an initial size of 2 blocks or 16K in total (when created in an 8K tablespace).

A simple test will confirm all this. And this is the beauty of simple tests, we can use them to validate Oracle behaviour. We don’t need a large, commercial, 1000+ concurrent, 100TB database to confirm all this. A very basic test of a PC at home is all you need …

Therefore the whole suggestion that the initial extent size of 16K has anything to do with MINEXTENTS is just plain wrong, wrong, wrong.

When I used to teach all this for Oracle University, if a student on an introduction DBA Admin course didn’t know that a segment consists of one extent by default, by lunchtime on the first day, I would be disappointed.

So what’s the real explanation ?

Well, any regular reader of my blog will of course know the answer …

Every segment must have a segment header where segment information such as the extent map, freelists (for non-ASSM segments), the HWM, etc. is stored. However, it makes no sense for a segment to consist of just a segment header, it needs at least one other block in which to store actual data relevant to the segment. Therefore a segment must consist of at least 2 blocks (note for ASSM segments, we need additional blocks for the segment bitmap information to be stored as well).

It makes no sense therefore for a segment to consist of just 1 block, it must be at least 2 blocks in size as an absolute minimum.

Therefore, it makes no sense to create a 1 extent segment where the extent is only 1 block in size. If a segment must consist of at least 2 blocks in a non-ASSM tablespace, then the associated extents must also consist of at least 2 blocks.

So in summary:

1) The default MINEXTENTS is 1, not 2 (with the exception of a rollback segment)

2) The reason why an extent must consist of at least 2 blocks is because a segment must also consist of at least 2 blocks, the segment header plus an additional block

3) Don’t believe everything you read or hear, even if it’s on the “news”

 

Back to more interesting topics soon I promise 🙂

 

UPDATE: 23 July 2009.The article has been updated with the erroneous discussion of MINEXTENTS defaulting to 2 being the issue replaced with the details I’ve listed here and the fact an extent must be a minimum of 2 blocks for a non-ASSM LMT. There are still a number of errors left in the article, however I’ll leave those for Don to sort out. Here’s a tip though. Perhaps save so-called “Oracle news” articles for those questions that are actually answered correctly 😉

How To Really Create A Function-Based Index (A Day In The Life) July 12, 2009

Posted by Richard Foote in Function Based Indexes, Humour, Oracle Indexes.
21 comments

“I heard the news today, oh boy” (John Lennon 1967).

It was with some amusement that someone pointed out Don Burleson’s latest 11 July 2009 “Oracle News” piece : “How to index on a SQL CASE Statement” (the article it links to is dated 29 December 2008).

It must obviously be a very very slow news day as the ability to create a function-based index has been around for a long time, the SQL CASE “statement” (expression actually) being around since at least Oracle8i days. So it’s not exactly “new” news.

However, what’s particularly amusing in this little “news” piece, is the SQL statement used to demonstrate how to create an index on a SQL CASE statement:

create index
case_index as
(case SOURCE_TRAN
when ‘PO’ then PO_ID
when ‘VOUCHER’ then voucher_id
ELSE journal_id
end = ‘0000000001’
END);

No, this is not some new weird piece of SQL. The problem of course is that this most definitely is NOT how to create a function-based index with a CASE statement. It’s fundamentally syntactically totally wrong, not just in one place, but in several places.

If you were to just TEST this SQL, you would get the following error:

SQL> create table case_tab (source_tran varchar2(5), po_id varchar2(5), voucher_id varchar2(5), journal_id varchar2(5));

Table created.

SQL> create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end);
create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end)
                        *
ERROR at line 1:
ORA-00969: missing ON keyword

 

You see, to create an index, you must at the very least need to specify the table that is being indexed …

Additionally, the AS statement is incorrect and the CASE statement itself is wrong and incorrectly specifies an additional END condition. There’s a lot wrong here.

If you really want to create such an index, then you need the SQL syntax to at least look something like this:

SQL> create index case_tab_idx on case_tab (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end);

Index created.

 

The “News” article then suggests you need to collect statistics on the index:

EXEC DBMS_STATS.gather_index_stats(‘OWNER’, ‘CASE_INDEX’);

However, this advice is again not quite correct. Remember, this “News” piece is only a day or two old and Oracle has been automatically collecting statistics on indexes as they’re created (or rebuilt) since Oracle 10g. Providing the table already has statistics, Oracle will automatically collect statistics on the index as it’s being created. There is therefore no need to collect index statistics again after the index has been created.

But but but, what does not have statistics and what really should be collected after you create any new function-based index are the statistics on the virtual column that is created on the parent table as part of creating any function-based index. This is vitally important, else the function-index may not be used by the CBO as expected. For example:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘CASE_TAB’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL HIDDEN COLUMNS SIZE 1’);

I discuss this issue in some detail the post “Function-Based Indexes And Missing Statistics“.

So some advice in summary:

1) If you’re going to create a function-based index, get the syntax correct. It helps,  it really does.

2) Ensure you collect the associated statistics on the table virtual column that is created automatically behind the scenes.

3) Don’t believe everything you read or hear, even if it’s on the “news” …  🙂

 

UPDATE: 13 July 2009: The “News” article has now been amended to expand the discussion on statistics and reference the dbms_stats.gather_table_stats procedure for collecting hidden column statistics on the function-based index virtual columns as I discussed. The article is still dated 29 December 2008.

I also notice the newly added gathering hidden column statistics example used in both the news article and in the referenced “important notes” is simply a cut ‘n’ paste from this very blog post. I can easily tell because:

  1. It uses the same table_name as I made up in my example (CASE_TAB)
  2. It has the same error as in my original version with the missing quote when defining the table_name 😉

The CBO and Indexes: OPTIMIZER_INDEX_COST_ADJ Part I July 8, 2009

Posted by Richard Foote in CBO, Index Access Path, OPTIMIZER_INDEX_COST_ADJ, Oracle Cost Based Optimizer, Oracle Indexes.
17 comments

In the previous entry regarding The CBO and Indexes, we saw how the CBO with a query that selected 5 distinct values in an IN list, representing 5% of the data, decided to use a FTS because the costs of doing so were less than that of using a corresponding index. These costs (using the I/O costing model) represented the number of expected I/Os and the FTS was basically going to perform fewer I/Os than the index. Less I/Os, less cost and so the FTS was selected as the preferred access path.
 
However, by default, the CBO when determining these costs via the I/O costing model makes two very important assumptions which may not necessarily be true.
 
Assumption one is that all I/Os are likely to be “physical I/Os” which all need to be costed and taken into account.
 
Assumption two is that all I/Os are costed equally, even though the size of a multiblock I/O performed typically during a FTS is larger and so potentially more costly than a single block I/O usually associated with an index access.
 
Today, I’m only going to focus on this second assumption. 

Now, when performing and processing data from a multiblock I/O as performed during a FTS operation, it’s typical for such operations to be more resource intensive than that of a single block I/O as performed during an index range scan, as the associated overheads are likely be greater such as having to read more actual data off the disk, having to transfer more data into the SGA, having to process more data in each associated block, etc.
 
Therefore, not all I/Os are equal. However, by default the CBO ignores all these possible differences and costs all I/Os associated with a FTS (multiblock) and an index (single block) as being equivalent or the same.
 
Now, this hardly seems fair or indeed accurate and desirable when determining the true cost differences between an index and a FTS. Shouldn’t the fact that a single block I/O is likely to be less resource intensive and take less elapsed time to process be taken into consideration when determining these relative costs ? 

Enter the optimizer_index_cost_adj parameter.
 
The purpose of this parameter is simply to “adjust” the corresponding costs associated with an index to (hopefully) more accurately reflect the relative I/O costs between using an index and a FTS. If for example a single block I/O only takes 1/2 the time and resources to perform compared to a multiblock I/O, shouldn’t these associated I/O cost differences be reflected when determining whether or not to use an index and perhaps reduce the index related costs by 1/2 as a result ?
 
This parameter has a very simple impact on how the CBO costs the use of an index based access path. It takes the value of the optimizer_index_cost_adj as a percentage and adjusts the cost of an index related range scan access path to only be the corresponding percentage of the total index cost. By default, it has a value of 100 meaning that a single block I/O is 100% when compared to that of a multiblock I/O which in turn means that the index related I/O costs are treated the same as that of a multiblock FTS I/O. A default value of 100 therefore has no effect on the overall cost of using an index related access path.
 
However, if the optimizer_index_cost_adj only has a value of (say) 25, it means that all single block I/O are only 25% as costly as that of a multiblock I/O and so index related range scan costs are adjusted to be only 25% of that of the total index access path cost.
 
Going back to the previous demo where the FTS was selected, I calculated the cost of using the index when retrieving the 5% of data to be:

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854) = 2 + 5 x 7 + 43 = 37 + 43 = 80.
 
The cost of using a FTS was calculated as being only 65. A cost of 65 for the FTS is less than a cost of 80 for the index and so the FTS was selected.

This time, the linked demo sets the optimizer_index_cost_adj = 25 before running the exact same query again.

We notice of couple of key differences. The first obvious difference is that the plan has changed and that the CBO has now decided to use the index. The second difference is the associated cost relating to the use of the index. Previously, it was calculated as being 80 but now it only has a cost of 20. The maths is pretty simple as with an optimizer_index_cost_adj = 25, we need only mutliply the previous total with 0.25:

(2 + 5 x ceil(0.01 x 602) + ceil(0.05 x 854)) x 0.25 = (2 + 5 x 7 + 43) x 0.25 = 80 x 0.25 = 20.

Note also that just the index range scan cost component was previously 2 + 5 x ceil(0.01 x 602) = 37, but is now also adjusted to 37 x 0.25 which rounds to 9.

Basically by setting the optimizer_index_cost_adj = 25, we have effectively reduced the overall cost of using the index based execution path down from 80 to just 20, to just 25% of the previous total index cost.
 
The cost of the FTS remains unchanged at 65. The index access path at just 20 is now less than the FTS alternative and so the index is now chosen by the CBO.

Yes, all these numbers and costs make sense when one understands how the CBO performs its calculations and the effect of setting the optimizer_index_cost_adj parameter to a non-default value.

The  optimizer_index_cost_adj parameter can therefore obviously have a very significant impact in the behaviour and subsequent performance of the database as the CBO will reduce (or maybe increase) the actual costs of index related access paths by the percentage denoted in the optimizer_index_cost_adj parameter. It can potentially dramatically increase (or decrease) the likelihood of an index access path being chosen over a FTS.
 
There are typically 3 very different ways in which this parameter is set, which I’ll list in increasing order of preference.
 
1) Set it arbitrarily to a very low figure such that indexes reign supreme as their associated costs get adjusted to such a low figure by the CBO that a FTS access path has little chance of being chosen (for example, here’s a suggestion to set it to a magical value of 12). Generally a very bad thing to do in any database …
 
2) Set it to a value that the DBA determines is an approximate percentage of the costs associated with a single block I/O when compared to a multiblock I/O. An improvement of option 1), but I still prefer the next option 3) …
 
3) Leave it at the default value of 100 such that it has no impact and the CBO does not use it to adjust the cost of an index access path

 

I’ll explain in Part II a sensible approach in setting the optimizer_index_cost_adj parameter and why option 3 is the preferred option with any currently supported version of Oracle.