jump to navigation

Indexes and NOT Equal (Not Now John) August 13, 2008

Posted by Richard Foote in Index Access Path, NOT Equal, Oracle Cost Based Optimizer, Oracle Indexes, Performance Tuning.
26 comments

The Cost Based Optimizer (CBO) is a rather complex piece of code that has to deal with countless different possible scenarios when trying to determine what the most optimal execution plan might be. It’s also a vitally important piece of code because not only do the decisions need to be reasonably accurate so that it doesn’t generate inefficient execution plans but it needs to make these decisions in a reasonably efficient manner else it wastes resources and most importantly wastes time while it performs its calculations.

So there’s a trade-off between ensuring the CBO makes reasonable decisions while ensuring it makes its decisions in a timely and resource efficient manner. Database performance could be directly impacted if these trade-offs are not managed effectively.

Therefore, there are all sorts of short cuts and assumptions that are coded into the CBO to make its life a little easier. However, these short cuts can sometimes be problematic if they’re not recognised and handled appropriately.

One of these little short cuts worth noting is how the CBO deals with NOT EQUAL (and NOT IN) conditions …

Typically when we have a condition where we just say NOT EQUAL, we’re basically suggesting we’re interested in the vast majority of possible values with the exception of the value specified in the NOT EQUAL condition. We want most values but not if it’s this particular value.

For example, a condition where we state something such as:

WHERE TEXT <> ‘BOWIE’

means we want all the other possible values of TEXT, just not those with the specific value of ‘BOWIE’. In other words, we’re typically interested in the vast majority of possible values when we specify a NOT EQUAL condition.

However, we all know that typically, Oracle will not use an index if generally a relatively “high” percentage of rows are to be selected. It would generally be more efficient and less costly to simply perform a Full Table Scan if most rows are going to be returned anyways.

Therefore the CBO simply ignores indexes when costing a NOT EQUAL condition. Why bother going to all the overhead of calculating the cost of using an index to retrieve the vast majority of rows when a Full Table Scan is going to be the cheaper alternative in the vast majority of such cases. So the CBO doesn’t even bother trying and ignores all indexes that could potentially be used to retrieve the rows based on the NOT EQUAL condition.

But what if the data isn’t evenly distributed and the NOT EQUAL condition actually retrieves only a relatively small proportion of the rows. What if most rows actually have the value specified in the NOT EQUAL condition and the remaining rows constitute a relatively small proportion of the remaining rows ?

When the CBO ignores indexes, it ignores indexes in all cases. Even if 99.99% of rows match the value in the NOT EQUAL condition and there’s only a handful of remaining rows to actually be retrieved, the code path in the CBO is still followed and indexes are ignored regardless. The reason possibly being such queries could be re-written to avoid the use of the NOT EQUAL condition and so its use is still suggesting a large selectivity.

The refusal of the CBO to consider an index with a NOT EQUAL condition can easily be illustrated.

First, let’s create a table and populate a TEXT column with the same value, ‘BOWIE':

SQL> create table bowie as select rownum id, ‘BOWIE’ text from dual connect by level <= 1000000;

Table created.

Let’s make the TEXT column NOT NULL so the CBO knows all rows have a value for this column:

SQL> alter table bowie modify text not null;

Table altered.

Let’s now add a new row, one that has a different value for the TEXT column:

SQL> insert into bowie values (1000001, ‘ZIGGY’);

1 row created.

Commit complete.

So all rows have a TEXT value of ‘BOWIE’, except for just the one row which has a value of ‘ZIGGY’.

OK, let’s now create an index on this column:

SQL> create index bowie_i on bowie(text);

Index created.

Let’s now collect some statistics on this table, including a histogram on the TEXT column so that the CBO knows the data is not even distributed and that the vast number of values of TEXT are ‘BOWIE':

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> ‘BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> ‘BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR COLUMNS TEXT SIZE 5′);

PL/SQL procedure successfully completed.

So only one row has a value that is NOT a ‘BOWIE’ which means an index to retrieve this one and only row would be an efficient and appropriate execution path, right ?

Well, let’s see what the CBO decides to do. First, let’s set a 10053 trace so we can see how the CBO has costed it’s possible options.

SQL> alter session set events ‘10053 trace name context forever';

Session altered.

Let’s now execute this simple and innocent looking statement:

SQL> select * from bowie where text <> ‘BOWIE';

        ID TEXT
---------- -----
   1000001 ZIGGY

---------------------------------
| Id| Operation         | Name  | 
---------------------------------
|  0| SELECT STATEMENT  |       |
|* 1|  TABLE ACCESS FULL| BOWIE | 
---------------------------------

We note that Oracle has decided to not use the index but use a FTS instead.  If we look at the relevant parts of the 10053 trace, we note that the CBO did not even cost or consider using the index. The index was basically ignored and not considered at all:

***************************************
BASE STATISTICAL INFORMATION
***********************
Table Stats::
  Table: BOWIE  Alias: BOWIE
    #Rows: 1000001  #Blks:  2214  AvgRowLen:  10.00
Index Stats::
  Index: BOWIE_I  Col#: 2
    LVLS: 2  #LB: 2370  #DK: 2  LB/K: 1185.00  DB/K: 1105.00  CLUF: 2210.00
Access path analysis for BOWIE
***************************************
SINGLE TABLE ACCESS PATH
  Single Table Cardinality Estimation for BOWIE[BOWIE]
  Column (#2):
    NewDensity:0.000000, OldDensity:0.000000 BktCnt:1000001, PopBktCnt:1000000, PopValCnt:1, NDV:2
  Table: BOWIE  Alias: BOWIE
    Card: Original: 1000001.000000  Rounded: 1  Computed: 1.00  Non Adjusted: 1.00
  Access Path: TableScan
    Cost:  620.67  Resp: 620.67  Degree: 0
      Cost_io: 601.00  Cost_cpu: 435767288
      Resp_io: 601.00  Resp_cpu: 435767288
  Best:: AccessPath: TableScan
         Cost: 620.67  Degree: 1  Resp: 620.67  Card: 1.00  Bytes: 0

You can try to hint the query but the CBO will still ignore any RANGE SCAN operation because the CBO can’t know what all other possible potential values that are not ‘BOWIE’ might be (remembering the statistics may not necessarily be accurate). It can perform a FULL INDEX SCAN but this means reading all the leaf nodes that contain all the unwanted ‘BOWIE’ values and so it still an inefficient option:

SQL> select /*+ index (bowie bowie_i) */ * from bowie where text <> ‘BOWIE';

-----------------------------------
| Id| Operation                   |
-----------------------------------
|  0| SELECT STATEMENT            |
|  1|  TABLE ACCESS BY INDEX ROWID|
|* 2|   INDEX FULL SCAN           |
-----------------------------------

The INDEX RANGE SCAN is simply not an option …

What is an option of course is to simply rewrite the query. One can just write the query in the “positive” sense and the index is now considered and used:

SQL> select * from bowie where text = ‘ZIGGY';

-----------------------------------
| Id| Operation                   |
-----------------------------------
|  0| SELECT STATEMENT            |
|  1|  TABLE ACCESS BY INDEX ROWID|
|* 2|   INDEX RANGE SCAN          |
-----------------------------------

Or, if there a many different distinct values that are not ‘BOWIE’ but which in total still constitute a relatively small percentage of the total rows, then it could be re-written as follows which can make use of the index in an effective manner by concatenating two separate index range scans:

SQL> select * from bowie where text < ‘BOWIE’ or text > ‘BOWIE';

        ID TEXT
---------- -----
   1000001 ZIGGY
------------------------------------
| Id| Operation                    |
------------------------------------
|  0| SELECT STATEMENT             |
|  1|  CONCATENATION               |
|  2|   TABLE ACCESS BY INDEX ROWID|
|* 3|    INDEX RANGE SCAN          |
|  4|   TABLE ACCESS BY INDEX ROWID|
|* 5|    INDEX RANGE SCAN          |
------------------------------------

Note this same issue applies to NOT IN conditions.

Be very careful when using NOT EQUAL conditions and be mindful of the impact they may have with your indexes.

Non-Unique Indexes and Direct-Path Inserts (What In The World) August 6, 2008

Posted by Richard Foote in Direct-Path Inserts, Non-Unique Indexes, Oracle Indexes, Performance Tuning.
6 comments

The OTN Database Forum has had some really good threads lately and something that came up was the question of indexes and Direct-Inserts which I thought might be worth a mention here.

I’ve previously discussed the differences between Unique and Non-Unique Indexes and my general preference for using Unique Indexes where possible and appropriate:

http://richardfoote.wordpress.com/2007/12/18/differences-between-unique-and-non-unique-indexes-part-i/

http://richardfoote.wordpress.com/2007/12/21/differences-between-unique-and-non-unique-indexes-part-ii/

http://richardfoote.wordpress.com/2007/12/30/differences-between-unique-and-non-unique-indexes-part-iii/

A Direct-Path Insert is a special mechanism used by Oracle to more quickly and efficiently insert data into a table. Rather than utilising the conventional method of using the table freelists or ASSM bitmaps to find an available free block that’s subsequently loaded into the buffer cache and processed, Oracle instead builds the necessary blocks in session memory and writes them directly to disk, “appending” them above the table High Water Mark (HWM) and hence by-passing the buffer cache entirely.

When inserting a large amount of data, a Direct-Path Insert can hence be substantially faster and has the added flexibility of being a NOLOGGING operation if required.

However, if you have a Primary Key (or Unique Key) policed via a Non-Unique index, then Oracle will automatically disable Direct-Path inserts behind the scene. This restriction has only been lifted since 11g.

A simple demo.

First, using a 10.2.0.3 database, create a table and populate it with a few rows:

SQL> CREATE TABLE ZIGGY (id NUMBER, text VARCHAR2(20));

Table created.

SQL> INSERT INTO ziggy SELECT rownum , ‘Ziggy’ FROM dual CONNECT BY LEVEL <= 10000;

10000 rows created.

SQL> COMMIT;

Commit complete.

Next, let’s add a PK that’s policed by a Non-Unique Index:

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) USING INDEX(CREATE INDEX ziggy_pk ON ziggy(id));

Table altered.

Now, let’s see how many blocks below the HWM we have allocated to the table:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> SELECT blocks, empty_blocks FROM dba_tables WHERE table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    25            0

Let’s see how many physical direct write operations we have currently performed:

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    21

Let’s now attempt to perform a Direct-Path Insert operation:

SQL> INSERT /*+ APPEND */ INTO ziggy SELECT 10001, ‘NEW’ FROM dual;

1 row created.

SQL> COMMIT;

Commit complete.

But have we actually performed any physical writes direct operations ?

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    21

NO !! The number of such operations has not changed. Has the HWM been incremented ?

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> SELECT blocks, empty_blocks FROM dba_tables WHERE table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    25            0 

NO !! The number of blocks has also remained the same. Clearly then, the Direct-Path Insert has not actually worked and Oracle has simply performed a conventional insert operation instead.

Let’s see if the situation changes if we replace the index policing the PK with a Unique Index instead …

SQL> ALTER TABLE ziggy DROP PRIMARY KEY;

Table altered.

SQL> DROP INDEX ZIGGY_PK;

Index dropped.

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id);

Table altered.

OK, let’s see the current number of physical writes direct operations:

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    41 

Let’s try another Direct-Path Insert …

SQL> INSERT /*+ APPEND */ INTO ziggy SELECT 10002, ‘NEW’ FROM dual;

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    42 

And the number of physical writes direct has now increased. What about the HWM ?

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> SELECT blocks, empty_blocks FROM dba_tables WHERE table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    26            0 

Yep, that’s gone up by one as well. Let’s just repeat the process to be sure …

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    42 

SQL> INSERT /*+ APPEND */ INTO ziggy SELECT 10003, ‘NEW’ FROM dual;

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    43 

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> SELECT blocks, empty_blocks FROM dba_tables WHERE table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    27            0 

Yes indeed, Direct-Path Inserts are definitely now occurring now we have a Unique Index policing our PK constraint.

Same thing now with a Non-Unique Index policing our PK again but this time on an 11g database …

SQL> CREATE TABLE ZIGGY (id NUMBER, text VARCHAR2(20));

Table created.

SQL> INSERT INTO ziggy SELECT rownum , ‘Ziggy’ FROM dual CONNECT BY LEVEL <= 10000;

10000 rows created.

SQL> COMMIT;

Commit complete.

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) USING INDEX(CREATE INDEX ziggy_pk ON ziggy(id));

Table altered.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> select blocks, empty_blocks from dba_tables where table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    25            0 

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    23 

SQL> INSERT /*+ APPEND */ INTO ziggy SELECT 10001, ‘NEW’ FROM dual;

1 row created.

SQL> COMMIT;

Commit complete.

SQL> SELECT n.name, s.value FROM v$mystat s, v$statname n WHERE s.statistic# = n.statistic# AND n.name = ‘physical writes direct';

NAME                   VALUE
---------------------- -----
physical writes direct    24 

Now, even though we have a Non-Unique index, the number of physical writes direct operations has indeed gone up by one.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’ZIGGY’, estimate_percent=>null, cascade=> true, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> select blocks, empty_blocks from dba_tables where table_name = ‘ZIGGY';

BLOCKS EMPTY_BLOCKS
------ ------------
    26            0 

And indeed, the HWM has increased as well showing that indeed a Direct-Path Insert works as expected in 11g even if our PK constraint is policed by a Non-Unique Index. 

Yet another example of a difference between a Unique and a Non-Unique index which might be worth some consideration.

Empty Leaf Blocks and Statistics (Sense Of Doubt) July 8, 2008

Posted by Richard Foote in Index Access Path, Index Block Splits, Index Delete Operations, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes, Performance Tuning.
14 comments

I’ve recently been discussing how empty index blocks or those blocks that contain nothing but deleted index entries are placed on the index freelist and can potentially be recycled during subsequent index block split operations.

A point that’s not so well known about such empty index blocks is how Oracle considers them when calculating index related statistics and the possible implications this may have on the CBO.

Let’s set the scene with an example I’ve used previously where we load a table/index with 10000 entries and then subsequently delete the vast majority of them.

SQL> create table rich as select rownum id, ‘Bowie’ text from dual connect by level <= 10000;
 
Table created.
 
SQL> create index rich_i on rich(id);
 
Index created.

OK, so we now have an index with 10000 entries. Let’s just check to see how many leaf blocks we currently have:

SQL> analyze index rich_i validate structure;

Index analyzed.

SQL> select lf_rows, lf_blks, del_lf_rows from index_stats;

   LF_ROWS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
     10000         21           0

So we currently have 10000 LF_ROWS and 21 LK_BLKS with no deleted index rows at this stage.

Let’s now deleted the vast majority of rows from the table and hence index row entries from the index:
SQL> delete rich where id <= 9990;
 
9990 rows deleted.
 
SQL> commit;
 
Commit complete.

OK, so now we have an index with the vast majority of the index entries having been deleted and with all but one index leaf block effectively empty.

Let’s start by looking at how the ANALYZE INDEX … VALIDATE STRUCTURE deals with empty leaf blocks and index entries:

SQL> analyze index rich_i validate structure;
 
Index analyzed.
 
SQL> select lf_rows, lf_blks, del_lf_rows from index_stats;

   LF_ROWS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
     10000         21        9990

The first thing we notice is that the LF_ROWS statistics still has a value of 10000. It still counts index entries, even if they’ve been deleted.

We also notice that the LF_BLKS value is 21 so those leaf blocks that are effectively empty are still counted as well.

Let’s now collect statistics using DBMS_STATS as currently recommended by Oracle:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘RICH’, cascade => true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

If we now look at the index statistics:

SQL> select index_name, num_rows, leaf_blocks from dba_indexes where index_name = ‘RICH_I';

INDEX_NAME NUM_ROWS LEAF_BLOCKS
---------- -------- -----------
RICH_I           10           1

We notice a couple of important differences. Firstly, the NUM_ROWS value is 10, highlighting that only non-deleted index entries are counted. We also notice that the number of LEAF_BLOCKS is only 1, highlighting that only those index leaf blocks that contain non-deleted index entries are counted. Although there are 20 other leaf blocks within the index structure, these are not counted and considered by the CBO when statistics are calculated using DBMS_STATS.

If we run the following simple little query that effectively selects all remaining rows from the table, we notice the following execution plan:

SQL> select * from rich where id between 1 and 10000;

        ID TEXT
---------- -----
      9991 Bowie
      9992 Bowie
      9993 Bowie
      9994 Bowie
      9995 Bowie
      9996 Bowie
      9997 Bowie
      9998 Bowie
      9999 Bowie
     10000 Bowie

Execution Plan
--------------------------------------------
|Id | Operation                   | Name   |
--------------------------------------------
| 0 | SELECT STATEMENT            |        |
| 1 |  TABLE ACCESS BY INDEX ROWID| RICH   |
|*2 |   INDEX RANGE SCAN          | RICH_I |
--------------------------------------------

The index is actually used to select all the remaining 10 rows, in part because the index related costs are so low.

Let’s see what would happens if we were to use the old, ANALYZE command to calculate the index statistics:

SQL> analyze index rich_i compute statistics;

Index analyzed.

First, let’s see if the index statistics are any different …

select index_name, num_rows, leaf_blocks from dba_indexes where index_name = ‘RICH_I';

INDEX_NAME NUM_ROWS LEAF_BLOCKS
---------- -------- -----------
RICH_I           10          21

OK, a big big difference here. Where previously, DBMS_STATS didn’t include the empty leaf blocks in it’s statistics, we now notice that using the ANALYZE command does include such empty leaf blocks. The LEAF_BLOCKS value is now 21, not 1 as it was previously. Note though that the number of NUM_ROWS is still 10, so it still doesn’t count the deleted index entries themselves, just the empty leaf blocks.

But leaf blocks is one of the key statistics used by the CBO when calculating the cost of using an index related access path. Could this all make a difference in how our previous query is costed by the CBO ?

SQL> select * from rich where id between 1 and 10000;

        ID TEXT
---------- -----
      9991 Bowie
      9992 Bowie
      9993 Bowie
      9994 Bowie
      9995 Bowie
      9996 Bowie
      9997 Bowie
      9998 Bowie
      9999 Bowie
     10000 Bowie

10 rows selected.

Execution Plan
----------------------------------
| Id  | Operation         | Name |
----------------------------------
|   0 | SELECT STATEMENT  |      |
|*  1 |  TABLE ACCESS FULL| RICH |
----------------------------------

Oh yes indeed. Now the CBO has decided to use a Full Table Scan, in large part because of the additional calculated costs associated with using the index.

Note these tests work the same on all supported versions of Oracle.

So empty leaf blocks can still have a large impact on not only how a query may perform but indeed on how the CBO calculates the associated costs, depending on how the statistics are generated.

Yes, there are differences between the ANALYZE command and DBMS_STATS. This is one of the more subtle differences …

3 Steps To Performance Tuning (Working Class Hero) July 3, 2008

Posted by Richard Foote in Oracle Opinion, Performance Tuning, Richard's Musings.
8 comments

Last night, I answered a question on the Database OTN forum regarding Database Re-Org and Performance Tuning. I thought it might be worthwhile sharing my response here as it’s something I feel quite strongly about.

Basically my response to the question of what basic steps one should follow when performing performance tuning was:

1) Identify an actual problem that needs addressing, one that’s problematic to the business, not one that only exists in some statistic or in one’s imagination

2) Determine what’s actually causing the problem as identified in Step 1.

3) Address the specific issue as identified in Step 2.

It all sounds rather obvious but it’s amazing how many don’t follow these 3 basic steps and attempt to jump straight to Step 3.

Unless you perform Step 1, you can’t accurately perform Step 2 which means you’ll only be guessing when performing Step 3.

The secret to performance tuning is not to guess …

I’ve lost count of the number of times I see people guessing at what a problem might be and hence get it all wrong …

The number of times people waste time and resources on problems that aren’t really problems and hence make no measurable difference …

The number of times people throw hardware at a problem without fully considering whether additional hardware will actually resolve the problem and hence waste money and resources for no measurable benefit …

The number of times people jump straight to applying a solution to a problem that they haven’t properly or correctly diagnosed and hence don’t actually solve the issue …

The number of times people attempt to resolve a problem by focusing on the symptoms rather than the root cause, only to fail dismally …

The number of times people are lucky and fix a problem by guesswork and by fumbling around in the dark without understanding why it fixed the problem, only to attempt the same thing again at another time and for it to fail dismally …

Like I said, the secret to performance tuning is not to guess.

John Lennon was once quoted as saying his secret to writing music was to:

1) Say what you want to say

2) Make it rhyme

3) Put a back beat to it

Three basic, fundamentally important steps. He would have made a good DBA :)

Extended SQL Tracing Presentation (Your Possible Pasts) May 30, 2008

Posted by Richard Foote in Oracle General, Oracle Myths, Oracle Opinion, Performance Tuning.
3 comments

I’ve recently dug up an old presentation I did for the local Oracle User Group a number of years ago:

Yet Another Presentation On Extended Tracing

It’s slightly dated but has some useful general information on the subject of diagnosing performance issues and Extended SQL Tracing in Oracle that some may find useful, so I thought it might be worth posting it here. I stripped out most of the formatting so that the resultant file will be smaller to download.

Extended SQL Tracing has been absolutely invaluable over the years in diagnosing and trouble-shooting performance related issues. I had the very good fortune of attending a Hotsos Diagnosing Oracle Performance course in Sydney around 2003 (at the same time when Guy Sebastian won the first Australian Idol, ooops, probably shouldn’t have admitted remembering that) with Cary Millsap and Gary Goodman. It was a really fantastic training course which focused on the whole subject of “Method R” and using Extended SQL Tracing to diagnose performance issues. I also remember Steve Adams attending as well and thinking to myself, this guy really  knows his Oracle …

If you haven’t already, I would strongly recommend checking out the Optimizing Oracle Performance book by Cary Millsap with Jeff Holt, an excellent read.

And Guess what. Using Extended SQL Tracing actually works !!

Larger Block Index Tablespace and Small Index Scans – Performance Improvement ? (Let Down) March 31, 2008

Posted by Richard Foote in Index Block Size, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
12 comments

Thought it might be worth looking at the impact on the performance of Unique and small index range scans as typical in OLTP environments, when an index is rebuilt in a larger block tablespace.

Warning, this discussion will again primarily be an exercise in simple mathematics. However, a few little details to start the ball rolling.

I’ve already discussed how in many scenarios, by increasing the index block size, the height of the index can remain unchanged. One can’t simply assume a larger block index results in an index with a lesser height. However, as we shall see, even when indexes do reduce in height, the so-called performance benefits can be somewhat “exaggerated”.

Note also in OLTP environments, the vast majority of index related access paths are either unique scans or small index scans. We’re only interested in a specific customer, in a specific order, in the bank balances of specific accounts, with specific customers buying specific products with specific credit cards, etc. Remember also that say a block size of 8K can generally store many hundreds of index entries per index leaf block and that even a humble little 2K leaf block can often store a three figure number of index entries …

So let’s say in our first example we currently have an 8K index with a height of 3 which we rebuild in a 16K tablespace and the index height remains the same. What are the comparative costs of performing a simple index look-up.

In the 8K index, we need to read 3 x 8K (root, branch and leaf block) to read the index entry of interest for a total of 24K. However, in the new 16K index, we now need to read 3 x 16K (again, a root, branch and index leaf block) to read the index entry of interest, for a total of 48K.

That’s double the potential physical I/O if the blocks are not currently cached, that’s double the memory that needs to be used in the buffer cache to store the index blocks and that’s potentially more CPU we need to use in order to pin and process the larger index blocks.

Hummm, these larger blocks aren’t too impressive so far …

But what if we actually reduce the height after moving an index to a larger block tablespace ?

In the 8K index, we still have our 3 x 8K = 24K. If we rebuild in an 16K block tablespace and reduce it’s height by 1, we now only need to read 2 x 16K = 32K. That’s 3 consistent reads at 24K vs. 2 consistent reads at 32K. Yes, we reduce the number of consistent reads which is a good thing but we still have a larger index footprint which again means possibly more data off disk, more memory and more CPU resources to process the index blocks.

Let’s hope our 2K block index with a height of 2 reduces its height somewhat if we decide to rebuild it in 32K blocks because that’s 2 x 2K = 4K vs 2 x 32K = 64K otherwise. But even if we did, although we reduce the consistent reads, which is a good thing, that’s still 2 x 2K = 4K vs 1 x 32K = 32K.

But we can potentially reduce the height of a 2K index if rebuild in a much larger block size (in specific cases) by more than 1 level, right ?

Yes, a 4 level 2K index, 4 x 2K = 8K, might very well now only require 2 levels in a 32K index, 2 x 32K = 64K. But that’s still 64K of index data we need to access vs. 8K in the smaller block. Consistent reads reduce but our footprint for small index range scans is still larger, sometimes by a considerably amount depending on the change of block size.

You begin to see the issue …

It’s a bit like someone saying they’re going to improve the skyline somewhat and only build appartment buildings that have fewer numbers of floors than previously. But if the floors they build are 2 times or 4 times or even 16 times higher than the previous floors and they only reduce the number of floors by a moderate amount, is the building really lower ?

Have we really reduced the height of the building ?

Yes, the lift only has to stop at fewer floors which can be more “efficient” but it takes longer each time as it goes from one floor to the next. So is it really that much quicker to get to the top, or more specifically in our discussion, from the top to the bottom of the building ?

Surely though, things must run faster, performance must actually improve, things must be more efficient by having larger sized index blocks else why bother ? Why indeed …

This demo on the Performance Impact Of Small Index Scans In A Larger Index Block Size shows how performance may actually worsen, not improve if we move indexes into a larger block size. It creates an index on a well clustered column (so the index is at it’s most efficient and potentially impacts performance the most), first with an 8K block size and then with a 16K block size. The size of the index was carefully chosen so that the larger index block size did indeed reduce the height of the index (although the range of values when this was possible was actually quite limited with these block sizes). A simple PL/SQL procedure then performs a massive number of single row look-ups and the CPU and elapsed times are monitored. The results show when the associated blocks are either cached or not cached, the larger block index incurs larger CPU related costs and results in overall slower response times, despite the fact it actually has a lower height than the smaller block index.

It’s not precisely the same as a large scale environment as PL/SQL has subtle little differences and efficiencies when compared to multiple, separate transactions. Also, it’s not the specific results that are important here but the overall general approach. You can pick any index you want to  investigate, you can pick whatever block sizes you may be interested in, you can monitor and benchmark by checking out specific session details or by tracing the specific sessions, you can determine what the comparative costs and response times may be and you can determine what may cause any performance differences by digging deeper into the session statistics or trace files.

No matter what index or block size you select, you will likely come to the same conclusion. The overall performance benefits of your OLTP transactions when you move indexes to a larger block tablespace will likely be “disappointing”.

If a real estate developer comes knocking on your door with a promise that the empty block around the corner will have an apartment building with only a few floors and it shouldn’t ruin the view too much, you may just want to ask them quietly exactly how many floors they’re planning to build and exactly how high each floor will be.

Else the resulting view may turn out more disappointing than you’re led to believe ;)

Store Indexes In A Larger Block Tablespace: The Multiblock Read Myth Part II (The Fly) March 20, 2008

Posted by Richard Foote in Index Block Size, Index statistics, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
67 comments

Thought I might begin by mentioning a lovely little story by Billy Verreynne in this OTN Forum Thread.

Basically a scientist is doing research on the behaviour of flies. He notes when he opens the jar lid of a trapped fly and claps his hands, the fly takes off and flies away. One day, he decides to pull the wings off the fly. When he claps his hands, the fly just sits there. No matter how loud he claps, other than a slight rocking motion, the fly just doesn’t budge.

Excitedly, he writes in his journal his latest discovery. When you pull the wings off a fly, it goes stone deaf !!

This is soooo funny because this sort of thing happens all the time in the Oracle world (and elsewhere). I see it time after time after time, we’ve all done it.

Hey, I just compressed a segment into one extent and performance improved. Conclusion, storing a segment in one extent improves performance.

Hey, I just separated my indexes from my tables and performance improved. Conclusion, separating indexes from tables improves performance.

Hey, I just moved my indexes into a larger block sized tablespace and my Index Fast Full Scan performance improved. Conclusion, a larger index block size improves Index Fast Full Scan performance.

Lots of flies with no wings. Lots of people thinking flies go deaf when they don’t have wings. An action appears to cause an effect, but does it really …

As Billy himself suggests, the fundamental reason why so many people think “flies with no wings go deaf” is that many don’t understand what’s actually going on. Many simply don’t understand the basic workings, the fundamental processes and mechanisms involved in how Oracle functions or how Oracle performs a specific operation. Tuning by observation, tuning by making change “A” without understanding all the implications of such a change and subsequently making suppositions on the results of such a change, ultimately means we have lots of people thinking flies without wings are deaf.

As I discussed in Part 1, Oracle performs exactly the same sized I/O during a multiblock read, regardless of the block size of the segment. Exactly the same. Without understanding this simple fact, one could very easily come to a wrong conclusion regarding the ramification of block sizes on multiblock reads. Without understanding flies don’t have ears or sound sensors in their wings, one could very well come to a wrong conclusion regarding the ramifications of removing the wings from a fly.

If we perform an Index Fast Full Scan and performance improves, it can’t be because associated multiblock I/Os are more efficient. A fly doesn’t have sound sensors in its wings. There must be another explanation. Conversely, in my example in Part I of this discussion, performance went worse with a larger index block size (as it did in Greg Rahn’s example on this OTN Forum Thread), but again not as a result of multiblock read performance.

So how could the performance of an Index Fast Full Scan change (for better or worse), if one simply rebuilds the index within a larger block size tablespace ? Well there are of course many possible reasons, with the two more obvious explanations being the following:

1) Most randomly inserted indexes have a PCT_USED value ranging between 70-75% as these indexes perform random 50-50 block splits that are subsequently in differing stages of being filled. By rebuilding an index (say back to a default PCTFREE of 10%) one might increase index compactness by say 15% and hence decrease the overall index size (note reduced block overheads may also reduce the index a little as well, depending on index size and differences in block sizes). The Fast Full Index Scan is the access path that potentially benefits most by defragmenting an index as the associated costs are proportional to the overall size of the index. Reducing the size of an index could therefore impact subsequent performance. However, rebuilding the index in the current block size would likely achieve a similar result (plus block overheads), compacting the index and resulting in potentially better performance (although once the index blocks begin to split again, the index would eventually return back to its previous state). Therefore, it’s not the bigger block size but the resultant defragmentation of the index that’s improved matters. The fly isn’t deaf, it just needs its wings to fly …

2) By storing an index in a larger block tablespace, the index must physically be stored on a different database file. This file could be on a faster disk, improving performance, this file could be on a faster part of the disk, improving performance, this file could be on a disk with far less disk contention, improving performance, etc. etc. It’s not the larger block size that’s improved (or worsened) performance, it’s the new physical characteristics of where the index is now stored. If one were to rebuild the index with the current block size and use the same physical characteristics of the larger block index, subsequent performance would likewise increase (or decrease). The fly isn’t deaf, it just needs its wings to fly …

There are many other possible reasons, the system was less busy when using the larger block index, more of the index was physically cached when using the larger block index, etc. etc.

Of course, an Index Fast Full Scan is rarely a scalability issue anyways. Do we really want our applications to perform hundreds of large, concurrent Index Fast Full Scans ? Tuning the application to avoid these overheads should be the focus rather than moving indexes into a larger block tablespace in the vain hope it will improve things dramatically. But that’s the topic of another discussion …

Can the performance of an Index Fast Full scan change after moving the index to a larger block size tablespace. Absolutely. However, it doesn’t necessarily mean such a change in performance is a direct result of the index having a larger block size and that multiblock read performance has improved.

It doesn’t mean moving indexes to larger block size tablespaces suddenly makes Oracle go deaf …

Bzzzzzzzzzzzzzzzzzz

UPDATE: I’ve added this simple little demo that illustrates how performance improves when an index is rebuilt in a larger block tablespace. This will of course suggest to some folk that the larger block tablespace improved the performance but what actually improved things was rebuilding the fragmented index to be a more efficient structure. The larger block tablespace was not the fix, rebuilding the index was the important factor. In fact, by rebuilding the index in the original smaller block tablespace, not only do we also improve performance, but things are further improved as we reduce CPU overheads incurred by the larger block tablespace and as a result elapsed times are further improved.

Store Indexes In A Larger Block Tablespace: The Multiblock Read Myth (Karma Police) March 18, 2008

Posted by Richard Foote in Index Block Size, Index statistics, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
8 comments

One of the great myths surrounding the use of differing block sizes is that storing indexes in larger block sizes somehow dramatically improves the performance of index related multiblock reads.

Oracle performs index multiblock reads when performing an Index Fast Full Scan, when it basically treats the index structure as a skinny version of the table. It reads the entire index structure, multiple blocks at a time, “throwing away” any non leaf blocks as it stumbles across them.

The theory goes that by storing indexes in larger size blocks, we would obviously have fewer index related blocks. If we need to read the entire index, as we do with an Index Fast Full Scan, surely it must be more efficient if we have fewer, larger index blocks.

The evidence looks convincing. Here’s a link to an extract from a book by Robin Schumacher where he clearly shows a dramatic “improvement” by using an index tablespace with double the block size. In one query using an 8K block index, the consistent reads during an Index Fast Full Scan was 421. However, when the same index was recreated as a 16K block index, the same query only used 211 consistent gets, 1/2 of what it was previously.

Conclusive “proof” that the 16K block index improved performance wouldn’t you say ?

Well actually, it’s only conclusive proof that the number of consistent gets has dropped, whether it actually improves “performance” is another thing entirely.

There are a couple of little “details” that many don’t quite appreciate. The devil is always in the details …

The first point to note is that when Oracle performs a multiblock read, it uses the value in the db_file_multiblock_read_count parameter to determine how many blocks to read per multiblock read (with system statistics, Oracle itself can determine how best to set this value).

So if the db_file_multiblock_read_count value were set to say 16, Oracle will attempt to read as many as 16 blocks at a time during a multiblock read operation.

Note this value is based on the default block size of the database. So if the default block size is 8K and the db_file_multiblock_read_count is 16, Oracle will try and read 16 x 8K blocks at a time during a multiblock read operation.

However, if there’s a non-default block sized segment (say 16K), Oracle will adjust the number of blocks that are actually read during a multiblock read operation so that the maximum size of the overall multiblock read is identical to that of the default block size.

So if the db_file_multiblock_read_count is 16 and the default block size is 8K, a multiblock read of an object in a 16K tablespace will only read 8 blocks at a time (and not 16). A multiblock read of an object in a 2K tablespace will read 64 blocks at a time.

The actual size of a multiblock read therefore is identical regardless of the block size of an object within a database.

An easy way to highlight this is to simply trace a session and see the specific size of corresponding multiblock read operations.

A sample from a trace on an 8K block index (with the db_file_multiblock_read_count set to 16), performing an Index Fast Full Scan looks like this:

WAIT #1: nam=’db file scattered read’ ela= 1487 file#=8 block#=1050 blocks=16 obj#=78294 tim=615409554677
WAIT #1: nam=’db file scattered read’ ela= 1377 file#=8 block#=1066 blocks=16 obj#=78294 tim=615409557777
WAIT #1: nam=’db file scattered read’ ela= 1143 file#=8 block#=1082 blocks=16 obj#=78294 tim=615409561563

Note that Oracle is reading 16 x 8K blocks (128K) per multiblock read operation.

However, when the index is recreated in a 16K block size tablespace, the Fast Full Index Scan looks like this:

WAIT #1: nam=’db file scattered read’ ela= 1413 file#=6 block#=14 blocks=8 obj#=78296 tim=626802128684
WAIT #1: nam=’db file scattered read’ ela= 1447 file#=6 block#=22 blocks=8 obj#=78296 tim=626802131649
WAIT #1: nam=’db file scattered read’ ela= 2014 file#=6 block#=30 blocks=8 obj#=78296 tim=626802135222

Note that Oracle is now only reading 8 x 16K blocks (128K) per multiblock operation.

Both indexes are effectively doing exactly the same work, both are effectively reading up to 128K of data per multiblock read …

It’s like paying someone $50 per 1/2 hour of work and then deciding to make things more “efficient” by paying them $100 per hour of work instead. In the end, you’re still just paying them $800 for an 8 hour day’s work regardless …

Note a larger block size will have less associated block overheads with there being less actual blocks so the overall size of an index may reduce a little, depending on index size and differences in block sizes. Therefore any possible improvements will only be restricted to the potential savings in the overall index size. With many databases having default block sizes of 8k and a maximum block size restricted to 16k, these savings may be minimum or non-existent.

This demo on the impact of different block sizes on multiblock read operations shows how Oracle actually performs the same sized reads when performing multiblock reads from differing block sized tablespaces, with the performance of the index in the larger block size tablespace being somewhat worse in this specific example.

With Oracle effectively performing identical work behind the scenes, the performance between different block size tablespaces is likely to be similar. You’re still paying $800 a day regardless …

Although it’s often claimed that multiblock reads is one of the key areas where larger index block sizes are beneficial, a claim based generally on the simplistic fact the number of consistent reads is reduced, the reality of the situation is somewhat different …

Store Indexes In a Larger Block Tablespace: Some Thoughts (Big Brother) March 16, 2008

Posted by Richard Foote in Index Block Size, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Richard's Musings, Tablespace Management.
17 comments

A suggestion that seems to pop up on a routine basis on various forums and discussion boards is that we should be storing our Indexes in a larger block-size tablespace. For example, if our database block size is set to 8K, we should be creating separate (say) 16K block tablespaces specifically for our indexes. Doing so will improve performance as the index will have a flatter, more efficient structure. Multiblock reads will also be more efficient (or so the theory goes) as we would be reading fewer index blocks during such scans.

Oracle introduced the concept of having different tablespaces in a database with different block sizes back in 9i Release 1 in order to make transportable tablespaces between databases with differing block sizes possible. However, there’s nothing preventing one creating a new tablespace with a non-default block size and assigning objects to these tablespaces.

In principle, storing indexes (in particular) in a larger block size sounds like a really good idea doesn’t it ?

I’ll be discussing the pros and cons of this approach in future postings but just some initial thoughts to get everyone thinking about it:

  • All tablespaces with a non-default block size requires a separate, non-default block size buffer cache to be manually configured
  • Non-Default buffer caches are not automatically sized as part of Oracle’s automatic memory management and must be manually tuned and sized, potentially increasing administrative overheads
  • Non-Default buffer caches do not have an associated KEEP or RECYCLED pool and so all objects with the same non-default block size must reside in the same buffer cache
  • The possibility of unnecessarily caching blocks from an infrequently accessed object and wasting memory is therefore likely to increase
  • The possibility of unnecessarily aging out blocks from a more frequently accessed object is also likely to increase, thus increasing I/O related overheads
  • Although the height of an index may reduce if stored in a larger block size, in many cases it may not actually change at all
  • In those cases when the height of an index is actually reduced, the actual performance benefit of such a height reduction is often overstated
  • The reduction of index leaf blocks (a much more telling possible advantage) is only beneficial to very specific types of queries
  • Larger blocks often have the disadvantage of greater contention, which can lead to performance related issues
  • Indexes with larger block sizes have a significantly greater I/O and memory related footprint in relation to most OLTP related index scans
  • Index related multiblock reads on larger block sized segments actually have no real benefit when compared to multiblock reads on smaller block sized segments 
  • Most databases out there in the “real world” only use default block size tablespaces so the risks associated with finding bugs, CBO anomalies, etc. increase once non-default block sizes are introduced

Although in specific scenarios with specific applications, there may be some potential benefits of using non-default blocks sizes, in general, the disadvantages of using non-default block sizes usually out weigh these potential benefits.

As we shall see …

Index Skip Scan – Does Index Column Order Matter Any More ? (Warning Sign) March 10, 2008

Posted by Richard Foote in Index Access Path, Index Skip Scan, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
33 comments

I’ve already written a few posts regarding concatenated indexes and things to consider (and not consider) when deciding the column order of a concatenated (composite) index.

A comment I see from time to time is that the whole question of column order within an index is now somewhat redundant anyways as Oracle since 9i has introduced the Index Skip Scan access path. Prior to 9i, if the leading column of an index wasn’t specified in a predicate, the index was effectively ignored by the CBO. However, if the leading column isn’t referenced now, Oracle can use the index anyways via an Index Skip Scan access path.

So the column order in a concatenated index doesn’t matter that much now, right ?

Well, not quite ….

An Index Skip Scan can only actually be used and considered by the CBO in very specific scenarios and is often an indicator there’s either a missing index or an exisiting index has the columns in the wrong order.

If the leading column of an index is missing, it basically means the values in subsequently referenced columns in the index can potentially appear anywhere within the index structure as the index entries are sorted primarily on the leading indexed column. So if we have column A with 100,000 distinct values and column B with 100,000 distinct values and an index based on (A,B), all index entries are sorted primarily on column A and within a specific value of column A, sorted by column B. Therefore if we attempt a search on just Column B = 42, these values could potentially appear anywhere within the index structure and so the index can not generally be effectively used.

However, what if the leading column actually contained very few distinct values ? Yes, the subsequent column(s) values could appear anywhere within the index structure BUT if these subsequent columns have relatively high cardinality, once we’ve referenced the required index entries for a specific occurrence of a leading column value, we can ignore all subsequent index row entries with the same leading column value. If the leading column has few distinct values, this means we can potentially “skip” several/many leaf blocks until the leading column value changes again, where we can again “probe” the index looking for the subsequent indexed column values of interest.

So if we have a leading column with few distinct values, we may be able to use the index “relatively” efficiently by probing the index as many times as we have distinct leading column values.

On the other hand, if the leading column has relatively high cardinality then an Index Skip Scan is not a viable option. An index can generally store many hundreds of index entries per index leaf block, depending on the block size and the average size of the index row entries of course. So if the leading column were to change just once on average within the subsequent index leaf block, Oracle would be forced to scan the next index leaf block anyways as it may contain the required index row entry.

For Oracle to be able to “skip” an index leaf block, the leaf block must contain nothing but the same leading column value as last changed in a preceding leaf block. Hopefully, there are many leaf blocks where the leading column value doesn’t change and so hopefully there are many leaf blocks that can potentially be “skipped”.

Therefore, the cardinality of the leading column is crucial for an Index Skip Scan to be viable. In the example above where we had 100,000 distinct values for columns A and B, unless the table is massive, it’s unlikely an Index Skip Scan will be viable regardless of which column is the first column in the index. However, if column B only had 10 distinct values, then an index based on (B,A) may very well be able to use an Index Skip Scan whereas an index on (A,B) would not.

Note though that an Index Skip Scan must probe the index at least as many times as there are distinct values of the leading column. This will not be as efficient as an index that only requires the one index probe. Therefore although a query with a predicate based on A=42 could use an Index Skip Scan  with an index on (B,A) assuming column B had few distinct values, an index on (A,B)  or (A) would be more efficient as it would only require the one index probe.

However, if the performance of index (B,A) were “good enough” and/or a search on just A=42 was uncommon, then the index on (B,A) may be quite adequate and an index on (A,B) may be unnecessary. The index on (B,A) would also be able to handle queries based on columns A and B and queries based on just column B (providing the CBO determined the selectivity acceptable, which it might for unevenly distributed rare values of column B).

See this Index Skip Scan demo to see when it all may prove useful.

No, an Index Skip Scan doesn’t mean we don’t need to consider the column order of an index. If anything, it’s something else that needs to be considered and along with index compression, is another reason why low cardinality leading index columns have advantages.

Index Compression Part IV (Packt Like Sardines In a Crushd Tin Box) February 29, 2008

Posted by Richard Foote in Index Compression, Index Internals, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
17 comments

A great question by Brian Tkatch has convinced me there needs to be at least a Part IV in this particular mini-series. The basic question is what size does the column need to be for it to be effectively compressed ?

If a column is only just the one byte, then it can’t really be compressed as the prefix related overheads would likely make the exercise pointless, right ?

Well not quite …

True a larger column that is heavily repeated will yield a better return from compression than a smaller column, but even a single byte column could be effectively compressed, IF there are sufficient repeated values per index leaf block.

Previously, I illustrated a logical representation of how Oracle compresses a block. The following is more of a physical representation of a compressed block:

Prefix 0: David Bowie

0:0

  : ROWID

1: 0

  : ROWID

2: 0

  : ROWID

3: 0

  : ROWID

4: 0

  : ROWID

Prefix 1: David Jones

 

 

5: 1

 

  : ROWID

6: 1

  : ROWID

7: 1

  : ROWID

Remember, only the leading columns in an index can be compressed and index row entries must be sorted by these leading columns. Therefore each prefix entry is referenced by one or more logically consecutive index row entries and each index row entry can only have the one index prefix. The corresponding prefix for a specific index row entry is therefore simply the prefix entry that precedes it within the leaf block.

If we look at sample prefix entries:

prefix row#0[8030] flag: -P—-, lock: 0, len=6
col 0; len 3; (3):  c2 08 43
prc 1
prefix row#1[8015] flag: -P—-, lock: 0, len=6
col 0; len 3; (3):  c2 08 44
prc 1

Note that the prefix row#0 starts at address 8030 while the next prefix entry row#1 starts at address 8015. That’s a total of 15 byes and yet the length of the prefix entry is just 6. Where’s the other 9 bytes ?

Well note the prc value for prefix row#0 is 1 meaning this prefix entry is only referenced by the one index row, as is the next prefix entry as well.  The leading column in this particular example is actually unique meaning each and every prefix entry only has one corresponding index row.

If we look at the corresponding row entry, guess what it’s length will be …

row#0[8021] flag: ——, lock: 0, len=9
col 0; len 6; (6):  04 81 aa 8b 00 69
psno 0
row#1[8006] flag: ——, lock: 0, len=9
col 0; len 6; (6):  04 81 aa 8b 00 6a
psno 1

If you guessed 9, well done. Note that it’s starting address is 8021, which is the starting address of the previous entry 3015 + 6 bytes for the prefix entry which equals 8021 (as Oracle actually fills index blocks from the “bottom up”). Note that index entry row#0 has a psno of 0 meaning it’s corresponding prefix entry references prefix row#0.

Note that the index row length of 9 is made up of 2 byes for locks and flags info, 6 byes for the rowid and 1 byte for the rowid length (as this index is a non-unique index, albeit with unique index values).

Therefore the psno value actually uses 0 bytes as its value can be purely derived from it’s position within the leaf block in relation to its parent prefix value.

Let’s now look at an index row entry with a leading column that is heavily repeated but only one byte in size.

row#0[8025] flag: ——, lock: 0, len=11
col 0; len 1; (1):  41
col 1; len 6; (6):  04 81 12 8a 02 67
row#1[8014] flag: ——, lock: 0, len=11
col 0; len 1; (1):  41
col 1; len 6; (6):  04 81 12 8a 02 68
row#2[8003] flag: ——, lock: 0, len=11
col 0; len 1; (1):  41
col 1; len 6; (6):  04 81 12 8a 02 69

Note the same leading column (hex 41) is repeated again and again within the leaf block and uses a total of 2 bytes of storage, 1 byte for the value and 1 byte for the column length. That means for each and every index row entry, this column uses 2 bytes. Note the length of the index row entries is currently 11 bytes.

Let’s look at the same index compressed.

prefix row#0[8032] flag: -P—-, lock: 0, len=4
col 0; len 1; (1):  41
prc 726
row#0[8023] flag: ——, lock: 0, len=9
col 0; len 6; (6):  04 81 12 8b 00 42
psno 0
row#1[8014] flag: ——, lock: 0, len=9
col 0; len 6; (6):  04 81 12 8b 00 43
psno 0
row#2[8005] flag: ——, lock: 0, len=9
col 0; len 6; (6):  04 81 12 8b 00 44
psno 0

First thing to note is that there is only the one prefix row entry for this particular leaf block and that all 727 index row entries all share the same prefix row entry.

The length of the prefix entry is 4, 2 bytes for these flags and locking info, 1 byte for the prefix column value and 1 bye for the prefix column length. That’s just 4, tiny little  byes.

Note that each index row entry is now only 9 bytes reduced from 11 as we no longer need to store the 2 bytes for the repeated column entry. That means for a total overhead of 4 byes, we are able to save 2 bytes in each and every index row entry.

Not bad at all and as such represents a useful saving. In this particular example, the index reduced from 163 leaf blocks down to 138 leaf blocks, just by compressing a single byte column as the column has very low cardinality.

Note however if this leading column were unique for every row, we would increase the storage of this column from 2 bytes for every index row entry up to 4 bytes for every index row entry as every index entry would have it’s own associated prefix entry. That’s why compression can potentially increase the size of an index if the compressed column(s) has too high a cardinality due to the additional 2 byte overhead required to store all prefix row entries.

The following Index Compression Part IV demo shows all this in a little more detail.

There’s more to index (and indeed table) compression than simply compressing an index (or table) for the sake of it …

BTW, the spelling mistakes in the post title are fully intentional ;)

Oracle Diagnostic Tools – Some Random Thoughts (Hammer To Fall) February 26, 2008

Posted by Richard Foote in Oracle Cost Based Optimizer, Oracle General, Performance Tuning, Richard's Musings.
37 comments

There’s been a bit of discussion lately and some interesting opinions aired, such as those by Daniel Fink , Doug Burns, and Alex Gorgachev regarding the whole issue of the usefulness or otherwise of various Oracle diagnostic tools (such as Statspack, AWR, ADDM, Extended Tracing, etc).

Are they really useful in diagnosing problems or do they simply provide a whole bunch of numbers that most folk can’t readily decipher ?

Boy, I could go on and on with this one …

IMHO, there’s no doubt that many (most even ?) people use many of these “diagnostic tools” incorrectly and use their view and perception of the data as some kind of justification for what really amounts to nothing more than a guess when attempting to determine the root cause and solution of a performance issue.

Look at all that time spent performing sequential reads, boy you need faster disks. Buffer Busy Waits are in your top 5 wait events, obviously need to add more freelists somewhere. Your session’s running slow eh, well statspack is showing enqueue waits times are really high, it’s probably that. If not, it might have something to do with those log file sync times (whatever they are).

I have no doubt about Mogens Norgaard’s assertion (as referred to by Daniel Fink) that if you take 2 people of the same skill in separate rooms and arm them with the same Statspack report, they would both come up with different suggestions on what needs to be done. I’ve lost count of the number of threads in forums that start with someone complaining about “performance”, posting a statspack report and ending up with 10 different people making 10 different suggestions.

I’ll digress somewhat, but this is a story I use to try and explain to my kids what it is I actually do.

Mum says she’s heading off to the local shops to get some milk for our breakfast cereal. The shops are close, a few minutes drive away and I expect she’ll be back in 10 minutes or so. 2 hours later, she finally gets home, drops off the milk and leaves again in a huff before we can ask what happened. Kids, my job is to find out what took her so long.

It’s at this point the kids say that we don’t normally have cereal, but toast anyways and if I’m angry at mum for taking so long I should have gone and got the milk myself, but I try to get the discussion back on track …

I know there was plenty of fuel in the car, the weather was fine and clear and that the traffic generally wasn’t too busy at the time. There’s also usually plenty of milk at the shop. Now I’ve been listening to the radio and have a pretty good idea of the overall traffic and weather conditions. There was a truck that overturned nearby, resulting in some pretty major traffic delays, causing lots of folk to run late.

So I “guess” it must have been the truck that likely caused the delay, right ?

The kids suggest it was probably someone she met and she just got caught up having a long chat. Then again, maybe the car broke down or maybe she didn’t have enough money and had to go to town to get to the bank or maybe …

The point of course is that we simply don’t know. Any diagnosis based on the data at hand would be nothing but a guess. Yes there maybe some historical additional reference we could use (like she sometimes gets caught up with someone for a chat) but it would still be a guess to assume it was the issue this time.

There maybe some “global” events taking place that could be a factor (such as the overturned truck, or really bad weather) but again, there’s nothing to directly suggest it’s the actual issue. 

Yet how often do people when trying to determine the cause of specific performance issues turn to global, generalised, averaged to death statistics, charts and reports to diagnose the problem ? How often do people “hope” that because the weather looks bad, that it’s a likely cause of the problem. Or because the amount of traffic is more than a certain threshold or slower than a certain ratio that maybe that’s the cause of the issue.

How often do people make these (sometimes educated) guesses, get it wrong, apply a solution that makes no difference, take another stab, wrong again, yet another guess and yet another guess until eventually either they hit the right diagnosis and solution or the problem goes away (only to come back at some other point in time) …

A key issue is that many of these fancy diagnosis tools, reports and lists of statistics are used inappropriately. It’s a potential diagnostic tool but often the wrong tool for the problem at hand. A really really sharp and shiny and fancy looking saw when the problem is a “protruding nail” that needs to be hit with a hammer.

Another problem is that there’s generally no process or methodology associated with looking at these reports and charts and database wide statistics. As such they’re open to interpretation and differing views. Most databases can be improved somehow in all sorts of different ways but what is the precise reason for a specific (or general) database performance issue. Everyone is slightly sick or imperfect in some way or another (with the possible exception of David Bowie), but what exactly is it that’s killing us …

However the key problem is that often, very often, most of these diagnostic tools, reports and flashing screens don’t actually provide the necessary information to make an educated diagnosis. The answer to the problem is simply not to be found in statspack, or in the v$ views or on the radio or in the weather report. 10 people can look at a statspack report and 10 different solutions can improve the database in 10 different ways but none of them may necessarily solve the specific database performance issue that’s causing stress to the business.

The actual issue is buried and hidden and drowned out in a sea of numbers, averages, statistics and people who are all predominately able to get their milk in a timely manner.

Not that a “saw” is a totally useless tool. Jonathan Lewis recently referenced a rather nice article by Connie Green on how a saw can be used effectively for slicing through some issues. The traffic report can be a useful source of information.

However, the Oracle marketing machine has certainly been promoting many of these “shiny saws” to the point where many see them as being the tools of choice, no matter the performance issue at hand. Oracle says they provide great, “useful” information so they must be really really useful in solving my performance issues. The answer to my problems has got to be in here somewhere right, if I just know where to look ?

The problem is not necessarily with the diagnostic tools themselves but in the manner in which they’re often used and attempted to be applied to specific issues. A saw is not particularly useful at driving in a nail. You need a hammer for that …

Back to my little story.

Kids, imagine if we had a camera on mum and saw exactly what she was doing for the entire time she was away getting the milk. We could actually see where all the time was spent, see exactly what happened in the 2 hours she was away. We can account for every moment and see exactly what happened during the 1 hour and 50 minutes of “lost” time.

Then we could see that in fact, the car worked fine, she took another route to the local shops bypassing the truck, got the milk at the counter straightaway. However, when she got back to the car she had a problem unlocking the car door as the key was quite bent and got to the point where she just couldn’t open the door.

In fact, out of the 2 hours, 1 hour and 50 minutes was spent frustratingly trying to open the car door.

So it was a “locking” problem all along ;)

No guesses. No assumptions. No ifs and maybes. We know exactly the root cause of the problem.

Therefore no wasted effort and time filling the car up with petrol, no need to drive the longer way around to miss that interchange, no need to demand she stop chatting at the shops (thank goodness), none of which if applied would have actually resolved the issue, none of which would have prevented the same problem from reoccurring again next time …

And that of course is the information Extended Tracing can provide. IMHO, if only this hammer were used more often, if this tool was considered more often to knock in that “protruding nail”, if people posted an extended trace file more frequently, then it would be a big step in the right general direction in correctly diagnosing problems.

Is a 10046 event, DBMS_SUPPORT, DBMS_MONITOR, etc. perfect ? No, of course not. Although there are constant improvements with most releases, it can be difficult to setup in some environments and configurations, it can be difficult to interpret and piece together, it can tell you what the issue might be without telling why, it may only tell you that the problem isn’t Oracle related (although that in itself can be useful), it requires the issue to generally be repeatable, it has overheads, etc. etc.

However, in most scenarios, when applied appropriately, it can provide the necessary information to diagnose the exact cause of performance issues. In most scenarios, it can take the guess work out of the equation and save time by driving one directly to the correct diagnosis, first time.

I’ll add this point in as well. Most people working on other software solutions trying to resolve performance issues, would faint with disbelief at the level of instrumentation available in Oracle. No it’s not perfect, but boy, things could be a lot lot worse.

My general recommendation is this. When you want to determine and diagnose the cause of specific or general database performance issues, consider extended tracing as the first tool within the toolkit. You want to know why the milk took so long, ensure you have a camera available and record what happens.

If you want to be on the lookout for low hanging fruit, if you want to have a global view of the general road conditions, of the weather, of the fuel left in the tank, and proactively see what areas in a database may benefit from some attention, then look at using “saws” such as Statspack, ADDM, etc.

IMHO, if the Oracle diagnostic tools were used more appropriately, if more people read Connie Green’s article, if more people investigated and applied the use of extended tracing, then I’m sure the perception of their usefulness would increase as well.

Meanwhile, I’m going to use a hammer to see if I can get this damn key straightened out …

BTW, the kids think I’m some kind of private investigator who follows and monitors people all day long so I guess I need to try again in explaining what it is I actually do …

Index Compression Part III (2+2=5) February 22, 2008

Posted by Richard Foote in Index Compression, Index Internals, Oracle General, Oracle Indexes, Performance Tuning.
1 comment so far

As previously discussed, compression is only beneficial when the compressed columns have repeated values within an index leaf block. If a compressed column has repeated values, Oracle will basically store the one copy of the compressed column (or columns) as a prefix entry within a leaf block and each index row entry will then be associated with its corresponding prefix entry.

That being the case, it makes no sense in compressing a single column Unique index. Each index entry must be unique, there can’t be any repeatable, duplicate column values. Therefore compression will be not only totally ineffective, but would actually result in an index structure that increases rather than decreases in size due to the extra overheads associated with having a prefix index entry for each and every index row entry.

It also makes no sense in compressing every column in a concatenated, multi-column Unique index. For exactly the same reasons. Each compressed index column combination must be unique and would result in a prefix entry for each and every index row entry. Compression would be worse than useless and result in an increased index structure.

However, Oracle does its best to protect ourselves from ourselves.

For a start, it does not allow one to create a compressed index on a single column Unique index. Attempt to do so and Oracle generates an “ORA-25193: cannot use COMPRESS option for a single column key”. Hey, even the error message is nice and meaningful.

If the Unique index has multiple columns, the default prefix length value (number of compressed columns) is the number of indexed columns minus one, not all columns as it is for a Non-Unique index. See, Oracle is doing its best here to prevent a useless attempt at index compression.

If you specify a prefix length value equal or greater than the number of columns in a Unique index, Oracle generates an “ORA-25194: invalid COMPRESS prefix length value”. These are not restrictions but designed to stop the creation of inefficient compressed indexes.

Note however that Non-Unique indexes can be used to police primary Key (PK) and Unique Key (UK) constraints. I’ve discussed all this previously. The constraints might be Unique, the data might be unique but the index is Non-Unique and so these “protections” fly out the window. There is nothing stopping one creating a single column compressed Non-Unique index to police a PK or UK constraint. There’s nothing preventing you from creating a concatenated, Non-Unique index with all columns compressed, from policing a PK or UK constraint. In fact, you can even create a fully compressed Non-Unique index at the same time as the PK or UK constraint with Oracle’s extended constraint and index creation syntax.

Nothing stopping you, except perhaps the realisation that it would be a very bad and futile thing to implement as the resultant index will be guaranteed to be less efficient than an equivalent nocompress index.

Yes, it might make sense to compress just the leading columns of a Unique index or a Non-Unique index that’s policing a PK or UK constraint, if such leading columns have sufficient repeating values to make compression effective. But it would simply not make sense to compress all columns from such indexes.

See this demo Index Compression Part III on how compression works and doesn’t work for Unique Indexes.

In Part IV, we’ll look at the issue of what specific columns might benefit from compression and look a little closer at how storage is actually saved.

Whoever said there can’t possibly be enough things to discuss on a Blog that focuses mainly on Oracle indexes ;)

Index Compression Part II (Down Is The New Up) February 20, 2008

Posted by Richard Foote in Index Compression, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
12 comments

Compression can be very beneficial when the compressed columns of an index have many repeated values within a leaf block as these repeated values are only stored once and subsequently referred to within the index row entries themselves.

However, when the leading column is very selective or the compressed columns are very selective and have very few or possibly no repeating values in an index, then we have a problem.

In these scenarios, Oracle will create a massive prefix table, housing most if not all the individual column values anyways, as the prefix table stores all unique combinations of compressed column values within a leaf block.

The index row entries will point or refer to a prefix entry which is hardly shared (if at all) by other index row entries.

Compression in these cases is not only pointless but potentially worse than pointless as we now not only have to store a similar amount of index column data anyways, but additional compression related overheads such as those associated with the prefix table entries.

Net result can be no effective compression or indeed leaf blocks that can no longer store as many index values due to these associated compression overheads, resulting in indexes increasing rather than decreasing in size as a consequence.

For concatenated, multi-column indexes, the order of the indexed columns can make a huge difference to the compression of an index. If the leading column or columns (depending on how many columns are compressed) have many repeated column value combinations, not a problem as compression will be effective.

However if the leading column is very selective, then compression by definition will be ineffective, as there must be many distinct column values in the prefix table as a result which are less likely to shared by the index row entries.

For compression to be effective, the prefix table should have considerably fewer entries than index row entries in most leaf blocks. If there are approximately (say) 200 index entries per no-compressed leaf block, you want the number of distinct column combinations within the leaf block to be substantially less than 200.

For index compression to be feasible, ensure low cardinality columns are the leading columns in a concatenated index, else compression may be futile or worse.

See this demo on Index Compression Part II to see how compressing an index can be either effective or plain terrible and how the order of columns in a concatenated index can make all the difference to the effectiveness of index compression.

Next I’ll cover index compression with Unique Indexes …

Index Compression Part I (Low) February 17, 2008

Posted by Richard Foote in Index Compression, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
25 comments

Index compression is perhaps one of the most under used and neglected index options available. It has the potential to substantially reduce the overall size of Non-Unique indexes and multi-column Unique indexes, in some scenarios dramatically so. A smaller index, especially if it stays permanently smaller without any subsequent expensive maintenance operations, it always a nice thing. Not only will it potentially save storage, but if the resultant index contains fewer leaf blocks, that’s potentially fewer LIOs and from the Cost Based Optimizer’s point of view, potentially a cheaper execution plan option.

All possible without a single index rebuild in sight …

However like most things Oracle, index compression also has the potential to be misused and to cause rather than solve problems. So it really helps to understand how index compression works and how it’s actually implemented before we rush into anything.

The first point to understand is that index compression is implemented at the index block level. Basically, Oracle stores each distinct combination of compressed column values found within a specific index leaf block in a “Prefix” table within the leaf block and assigns each combination a unique prefix number.

The more numbers of distinct compressed column values, the more entries in the prefix table and the larger the prefixed related data. The fewer numbers of distinct compressed column values, the fewer entries in the prefix table and the smaller the prefix related data. Generally, the fewer entries in the prefix table, the better the compression.

Oracle now no longer stores these compressed columns within the actual index row entries. These compressed columns are substituted and referenced with the details stored in the prefix table entry, which are shared by all index row entries with the same corresponding prefix value.

Only leading columns (which in a Non-Unique Index can potentially be all the columns in an index, in a Unique Index can potentially be all but the last column in the index) can be compressed. Therefore, the prefix table is in the same logical order as they would appear in the index row entries. The values of the prefix values always appear within the index row entries in a sequential manner, noting that (hopefully) several row entries share the same prefix number.

Let’s say we currently have a nocompress Non-Unique index on a NAME column with the following entries:

0: David Bowie

  : ROWID

1: David Bowie

  : ROWID

2: David Bowie

  : ROWID

3: David Bowie

  : ROWID

4: David Bowie

  : ROWID

5: David Jones

  : ROWID

6: David Jones

  : ROWID

7: David Jones

  : ROWID

After the index is compressed, the leaf block entries would look logically something like this:

Prefix

0: David Bowie

1: David Jones

0:0

  : ROWID

1: 0

  : ROWID

2: 0

  : ROWID

3: 0

  : ROWID

4: 0

  : ROWID

5: 1

  : ROWID

6: 1

  : ROWID

7: 1

  : ROWID

Importantly, each distinct combination of compressed column values is now stored just the once within an individual index leaf block. In the example above, we previously stored “David Bowie” 5 times. In the compressed index leaf block, we now only store “David Bowie” once, with the prefix value now being referenced by each index entry instead.

The net result being we can now (hopefully) store many more index entries per leaf block meaning we don’t need as many leaf blocks in our index.

To compress an index, simply specify the COMPRESS option:

CREATE INDEX bowie_table_i ON bowie_table(col1, col2) COMPRESS 2;

The number after the COMPRESS keyword denotes how many columns to compress. The default is all columns in a Non-Unique index and all columns except the last column in a Unique index.

This demo, Index Compression Part I shows how an appropriately compressed index can substantially reduce the overall size of an index. It also shows a couple of index leaf block dumps to highlight how index compression is implemented.

In Part II, I’ll show you how you can really stuff things up by implementing index compression totally inappropriately …

Follow

Get every new post delivered to your Inbox.

Join 1,915 other followers