jump to navigation

BLEVEL 1 => BLEVEL 2 (Teenage Wildlife) August 23, 2011

Posted by Richard Foote in BLEVEL, CBO, Oracle Indexes.
trackback

Jonathan Lewis recently wrote a really nice blog piece blevel=1 on the dangers of an index toggling between BLEVEL 1 and BLEVEL 2. I thought it would be useful to demonstrate this issue with a quick demo (Note: this example is on 11.2.0.1, with an 8K block size).
 
First, create a simple little table with 336,000 rows and an index on an ID number column:

  
SQL> create table major_tom (id number, code number, name varchar2(30));
 
Table created.
 
SQL> create index major_tom_i on major_tom(id);
 
Index created.
 
SQL> insert into major_tom select rownum, mod(rownum,100), 'GROUND CONTROL' from dual connect by level <=336000;
 
336000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MAJOR_TOM', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name='MAJOR_TOM_I';
 
    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         1         671              1296

 
Note the index has 671 leaf blocks and a Blevel=1. This of course means the index basically consists of a Root Block, which in turn references all its 671 leaf blocks. Therefore to read a specific index entry, requires a read of the index root block followed by a read of the specific index leaf block. That’s 2 reads in total.
 
Let’s run a query to return one row (note the ID column is effectively unique although I’ve only created a non-unique index):
 

SQL> select * from major_tom where id = 42;
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4155681103
 
-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    23 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     2   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     1   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        531  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

 
 

Note: The cost of using the index is just 1 not 2 as perhaps expected. This is due to the CBO ignoring the Blevel in its calculations when the Blevel = 1.  As the index is relatively small, the CBO takes the approach that the root block is very likely already cached and so is not worth costing.
 
As the data is perfectly evenly distributed and effectively unique, the CBO has correctly estimated the number of returned rows as just 1. Therefore, the overall cost of the execution plan is just 2, 1 to read the leaf block and 1 to read the table block.
 
Notice that the number of consistent gets is 4. 1 to read the index root block, 1 for the index leaf block, 1 for the table block and as the index is non-unique, 1 for an additional fetch performed to check the index again that there are no further rows to be returned.
 
If we now create another table of 1M rows that will be used in a join operation:
 

 
SQL> create table ziggy (id number, code number, name varchar2(30));
 
Table created.
 
SQL> insert into ziggy select rownum, mod(rownum,10000), 'ZIGGY STARDUST' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'ZIGGY', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

If we now join these 2 tables and select a moderate number of rows:
 

 
SQL> select * from ziggy z, major_tom m where z.id = m.id and z.code in (42, 4242);
 
68 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 2011771477
 
--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |             |   200 |  9400 |  1372   (2)| 00:00:17 |
|   1 |  NESTED LOOPS                |             |       |       |            |          |
|   2 |   NESTED LOOPS               |             |   200 |  9400 |  1372   (2)| 00:00:17 |
|*  3 |    TABLE ACCESS FULL         | ZIGGY       |   200 |  4800 |  1105   (2)| 00:00:14 |
|*  4 |    INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     1   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     2   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - filter("Z"."CODE"=42 OR "Z"."CODE"=4242)
   4 - access("Z"."ID"="M"."ID")
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       4175  consistent gets
       4024  physical reads
          0  redo size
       1950  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         68  rows processed

The CBO goes for a Nested Loop join, primarily because the inner table is only accessed a relatively small number of times AND because the cost of doing so via the index is so damn cheap.
 
However, if we add just a few more rows and collect fresh statistics …
 

 
SQL> insert into major_tom select rownum+336000, mod(rownum,100), 'GROUND CONTROL' from dual connect by level <=500;
 
500 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'MAJOR_TOM', cascade=> true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 

SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name='MAJOR_TOM_I';
 
    BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
         2         672              1298

The index has now toggled over to become a Blevel 2 index. We only added a handful of rows resulting in just the one additional index leaf block, but 672 is just too many to be referenced within the one index root block in this example. The root block has split, two new index branches have been created that now reference the leaf blocks and the root block now only references the two new branch blocks.
 
Overall, the changes are quite minor but the ramifications can be quite dramatic …
 
If we now select one row again:
 

 
SQL> select * from major_tom where id = 42;
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4155681103
 
-------------------------------------------------------------------------------------------
| Id  | Operation                   | Name        | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |             |     1 |    23 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| MAJOR_TOM   |     1 |    23 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | MAJOR_TOM_I |     1 |       |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID"=42)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
        531  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

The cost of the index has now jumped up by 2 from 1 to 3 with the overall costs up from 2 to 4, even though the index is practically the same size. As the Blevel is now 2, the CBO now includes the cost of the Blevel in its calculations. The cost associated with accessing the root block and a branch block all now count. Although overall the costs are still low, this increase actually represents a 100% increase in the use of the index for an equality search.
 
This increase can be significant if the index needs to be accessed multiple times. Let’s now re-run the join query:
 

 
SQL> select * from ziggy z, major_tom m where m.id = z.id and z.code in (42, 4242);
 
68 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1928189744
 
--------------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |   200 |  9400 |  1485   (2)| 00:00:18 |
|*  1 |  HASH JOIN         |           |   200 |  9400 |  1485   (2)| 00:00:18 |
|*  2 |   TABLE ACCESS FULL| ZIGGY     |   200 |  4800 |  1105   (2)| 00:00:14 |
|   3 |   TABLE ACCESS FULL| MAJOR_TOM |   336K|  7558K|   378   (1)| 00:00:05 |
--------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("M"."ID"="Z"."ID")
   2 - filter("Z"."CODE"=42 OR "Z"."CODE"=4242)
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       5366  consistent gets
       4024  physical reads
          0  redo size
       1964  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         68  rows processed

Although it’s retrieving exactly the same data, the execution plan has changed significantly. The Nested Loop join is no longer as appealing to the CBO as the cost of accessing the inner table via the index has now effectively doubled. The CBO has now gone for a Hash Join, accessing both tables via Full Tables Scans. The overall cost of the Nested Loop plan was 1372, but this has increased to over 1485, the cost of the now so-called more efficient Hash Join plan.
 
If you have indexes that are on the boundary of increasing from a blevel=1 to a blevel=2, execution plans can potentially change significantly based on the differences in how indexes get costed. This can be especially troublesome when such indexes get regularly rebuilt as they may toggle between Blevel 1 and 2 based on associated space savings and can sometimes result in unpredictable performance depending on when new statistics get collected.
 
I liken it to a child growing up from being a young kid to a teenager. It may only be a difference of a year or so but boy, can the differences be dramatic !!

Comments»

1. Yogesh Tiwari - August 28, 2011

Interesting insight. I wonder how we know if an index on boundary of blevel…?

-Yogi

Like

Richard Foote - August 29, 2011

Hi Yogesh

You can get some idea based on the average key length and the number of leaf blocks. A dump of the root block and the amount of free space will also give you an idea of what proportion of the index there is to go (for example, 1/2 the space in root block is free means the index needs to roughly double).

Like

2. Anand - September 2, 2011

Hi Richard,

Interesting stuff !!! 🙂

After 500 rows inserted and stats gathered —

SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name=’MAJOR_TOM_I’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 672 1298

The plan showed hash join as stated by you. Then i did an index rebuild

SQL> alter index MAJOR_TOM_I rebuild;

Index altered.

SQL> select blevel, leaf_blocks, clustering_factor from dba_indexes where index_name=’MAJOR_TOM_I’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 749 1298

The LEAF_BLOCKS increased. Why so?

Regards,
Anand

Like

Richard Foote - September 3, 2011

Hi Anand

People sometimes don’t realise that by rebuilding an index, you can actually make it bigger and less efficient.

In this case, you had a monotonically increasing value which meant the index was performing 90-10 splits and hence had 100% usage.

However, once you performed the rebuild, the default pctfree was no doubt 10% and so you’ve now suddenly introduced 10% free space in your leaf blocks and decreased usage to just 90% (and so increased the overall size of the index by 10% or so).

Like


Leave a comment