jump to navigation

Oracle 19c Automatic Indexing: Common Index Creation Trap (Rat Trap) June 30, 2020

Posted by Richard Foote in 19c, 19c New Features, ASSM, Automatic Indexing, CBO, Clustering Factor, Data Clustering, Oracle Indexes, TABLE_CACHED_BLOCKS.
1 comment so far

When I go to a customer site to resolve performance issues, one of the most common issues I encounter is in relation to inefficient SQL. And one of the most common causes for inefficient SQL I encounter is because of deficiencies the default manner by which the index Clustering Factor is calculated.

When it comes to both Automatic Indexes and in relation to the Oracle Autonomous Database Cloud Services, the “flawed” default manner by which the index Clustering Factor is calculated still applies. So we need to exercise some caution when Auto Indexes are created and the impact their default statistics can have on the performance of subsequent SQL statements.

To illustrate with a simple example, I’ll first create a table with the key column being the ID column which will be effectively unique. The table will be populated via a basic procedure that just inserts 1M rows. The procedure uses an ORDER sequence, such that the ID values are generated in a monotonically increasing manner:

SQL> create table bowie_assm (id number, code number, name varchar2(42));

Table created.

SQL> create sequence bowie_assm_seq order;

Sequence created.

Procedure created.

SQL> create or replace procedure pop_bowie_assm as
2  begin
3    for i in 1..1000000 loop
4      insert into bowie_assm values (bowie_assm_seq.nextval, mod(i,1000), 'DAVID BOWIE');
5      commit;
6    end loop;
7  end;
8  /

Procedure created.

 

However crucially, the procedure is executed by 3 different session concurrently, to simulate a multi user environment inserting into a table…

 

SQL> exec pop_bowie_assm

PL/SQL procedure successfully completed.

 

We’ll now collect statistics on the table:

 

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_ASSM');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, blocks from user_tables where table_name='BOWIE_ASSM';

TABLE_NAME        NUM_ROWS     BLOCKS
--------------- ---------- ----------
BOWIE_ASSM         3000000      12137

 

So the table has 3M rows and is 12137 blocks in size.

If we run an SQL a few times where we select only the one ID value:

 

SQL> select * from bowie_assm where id = 42;

Execution Plan
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |     1 |    22 |  1934   (6)| 00:00:01 |
|   1 |  PX COORDINATOR              |            |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |     1 |    22 |  1934   (6)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |            |     1 |    22 |  1934   (6)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| BOWIE_ASSM |     1 |    22 |  1934   (6)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - storage("ID"=42)
    filter("ID"=42)

Statistics
----------------------------------------------------------
         6  recursive calls
         0  db block gets
     12138  consistent gets
         0  physical reads
         0  redo size
       707  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
         1  rows processed

 

The execution plan shows a Full Table Scan (FTS) is invoked, the only choice the CBO has without an index on the ID column. Clearly an index on the ID column would make the plan substantially more efficient with just 1 row selected from a 3M row table. Hopefully, Automatic Indexing will come to our rescue, so let’s check out the subsequent Automatic Indexing Report:

 

REPORT

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                              : 1
Indexes created (visible / invisible)         : 1 (1 / 0)
Space used (visible / invisible)              : 58.72 MB (58.72 MB / 0 B)
Indexes dropped                               : 0
SQL statements verified                       : 2
SQL statements improved (improvement factor)  : 1 (1.2x)
SQL plan baselines created                    : 0
Overall improvement factor                    : 1.1x

-------------------------------------------------------------------------------
INDEX DETAILS
-------------------------------------------------------------------------------

The following indexes were created:
-------------------------------------------------------------------------
| Owner | Table      | Index                | Key | Type   | Properties |
-------------------------------------------------------------------------
| BOWIE | BOWIE_ASSM | SYS_AI_2w1pss6qbdz6z | ID  | B-TREE | NONE       |
-------------------------------------------------------------------------

So yes indeed, an Automatic Index (SYS_AI_2w1pss6qbdz6z) was created on the ID column.

If we look at the default Clustering Factor of this index:

 

SQL> select index_name, auto, constraint_index, visibility, status, clustering_factor from user_indexes where table_name='BOWIE_ASSM';

INDEX_NAME           AUT CON VISIBILIT STATUS   CLUSTERING_FACTOR
-------------------- --- --- --------- -------- -----------------
SYS_AI_2w1pss6qbdz6z YES NO  VISIBLE   VALID              2504869

 

We notice the Clustering Factor is relatively high at 2504869, much higher than the 12137 number of blocks in the table.

But if the ID column in the table has been loaded via a monotonically increasing sequence, doesn’t that mean the ID values have been inserted in approximately in ID order? If so, doesn’t that mean the ID column should have a “good” Clustering Factor” as the order of the rows in the table matches the order of the indexed values in the ID index?

Clearly not.

The reason being that the table is stored in the default Automatic Segment Space Management (ASSM) tablespace type, which is designed to avoid contention by concurrent inserts from different sessions. Therefore each of the 3 sessions inserting into the table are each assigned to different table blocks, resulting in the rows not being precisely inserted in ID order. It’s very close to ID order, the the ID values clustered within a few blocks from each other, but not precisely stored in ID order.

However, by default, the Clustering Factor is calculated by reading each index entry and determining if it references a ROWID that accesses a table block different from the PREVIOUS index entry. If it does differ, it increments the Clustering Factor, if it doesn’t differ and accesses the same table block as the previous index entry, the Clustering Factor is NOT incremented.

So in theory, we could have 100 rows that reside in just 2 different table blocks, but if the odd IDs live in one block and the even IDs live in the other block, meaning that each ID is stored in a different table block to the previous, the Clustering Factor would have a value of 100 for these 100 rows, even though they only occupy 2 table blocks. The Clustering Factor is therefore much higher than in reality it should be as ultimately only 2 different table blocks are accessed within a negligible time from each other.

This is the “flaw” with how the default Clustering Factor is calculated. By noting if a table block access differs only from the previous table block accessed, it leaves the Clustering Factor calculation susceptible to exaggerated high values when the data really is relatively well clustered within the table.

If we run the same SQL as previously which only selects one ID value:

 

SQL> select * from bowie_assm where id = 42;

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |     1 |    22 |     4   (0)| 00:00:01 |
|   1 |  PX COORDINATOR                       |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10001             |     1 |    22 |     4   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM           |     1 |    22 |     4   (0)| 00:00:01 |
|   4 |     BUFFER SORT                       |                      |       |       |            |          |
|   5 |      PX RECEIVE                       |                      |     1 |       |     3   (0)| 00:00:01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)    | :TQ10000             |     1 |       |     3   (0)| 00:00:01 |
|   7 |        PX SELECTOR                    |                      |       |       |            |          |
|*  8 |           INDEX RANGE SCAN            | SYS_AI_2w1pss6qbdz6z |     1 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - access("ID"=42)

Statistics
----------------------------------------------------------
        12  recursive calls
         0  db block gets
         4  consistent gets
         0  physical reads
         0  redo size
       707  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         2  sorts (memory)
         0  sorts (disk)
         1  rows processed

 

The CBO now uses the new Automatic Index as with just one row, the index is clearly more efficient regardless of the Clustering Factor value.

However, if we now run a query that selects a range of ID values, in this example between 42 and 4242 which represents only a relatively low 0.14% of the table:

 

SQL> select * from bowie_assm where id between 42 and 4242;

4201 rows selected.

Execution Plan
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |  4202 | 92444 |  1934   (6)| 00:00:01 |
|   1 |  PX COORDINATOR              |            |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |  4202 | 92444 |  1934   (6)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |            |  4202 | 92444 |  1934   (6)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| BOWIE_ASSM |  4202 | 92444 |  1934   (6)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - storage("ID"<=4242 AND "ID">=42)
    filter("ID"<=4242 AND "ID">=42)

Statistics
----------------------------------------------------------
         8  recursive calls
         4  db block gets
     12138  consistent gets
         0  physical reads
         0  redo size
     54767  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
      4201  rows processed

 

The CBO decides to use a Full Table Scan as it deems the index with the massive Clustering Factor to be too expensive, with it having to visit differing blocks for the majority of the estimated 4202 rows (note at 4201 actual rows returned, this estimate by the CBO is practically spot on).

If we force the use of the index via an appropriate hint:

 

SQL> select /*+ index (bowie_assm) */ * from bowie_assm where id between 42 and 4242;

4201 rows selected.

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |  4202 | 92444 |  3530   (1)| 00:00:01 |
|   1 |  PX COORDINATOR                       |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10001             |  4202 | 92444 |  3530   (1)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM           |  4202 | 92444 |  3530   (1)| 00:00:01 |
|   4 |     BUFFER SORT                       |                      |       |       |            |          |
|   5 |      PX RECEIVE                       |                      |  4202 |       |    12   (0)| 00:00:01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)    | :TQ10000             |  4202 |       |    12   (0)| 00:00:01 |
|   7 |        PX SELECTOR                    |                      |       |       |            |          |
|*  8 |         INDEX RANGE SCAN              | SYS_AI_2w1pss6qbdz6z |  4202 |       |    12   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - access("ID">=42 AND "ID"<=4242)

Statistics
----------------------------------------------------------
        12  recursive calls
         0  db block gets
        26  consistent gets
         0  physical reads
         0  redo size
     54767  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         2  sorts (memory)
         0  sorts (disk)
      4201  rows processed

 

Note at an estimated cost of 3530, this is greater than the 1934 cost of the FTS which explains why the CBO decides the FTS is best. However, if we look at the number of Consistent Gets, it’s only 26, meaning the CBO is actually getting these costs way wrong.

Why?

Because of the grossly inflated Clustering Factor.

As I’ve discussed previously, Oracle 12.1 introduced a new TABLE_CACHED_BLOCKS preference. Rather than the default value of 1, we can set this to any value up to 255. When calculating the Clustering Factor during statistics collection, it will NOT increment the Clustering Factor if the index visits a table block again that was one of the last “x” distinct table blocks visited. So by setting TABLE_CACHED_BLOCKS to (say) 42, if the index visits a block that was one of the last 42 distinct table blocks previously visited, don’t now increment the Clustering Factor. This can therefore generate a much more “accurate” Clustering Factor which can be significantly smaller than previously. This in turn makes the index much more efficient to the CBO because it then estimates far fewer table blocks need be accessed during a range scan.

So let’s change the TABLE_CACHED_BLOCKS value for this table to 42 (don’t increment now the Clustering Factor value when collecting statistics if we visit again any of the last 42 differently accessed table blocks) and recollect the segment statistics:

 

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE_ASSM', pname=>'TABLE_CACHED_BLOCKS', pvalue=>42);

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_ASSM', cascade=>true);

PL/SQL procedure successfully completed.

 

If we now examine the new Clustering Factor value:

 

SQL> select index_name, auto, constraint_index, visibility, status, clustering_factor from user_indexes

where table_name='BOWIE_ASSM';

INDEX_NAME           AUT CON VISIBILIT STATUS   CLUSTERING_FACTOR
-------------------- --- --- --------- -------- -----------------
SYS_AI_2w1pss6qbdz6z YES NO  VISIBLE   VALID                11608

 

We can see that at just 11608 it’s substantially less than the previous 2504869.

If we now rerun the previous range scan SQL without the hint:

 

SQL> select * from bowie_assm where id between 42 and 4242;

4201 rows selected.

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |  4202 | 92444 |    30   (4)| 00:00:01 |
|   1 |  PX COORDINATOR                       |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10001             |  4202 | 92444 |    30   (4)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM           |  4202 | 92444 |    30   (4)| 00:00:01 |
|   4 |     BUFFER SORT                       |                      |       |       |            |          |
|   5 |      PX RECEIVE                       |                      |  4202 |       |    12   (0)| 00:00:01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)    | :TQ10000             |  4202 |       |    12   (0)| 00:00:01 |
|   7 |        PX SELECTOR                    |                      |       |       |            |          |
|*  8 |           INDEX RANGE SCAN            | SYS_AI_2w1pss6qbdz6z |  4202 |       |    12   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - access("ID">=42 AND "ID"<=4242)

Statistics
----------------------------------------------------------
        12  recursive calls
         0  db block gets
        26  consistent gets
         0  physical reads
         0  redo size
     54767  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         2  sorts (memory)
         0  sorts (disk)
      4201  rows processed

 

We can see the CBO now automatically uses the new Automatic Index. At a new cost of just 30, it’s substantially less than the previous index cost of 3530 and now much less than the 1934 for the FTS and so why the index is now automatically chosen by the CBO.

When Automatic Indexes are created, it’s usually a good idea to check on the Clustering Factor and because default ASSM tablespaces have a tendency to significantly escalate the values of index Clustering Factors, to look at recalculating them with an non-default setting of the TABLE_CACHED_BLOCKS statistics collection preference.

Of course, not only is this a good idea for Automatic Indexes, but for manually created indexes as well.

Although no doubt Autonomous Database Cloud services will look at these issues in the future, such self-tuning capabilities are not currently available. You will need to go in there and make these changes as necessary to fix the root issues with such inefficient SQL statements…

Oracle 19c Automatic Indexing: A More Complex Example (How Does The Grass Grow) June 16, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, CBO.
3 comments

In this post I’m going to put together in a slightly more complex SQL example a number of the concepts I’ve covered thus far in relation to the current implementation of Oracle Automatic Indexing.

I’ll begin by creating three tables, a larger TABLE1 and two smaller TABLE2 and TABLE3 lookup tables. Each table is created with only a Primary Key, Unique index and currently have no secondary indexes (this demo was run in an ATP Autonomous Cloud environment hence the odd parallel execution plans):

SQL> create table table1 (id number not null, code1 number not null, grade1 number not null, name1 varchar2(42));

Table created.

SQL> insert into table1 select rownum, mod(rownum, 1000)+1, mod(rownum, 200000)+1, 'David Bowie '|| rownum
from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create unique index table1_id_i on table1(id);

Index created.

SQL> alter table table1 add primary key(id);

Table altered.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE1');

PL/SQL procedure successfully completed.


SQL> create table table2 (id number not null, stuff number not null, name2 varchar2(42));

Table created.

SQL> insert into table2 select rownum, mod(rownum, 100)+1, 'Ziggy Stardust ' || rownum from dual connect by level <=10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> create unique index table2_id_i on table2(id);

Index created.

SQL> alter table table2 add primary key(id);

Table altered.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE2');

PL/SQL procedure successfully completed.


SQL> create table table3 (id number not null, code3 number not null, name3 varchar2(42));

Table created.

SQL> insert into table3 select rownum, mod(rownum, 500)+1, 'Thin White Duke ' || rownum from dual connect by level <=1000;

1000 rows created.

SQL> commit;

Commit complete.

SQL> create unique index table3_id_i on table3(id);

Index created.

SQL> alter table table3 add primary key(id);

Table altered.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE3');

PL/SQL procedure successfully completed.

 

I’ll next run the following “complex” query a number of times:

 

SQL> select code1, grade1, name1, name2 from table1, table2
where table1.code1=table2.id and
table1.grade1 in (select table3.id from table3 where table3.code3=42);

10 rows selected.

Execution Plan
------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name        | Rows  | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                 |             |    10 |   600 |     857  (7)| 00:00:01 |
|   1 |  PX COORDINATOR                  |             |       |       |             |          |
|   2 |   PX SEND QC (RANDOM)            | :TQ10000    |    10 |   600 |     857  (7)| 00:00:01 |
|   3 |    NESTED LOOPS                  |             |    10 |   600 |     857  (7)| 00:00:01 |
|   4 |     NESTED LOOPS                 |             |    10 |   600 |     857  (7)| 00:00:01 |
|*  5 |      HASH JOIN                   |             |    10 |   360 |     852  (7)| 00:00:01 |
|   6 |       JOIN FILTER CREATE         | :BF0000     |     2 |    16 |       2  (0)| 00:00:01 |
|*  7 |        TABLE ACCESS STORAGE FULL | TABLE3      |     2 |    16 |       2  (0)| 00:00:01 |
|   8 |       JOIN FILTER USE            | :BF0000     |  1000K|    26M|     833  (5)| 00:00:01 |
|   9 |        PX BLOCK ITERATOR         |             |  1000K|    26M|     833  (5)| 00:00:01 |
|* 10 |   TABLE ACCESS STORAGE FULL      | TABLE1      |  1000K|    26M|     833  (5)| 00:00:01 |
|* 11 |      INDEX UNIQUE SCAN           | TABLE2_ID_I |     1 |       |       0  (0)| 00:00:01 |
|  12 |     TABLE ACCESS BY INDEX ROWID  | TABLE2      |     1 |    24 |       1  (0)| 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

      5 - access("TABLE1"."GRADE1"="TABLE3"."ID")
      7 - storage("TABLE3"."CODE3"=42)
          filter("TABLE3"."CODE3"=42)
     10 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"TABLE1"."GRADE1"))
          filter(SYS_OP_BLOOM_FILTER(:BF0000,"TABLE1"."GRADE1"))
     11 - access("TABLE1"."CODE1"="TABLE2"."ID")

Statistics
----------------------------------------------------------
         6  recursive calls
         0  db block gets
      5777  consistent gets
         0  physical reads
         0  redo size
      1224  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
        10  rows processed

 

We note there are three predicates listed in which a column access could potentially benefit from an index being created:

table1.code1=table2.id (note that table2.id already has a unique index defined)

table1.grade1 in (Select…)

table3.code3=42

 

Let’s have a look at the corresponding Automatic Indexing report to see what the Oracle Automatic Indexing process made of this example:

 

REPORT

--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------

Activity start                  : 26-JUN-2019 08:56:08
Activity end                    : 26-JUN-2019 08:56:53
Executions completed            : 1
Executions interrupted          : 0
Executions with fatal error     : 0

-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------

Index candidates                                : 3
Indexes created (visible / invisible)           : 2 (2 / 0)
Space used (visible / invisible)                : 18.94 MB (18.94 MB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 2
SQL statements improved (improvement factor)    : 1 (192.7x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 170.2x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

We note there are 3 index candidates that were considered, BUT only 2 new indexes were actually created. Overall, the created indexes resulted in an estimated 192.7x improvement in the above SQL performance.

If we look further on in the report and at the actual indexes created:

 

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
------------------------------------------------------------------------
| Owner | Table  | Index                | Key    | Type   | Properties |
------------------------------------------------------------------------
| BOWIE | TABLE1 | SYS_AI_0p5nn18dt9bn1 | GRADE1 | B-TREE | NONE       |
| BOWIE | TABLE3 | SYS_AI_b4ntgxt6pdbfh | CODE3  | B-TREE | NONE       |
------------------------------------------------------------------------

VERIFICATION DETAILS

-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE
SQL ID               : 21v8yqs9jrnzm
SQL Text             : select code1, grade1, name1, name2 from table1, table2
                       where table1.code1=table2.id and table1.grade1 in (select table3.id from table3 where table3.code3=42)
Improvement Factor   : 192.7x

Execution Statistics:
-----------------------------

                                Original Plan                 Auto Index Plan
                                ----------------------------  ----------------------------

Elapsed Time (s):               299396                        2170
CPU Time (s):                   289510                        1036
Buffer Gets:                    28912                         43
Optimizer Cost:                 857                           27
Disk Reads:                     0                             4
Direct Writes:                  0                             0
Rows Processed:                 50                            10
Executions:                     5                             1

 

We note the report states the two new indexes are created on the TABLE1.GRADE1 and TABLE3.CODE3 columns.

 

If look down further in the report and compare the before and after execution plans in the PLANS SECTION:

 

PLANS SECTION

- Original

-----------------------------
Plan Hash Value  : 1597150496
------------------------------------------------------------------------------------------------
| Id | Operation                          | Name        | Rows    | Bytes    | Cost | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |             |         |          |  857 |          |
|  1 |  PX COORDINATOR                    |             |         |          |      |          |
|  2 |    PX SEND QC (RANDOM)             | :TQ10000    |      10 |      600 |  857 | 00:00:01 |
|  3 |     NESTED LOOPS                   |             |      10 |      600 |  857 | 00:00:01 |
|  4 |      NESTED LOOPS                  |             |      10 |      600 |  857 | 00:00:01 |
|  5 |       HASH JOIN                    |             |      10 |      360 |  852 | 00:00:01 |
|  6 |        JOIN FILTER CREATE          | :BF0000     |       2 |       16 |    2 | 00:00:01 |
|  7 |         TABLE ACCESS STORAGE FULL  | TABLE3      |       2 |       16 |    2 | 00:00:01 |
|  8 |        JOIN FILTER USE             | :BF0000     | 1000000 | 28000000 |  833 | 00:00:01 |
|  9 |         PX BLOCK ITERATOR          |             | 1000000 | 28000000 |  833 | 00:00:01 |
| 10 |  TABLE ACCESS STORAGE FULL         | TABLE1      | 1000000 | 28000000 |  833 | 00:00:01 |
| 11 |       INDEX UNIQUE SCAN            | TABLE2_ID_I |       1 |          |    0 |          |
| 12 |      TABLE ACCESS BY INDEX ROWID   | TABLE2      |       1 |       24 |    1 | 00:00:01 |
------------------------------------------------------------------------------------------------

- With Auto Indexes

-----------------------------
Plan Hash Value  : 2014256176
----------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                 | Rows | Bytes | Cost | Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                      |   10 |   600 |   27 | 00:00:01 |
|   1 |   NESTED LOOPS                           |                      |   10 |   600 |   27 | 00:00:01 |
|   2 |    NESTED LOOPS                          |                      |   10 |   600 |   27 | 00:00:01 |
|   3 |     NESTED LOOPS                         |                      |   10 |   360 |   17 | 00:00:01 |
|   4 |      TABLE ACCESS BY INDEX ROWID BATCHED | TABLE3               |    2 |    16 |    3 | 00:00:01 |
| * 5 |       INDEX RANGE SCAN                   | SYS_AI_b4ntgxt6pdbfh |    2 |       |    1 | 00:00:01 |
|   6 |      TABLE ACCESS BY INDEX ROWID BATCHED | TABLE1               |    5 |   140 |    7 | 00:00:01 |
| * 7 |       INDEX RANGE SCAN                   | SYS_AI_0p5nn18dt9bn1 |    5 |       |    2 | 00:00:01 |
| * 8 |     INDEX UNIQUE SCAN                    | TABLE2_ID_I          |    1 |       |    0 |          |
|   9 |    TABLE ACCESS BY INDEX ROWID           | TABLE2               |    1 |    24 |    1 | 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------

* 5 - access("TABLE3"."CODE3"=42)
* 7 - access("TABLE1"."GRADE1"="TABLE3"."ID")
* 8 - access("TABLE1"."CODE1"="TABLE2"."ID")

 

We can see that the new plan uses both the new automatic indexes and has a new improved cost of 27 (down from 857).

 

If we look at all the indexes that have been created on these tables:

 

SQL> select table_name, index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name like 'TABLE%';

TABLE_NAME   INDEX_NAME             AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------ ---------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
TABLE1       TABLE1_ID_I            NO  NO  VISIBLE   DISABLED      VALID       1000000        2088              5202
TABLE1       SYS_AI_85dcbcnkgj0w0   YES NO  INVISIBLE DISABLED      UNUSABLE    1000000        2171            415525
TABLE1       SYS_AI_0p5nn18dt9bn1   YES NO  VISIBLE   DISABLED      VALID       1000000        2221           1000000
TABLE2       TABLE2_ID_I            NO  NO  VISIBLE   DISABLED      VALID         10000          20                44
TABLE3       TABLE3_ID_I            NO  NO  VISIBLE   DISABLED      VALID          1000           2                 5
TABLE3       SYS_AI_b4ntgxt6pdbfh   YES NO  VISIBLE   DISABLED      VALID          1000           3               999

SQL> select table_name, index_name, column_name, column_position from user_ind_columns
where table_name like 'TABLE%' order by table_name, index_name, column_position;

TABLE_NAME   INDEX_NAME             COLUMN_NAME     COLUMN_POSITION
------------ ---------------------- --------------- ---------------
TABLE1       SYS_AI_0p5nn18dt9bn1   GRADE1                        1
TABLE1       SYS_AI_85dcbcnkgj0w0   CODE1                         1
TABLE1       TABLE1_ID_I            ID                            1
TABLE2       TABLE2_ID_I            ID                            1
TABLE3       SYS_AI_b4ntgxt6pdbfh   CODE3                         1
TABLE3       TABLE3_ID_I            ID                            1

 

We note that both indexes listed as created in the Auto Indexing report on the TABLE1.GRADE1 (and TABLE3.CODE3 columns are both listed as being VISIBLE and VALID. Both of these indexes have been proven to improve the performance of the SQL statement by reducing the number of logical reads.

 

However, there is also an Auto Indexed defined on the other potential indexed column table1.code1, called “SYS_AI_85dcbcnkgj0w0” that has been left in an INVISIBLE, UNUSABLE state. This index has been shown to be ineffective in improving SQL performance and has been converted back to an UNUSABLE state.

If we run the query again and look at the resultant execution plan:

 

SQL> select code1, grade1, name1, name2
from table1, table2
where table1.code1=table2.id and
table1.grade1 in (select table3.id from table3 where table3.code3=24);

10 rows selected.

Execution Plan
-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                      |    10 |   600 |    15   (0)| 00:00:01 |
|   1 |  PX COORDINATOR                          |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                    | :TQ10000             |    10 |   600 |    15   (0)| 00:00:01 |
|   3 |    NESTED LOOPS                          |                      |    10 |   600 |    15   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                         |                      |    10 |   600 |    15   (0)| 00:00:01 |
|   5 |      NESTED LOOPS                        |                      |    10 |   360 |    10   (0)| 00:00:01 |
|   6 |       PX BLOCK ITERATOR                  |                      |       |       |            |          |
|*  7 |        TABLE ACCESS STORAGE FULL         | TABLE3               |     2 |    16 |     2   (0)| 00:00:01 | 
|   8 |       TABLE ACCESS BY INDEX ROWID BATCHED| TABLE1               |     5 |   140 |     4   (0)| 00:00:01 |
|*  9 |        INDEX RANGE SCAN                  | SYS_AI_0p5nn18dt9bn1 |     5 |       |     1   (0)| 00:00:01 |
|* 10 |      INDEX UNIQUE SCAN                   | TABLE2_ID_I          |     1 |       |     0   (0)| 00:00:01 |
|  11 |     TABLE ACCESS BY INDEX ROWID          | TABLE2               |     1 |    24 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   7 - storage("TABLE3"."CODE3"=24)
       filter("TABLE3"."CODE3"=24)
   9 - access("TABLE1"."GRADE1"="TABLE3"."ID")
  10 - access("TABLE1"."CODE1"="TABLE2"."ID")

Statistics
----------------------------------------------------------
        10  recursive calls
         4  db block gets
        59  consistent gets
         0  physical reads
         0  redo size
      1122  bytes sent via SQL*Net to client
       588  bytes received via SQL*Net from client
         2  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
        10  rows processed

 

We note the new execution plan now only uses one of newly created Automatic Indexes (SYS_AI_0p5nn18dt9bn1) and at just 59 consistent gets, is significantly more efficient than it was previously where it required 5777 consistent gets.

However, the other VISIBLE automatic index (SYS_AI_b4ntgxt6pdbfh on the BOWIE3 table) is NOT actually used in the new plan. Why?

Because as discussed previously, it’s entirely possible for the Auto Indexing process to consider a new plan with Auto Index to be more efficient because it uses less consistent gets BUT the CBO not use the plan because the plan has a higher cost. The plan generated by the Auto Index process has a cost of 27 and uses 43 buffer gets but the CBO uses the plan without the second Auto Index because it has a reduced cost of only 15, even though it uses 59 consistent gets.

So the Auto Indexing process can create any number of possible indexes for a particular query and may independently ultimately determine different states for the various candidate indexes and so only create and keep the necessary indexes to sufficiently improve an SQL.

We now have an SQL statement that automatic runs much more efficiently without human intervention thanks to these automatically created indexes…