jump to navigation

Oracle 19c Automatic Indexing: Mixing Manual and Automatic Indexes Part I (I Can’t Read) April 21, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, CBO, Clustering Factor, Mixing Auto and Manual Indexes, Oracle Indexes.
1 comment so far

tin machine album

In previous articles, I discussed how Automatic Indexing has the capability to add columns or reorder the column list of previously created Automatic Indexes. However, how does Automatic Indexing handle these types of scenarios with regard to existing manually created indexes?

To investigate, let’s create a table identical to the table I created in my previous blog post where Automatic Indexing created an index that was ultimately not used by the CBO because although Automatic Indexing finds the new index more efficient, the CBO costs it as being too expensive and ignores it.

SQL> create table major_tom5 (id number, code1 number, code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into major_tom5 select rownum, mod(rownum, 1000)+1, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 10)),  'David Bowie' from dual connect by level = 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'MAJOR_TOM5');

PL/SQL procedure successfully completed.

However, in this demo, I’m going to first create a manual index, but with the column list in CODE3, CODE2 order. This is the opposite order in which a default Automatic Index would be created (CODE2, CODE3 order) as this is the order of the columns in the table definition:

SQL> create index major_tom5_code3_code2_i on major_tom5(code3, code2);

Index created.

SQL&gt; select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor <span style="color:var(--color-text);">from user_indexes where table_name='MAJOR_TOM5';</span>

INDEX_NAME                AUT CON VISIBILIT COMPRESSION   STATUS   NUM_ROWS   LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
MAJOR_TOM5_CODE3_CODE2_I  NO  NO  VISIBLE   DISABLED      VALID      10000000       24181           8974538

The resultant index has a terrible Clustering Factor of 8974538 on a 10M row table.

If we run the following query with filtering predicates on these 2 indexed columns:

SQL> select * from major_tom5 where code3=4 and code2=42;

10051 rows selected.

Execution Plan
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |  9982 |   272K|  7355   (7)| 00:00:01 |
|   1 |  PX COORDINATOR              |            |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |  9982 |   272K|  7355   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |            |  9982 |   272K|  7355   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| MAJOR_TOM5 |  9982 |   272K|  7355   (7)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
   4 - storage("CODE2"=42 AND "CODE3"=4)
       filter("CODE2"=42 AND "CODE3"=4)

Statistics
----------------------------------------------------------
         6  recursive calls
         0  db block gets
     45888  consistent gets
        68  physical reads
      5256  redo size
    149822  bytes sent via SQL*Net to client
       610  bytes received via SQL*Net from client
         4  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
     10051  rows processed

The CBO decides to NOT use the available index as it deems it too expensive, especially with such a poor Clustering Factor, to return the resultant 10,051 rows.

But what will Automatic Indexing do now. If we wait the 15 minute period until the next Automatic Indexing period and look at the resultant Automatic Indexing report:

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM5 | SYS_AI_2ajmncxsmg189 | CODE2,CODE3 | B-TREE | NONE       |
---------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE
SQL ID               : fmpwux2ptvasq
SQL Text             : select * from major_tom5 where code2=42 and code3=4
Improvement Factor   : 5.1x

Automatic Indexing has created a new index based on the column list CODE2, CODE3, because it considers such an index would improve performance of the query by a factor of 5.1x.

However, it has not recognised that the existing manual index based the column list CODE3, CODE2 would have done precisely the same job.

If we look further on in the Automatic Indexing report:

Execution Statistics:
-----------------------------
                              Original Plan                 Auto Index Plan
                              ----------------------------  ----------------------------
Elapsed Time (s):             993225                        26436
CPU Time (s):                 963727                        22535
Buffer Gets:                  137756                        9000
Optimizer Cost:               7355                          9069
Disk Reads:                   0                             26
Direct Writes:                0                             0
Rows Processed:               30153                         10051
Executions:                   3                             1

PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------

Plan Hash Value  : 2129981950
---------------------------------------------------------------------------------------
| Id | Operation                      | Name       | Rows   | Bytes  | Cost  | Time    |
---------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |            |        |        |  7355 |         |
|  1 |  PX COORDINATOR                |            |        |        |       |         |
|  2 |    PX SEND QC (RANDOM)         | :TQ10000   |  10000 | 280000 |  7355 | 00:00:01|
|  3 |     PX BLOCK ITERATOR          |            |  10000 | 280000 |  7355 | 00:00:01|
|  4 |      TABLE ACCESS STORAGE FULL | MAJOR_TOM5 |  10000 | 280000 |  7355 | 00:00:01|
---------------------------------------------------------------------------------------

- With Auto Indexes
-----------------------------

Plan Hash Value  : 459198994
---------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes  | Cost | Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      | 10159 | 284452 | 9069 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | MAJOR_TOM5           | 10159 | 284452 | 9069 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_2ajmncxsmg189 | 10051 |        |   27 | 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE2"=42 AND "CODE3"=4)

We notice the new execution plan using the newly created Automatic Index actually has a greater CBO cost than the previous FTS execution plan.

As we discussed in the previous post on when Automatic Indexing creating indexes that are not ultimately used by the CBO, although Automatic Indexing has indeed created this index because it has determined it’s going to be more efficient by a factor of 5.1x due to the reduction in Buffer Gets (137756 buffer gets old plan / 3 executions = 45,919 / 9000 buffer gets with index = 5.1), the CBO considers the execution plan using the Automatic Index to have a larger cost at 9069 than the previous FTS cost at just 7355.

Again just as with the existing, logically equivalent manually created index, the reason why the new Automatic Index is deemed too expensive by the CBO is because it likewise has the same terrible Clustering Factor:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='MAJOR_TOM5';

INDEX_NAME               AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
MAJOR_TOM5_CODE3_CODE2_I NO  NO  VISIBLE   DISABLED      VALID      10000000       24181           8974538
SYS_AI_2ajmncxsmg189     YES NO  VISIBLE   DISABLED      VALID      10000000       23697           8974538

If we re-run the initial query again with the newly created Visible/Valid Automatic Index:

SQL> select * from major_tom5 where code3=4 and code2=42;

10051 rows selected.

Execution Plan
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |  9982 |   272K|  7355   (7)| 00:00:01 |
|   1 |  PX COORDINATOR              |            |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |  9982 |   272K|  7355   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |            |  9982 |   272K|  7355   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| MAJOR_TOM5 |  9982 |   272K|  7355   (7)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

    4 - storage("CODE2"=42 AND "CODE3"=4)
        filter("CODE2"=42 AND "CODE3"=4)

Statistics
----------------------------------------------------------
         6  recursive calls
         0  db block gets
     45888  consistent gets
        68  physical reads
      5256  redo size
    149822  bytes sent via SQL*Net to client
       610  bytes received via SQL*Net from client
         4  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
     10051  rows processed

The CBO ignores the newly created Automatic Index as it did the logically equivalent manually created index and uses the previous, cheaper FTS execution plan.

Automatic Indexing was NOT able to recognise that we already had an equivalent manually created index and so now we have TWO indexes that the CBO simply ignores as being too expensive…

More on mixing Automatic and Manual Indexes on my next post.

Oracle 19c Automatic Indexing: Adding Columns To Existing Automatic Indexes (2+2=5) April 7, 2020

Posted by Richard Foote in 19c, 19c New Features, Add Column To Existing Index, Automatic Indexing, Oracle Indexes.
2 comments

2+2=5 Single

 

In my previous post, I discussed how when the following query is run:

select * from major_tom3 where code3=4 and code2=42;

the Automatic Indexing process will create an index on (CODE2, CODE3) but ultimately not use the index as the CBO considers the corresponding index based execution plan too expensive.

I’m going to expand on the demo and run now the following SQL for the first time (note these examples are run on the OLTP Autonomous Cloud service which explains the odd default parallel based execution plans):

SQL> select * from major_tom3
where code1=42 and code3=4 and code2=42;

10 rows selected.

Execution Plan
-------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |    10 |   280 |  7354   (7)| 00:00:01 |
|   1 |  PX COORDINATOR              |            |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |    10 |   280 |  7354   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |            |    10 |   280 |  7354   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| MAJOR_TOM3 |    10 |   280 |  7354   (7)| 00:00:01 |
-------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - storage("CODE1"=42 AND "CODE2"=42 AND "CODE3"=4)
    filter("CODE1"=42 AND "CODE2"=42 AND "CODE3"=4)

Statistics
----------------------------------------------------------
         6 recursive calls
         0 db block gets
     45853 consistent gets
         0 physical reads
         0 redo size
      1032 bytes sent via SQL*Net to client
       588 bytes received via SQL*Net from client
         2 SQL*Net roundtrips to/from client
         0 sorts (memory)
         0 sorts (disk)
        10 rows processed

 

This query has a predicate that includes both CODE2 and CODE3 filtering columns as previously, but now also a new filtering column on CODE1. This now makes the resultant SQL much more selective than the previous SQL, returning just 10 rows where the previous SQL returned 9968 rows.

The current explain plan still uses the previous Full Table Scan, as the only index available is the Automatic Index created previously based on CODE2, CODE3 which has already been shown to be too expensive to return the necessary rows. The additional filtering predicate based on CODE1 doesn’t make the index any more efficient, it still has to access the 9968 rows that match the CODE2, CODE3 predicates and then filter out most of the rows less the 10 that are actually required.

So what does the Automatic Index process do in this scenario?

Let’s look at the resultant Automatic Index report:

 

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
| Owner | Table      | Index                | Key               | Type   | Properties |
---------------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM3 | SYS_AI_cy8rs2dqb0nrp | CODE2,CODE3,CODE1 | B-TREE | NONE       |
---------------------------------------------------------------------------------------

The following indexes were dropped:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties|
---------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM3 | SYS_AI_bnyacywycxx8b | CODE2,CODE3 | B-TREE | NONE      |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

So the first thing to note is that Oracle first creates a new index based on columns CODE2,CODE3,CODE1.

It then drops the previously created index based on the CODE2,CODE3 columns.

If we look at the Verification Details section of the report:

VERIFICATION DETAILS

-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE
SQL ID               : 1hv3d685x2cy4
SQL Text             : select * from major_tom3 where code1=43 and code3=4 and code2=42
Improvement Factor   : 45853.6x

-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 97by2q15zprgc
SQL Text            : select * from major_tom3 where code1=42 and code3=4 and code2=42
Improvement Factor  : 45856.2x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    2815446                      1046
CPU Time (s):        2741134                      1013
Buffer Gets:         596135                       13
Optimizer Cost:      7354                         13
Disk Reads:          0                            2
Direct Writes:       0                            0
Rows Processed:      130                          10
Executions:          13                           1
PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 2354969370

--------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost | Time     |
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT          |            |      |       | 7354 |          |
| 1  | PX COORDINATOR            |            |      |       |      |          |
| 2  | PX SEND QC (RANDOM)       | :TQ10000   |   10 |   280 | 7354 | 00:00:01 |
| 3  | PX BLOCK ITERATOR         |            |   10 |   280 | 7354 | 00:00:01 |
| 4  | TABLE ACCESS STORAGE FULL | MAJOR_TOM3 |   10 |   280 | 7354 | 00:00:01 |
--------------------------------------------------------------------------------

Notes
-----
- dop = 2
- px_in_memory_imc = no
- px_in_memory = no
- With Auto Indexes
-----------------------------
Plan Hash Value : 2892362571

-----------------------------------------------------------------------------------------------------
|  Id | Operation                           | Name                 | Rows | Bytes | Cost | Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   280 |   13 | 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | MAJOR_TOM3           |   10 |   280 |   13 | 00:00:01 |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_cy8rs2dqb0nrp |   10 |       |    3 | 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE2"=42 AND "CODE3"=4 AND "CODE1"=42)
Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

So what’s going on here?

Oracle has correctly determined that an index based on all 3 columns would enable a much more efficient access path for the new SQL, by a factor of 45856.2x no less.

In order to service BOTH known queries that the Automatic Index process has determined would benefit from an index based on predicates (CODE2=42 and CODE3=4) and (CODE2=42 and CODE3=4 and CODE1=42), a single index based on CODE2,CODE3,CODE1 would suffice.

As such, the existing index based on just CODE2,CODE3 is redundant as the new index has the same leading columns. Therefore, the existing index can be safely dropped.

If we look at the definition of the indexes on this table:

SQL> select index_name, auto, constraint_index, visibility, status compression from user_indexes where table_name='MAJOR_TOM3';

INDEX_NAME           AUT CON VISIBILIT COMPRESSION   STATUS
-------------------- --- --- --------- ------------- --------
SYS_AI_cy8rs2dqb0nrp YES NO  VISIBLE   DISABLED      VALID

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='MAJOR_TOM3' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME          COLUMN_POSITION
-------------------- -------------------- ---------------
SYS_AI_cy8rs2dqb0nrp CODE2                              1
SYS_AI_cy8rs2dqb0nrp CODE3                              2
SYS_AI_cy8rs2dqb0nrp CODE1                              3

 

We notice we now have just the new index, which is both VISIBLE and VALID, based on columns in CODE2,CODE3,CODE1 order.

So Automatic Indexing is clever enough to recognise the scenario where a new index can replace an existing index by adding additional columns to cater for new SQL workloads.

Now that’s rather impressive…

Much more on Automatic Indexing to come.