jump to navigation

Oracle Database 19c Automatic Indexing: Minimum Number Of Required Indexes (Low) January 20, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Oracle Indexes.
add a comment

 

As I discussed in my previous posts, Oracle Automatic Indexing will try and create as few indexes as possible to satisfy existing workloads, even if that means reordering the columns in an existing index.

To illustrate how Automatic Indexing creates as few indexes as possible, I’ll create the following table which has a number of columns with differing numbers of distinct values:

SQL> create table thin_white_duke (id number, code1 number, code2 number, code3 number, code4 number, name varchar2(42));

Table created.

SQL> insert into thin_white_duke select rownum, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 1000)), ceil(dbms_random.value(0, 10000)), ceil(dbms_random.value(0, 100000)), 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'THIN_WHITE_DUKE');

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from user_tab_columns where table_name='THIN_WHITE_DUKE';

COLUMN_NAME     NUM_DISTINCT    DENSITY
--------------- ------------ ----------
ID                   9914368 1.0086E-07
CODE1                    100        .01
CODE2                   1000       .001
CODE3                  10000      .0001
CODE4                 100824 9.9183E-06
NAME                       1          1

 

I then run the following workload within a 15 minute window between Automatic Index tasks:

 

SQL> select * from thin_white_duke where id=42;

SQL> select * from thin_white_duke where code1=42;

SQL> select * from thin_white_duke where code2=42;

SQL> select * from thin_white_duke where code3=42;

SQL> select * from thin_white_duke where code4=42;

SQL> select * from thin_white_duke where code1=42 and code2=42;

SQL> select * from thin_white_duke where code1=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code1=42;

SQL> select * from thin_white_duke where code2=42 and code3=42;

SQL> select * from thin_white_duke where code2=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code1=42;

SQL> select * from thin_white_duke where code3=42 and code2=42;

SQL> select * from thin_white_duke where code3=42 and code4=42;

SQL> select * from thin_white_duke where code4=42 and code1=42;

SQL> select * from thin_white_duke where code4=42 and code2=42;

SQL> select * from thin_white_duke where code4=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code4=42;

SQL> select * from thin_white_duke where code1=42 and code3=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code1=42 and code3=42;

SQL> select * from thin_white_duke where code2=42 and code1=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code3=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code1=42 and code2=42;

SQL> select * from thin_white_duke where code3=42 and code1=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code2=42 and code4=42;

SQL> select * from thin_white_duke where code4=42 and code1=42 and code2=42;

SQL> select * from thin_white_duke where code4=42 and code1=42 and code3=42;

SQL> select * from thin_white_duke where code4=42 and code2=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code3=42 and code4=42;

 

Each of these queries have no choice but to perform a Full Table Scan as there are currently no indexes defined to the table. Each query uses a different column list, so for the 30 or so SQL statements, one could potentially create 30 or so different indexes to cover each and every SQL predicate combination used above.

But how many different indexes will Automatic Indexing create?

Let’s have a look…

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 15-JUL-2019 07:46:25
Activity end                 : 15-JUL-2019 07:48:56
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0

-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                                : 7
Indexes created (visible / invisible)           : 7 (7 / 0)
Space used (visible / invisible)                : 1.8 GB (1.8 GB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 29
SQL statements improved (improvement factor)    : 29 (147.2x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 147.2x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

We can see that Automatic Indexing only create 7 different indexes, that’s it !!

If we look at the indexes that have been created:

 

INDEX DETAILS

-------------------------------------------------------------------------------
1.  The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
| Owner | Table           | Index                | Key                     | Type   | Properties |
--------------------------------------------------------------------------------------------------
| BOWIE | THIN_WHITE_DUKE | SYS_AI_0u1qx8vgtstkb | CODE4,CODE1,CODE2       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_2j5g09nzrhqsw | CODE2,CODE3,CODE4       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_4y26dtkybxq6k | CODE3,CODE4             | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_5pmdyk5pjay8a | CODE3,CODE1,CODE4       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_6uqhvvzabg5n8 | ID                      | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_bwfbc6nah6uga | CODE2,CODE4             | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_fftcb8q17yy6g | CODE1,CODE2,CODE3,CODE4 | B-TREE | NONE       |
--------------------------------------------------------------------------------------------------

 

We can see how the 7 indexes can collectively cover all 30 odd different SQL predicates within the workload, because the leading columns of at least one index has the necessary columns of each SQL predicate.

If we look at but one SQL example within the Automatic Index report, the query with the predicate on just the CODE4 column:

 

Parsing Schema Name   : BOWIE
SQL ID                 : 20h1p88d1u80r
SQL Text               : select * from thin_white_duke where code4=42
Improvement Factor     : 1200.5x

Execution Statistics:
-----------------------------
Original Plan                  Auto Index Plan
----------------------------  ----------------------------
Elapsed Time (s):     679689                          1148
CPU Time (s):         724033                           933
Buffer Gets:          162070                            91
Optimizer Cost:         8617                           103
Disk Reads:                0                             2
Direct Writes:             0                             0
Rows Processed:          264                            88
Executions:                3                             1

PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value  : 2714752625
------------------------------------------------------------------------------------------
| Id | Operation                      | Name            | Rows | Bytes | Cost | Time     |
------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |                 |      |       | 8617 |          |
|  1 |  PX COORDINATOR                |                 |      |       |      |          |
|  2 |    PX SEND QC (RANDOM)         | :TQ10000        |   99 |  3366 | 8617 | 00:00:01 |
|  3 |     PX BLOCK ITERATOR          |                 |   99 |  3366 | 8617 | 00:00:01 |
|  4 |      TABLE ACCESS STORAGE FULL | THIN_WHITE_DUKE |   99 |  3366 | 8617 | 00:00:01 |
------------------------------------------------------------------------------------------

- With Auto Indexes

-----------------------------
Plan Hash Value  : 2447525579
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   88 |  2992 |  103 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | THIN_WHITE_DUKE      |   88 |  2992 |  103 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_0u1qx8vgtstkb |   99 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------

 

We see it can now be serviced with the new SYS_AI_0u1qx8vgtstkb index, because it has the following columns (CODE4,CODE1,CODE2), with the CODE4 column as the leading column.

If we look at the details of all these new Automatic Indexes:

 

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes
where table_name='THIN_WHITE_DUKE';

INDEX_NAME           AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_6uqhvvzabg5n8 YES NO  VISIBLE   DISABLED      VALID      10000000       23553             53336
SYS_AI_fftcb8q17yy6g YES NO  VISIBLE   DISABLED      VALID      10000000       37322           9999819
SYS_AI_2j5g09nzrhqsw YES NO  VISIBLE   DISABLED      VALID      10000000       33154           9999815
SYS_AI_bwfbc6nah6uga YES NO  VISIBLE   DISABLED      VALID      10000000       27564           9999822
SYS_AI_5pmdyk5pjay8a YES NO  VISIBLE   DISABLED      VALID      10000000       31908           9999804
SYS_AI_4y26dtkybxq6k YES NO  VISIBLE   DISABLED      VALID      10000000       27697           9999818
SYS_AI_0u1qx8vgtstkb YES NO  VISIBLE   DISABLED      VALID      10000000       31783           9999819
         
SQL> select index_name, column_name, column_position from user_ind_columns where table_name='THIN_WHITE_DUKE' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME     COLUMN_POSITION
-------------------- --------------- ---------------
SYS_AI_0u1qx8vgtstkb CODE4                         1
SYS_AI_0u1qx8vgtstkb CODE1                         2
SYS_AI_0u1qx8vgtstkb CODE2                         3
SYS_AI_2j5g09nzrhqsw CODE2                         1
SYS_AI_2j5g09nzrhqsw CODE3                         2
SYS_AI_2j5g09nzrhqsw CODE4                         3
SYS_AI_4y26dtkybxq6k CODE3                         1
SYS_AI_4y26dtkybxq6k CODE4                         2
SYS_AI_5pmdyk5pjay8a CODE3                         1
SYS_AI_5pmdyk5pjay8a CODE1                         2
SYS_AI_5pmdyk5pjay8a CODE4                         3
SYS_AI_6uqhvvzabg5n8 ID                            1
SYS_AI_bwfbc6nah6uga CODE2                         1
SYS_AI_bwfbc6nah6uga CODE4                         2
SYS_AI_fftcb8q17yy6g CODE1                         1
SYS_AI_fftcb8q17yy6g CODE2                         2
SYS_AI_fftcb8q17yy6g CODE3                         3
SYS_AI_fftcb8q17yy6g CODE4                         4

 

The 7 newly created Automatic Indexes are all VISIBLE and VALID and can collectively service all 30 odd different SQL predicates of the captured workload.

I just know from experience that many DBAs and Developers out there would create many more than just these 7 indexes, partly because it’s just easier to create a new index for each new SQL predicate that doesn’t currently have an appropriate index and partly because it’s not always easy to capture and know what all SQL predicate combinations might be in use by an application.

This is one of the really nice capabilities of Automatic Indexing, in that it tries to service the known workloads it captures with as few indexes as possible, that have all be proven first to indeed improve SQL performance.

In my next blog post, I’ll show another trick that Automatic Indexing can do to reduce the number of different indexes it needs to create…