jump to navigation

Oracle Database 19c Automatic Indexing: Minimum Number Of Required Indexes (Low) January 20, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Oracle Indexes.
add a comment

 

As I discussed in my previous posts, Oracle Automatic Indexing will try and create as few indexes as possible to satisfy existing workloads, even if that means reordering the columns in an existing index.

To illustrate how Automatic Indexing creates as few indexes as possible, I’ll create the following table which has a number of columns with differing numbers of distinct values:

SQL> create table thin_white_duke (id number, code1 number, code2 number, code3 number, code4 number, name varchar2(42));

Table created.

SQL> insert into thin_white_duke select rownum, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 1000)), ceil(dbms_random.value(0, 10000)), ceil(dbms_random.value(0, 100000)), 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'THIN_WHITE_DUKE');

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from user_tab_columns where table_name='THIN_WHITE_DUKE';

COLUMN_NAME     NUM_DISTINCT    DENSITY
--------------- ------------ ----------
ID                   9914368 1.0086E-07
CODE1                    100        .01
CODE2                   1000       .001
CODE3                  10000      .0001
CODE4                 100824 9.9183E-06
NAME                       1          1

 

I then run the following workload within a 15 minute window between Automatic Index tasks:

 

SQL> select * from thin_white_duke where id=42;

SQL> select * from thin_white_duke where code1=42;

SQL> select * from thin_white_duke where code2=42;

SQL> select * from thin_white_duke where code3=42;

SQL> select * from thin_white_duke where code4=42;

SQL> select * from thin_white_duke where code1=42 and code2=42;

SQL> select * from thin_white_duke where code1=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code1=42;

SQL> select * from thin_white_duke where code2=42 and code3=42;

SQL> select * from thin_white_duke where code2=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code1=42;

SQL> select * from thin_white_duke where code3=42 and code2=42;

SQL> select * from thin_white_duke where code3=42 and code4=42;

SQL> select * from thin_white_duke where code4=42 and code1=42;

SQL> select * from thin_white_duke where code4=42 and code2=42;

SQL> select * from thin_white_duke where code4=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code4=42;

SQL> select * from thin_white_duke where code1=42 and code3=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code1=42 and code3=42;

SQL> select * from thin_white_duke where code2=42 and code1=42 and code4=42;

SQL> select * from thin_white_duke where code2=42 and code3=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code1=42 and code2=42;

SQL> select * from thin_white_duke where code3=42 and code1=42 and code4=42;

SQL> select * from thin_white_duke where code3=42 and code2=42 and code4=42;

SQL> select * from thin_white_duke where code4=42 and code1=42 and code2=42;

SQL> select * from thin_white_duke where code4=42 and code1=42 and code3=42;

SQL> select * from thin_white_duke where code4=42 and code2=42 and code3=42;

SQL> select * from thin_white_duke where code1=42 and code2=42 and code3=42 and code4=42;

 

Each of these queries have no choice but to perform a Full Table Scan as there are currently no indexes defined to the table. Each query uses a different column list, so for the 30 or so SQL statements, one could potentially create 30 or so different indexes to cover each and every SQL predicate combination used above.

But how many different indexes will Automatic Indexing create?

Let’s have a look…

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 15-JUL-2019 07:46:25
Activity end                 : 15-JUL-2019 07:48:56
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0

-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                                : 7
Indexes created (visible / invisible)           : 7 (7 / 0)
Space used (visible / invisible)                : 1.8 GB (1.8 GB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 29
SQL statements improved (improvement factor)    : 29 (147.2x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 147.2x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

We can see that Automatic Indexing only create 7 different indexes, that’s it !!

If we look at the indexes that have been created:

 

INDEX DETAILS

-------------------------------------------------------------------------------
1.  The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------
| Owner | Table           | Index                | Key                     | Type   | Properties |
--------------------------------------------------------------------------------------------------
| BOWIE | THIN_WHITE_DUKE | SYS_AI_0u1qx8vgtstkb | CODE4,CODE1,CODE2       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_2j5g09nzrhqsw | CODE2,CODE3,CODE4       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_4y26dtkybxq6k | CODE3,CODE4             | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_5pmdyk5pjay8a | CODE3,CODE1,CODE4       | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_6uqhvvzabg5n8 | ID                      | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_bwfbc6nah6uga | CODE2,CODE4             | B-TREE | NONE       |
| BOWIE | THIN_WHITE_DUKE | SYS_AI_fftcb8q17yy6g | CODE1,CODE2,CODE3,CODE4 | B-TREE | NONE       |
--------------------------------------------------------------------------------------------------

 

We can see how the 7 indexes can collectively cover all 30 odd different SQL predicates within the workload, because the leading columns of at least one index has the necessary columns of each SQL predicate.

If we look at but one SQL example within the Automatic Index report, the query with the predicate on just the CODE4 column:

 

Parsing Schema Name   : BOWIE
SQL ID                 : 20h1p88d1u80r
SQL Text               : select * from thin_white_duke where code4=42
Improvement Factor     : 1200.5x

Execution Statistics:
-----------------------------
Original Plan                  Auto Index Plan
----------------------------  ----------------------------
Elapsed Time (s):     679689                          1148
CPU Time (s):         724033                           933
Buffer Gets:          162070                            91
Optimizer Cost:         8617                           103
Disk Reads:                0                             2
Direct Writes:             0                             0
Rows Processed:          264                            88
Executions:                3                             1

PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value  : 2714752625
------------------------------------------------------------------------------------------
| Id | Operation                      | Name            | Rows | Bytes | Cost | Time     |
------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |                 |      |       | 8617 |          |
|  1 |  PX COORDINATOR                |                 |      |       |      |          |
|  2 |    PX SEND QC (RANDOM)         | :TQ10000        |   99 |  3366 | 8617 | 00:00:01 |
|  3 |     PX BLOCK ITERATOR          |                 |   99 |  3366 | 8617 | 00:00:01 |
|  4 |      TABLE ACCESS STORAGE FULL | THIN_WHITE_DUKE |   99 |  3366 | 8617 | 00:00:01 |
------------------------------------------------------------------------------------------

- With Auto Indexes

-----------------------------
Plan Hash Value  : 2447525579
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   88 |  2992 |  103 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | THIN_WHITE_DUKE      |   88 |  2992 |  103 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_0u1qx8vgtstkb |   99 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------

 

We see it can now be serviced with the new SYS_AI_0u1qx8vgtstkb index, because it has the following columns (CODE4,CODE1,CODE2), with the CODE4 column as the leading column.

If we look at the details of all these new Automatic Indexes:

 

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes
where table_name='THIN_WHITE_DUKE';

INDEX_NAME           AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_6uqhvvzabg5n8 YES NO  VISIBLE   DISABLED      VALID      10000000       23553             53336
SYS_AI_fftcb8q17yy6g YES NO  VISIBLE   DISABLED      VALID      10000000       37322           9999819
SYS_AI_2j5g09nzrhqsw YES NO  VISIBLE   DISABLED      VALID      10000000       33154           9999815
SYS_AI_bwfbc6nah6uga YES NO  VISIBLE   DISABLED      VALID      10000000       27564           9999822
SYS_AI_5pmdyk5pjay8a YES NO  VISIBLE   DISABLED      VALID      10000000       31908           9999804
SYS_AI_4y26dtkybxq6k YES NO  VISIBLE   DISABLED      VALID      10000000       27697           9999818
SYS_AI_0u1qx8vgtstkb YES NO  VISIBLE   DISABLED      VALID      10000000       31783           9999819
         
SQL> select index_name, column_name, column_position from user_ind_columns where table_name='THIN_WHITE_DUKE' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME     COLUMN_POSITION
-------------------- --------------- ---------------
SYS_AI_0u1qx8vgtstkb CODE4                         1
SYS_AI_0u1qx8vgtstkb CODE1                         2
SYS_AI_0u1qx8vgtstkb CODE2                         3
SYS_AI_2j5g09nzrhqsw CODE2                         1
SYS_AI_2j5g09nzrhqsw CODE3                         2
SYS_AI_2j5g09nzrhqsw CODE4                         3
SYS_AI_4y26dtkybxq6k CODE3                         1
SYS_AI_4y26dtkybxq6k CODE4                         2
SYS_AI_5pmdyk5pjay8a CODE3                         1
SYS_AI_5pmdyk5pjay8a CODE1                         2
SYS_AI_5pmdyk5pjay8a CODE4                         3
SYS_AI_6uqhvvzabg5n8 ID                            1
SYS_AI_bwfbc6nah6uga CODE2                         1
SYS_AI_bwfbc6nah6uga CODE4                         2
SYS_AI_fftcb8q17yy6g CODE1                         1
SYS_AI_fftcb8q17yy6g CODE2                         2
SYS_AI_fftcb8q17yy6g CODE3                         3
SYS_AI_fftcb8q17yy6g CODE4                         4

 

The 7 newly created Automatic Indexes are all VISIBLE and VALID and can collectively service all 30 odd different SQL predicates of the captured workload.

I just know from experience that many DBAs and Developers out there would create many more than just these 7 indexes, partly because it’s just easier to create a new index for each new SQL predicate that doesn’t currently have an appropriate index and partly because it’s not always easy to capture and know what all SQL predicate combinations might be in use by an application.

This is one of the really nice capabilities of Automatic Indexing, in that it tries to service the known workloads it captures with as few indexes as possible, that have all be proven first to indeed improve SQL performance.

In my next blog post, I’ll show another trick that Automatic Indexing can do to reduce the number of different indexes it needs to create…

Oracle Database 19c Automatic Indexing – Need Another Index (Another Brick in The Wall Part 2) January 16, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Oracle Indexes.
add a comment

I previously discussed how Automatic Indexing can effectively cleverly reorder an existing index if it means it can now use the new index to satisfy new SQL predicates. In this post, we’ll explore this example further with some new workloads.

So, we previously ran SQL queries with SQL predicates in the following combinations:

  • CODE1=42 and CODE2=42 and CODE3=42
  • CODE2=42 and CODE3=42
  • CODE3=42

 

so an index based on CODE3, CODE2, CODE1 is able to satisfy all 3 existing predicates.

If we now run a new query with the following predicates based on CODE2=42 and CODE1=42:

SQL> select * from major_tom where code2=42 and code1=42;

no rows selected

Execution Plan

-----------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)|Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |     1 |    28 |  7318   (6)|00:00:01 |
|   1 |  PX COORDINATOR              |           |       |       |            |         |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000  |     1 |    28 |  7318   (6)|00:00:01 |
|   3 |    PX BLOCK ITERATOR         |           |     1 |    28 |  7318   (6)|00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| MAJOR_TOM |     1 |    28 |  7318   (6)|00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - storage("CODE1"=42 AND "CODE2"=42)
    filter("CODE1"=42 AND "CODE2"=42)

Statistics
----------------------------------------------------------
         8  recursive calls
         4  db block gets
     45853  consistent gets
         0  physical reads
         0  redo size
       645  bytes sent via SQL*Net to client
       577  bytes received via SQL*Net from client
         1  SQL*Net roundtrips to/from client
         0  sorts (memory)
         0  sorts (disk)
         0  rows processed

 

The current previously re-ordered index can’t now satisfy this new predicate as the leading column of the index (CODE3) is not specified in the SQL predicate and the CBO considers an Index Skip Scan as too expensive an operation. So a Full Table Scan is chosen here by the CBO.

If we wait and see what Automatic Indexing decides to now do:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------

Activity start               : 21-JUN-2019 07:41:55
Activity end                 : 21-JUN-2019 07:42:49
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0

-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------

Index candidates                              : 1
Indexes created (visible / invisible)         : 1 (1 / 0)
Space used (visible / invisible)              : 201.33 MB (201.33 MB / 0 B)
Indexes dropped                               : 0
SQL statements verified                       : 1
SQL statements improved (improvement factor)  : 1 (5.1x)
SQL plan baselines created                    : 0
Overall improvement factor                    : 5.1x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

-------------------------------------------------------------------------------

We notice that a new index has been created…

 

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
--------------------------------------------------------------------------------
| Owner | Table     | Index                | Key         | Type   | Properties |
--------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM | SYS_AI_gmhdbjx21zcr1 | CODE1,CODE2 | B-TREE | NONE       |
--------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE
SQL ID               : 3y246nzwdpybv
SQL Text             : select * from major_tom where code2=42 and code1=4
Improvement Factor   : 5.1x

 

So Automatic Indexing has decided to now create a new, second index (SYS_AI_gmhdbjx21zcr1) on the table based on CODE1, CODE2, as a single index is no longer capable to address all known SQL predicates.

 

SQL> select index_name, column_name, column_position from user_ind_columns 
where table_name='MAJOR_TOM' order by index_name, column_position;

INDEX_NAME                COLUMN_NAME          COLUMN_POSITION
------------------------- -------------------- ---------------
SYS_AI_00hxxxkgb821n      CODE3                              1
SYS_AI_00hxxxkgb821n      CODE2                              2
SYS_AI_00hxxxkgb821n      CODE1                              3
SYS_AI_gmhdbjx21zcr1      CODE1                              1
SYS_AI_gmhdbjx21zcr1      CODE2                              2

 

So we now have two indexes to cater for all known SQL predicates in which an index has been shown to improve performance.

Automatic Indexing will try and create the minimum number of indexes possible to cater for all known SQL workloads.

 

In my next post, we’ll look at a more complex example in which there could potentially be numerous different indexes that could cater for more extensive workloads and how Automatic Indexing copes rather splendidly…

Merry Christmas and Happy New Year !! (“Heroes”) December 24, 2019

Posted by Richard Foote in Oracle Indexes.
4 comments

I would like to take this opportunity to wish all my readers a very Merry Christmas and a most happy, peaceful and prosperous New Year.

My gift this year is not David Bowie and Bing Crosby doing their famous Christmas duet but a performance by Bowie of his classic “Heroes” as featured on the same Bing Crosby Christmas Special in 1977.

 

Enjoy 🙂

London March 2020: “Oracle Indexing Internals and Best Practices” and “Oracle Performance Diagnostics and Tuning” Seminars !! September 3, 2019

Posted by Richard Foote in Oracle Index Seminar, Oracle Indexes, Oracle Performance Diagnostics and Tuning Seminar.
add a comment

seminar photo

It’s with great excitement that I announce I’ll finally be returning to London, UK in March 2020 to run both of my highly acclaimed seminars. The dates and registration links are as follows:

23-24 March 2020: “Oracle Indexing Internals and Best Practices” seminar – Tickets and Registration Link

25-26 March 2020: “Oracle Performance Diagnostics and Tuning” Seminar – Tickets and Registration Link

You can also purchase tickets to both seminars at a special 20% combo discount:

23-26 March 2020:  Both “Oracle Indexing Internals and Best Practices” and “Oracle Performance Diagnostics and Tuning” Seminars – Tickets and Registration Link

The cost for each individual seminar is:

  • Early Bird Rate (enrollments prior to 31 January 2020) £990.00 (+ VAT)
  • General Rate  (enrollments post 31 January 2020) £1190 (+VAT)

The cost for the seminar combo is:

  • Early Bird Rate (enrollments prior to 31 January 2020) £1550.00 (+ VAT)
  • General Rate  (enrollments post 31 January 2020) £1900 (+VAT)

 

The venue is the rather nice Hilton London Kensington.

Prices include attendance to the seminar, both soft and hard copy of the extensive seminar materials, lunch and morning/afternoon tea/coffee.

Both seminars are very highly acclaimed, with past attendees universally applauding the quality and educational outcomes of the training.  They’re both aimed at Oracle Professionals (DBAs and Developers) who are interested in Performance Tuning and how to maximise the performance of both Oracle Databases and associated applications.

All the details of the Oracle Indexing Internals and Best Practices Seminar.

All the details of the Oracle Performance Diagnostic and Tuning Seminar.

Both seminars have strictly limited places to ensure a quality event for all attendees with venues booked with only small classes in mind. So I recommend booking early (as it’s cheaper) and to avoid possible disappointment. I don’t get to run these kind of events in the UK very often (it would be over 2 years since I last run seminars in London) so do take advantage of attending what will be a unique training opportunity while you can.

If you have any questions, please leave a comment or contact me at richard@richardfooteconsulting.com.

Hope to see you at one or both of these seminars next year !!

Oracle Database 19c Automatic Indexing: Default Index Column Order Part I (Anyway Anyhow Anywhere) September 2, 2019

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Index Column Order, Oracle Indexes.
1 comment so far

pin ups

The next thing I was curious about regarding Automatic Indexing was in which order would Oracle by default order the columns within an index. This can be a crucial decision with respect to the effectiveness of the index (but then again, may not be so crucial as well). Certainly one would expect the index column order be dependent on the SQL predicates running in the database and I’ll discuss all that in future posts, but what is the default behaviour here with regard index column order based (for now) on a single SQL predicate.

I could come up with a number of possible options that Oracle might adopt when determining the default index column order such as:

  • Column Name Order
  • Column ID Order
  • (Reverse) Column Cardinality Order
  • Best Clustering Factor
  • Other (Random even)

So to investigate this, I started with a basic table with 3 columns (CODE1, CODE2, CODE3) that had differing levels of cardinality:

SQL> create table major_tom (id number, code1 number, code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into major_tom select rownum, mod(rownum, 10)+1, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 1000)), 'David Bowie' from dual connect by level  commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'MAJOR_TOM');

PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density from user_tab_columns where table_name='MAJOR_TOM';

COLUMN_NAME          NUM_DISTINCT    DENSITY
-------------------- ------------ ----------
ID                        9914368 1.0086E-07
CODE1                          10  .00000005
CODE2                         100  .00000005
CODE3                        1000       .001
NAME                            1          1

I then ran the following query with a predicate based on the 3 columns CODE1, CODE2 and CODE3:

SQL> select * from major_tom where code3=42 and code2=42 and code1=4;

15 rows selected.

Execution Plan
------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |    10 |   280 |  7354   (7)| 00:00:01 |
|   1 |  PX COORDINATOR              |           |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000  |    10 |   280 |  7354   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |           |    10 |   280 |  7354   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| MAJOR_TOM |    10 |   280 |  7354   (7)| 00:00:01 |
------------------------------------------------------------------------------------------

If we look at the resultant Automatic Index:

INDEX DETAILS

-------------------------------------------------------------------------------
1 . The following indexes were created:
--------------------------------------------------------------------------------------
| Owner | Table     | Index                | Key               | Type   | Properties |
--------------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM | SYS_AI_9mrs058nrg9d5 | CODE1,CODE2,CODE3 | B-TREE | NONE       |
--------------------------------------------------------------------------------------

 

SQL> select i.index_name, i.column_name, i.column_position, t.num_distinct
from user_ind_columns i, user_tab_columns t
where i.table_name = t.table_name and i.column_name = t.column_name and i.table_name='MAJOR_TOM'
order by i.index_name, i.column_position;

INDEX_NAME           COLUMN_NAME     COLUMN_POSITION NUM_DISTINCT
-------------------- --------------- --------------- ------------
SYS_AI_9mrs058nrg9d5 CODE1                         1           10
SYS_AI_9mrs058nrg9d5 CODE2                         2          100
SYS_AI_9mrs058nrg9d5 CODE3                         3         1000

 

We notice that the Automatic Index is in CODE1, CODE2, CODE3 order.

If we create a similar table, but this time have the columns with a different order of cardinality:

SQL> create table major_tom2 (id number, code1 number, code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into major_tom2 select rownum, mod(rownum, 1000)+1, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 10)),
'David Bowie' from dual connect by level;

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'MAJOR_TOM2');

PL/SQL procedure successfully completed.

SQL> select * from major_tom where code3=42 and code2=42 and code1=4;

15 rows selected.

 

We notice that the resultant automatic index is still in the same CODE1, CODE2 and CODE3 order:

INDEX DETAILS

-------------------------------------------------------------------------------
1. The following indexes were created:
---------------------------------------------------------------------------------------
| Owner | Table      | Index                | Key               | Type   | Properties |
---------------------------------------------------------------------------------------
| BOWIE | MAJOR_TOM2 | SYS_AI_7w9t3tt9u171r | CODE1,CODE2,CODE3 | B-TREE | NONE       |
---------------------------------------------------------------------------------------

 

SQL> select i.index_name, i.column_name, i.column_position, t.num_distinct
from user_ind_columns i, user_tab_columns t
where i.table_name = t.table_name and i.column_name = t.column_name and i.table_name='MAJOR_TOM2'
order by i.index_name, i.column_position;

INDEX_NAME           COLUMN_NAME     COLUMN_POSITION NUM_DISTINCT
-------------------- --------------- --------------- ------------
SYS_AI_7w9t3tt9u171r CODE1                         1         1000
SYS_AI_7w9t3tt9u171r CODE2                         2          100
SYS_AI_7w9t3tt9u171r CODE3                         3           10

 

So we can eliminate column cardinality as being a contributing factor in Oracle deciding in which manner to order the indexed columns.

Which is unfortunate as we’ll see in a future post when we decide to implement Oracle Index Compression with Automatic Indexing.

In the next post, we’ll explore further other considerations and confirm how Oracle does indeed decide to order columns within an Automatic Index by default.

Oracle 19c Automatic Indexing: How Many Executions Does It Take? (One Shot) August 29, 2019

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Oracle Indexes.
2 comments

One shot single

One of the first questions I asked when playing with the new Oracle Database 19c Automatic Indexing feature was how many executions of an SQL does it take for a new index to be considered?

To find out, I create the following table:

SQL> create table bowie_one (id number constraint bowie_one_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_one select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_ONE');

PL/SQL procedure successfully completed.

I then ran the following query just once and checked to see if the Automatic Indexing task would pick this execution up and consider building a new index:

SQL> select * from bowie_one where code=42;

10 rows selected.

Execution Plan

------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |    10 |   230 |  6208   (7)| 00:00:01 |
|   1 |  PX COORDINATOR              |           |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000  |    10 |   230 |  6208   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |           |    10 |   230 |  6208   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| BOWIE_ONE |    10 |   230 |  6208   (7)| 00:00:01 |
------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - storage("CODE"=42)
    filter("CODE"=42)

Statistics
----------------------------------------------------------

   12  recursive calls
    0  db block gets
39000  consistent gets
    0  physical reads
  132  redo size
  867  bytes sent via SQL*Net to client
  588  bytes received via SQL*Net from client
    2  SQL*Net roundtrips to/from client
    0  sorts (memory)
    0  sorts (disk)
   10  rows processed

The following Automatic Indexing report detailed the following:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 26-JUN-2019 13:03:30
Activity end                 : 26-JUN-2019 21:13:06
Executions completed         : 24
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 184.55 MB (184.55 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 3
SQL statements improved (improvement factor) : 1 (19500x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 6.9x
-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

So an index was indeed created. Later in the report:

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------

-------------------------------------------------------------------------
| Owner | Table     | Index                | Key  | Type   | Properties |
-------------------------------------------------------------------------
| BOWIE | BOWIE_ONE | SYS_AI_5tabfu6wtkbdh | CODE | B-TREE | NONE       |
-------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS

-------------------------------------------------------------------------------The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE
SQL ID               : 9n89axkwrvw4b
SQL Text             : select * from bowie_one where code=42
Improvement Factor   : 19500x

Execution Statistics:
-----------------------------

                    Original Plan                 Auto Index Plan
                    ----------------------------  ----------------------------
Elapsed Time (s):   198342                        961
CPU Time (s):       187768                        1112
Buffer Gets:        39000                         13
Optimizer Cost:     6208                          14
Disk Reads:         0                             2
Direct Writes:      0                             0
Rows Processed:     10                            10
Executions:         1                             1

So the above details that an index on the CODE column of the BOWIE_ONE table was indeed created after just 1 execution.

For those wondering, yes the Elaspsed and CPU times are actually in Microseconds (1 millionth of a second) and not in seconds as stated…

The final section of the report details:

PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------

Plan Hash Value  : 227986582
------------------------------------------------------------------------------------
| Id | Operation                      | Name      | Rows | Bytes | Cost | Time     |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |           |      |       | 6208 |          |
|  1 |  PX COORDINATOR                |           |      |       |      |          |
|  2 |    PX SEND QC (RANDOM)         | :TQ10000  |   10 |   230 | 6208 | 00:00:01 |
|  3 |     PX BLOCK ITERATOR          |           |   10 |   230 | 6208 | 00:00:01 |
|  4 |      TABLE ACCESS STORAGE FULL | BOWIE_ONE |   10 |   230 | 6208 | 00:00:01 |
------------------------------------------------------------------------------------

Notes
-----

- dop_op_reason = scan of object BOWIE.BOWIE_ONE
- dop = 2
- px_in_memory_imc = no
- px_in_memory = no
- With Auto Indexes
-----------------------------

Plan Hash Value  : 2734060610
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |  10   |  230 |   14 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_ONE            |  10   |  230 |   14 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_5tabfu6wtkbdh |  10   |      |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------

* 2 - access("CODE"=42)

Notes
-----

- Dynamic sampling used for this statement ( level = 11 )

It details that indeed, a new plan using the newly Automatic Index would  be substantially more efficient.

If we look at details of the new Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor  from user_indexes where table_name='BOWIE_ONE';

INDEX_NAME             AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_ONE_PK           NO  YES VISIBLE   DISABLED      VALID      10000000       19642             57523
SYS_AI_5tabfu6wtkbdh   YES NO  VISIBLE   DISABLED      VALID      10000000       22285          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_ONE' order by index_name, column_position;

INDEX_NAME             COLUMN_NAME     COLUMN_POSITION
---------------------- --------------- ---------------
BOWIE_ONE_PK           ID                            1
SYS_AI_5tabfu6wtkbdh   CODE                          1

The newly created Automatic Index is both Valid and Visible and so can be used globally within the database.

If I now re-run the original query:

SQL> select * from bowie_one where code=42;

10 rows selected.

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |    10 |   230 |    13   (0)| 00:00:01 |
|   1 |  PX COORDINATOR                       |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10001             |    10 |   230 |    13   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ONE            |    10 |   230 |    13   (0)| 00:00:01 |
|   4 |     BUFFER SORT                       |                      |       |       |            |          |
|   5 |      PX RECEIVE                       |                      |    10 |       |     3   (0)| 00:00:01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)    | :TQ10000             |    10 |       |     3   (0)| 00:00:01 |
|   7 |        PX SELECTOR                    |                      |       |       |            |          |
|*  8 |           INDEX RANGE SCAN            | SYS_AI_5tabfu6wtkbdh |    10 |       |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - access("CODE"=42)

Statistics
---------------------------------------------------------

 12  recursive calls
  0  db block gets
 13  consistent gets
  0  physical reads
  0  redo size
867  bytes sent via SQL*Net to client
588  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  2  sorts (memory)
  0  sorts (disk)
 10  rows processed

The CBO now uses the newly created Automatic Index.

So it only potentially takes just the one execution of an SQL statement for an Automatic Index to be created.

Therefore some caution needs to be exercised in environments where there may be a very large number of ad-hoc queries where specific indexes may not be necessary for once only executed predicate combinations.

That said, the Automatic Indexing process is highly efficient in building only the bare minimum of column indexed combinations to cater for all known SQL predicates.

More on this in a future post.

Speaking at Trivadis Performance Days 2019 August 28, 2019

Posted by Richard Foote in Oracle Indexes, Performance Days 2019.
add a comment

performance days

I’ll again be speaking at the wonderful Trivadis Performance Days 2019 conference in Zurich, Switzerland on 26-27 September.

There’s again another fantastic lineup of speakers, including:

  • CHRISTIAN ANTOGNINI
  • IVICA ARSOV
  • MARK ASHDOWN
  • SHASANK CHAVAN
  • EMILIANO FUSAGLIA
  • STEPHAN KÖHLER
  • JONATHAN LEWIS
  • FRANCK PACHOT
  • TANEL PODER
  • DANI SCHNIDER

 

I’ll be presenting two papers:

  • Oracle 18c and Oracle 19c New Indexing Features
  • Improving Performance with Indexing and Partitioning

 

For all the details, including the agenda and how to register: https://m.trivadis.com/performance-days-en

I can’t recommend this conference enough, one that focuses and specialises on improving performance in Oracle Database environments.

Oh, and the beer served is excellent as well.

 

performance days beers

 

Speaking at Oracle OpenWorld 2019 August 22, 2019

Posted by Richard Foote in OOW19, Oracle Indexes.
add a comment

OOW Speaking

It’s been remarkably 9 years since I’ve been to Oracle OpenWorld, but will finally get the opportunity to present there again this year (with many thanks to the Oracle ACE Director program for making this possible).

Details of my presentation are as follows:

Conference: Oracle OpenWorld

Session Type: Conference Session

Session ID: CON1432

Session Title: Oracle Database 19c: In-Depth Look into the New Automatic Indexing Feature

Room: Moscone South – Room 152A

Date: 09/17/19

Start Time: 11:15:00 AM

End Time: 12:00:00 PM

I promise it to be a fast-paced, content rich, action-packed presentation that will leave you much wiser about the cool capabilities of Oracle Database 19c Automatic Indexing.

The current allocated room is almost full with pre-registrations so if you’re interested in learning how Oracle’s new Automatic Indexing feature works, I recommend registering for the session ASAP (or to keep following my blog).

I’m really looking forward to catching with many of my Oracle friends again, so please do stop by to say hi if you see me loitering around 🙂

 

For the record, I’ve hardly changed a bit in the intervening last 9 years 🙂

oow-day-2-006

AUSOUG Connect 2019 Conference Series August 21, 2019

Posted by Richard Foote in AUSOUG, Connect 2019, Oracle Indexes.
add a comment

Connect 2019

 

AUSOUG will again be running their excellent CONNECT 2019 conference series this year at the following great venues:

  • Monday 14th October – Rendezvous Hotel In Melbourne
  • Wednesday 16th October –  Mercure Hotel in Perth

As usual, there’s a wonderful lineup of speakers from both Australia and overseas including:

  • Connor McDonald
  • Scott Wesley
  • Guy Harrison
  • Jeffrey Kemp
  • Francisco Munoz Alvarez
  • Gavin Soorma
  • Douglas Hood
  • Charles Kim
  • Craig Shallahamer
  • David Peake
  • Patrick Barel
  • Christopher Jones

to name but a few. You can see the full list of speakers here.

I’ll will also be presenting my fully updated and revised “10 Things You Might Not Know About Oracle Indexes But Really Should” presentation, that covers the more important and recent indexing capabilities that are not so widely known or understood but can be critical for optimal database/application performance.

If you’re an Oracle professional based in Australia, this is definitely the Oracle conference for you. Hopefully, I’ll get to meet many of you at either Melbourne or Perth 🙂

For more information, including how to register, visit the Connect 2019 site here.

 

ausoug pic

Oracle 19c Automatic Indexing: My First Auto Index (Absolute Beginners) August 19, 2019

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Oracle Indexes.
3 comments

absolute beginners

I am SOOOO struggling with this nightmare block editor but here goes. Please excuse any formatting issues below:

I thought it was time to show the new Oracle 19c Automatic Indexing feature in action and what better way than to go through how I created my first ever Automatic Index.

To start, I create a typically simple little table:

SQL> create table bowie (id number constraint bowie_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level  commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BOWIE';

INDEX_NAME                  AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_PK                    NO  YES VISIBLE   DISABLED        VALID      10000000         19429               58133

The key column here is CODE, which is highly selective with just 10 rows on average per CODE value.

If I run the following query a number of times:

SQL> select * from bowie where code=42;

10 rows selected.

Execution Plan

--------------------------------------------------------------------------------------------
| Id  | Operation                      | Name     | Rows    | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT               |          |      10 |   230 |  6208   (7)| 00:00:01 |
|   1 |  PX COORDINATOR                |          |         |       |            |          |
|   2 |   PX SEND QC (RANDOM)          | :TQ10000 |      10 |   230 |  6208   (7)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR           |          |      10 |   230 |  6208   (7)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL  | BOWIE    |      10 |   230 |  6208   (7)| 00:00:01 |
---------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

4 - storage("CODE"=42)
filter("CODE"=42)

Statistics
----------------------------------------------------------
    6  recursive calls
    0  db block gets
39026  consistent gets
    0  physical reads
    0  redo size
  867  bytes sent via SQL*Net to client
  588  bytes received via SQL*Net from client
    2  SQL*Net roundtrips to/from client
    0  sorts (memory)
    0  sorts (disk)
   10  rows processed

The query runs slowly as it performs a Full Table Scan of a 10M row table when returning just 10 runs.

Perhaps an index would be a good idea…

With Auto Indexing, we just wait approx. 15 minutes until the Auto Index task completes, or manually run exec dbms_auto_index_internal.task_proc(true);

As discussed in my previous blog post, the Auto Indexing task will look at the workload over the past 15 minutes and determine if a new index might be warranted to improve the performance of an SQL.

We can query the results of the last Auto Index task by running the following:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start                  : 20-JUN-2019 08:12:15
Activity end                    : 20-JUN-2019 08:12:55
Executions completed            : 1
Executions interrupted          : 0
Executions with fatal error     : 0
-------------------------------------------------------------------------------
SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                                : 1
Indexes created (visible / invisible)           : 1 (1 / 0)
Space used (visible / invisible)                : 184.55 MB (184.55 MB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 2
SQL statements improved (improvement factor)    : 1 (39044.8x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 6.9x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------

Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

At this point in the report, we can see Oracle has verified 2 SQL statements and has created 1 new, visible index using 184.55 MB of space. It has improved 1 SQL statement by a factor of 39044.8x and improved things overall by a factor of 6.9x. (we’ll look at how Oracle determines these values in a later post).

The report continues with the Index Details section:

INDEX DETAILS

-------------------------------------------------------------------------------
The following indexes were created:
---------------------------------------------------------------------
| Owner | Table | Index                | Key  | Type   | Properties |
---------------------------------------------------------------------
| BOWIE | BOWIE | SYS_AI_600vgjmtqsgv3 | CODE | B-TREE | NONE       |
---------------------------------------------------------------------

Oracle has decided to create a new indexed called “SYS_AI_600vgjmtqsgv3” on the CODE column of the BOWIE table. Notice the mixed case naming convention for the new Auto Index, oh what fun and games to be had…

Next the Verification Details section:

VERIFICATION DETAILS

-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name  : BOWIE

SQL ID               : dd5gzx7skf6as
SQL Text             : select * from bowie where code=42
Improvement Factor   : 39044.8x

Execution Statistics:
-----------------------------

                     Original Plan                 Auto Index Plan
                     ----------------------------  ----------------------------

Elapsed Time (s):    3241698                       108
CPU Time (s):        3174021                       108
Buffer Gets:         663764                        13
Optimizer Cost:      6204                          14
Disk Reads:          0                             0
Direct Writes:       0                             0
Rows Processed:      170                           10
Executions:          17                            1

So the SQL we previously ran has an improvement factor of 39044.8x with the new plan that uses the newly created Auto Index. These numbers are a little nonsensical as we’ll see in a later post, but it does sound kinda impressive…

Finally, we get to the Plans Section of the report:

PLANS SECTION
---------------------------------------------------------------------------------------------
- Original
-----------------------------

Plan Hash Value  : 3567883234
-------------------------------------------------------------------------------------
| Id | Operation                      | Name     | Rows   | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |          |        |       | 6204 |          |
|  1 |  PX COORDINATOR                |          |        |       |      |          |
|  2 |    PX SEND QC (RANDOM)         | :TQ10000 |    425 | 20825 | 6204 | 00:00:01 |
|  3 |     PX BLOCK ITERATOR          |          |    425 | 20825 | 6204 | 00:00:01 |
|  4 |      TABLE ACCESS STORAGE FULL | BOWIE    |    425 | 20825 | 6204 | 00:00:01 |
-------------------------------------------------------------------------------------

- With Auto Indexes

-----------------------------
Plan Hash Value  : 493118340

-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |   10 |   230 |   14 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE                |   10 |   230 |   14 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_600vgjmtqsgv3 |   10 |       |    3 | 00:00:01 |

Here Oracle compares the original plan with the new plan that uses the new index. The new plan is much more efficient and so the index is created as a Valid, Visible index.

Note: the vast majority of my test cases were run on the Dedicated Autonomous Application Transaction Processing (ATP) environment, where parallelism is common for most plans by default.

Let’s look at details of the newly created Automatic Index:

 

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor  from user_indexes where table_name='BOWIE';

INDEX_NAME                AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_PK                  NO  YES VISIBLE   DISABLED      VALID      10000000       19429             58133
SYS_AI_600vgjmtqsgv3      YES NO  VISIBLE   DISABLED      VALID      10000000       22419          10000000

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='BOWIE' order by index_name, column_position;

INDEX_NAME                COLUMN_NAME          COLUMN_POSITION
------------------------- -------------------- ---------------
BOWIE_PK                  ID                                 1
SYS_AI_600vgjmtqsgv3      CODE                               1

 

There is new column column called AUTO in DBA_INDEXES to denote where an index has been automatically created by Oracle.

So the new SYS_AI_600vgjmtqsgv3 Automatic Index on the CODE column is both VISIBLE and VALID in this case,  meaning it can be globally used within the database. As we’ll see if future posts, this is not always the case with Automatic Indexes.

If we now re-run the initial SQL query I ran and look at the execution plan:

 

SQL> select * from bowie where code=42;

10 rows selected.

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |    10 |   230 |    13   (0)| 00:00:01 |
|   1 |  PX COORDINATOR                       |                      |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)                 | :TQ10001             |    10 |   230 |    13   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE                |    10 |   230 |    13   (0)| 00:00:01 |
|   4 |     BUFFER SORT                       |                      |       |       |            |          |
|   5 |      PX RECEIVE                       |                      |    10 |       |     3   (0)| 00:00:01 |
|   6 |       PX SEND HASH (BLOCK ADDRESS)    | :TQ10000             |    10 |       |     3   (0)| 00:00:01 |
|   7 |        PX SELECTOR                    |                      |       |       |            |          |
|*  8 |           INDEX RANGE SCAN            | SYS_AI_600vgjmtqsgv3      10         |     3   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

8 - access("CODE"=42)

Statistics
----------------------------------------------------------
 12  recursive calls
  0  db block gets
 13  consistent gets
  0  physical reads
  0  redo size
867  bytes sent via SQL*Net to client
588  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  2  sorts (memory)
  0  sorts (disk)
 10  rows processed

 

We notice the new Automatic Index has been used by the CBO to substantially improve the performance of the query (just 13 consistent gets).

OK, so that’s the end of the Intro. In the next blog article, we’ll start looking at some of the specifics of how Auto Indexing works behind the covers and why it’s important to understand some of these workings…

2019 Public Appearances (What In The World) August 15, 2019

Posted by Richard Foote in Oracle Indexes.
add a comment

I’ll be presenting at a number of Oracle events over the remainder of the year. Details as follows:

Oracle Open World – San Francisco (16-19 September 2019)

  • Session Type: Conference
  • Session Session ID: CON1432
  • Session Title: Oracle Database 19c: In-Depth Look into the New Automatic Indexing Feature
  • Room: Moscone South – Room 152A Date: 09/17/19
  • Start Time: 11:15:00 AM
  • End Time: 12:00:00 PM

Link: https://events.rainfocus.com/widget/oracle/oow19/catalogow19?

Trivadis Performance Days 2019 (26-27 September 2019)

I have the following 2 presentations:

  • Indexing With Partitioning
  • Oracle Database 18c and 19c New Indexing Related Features

Link: https://www.trivadis.com/en/training/performance-days-2019-tvdpdays

AUSOUG Connect 2019 Series (Melbourne 14 October, Perth 16 October)

I have the following presentation:

  • 10 Things You Might Not Know But Really Should About Oracle Indexes

Link: https://www.ausoug.org.au/whats-on/connect2019/

ANZ Let’s Talk Database Series

I have the following 2 presentations:

  • Oracle Database 19c New Features
  • Oracle Exadata X8 New Features

Following are the confirmed dates:

  • Canberra: 22 October 2019
  • Sydney: 23 October 2019
  • Melbourne: 24 October 2019
  • Brisbane: 29 October 2019
  • Auckland: 30 October 2019
  • Wellington: 31 October 2019

Registration links coming soon.

UKOUG Techfest19 Brighton, UK (1-4 December 2019)

I have the following 3 presentations:

  • Oracle Database 19c New Features (Tuesday, 3 December 11:00-11:45am)
  • Oracle Indexing Q&A with Richard Foote (Tuesday, 3 December 2:45-3:30pm)
  • 10 Things You Might Not Know But Really Should About Oracle Indexes (Wednesday, 2:45-3:30pm)

Link: https://ukoug.org/page/techfest19

Hopefully I can catch you at one of these events. Don’t be shy, please say hi 🙂

Oracle 19c Automatic Indexing: Configuration (All I Need) July 29, 2019

Posted by Richard Foote in Automatic Indexing, AUTO_INDEX_COMPRESSION, AUTO_INDEX_DEFAULT_TABLESPACE, AUTO_INDEX_MODE, AUTO_INDEX_REPORT_RETENTION, AUTO_INDEX_RETENTION_FOR_AUTO, AUTO_INDEX_RETENTION_FOR_MANUAL, AUTO_INDEX_SCHEMA, AUTO_INDEX_SPACE_BUDGET, DBA_AUTO_INDEX_CONFIG, DBMS_AUTO_INDEX.CONFIGURE, Oracle Indexes, Oracle19c, SMB$CONFIG.
add a comment

In Rainbows

In this post, I’ll discuss how to configure the new Automatic Indexing capability introduced in Oracle Database 19c.

The intent of Oracle here is to make the configuration of Automatic Indexing as simplistic as possible, with as few levers for DBAs to potentially stuff up as possible. The ultimate goal would be to have a switch that just turns the feature on and that all necessary indexes then simply be created/modified/dropped as required. It’s not quite there yet, but it’ll no doubt get closer with each new release.

By default, Automatic Indexing is turned OFF. To turn on these capabilities, you simply run the following using the DBMS_AUTO_INDEX.CONFIGURE procedure:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_MODE‘,’IMPLEMENT‘);

PL/SQL procedure successfully completed.

That’s it. Automatic Indexing is now enabled and as discussed in the previous blog post on the Automatic Indexing Methodology, every 15 minutes, the SYS_AUTO_INDEX_TASK background task will kickoff and automatically create/replace/drop any database indexes as necessary.

Another option, is to enable Automatic Indexing in ‘REPORT ONLY‘ mode:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_MODE’,’REPORT ONLY‘);

PL/SQL procedure successfully completed.

This will only create new indexes as INVISIBLE indexes, which are not considered by default by the CBO. The intent here is that the DBA can investigate the newly created Automatic Indexes and decide whether turning on this feature for real would be a good idea. Need to exercise some caution with this option though, as the limited options regarding how to subsequently administer the created Invisible Automatic Indexing can be problematic. I’ll discuss all this is more detail in a future post.

To turn off Automatic Indexing, simply set the AUTO_INDEX_MODE to ‘OFF’:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_MODE’,’OFF‘);

PL/SQL procedure successfully completed.

Note here the documentation states “the existing auto indexes are disabled” which is incorrect. New Automatic Indexes will no longer be created, but existing Automatic Indexes will still be both Visible and Valid and available to the CBO for continued use.

By default, Automatic Indexing considers all tables in all “user created” schemas. However, this can be controlled with the AUTO_INDEX_SCHEMA option within the DBMS_AUTO_INDEX.CONFIGURE procedure. You can control which schemas to either explicitly include or exclude from Automatic Indexing considerations.

To add the BOWIE schema to an “Inclusion” list of schemas, you run the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_SCHEMA‘,’BOWIE’, TRUE);

PL/SQL procedure successfully completed.

 

To add the BOWIE schema to an “Exclusion” list of schemas, you run the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_SCHEMA‘,’BOWIE’, FALSE);

PL/SQL procedure successfully completed.

 

To remove the BOWIE schema from whichever list it belongs, you run the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_SCHEMA‘, ‘BOWIE’, NULL);

PL/SQL procedure successfully completed.

 

To remove all schemas from the “Inclusion” list, you run the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_SCHEMA‘, NULL, TRUE);

PL/SQL procedure successfully completed.

 

You can configure a tablespace to be the tablespace in which all Automatic Indexes are to now be created by running the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_DEFAULT_TABLESPACE‘,’INDEX_TS’);

PL/SQL procedure successfully completed

The INDEX_TS tablespace is now the location of all newly created Automatic Indexes.

You can also control how much of the configured Automatic Indexing tablespace is to be reserved for use by Automatic Indexes (default 50%) by running the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_SPACE_BUDGET‘, ’42’);

PL/SQL procedure successfully completed.

Now, only 42% of the INDEX_TS tablespace can be used by Automatic Indexes.

 

You can control the number of days (the default is 373 days) in which if an Automatic Index has been deemed NOT to have been used, it will be automatically dropped. The following command:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_RETENTION_FOR_AUTO‘, ’42’);

PL/SQL procedure successfully completed.

will automatically drop any Automatic Index that is deemed not to have been used in the last 42 days.

Note: Oracle uses the new Indexing Tracking feature introduced in Oracle 12.2 to determine if an index has/has not been used, which has limitations that could potentially result in an Automatic Index that has been “lightly” used during the retention period being dropped. This will be discussed in more detail in a future post.

A similar retention configuration can be implemented for manually created indexes. The following command:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_RETENTION_FOR_MANUAL’,’42’);

PL/SQL procedure successfully completed.

will automatically drop any manually created index that is deemed not to have been used in the last 42 days.

Note: the same Indexing Tracking limitations means that manually created indexes lightly used during the last 42 days could also be automatically dropped.

I will discuss various undocumented implications of automatically dropping both Automatic and Manual Indexes in future posts.

 

Very importantly,  the Automatic Indexing logs on which Automatic Indexing Reports are based are only retained by default for just 31 days. To change the time before the Automatic Indexing logs are deleted, you run the following:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_REPORT_RETENTION‘,’342’);

PL/SQL procedure successfully completed.

The Automatic Indexing logs are now retained for 342 days. These Automatic Indexing reports (and hence logs) are critical for understanding what the Automatic Indexing featuring is doing within the database and will be discussed extensively in future posts. I would recommend increasing the retention period from the 31 days default.

 

The final configuration option is not officially documented (yet) and controls whether or not Advanced Compression is used for Automatic Indexes. The default is that Advanced Compression is disabled, but this can be changed as follows:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_COMPRESSION‘,’ON’);

PL/SQL procedure successfully completed.

All Automatic Indexes are now automatically compressed using Advanced Low Compression.

I believe this was at some stage going to be the default behaviour, but due to licencing considerations and that many sites don’t have the Advanced Compression Option, this was subsequently changed. If you do have Advanced Compression, I would strongly recommend turning this ON, as implementing Advanced Low Compression is a relatively no-brainer beneficial decision.

I will however discuss the various undocumented implications of Advanced Compression in relation to Automatic Indexing in future posts.

The documented DBA_AUTO_INDEX_CONFIG view can be used to view the current setting for all of these configuration options:

SQL> select * from dba_auto_index_config;

PARAMETER_NAME                             PARAMETER_VALUE      LAST_MODIFIED                              MODIFIED_BY
---------------------------------------- -------------------- ---------------------------------------- --------------------
AUTO_INDEX_COMPRESSION                   OFF
AUTO_INDEX_DEFAULT_TABLESPACE
AUTO_INDEX_MODE                          REPORT ONLY           03-JUL-19 05.43.28.000000 AM            BOWIE
AUTO_INDEX_REPORT_RETENTION              31
AUTO_INDEX_RETENTION_FOR_AUTO            42                    20-JUN-19 06.32.06.000000 AM            BOWIE
AUTO_INDEX_RETENTION_FOR_MANUAL                                02-JUL-19 12.12.21.000000 AM            BOWIE
AUTO_INDEX_SCHEMA  schema IN (BOWIE)                           20-JUN-19 06.27.26.000000 AM            BOWIE
AUTO_INDEX_SPACE_BUDGET                  50

 

Additionally, you can view both the documented and undocumented settings regarding Automatic Indexing by looking at the SMB$CONFIG table:

 

SQL> select parameter_name, parameter_value
from sys.SMB$CONFIG;

PARAMETER_NAME                           PARAMETER_VALUE
---------------------------------------- ---------------
SPACE_BUDGET_PERCENT                                  10
PLAN_RETENTION_WEEKS                                  53
SPM_TRACING                                            0
AUTO_CAPTURE_PARSING_SCHEMA_NAME                       0
AUTO_CAPTURE_MODULE                                    0
AUTO_CAPTURE_ACTION                                    0
AUTO_CAPTURE_SQL_TEXT                                  0
AUTO_INDEX_SCHEMA                                      0
AUTO_INDEX_DEFAULT_TABLESPACE                          0
AUTO_INDEX_SPACE_BUDGET                               50
AUTO_INDEX_REPORT_RETENTION                           31
AUTO_INDEX_RETENTION_FOR_AUTO                          0
AUTO_INDEX_RETENTION_FOR_MANUAL                        0
AUTO_INDEX_MODE                                        0
_AUTO_INDEX_TRACE                                      0
_AUTO_INDEX_TASK_INTERVAL                            900
_AUTO_INDEX_TASK_MAX_RUNTIME                        3600
_AUTO_INDEX_IMPROVEMENT_THRESHOLD                     20
_AUTO_INDEX_REGRESSION_THRESHOLD                      10
_AUTO_INDEX_ABSDIFF_THRESHOLD                        100
_AUTO_INDEX_STS_CAPTURE_TASK                           0
_AUTO_INDEX_CONTROL                                    0
_AUTO_INDEX_DERIVE_STATISTICS                          0
_AUTO_INDEX_CONCURRENCY                                1
_AUTO_INDEX_SPA_CONCURRENCY                            1
_AUTO_INDEX_REBUILD_TIME_LIMIT                        30
_AUTO_INDEX_REBUILD_COUNT_LIMIT                        5
_AUTO_INDEX_REVERIFY_TIME                             30
AUTO_INDEX_COMPRESSION                                 0
AUTO_SPM_EVOLVE_TASK                                   0
AUTO_SPM_EVOLVE_TASK_INTERVAL                       3600
AUTO_SPM_EVOLVE_TASK_MAX_RUNTIME                    1800

 

I’ll discuss a number of these undocumented configuration options in future posts.

In my next post however, we’ll look at the Automatic Indexing feature in action with a very simple example to start with…

Oracle 19c Automatic Indexing: Methodology Introduction (After Today) July 24, 2019

Posted by Richard Foote in Automatic Indexing, Oracle Indexes, Oracle19c.
1 comment so far

young americans

For the past month or so I’ve been playing around extensively with the new Oracle 19c “Automatic Indexing” feature, so I thought it was time to start blogging about it.

Considering it’s only in “Version 1” status, my initial impression is very positive in that it works extremely well doing at what it’s initially designed to do. There are certainly “interesting” bits of behavior here and there and some missing functionality, but it’s a whole lot better than DB running sub-optimally due to missing indexes.

This post is purely an introduction in which I’ll just discuss the general methodology behind the Automatic Indexing (AI) capability, initially in a somewhat simplistic manner in order to convey the more important concepts. I’ll expand and discuss many of the actual complexities behind this cool feature in future posts.

The basic concept behind AI is that the DBA via a simple “switch” can turn on the AI  feature and have the Oracle Database automatically create/drop/modify any necessary database indexes as required.

The Oracle database continually captures SQL workloads and keeps track of any new column usages within SQL predicates (as well as associated plans and execution statistics) that might warrant the need for a new index. Note that currently, only equality predicates are considered when determining potential candidate Automatic Indexes.

By default, a background task is executed every 15 minutes that by default runs for up to an hour, to determine if the database can identify any new indexes that “might” be warranted. There are a number of reasons why the tasks might actually take considerably more time than the default 1 hour (I’ll expand on reasons why in future posts).

Candidate indexes that have been detected based on the previous 15 mins database workload are initially created as INVISIBLE/UNUSABLE indexes and hard parsed with the captured SQLs to determine if the index could be considered by the CBO.

If the candidate indexes are indeed viable, the indexes are then created as INVISIBLE/USABLE indexes and verified via the SQL Performance Analyzer using SQL Tuning Sets to determine if the performance of the captured SQLs have indeed improved from the existing plans when using the newly created Automatic Indexes.

If performance actually improves for all captured SQLs using a new Automatic Index, the Automatic Index is made VISIBLE and is now available for general database use. If performance is worse for all captured SQLs, the Automatic Index is made UNUSABLE again and hence not available to the CBO. If performance is better for some SQLs but worse for others, well the story gets a little complicated. In “theory”, the Automatic Index is made USABLE but SQL baselines are created for the SQLs that suffer performance degradation to not use the Automatic Index. Any such SQLs are effectively “blacklisted” and are not (easily) considered for future “new” AI deliberations.  As I’ll discuss in future posts, things are not actually quite as straightforward as that.

So depending on the scenario, Automatic Indexes can end up being in any of the following states:

  • Invisible and Unusable
  • Invisible and Valid
  • Visible and Unusable
  • Visible and Valid

When Visible and Valid, Automatic Indexes can ultimately be:

  • Used by the CBO
  • Not used by the CBO (even by the SQL that caused its creation)

As part of the identify candidate index process, Oracle will consider if a new index can be logically “merged” with an existing index and effectively replace an existing index by dropping and replacing it with a new Automatic Index.

AI will also monitor if existing (either Automatic or Manual) indexes are not currently being used within the database. If after a configurable period of time an index is deemed not to have been used within that time, the index will be automatically dropped during the AI 15 minute task.

This is the basic AI story. The actual story is a tad more complicated and which I’ll expand upon in many many future posts, so keep your questions until then 🙂

 

Importantly, AI is ONLY available on the Exadata platform or on Oracle Cloud environments. It is NOT available on standard On-Premises Oracle 19c deployments. Attempts to turn the feature on where not supported will only result in disappointment:

 

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_MODE’,’IMPLEMENT’);
BEGIN DBMS_AUTO_INDEX.CONFIGURE(‘AUTO_INDEX_MODE’,’IMPLEMENT’); END;

*
ERROR at line 1:
ORA-40216: feature not supported
ORA-06512: at “SYS.DBMS_SYS_ERROR”, line 79
ORA-06512: at “SYS.DBMS_AUTO_INDEX_INTERNAL”, line 9180
ORA-06512: at “SYS.DBMS_AUTO_INDEX”, line 283
ORA-06512: at line 1

 

I’ll next discuss the AI configuration options available to the DBA…

 

 

Bitmap Index On Column With 212552698 Distinct Values, What Gives? (I’d Rather Be High) April 23, 2019

Posted by Richard Foote in Advanced Index Compression, Autonomous Data Warehouse, Autonomous Database, Index Compression, Oracle Indexes.
add a comment

The Next Day

In my previous post on Indexing The Autonomous Warehouse, I highlighted how it might be necessary to create indexes to improve the performance and scalability of highly selective queries, as it might on any Data Warehouse running on an Exadata platform. In the post, I created a Bitmap Index and showed how improve SQL performance insured.

I’ve had two subsequent correspondences asking why I decided to use a Bitmap Index in my example, when the column in question had a (relatively) massive number of distinct values, some 212552698. What gives, wouldn’t it have been far more efficient to have used a standard B-Tree index instead, as Bitmap Indexes are only suitable for columns that have low-medium numbers of distinct values? There’s nothing particularly low-medium cardinality about 212552698 distinct values.

It’s a good question and as 2 people asked basically the same question, thought it worthy of a separate post to try to address it.

Firstly, I must admit the only calculation I performed mentally when deciding which type of index to use was to look at the number of distinct values, 212552698 and the number of rows in the table, 1 billion and determine that there’s approximately 5 rows per distinct value. With 5 repeated values on average, this was sufficient for the Bitmap Index to likely be the more efficient type of index and so I created a Bitmap Index in my demo, considering this was a Data Warehouse environment where potential locking issues relating to concurrent DMLs was not an issue.

As I’ve discussed previously, as with this classic post from way back in 2010, “So What Is A Good Cardinality Estimate For A Bitmap Index Column“, it’s not the number of distinct values that’s important, it’s the column cardinality and the average number of repeated column values that’s important when deciding if a Bitmap Index might be appropriate in a Data Warehouse environment.

As above post discusses, a Bitmap Index might only need the one index entry per column value and as the resultant index bitmap string can be very very highly compressed if there are few actual occurrences per value, a column with as many as 5 repeated values of average would likely be more efficient than a corresponding B-Tree Index. A B-Tree Index with 5 repeated values would require the value to be stored 5 times, with their 5 corresponding rowids, whereas the Bitmap Index might only need to store the indexed value once with its 2 corresponding rowids and a trivial amount for the highly compressed bitmap string.

If we create an equivalent B-Tree Index to the previously created Bitmap Index:

SQL> create index big_ziggy_lo_orderkey_i2 on big_ziggy(lo_orderkey) invisible;

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I2       NORMAL         2506552 DISABLED

We notice the Bitmap Index at 843144 leaf blocks is indeed substantially smaller than the equivalent B-Tree Index at 2506552 leaf blocks.

However, this is the Oracle Autonomous Data Warehouse environment where I have accessed to the Advanced Compression option and the capability to potentially create highly compressed B-Tree index structures. See various previous posts on Index Advanced Compression.

Perhaps, I can create a smaller structure to the Bitmap Index by using Index Advanced Compress. Let’s try initially with Advanced Compression Low:

SQL> drop index big_ziggy_lo_orderkey_i2;

Index dropped.

SQL> create index big_ziggy_lo_orderkey_i3 on big_ziggy(lo_orderkey) compress advanced low invisible;
...

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I3       NORMAL         1921296 ADVANCED LOW

We notice the B-Tree Index is indeed now smaller at just 1921296 leaf blocks, down from 2506552 leaf blocks, but still not as small as the 843144 leaf blocks of the Bitmap Index.

However, this is a Data Warehouse environment where the DML overheads associated with the maintenance of Advanced Compression High indexes may not be of such concern. Additionally, I might have ample CPU resources to cater for any additional CPU requirements of accessing such Advanced Compression High indexes. Therefore, let’s create a B-Tree Index with Advanced Compression High:

SQL> drop index big_ziggy_lo_orderkey_i3;

Index dropped.

SQL> create index big_ziggy_lo_orderkey_i4 on big_ziggy(lo_orderkey) compress advanced high invisible;
...

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I4       NORMAL          786537 ADVANCED HIGH

We notice the index has significantly reduced further in size and at just 786537 leaf blocks in now indeed smaller than the corresponding Bitmap Index.

Note that although it might be beneficial to use Advanced Compressed High on Bitmap Indexes, such an option is not currently supported.

Of course your mileage will vary on which Index Type and Compression Option will be most appropriate depending on the characteristics of your data, however it might be worth considering these options in environments where you have access to these indexing options.

But yes, a Bitmap Index might indeed be the better option, even if there are billions of distinct values (if there are say 10s of billions of rows) for the columns to be indexed…

Indexing The Oracle Autonomous Data Warehouse (Autobahn) April 2, 2019

Posted by Richard Foote in Autonomous Data Warehouse, Autonomous Database, Oracle Indexes.
2 comments

autobahn

When the Autonomous Data Warehouse was initially released in March 2018, one of the notable restrictions was that it did not support typical Data Warehouse performance related structures such as Indexes, Partitioning or Materialized Views. The platform was deemed efficient enough to not require these features, which would unnecessarily complicate the autonomous environment.

In September 2018, these restrictions were finally lifted with the (for now) manual creation of Indexes, Partitioning and Materialized Views now all supported. Oracle however states that “Oracle recommends that you do not manually create these structures, and leave performance optimizations to the Autonomous Data Warehouse. If you’re a highly skilled Oracle Database tuning expert and decide to manually create these access structures, please be advised to test the impact of your manual tuning efforts on your full workload.

If you’re not a “highly skilled Oracle Database tuning expert”, let me give you some advice on when to create indexes in the Autonomous Data Warehouse (on Oracle Database 18c where these features are NOT automatically created and maintained by autonomous processes).

I’ll initially focus on two of the key capabilities available in the Autonomous Data Warehouse environment that might indeed make some index structures unnecessary, although neither is unique to autonomous databases.

Let’s begin by creating a relatively large table based on the huge LINEORDER table found in the provided SSB schema. The following BIG_ZIGGY table is based on 1 billion rows from the LINEORDER table:

SQL> create table big_ziggy select * from ssb.lineorder where rownum<=1000000000;

Frustratingly, if you have a session that performs a database operation for an extended period, the session looses its connection and hangs indefinitely. You need to start a separate session to confirm when the task actually completes and track progress by say looking a v$session_longops or querying dba_segments.

So we have a decently sized BIG_ZIGGY table at around 27GB:

SQL> select table_name, num_rows, blocks, round(blocks/128) MB
from user_tables where table_name='BIG_ZIGGY';

TABLE_NAME     NUM_ROWS     BLOCKS         MB
------------ ---------- ---------- ----------
BIG_ZIGGY    1000000000    3534714      27615

Here are a couple of key points.

Firstly, if you run a query on a table using filtering columns for the first time, then none of the smarts in the autonomous database environments are going to help you here. Things will initially run slowly, because Oracle would not have detected yet there is anything that needs to be tuned. The tuning process is “reactive”, in that Oracle needs to find an issue first before it can potentially address it.

The second key point is that if only a small fraction of the overall data is required or need be accessed, if the data being accessed is very highly filtered, then a database index is likely going to be highly effective, even within the Autonomous Data Warehouse. This is precisely why indexes are typically required in databases.

Although very often, large volumes of data are indeed accessed in typical Data Warehouse workloads, this is not always the case. It’s certainly not uncommon for ad-hoc queries and the such to only access a small fraction of an available data set.

The following query on the large BIG_ZIGGY table only returns 9 rows:

SQL> select * from big_ziggy
where lo_orderkey = 2294059393;

9 rows selected.

Elapsed: 00:06:23.99

Execution Plan
----------------------------------------------------------
Plan hash value: 2097953955

---------------------------------------------------------------------------------------------------------
| Id | Operation                   | Name                       | Rows | Bytes | Cost (%CPU)| Time      |
---------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                            |    5 |   505 |  67250 (12)|  00:00:03 |
|  1 |  RESULT CACHE               | 2u5qwsq68ghc5bqn5spqhyqjyj |      |       |            |           |
|* 2 |   TABLE ACCESS STORAGE FULL | BIG_ZIGGY                  |    5 |   505 |  67250 (12)|  00:00:03 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("LO_ORDERKEY"=2294059393)
     filter("LO_ORDERKEY"=2294059393)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=17; dependencies=(BOWIE.BIG_ZIGGY); name="select * from big_
ziggy where lo_orderkey = 2294059393"

Note
-----

- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
      1 recursive calls
      0 db block gets
3530785 consistent gets
3530770 physical reads
      0 redo size
   2116 bytes sent via SQL*Net to client
    552 bytes received via SQL*Net from client
      2 SQL*Net roundtrips to/from client
      0 sorts (memory)
      0 sorts (disk)
      9 rows processed

The query takes nearly 6.5 minutes to complete which is clearly a long time to return a few rows.

Note that the Autonomous Data Warehouse platform is Exadata and Exadata has a number of key capabilities, such a Hybrid Columnar Compression (HCC), Smarts Scans, Storage Indexes and the such that can be extremely useful in Data Warehouse environments.

For example, the table is smaller and can be accessed more efficiently thanks to HCC:

SQL> select table_name, compression, compress_for
from user_tables where table_name='BIG_ZIGGY';

TABLE_NAME   COMPRESS COMPRESS_FOR
------------ -------- ------------------------------
BIG_ZIGGY     ENABLED QUERY HIGH ROW LEVEL LOCKING

Exadata Smart Scans and associated Storage Indexes can result in extremely efficient Full Table Scans by potentially skipping large sections of the physical data from having to be accessed and by returning just the required data set back to the database. I’ve blogged previously a number of times on the power of Storage Indexes and Exadata related smarts.

However, if a query is executed using specific filtering columns for the first time, then Storage Indexes will not have yet been created. If we look at the effectiveness of Storage Indexes when running the previous query:

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s
where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index                             0
cell physical IO interconnect bytes returned by smart scan       7.79956818

Note that initially, no bytes are saved by storage indexes as they don’t initially exist. Although only a relatively small amount of data is actually returned to the database server, having to read the entire table is expensive and why the query is taking so long to complete.

If we run a number of different queries with a filter predicate on the same LO_CUSTKEY column and eventually run the following query:

SQL> select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity
from big_ziggy where lo_orderkey = 2294059491;

6 rows selected.

Elapsed: 00:01:41.95

Execution Plan
----------------------------------------------------------
Plan hash value: 2097953955

---------------------------------------------------------------------------------------------------------
| Id | Operation                   | Name                       | Rows | Bytes | Cost (%CPU)| Time      |
---------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                            |    5 |   125 |  67250 (12)|  00:00:03 |
|  1 |  RESULT CACHE               | 9ycjfft9fx7gy506tr9d98djdj |      |       |            |           |
|* 2 |   TABLE ACCESS STORAGE FULL | BIG_ZIGGY                  |    5 |   125 |  67250 (12)|  00:00:03 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("LO_ORDERKEY"=2294059491)
     filter("LO_ORDERKEY"=2294059491)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=5; dependencies=(BOWIE.BIG_ZIGGY); name="select lo_orderkey,
lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orde
rkey = 2294059491"

Note
-----

- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
      1 recursive calls
      0 db block gets
3530785 consistent gets
3530770 physical reads
      0 redo size
    760 bytes sent via SQL*Net to client
    552 bytes received via SQL*Net from client
      2 SQL*Net roundtrips to/from client
      0 sorts (memory)
      0 sorts (disk)
      6 rows processed

We note that the overall elapsed times have now decreased, taking only 1 minute, 45 seconds to complete.

If we now look at the effectiveness of storage indexes:

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s
where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index                     731.78125
cell physical IO interconnect bytes returned by smart scan       7.59265137

The storage index is now providing some benefit but as I’ve discussed previously, there are limitations to their effectiveness depending on data distribution and the physical clustering of the data.  So your mileage will vary on the effectiveness of smart scans and no, Oracle does not (currently) automatically re-cluster data to improve smart scan performance when most effective.

If you re-run the same query multiple times, another performance feature kicks in:

SQL> select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orderkey = 2294059491;

6 rows selected.

Elapsed: 00:00:01.74

Execution Plan
----------------------------------------------------------
Plan hash value: 2097953955

---------------------------------------------------------------------------------------------------------
|  Id | Operation                   | Name                       | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                            |    5 |   125 |  67250 (12)| 00:00:03 |
|   1 |  RESULT CACHE               | 9ycjfft9fx7gy506tr9d98djdj |      |       |            |          |
| * 2 |   TABLE ACCESS STORAGE FULL | BIG_ZIGGY                  |    5 |   125 |  67250 (12)| 00:00:03 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("LO_ORDERKEY"=2294059491)
     filter("LO_ORDERKEY"=2294059491)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=5; dependencies=(BOWIE.BIG_ZIGGY); name="select lo_orderkey,
lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orderkey = 2294059491"

Note
-----

- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
  0 recursive calls
  0 db block gets
  0 consistent gets
  0 physical reads
  0 redo size
760 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  6 rows processed

Response times are now under 2 seconds (Australia is a long way from Ashburn, USA), with Oracle automatically using the Result Cache to directly access the necessary result and hence not requiring any table related consistent reads. This is as efficient as it gets, but it’s dependant on similar result sets being returned and no data changes that would invalidate the results cache such as a new data load into the data warehouse:

SQL> insert into big_ziggy select * from big_ziggy where rownum  commit;

Commit complete.

SQL> select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orderkey = 2294059491;

6 rows selected.

Elapsed: 00:01:44.79

Execution Plan
----------------------------------------------------------
Plan hash value: 2097953955

---------------------------------------------------------------------------------------------------------
|  Id | Operation                   | Name                       | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                            |    5 |   125 |  67250 (12)| 00:00:03 |
|   1 |  RESULT CACHE               | 9ycjfft9fx7gy506tr9d98djdj |      |       |            |          |
| * 2 |   TABLE ACCESS STORAGE FULL | BIG_ZIGGY                  |    5 |   125 |  67250 (12)| 00:00:03 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("LO_ORDERKEY"=2294059491)
     filter("LO_ORDERKEY"=2294059491)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=5; dependencies=(BOWIE.BIG_ZIGGY); name="select lo_orderkey,
lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orde
rkey = 2294059491"

Note
-----

- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
      1 recursive calls
      0 db block gets
3530808 consistent gets
3530786 physical reads
      0 redo size
    760 bytes sent via SQL*Net to client
    552 bytes received via SQL*Net from client
      2 SQL*Net roundtrips to/from client
      0 sorts (memory)
      0 sorts (disk)
      6 rows processed

After new data gets loaded into the table, we’re back to being reliant on the effectiveness of the Exadata smart scan.

So we have a scenario here where we either have excellent performance via the result cache or just average performance is we want data for specific data based on a moderately effective storage index and resultant smart scan (in a different session, the following query is run):

SQL> select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orderkey = 2294059492;

Elapsed: 00:02:03.99

Execution Plan
----------------------------------------------------------
Plan hash value: 2097953955

---------------------------------------------------------------------------------------------------------
|  Id | Operation                   | Name                       | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                            |    5 |   125 |  67250 (12)| 00:00:03 |
|   1 |  RESULT CACHE               | 4s0z6mkqpufc33mxb9ddvuc69r |      |       |            |          |
| * 2 |   TABLE ACCESS STORAGE FULL | BIG_ZIGGY                  |    5 |   125 |  67250 (12)| 00:00:03 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("LO_ORDERKEY"=2294059492)
     filter("LO_ORDERKEY"=2294059492)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=5; dependencies=(BOWIE.BIG_ZIGGY); name="select lo_orderkey,
lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orde
rkey = 2294059492"
Note
-----
- automatic DOP: Computed Degree of Parallelism is 1
Statistics
----------------------------------------------------------
      1 recursive calls
      0 db block gets
3530785 consistent gets
3530770 physical reads
      0 redo size
    695 bytes sent via SQL*Net to client
    552 bytes received via SQL*Net from client
      2 SQL*Net roundtrips to/from client
      0 sorts (memory)
      0 sorts (disk)
      2 rows processed

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index                     731.78125
cell physical IO interconnect bytes returned by smart scan       7.59265137

To get both consistent and excellent performance when running a query that performs a high level of filtering, we need to create an appropriate index as we would in any other regular data warehouse database environment:

SQL> create bitmap index big_ziggy_lo_orderkey_i on big_ziggy(lo_orderkey);
SQL> select index_name, leaf_blocks, blevel, distinct_keys, status from user_indexes where table_name='BIG_ZIGGY';

INDEX_NAME                LEAF_BLOCKS     BLEVEL DISTINCT_KEYS STATUS
------------------------- ----------- ---------- ------------- --------
BIG_ZIGGY_LO_ORDERKEY_I        843144          3     212552698 VALID

If we now run random queries based on the highly selective LO_ORDERKEY column:

SQL> select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity from big_ziggy where lo_orderkey = 2294059489;

Elapsed: 00:00:01.67

Execution Plan
----------------------------------------------------------
Plan hash value: 1677294303

------------------------------------------------------------------------------------------------------------
| Id | Operation                      | Name                       | Rows | Bytes | Cost (%CPU)| Time      |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT               |                            |    5 |   125 |       5 (0)|  00:00:01 |
|  1 |  RESULT CACHE                  | 7tn121mujfw3kfg39z5pg4duwy |      |       |            |           |
|  2 |   TABLE ACCESS BY INDEX ROWID  | BIG_ZIGGY                  |    5 |   125 |       5 (0)|  00:00:01 |
|  3 |    BITMAP CONVERSION TO ROWIDS |                            |      |       |            |           |
|* 4 |     BITMAP INDEX SINGLE VALUE  | BIG_ZIGGY_LO_ORDERKEY_I    |      |       |            |           |
------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

4 - access("LO_ORDERKEY"=2294059489)

Result Cache Information (identified by operation id):
------------------------------------------------------

1 - column-count=5; dependencies=(BOWIE.BIG_ZIGGY); attributes=(ordered); nam
e="select lo_orderkey, lo_linenumber, lo_custkey, lo_partkey, lo_quantity from b
ig_ziggy where lo_orderkey = 2294059489"
Note
-----
- automatic DOP: Computed Degree of Parallelism is 1 because of no expensive
parallel operation

Statistics
----------------------------------------------------------
  1 recursive calls
  0 db block gets
  8 consistent gets
  0 physical reads
  0 redo size
647 bytes sent via SQL*Net to client
553 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  1 rows processed

We now get the required, consistent, high performance that we require, without having to resort to consuming additional CPU resources by increasing say the number of available CPUs and degree of parallelism to reduce response times.

In short, you index the Autonomous Data Warehouse as you would any database running on an Exadata platform…

Remember, this is all effectively version one of the Autonomous Data Warehouse. The capability of automatically creating necessary indexes is coming very soon…