jump to navigation

Automatic Indexes: Automatically Rebuild Unusable Indexes Part II (“I Wish You Would”) May 11, 2022

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Local Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning, Performance Tuning, Rebuild Unusable Indexes.
1 comment so far

Within a few hours of publishing my last blog piece on how Automatic Indexing (AI) can automatically rebuild indexes that have been placed in an UNUSABLE state, I was asked by a couple of readers a similar question: “Does this also work if just a single partition of an partitioned index becomes unusable”?

My answer to them both is that I’ve provided them the basic framework in the demo to check out the answer to that question for themselves (Note: a fantastic aspect of working with the Oracle Database is that it’s available for free to play around with, including the Autonomous Database environments).

But based on the principle that for every time someone asks a question, there’s probably a 100 others who potentially might be wondering the same thing, thought I’ll quickly whip up a demo to answer this for all.

I’ll begin with the same table format and data as my previous blog:

SQL> CREATE TABLE big_ziggy(id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2020 VALUES LESS THAN (TO_DATE('01-JAN-2021', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2021 VALUES LESS THAN (TO_DATE('01-JAN-2022', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2022 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_ziggy SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> COMMIT;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_ZIGGY');

PL/SQL procedure successfully completed.

 

But this time, I’ll run a number of queries similar to the following, that also has a predicate based on the partitioned key (RELEASE_DATE) of the table:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3599046327

----------------------------------------------------------------------------------------------------
| Id | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT       |           |    1 |    26 |    1051 (2) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE |           |    1 |    26 |    1051 (2) | 00:00:01 |      3 |     3 |
|* 2 |  TABLE ACCESS FULL     | BIG_ZIGGY |    1 |    26 |    1051 (2) | 00:00:01 |      3 |     3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
    filter(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       5618 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

If we wait for the next AI task to kick in:

DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY()
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 11-MAY-2022 10:55:43
Activity end                : 11-MAY-2022 10:56:27
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 6
SQL statements improved (improvement factor) : 3 (6670.1x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 2x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------
| Owner | Table     | Index                | Key                      | Type   | Properties |
---------------------------------------------------------------------------------------------
| BOWIE | BIG_ZIGGY | SYS_AI_6wv99zdbsy8ar | RELEASE_DATE,TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We can see that AI has indeed automatically created a LOCAL, partitioned index (on columns RELEASE_DATE, TOTAL_SALES) in this scenario, as we have an equality predicate based on the partitioned key (RELEASE_DATE).

Currently, all is well with the index, with all partitions in a USABLE state:

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes WHERE table_name = 'BIG_ZIGGY';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_6wv99zdbsy8ar           YES YES VISIBLE   N/A

SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

SQL> select index_name, column_name, column_position from user_ind_columns 
     where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_6wv99zdbsy8ar           RELEASE_DATE                  1
SYS_AI_6wv99zdbsy8ar           TOTAL_SALES                   2

 

But if we now do an offline reorg of a specific table partition:

SQL> alter table big_ziggy move partition albums_2017;

Table altered.

SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          UNUSABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

 

We can see we’ve now made the associated Local Index partition UNUSABLE.

If we run the following query:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3599046327

----------------------------------------------------------------------------------------------------
| Id | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT       |           |    1 |    26 |     986 (2) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE |           |    1 |    26 |     986 (2) | 00:00:01 |      3 |     3 |
|* 2 |  TABLE ACCESS FULL     | BIG_ZIGGY |    1 |    26 |     986 (2) | 00:00:01 |      3 |     3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
    filter(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

Statistics
----------------------------------------------------------
          3 recursive calls
          4 db block gets
       5578 consistent gets
       5571 physical reads
        924 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

The CBO has no choice here but to do a full partition table scan.

If now wait again for the next AI task to strut its stuff:

SQL> select dbms_auto_index.report_last_activity() from dual;

DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY()
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 11-MAY-2022 11:42:42
Activity end                : 11-MAY-2022 11:43:13
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 4
SQL statements improved (improvement factor) : 1 (5573x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1.1x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------
| Owner | Table     | Index                | Key                      | Type   | Properties |
---------------------------------------------------------------------------------------------
| BOWIE | BIG_ZIGGY | SYS_AI_6wv99zdbsy8ar | RELEASE_DATE,TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------


SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

The index partition is now automatically in a USABLE state again.

If we look at the index object data:

SQL> select object_name, subobject_name, to_char(created, 'dd-Mon-yy hh24:mi:ss') created, to_char(last_ddl_time, 'dd-Mon-yy hh24:mi:ss’)
last_ddl_time from dba_objects where object_name='SYS_AI_6wv99zdbsy8ar';

OBJECT_NAME                    SUBOBJECT_NAME       CREATED                     LAST_DDL_TIME
------------------------------ -------------------- --------------------------- ---------------------------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          11-May-22 10:41:33          11-May-22 10:56:14
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          11-May-22 10:41:33          11-May-22 10:56:15
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          11-May-22 10:41:33          11-May-22 11:42:42
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          11-May-22 10:41:33          11-May-22 10:56:18
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          11-May-22 10:41:33          11-May-22 10:56:19
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          11-May-22 10:41:33          11-May-22 10:56:20
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          11-May-22 10:41:33          11-May-22 10:56:22
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          11-May-22 10:41:33          11-May-22 10:56:22
SYS_AI_6wv99zdbsy8ar                                11-May-22 10:41:33          11-May-22 11:43:13

 

We can see that just the impacted index partition has been rebuilt.

The CBO can now successfully use the index to avoid the full partition table scan:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3640710173

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                  | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart | Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                           |                      |    1 |    26 |      4 (0) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE                     |                      |    1 |    26 |      4 (0) | 00:00:01 |      3 |     3 |
|  2 |  TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_ZIGGY            |    1 |    26 |      4 (0) | 00:00:01 |      3 |     3 |
|* 3 |   INDEX RANGE SCAN                         | SYS_AI_6wv99zdbsy8ar |    1 |       |      3 (0) | 00:00:01 |      3 |     3 |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TOTAL_SALES"=123456)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

I’ll leave it to the discernible reader to determine if this also works in the scenario where the partitioned index were to be global… ūüôā

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part III (“Loaded”) April 28, 2022

Posted by Richard Foote in 19c, Advanced Index Compression, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Data Clustering, Exadata, Index Access Path, Index Column Order, Index Compression, Oracle, Oracle 21c, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Overloading.
add a comment

In my previous two posts, I’ve discussed scenarios where Automatic Indexing (AI) does not currently created automatic indexes and you may need to manually create the necessary indexes.

In this post, I’ll discuss a third scenario where AI will create an index, but you may want to manually create an even better one…

I’ll start by creating a relatively “large” table, with 20+ columns:

SQL> create table bowie_overload (id number, code1 number, code2 number, stuff1 varchar2(42), stuff2 varchar2(42), stuff3 varchar2(42), stuff4 varchar2(42), stuff5 varchar2(42), stuff6 varchar2(42), stuff7 varchar2(42), stuff8 varchar2(42), stuff9 varchar2(42), stuff10 varchar2(42), stuff11 varchar2(42), stuff12 varchar2(42), stuff13 varchar2(42), stuff14 varchar2(42), stuff15 varchar2(42), stuff16 varchar2(42), stuff17 varchar2(42), stuff18 varchar2(42), stuff19 varchar2(42), stuff20 varchar2(42), name varchar2(42));

Table created.

SQL> insert into bowie_overload select rownum, mod(rownum, 1000)+1, '42', 'David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke', 'David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke','David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke','David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke', 'The Spiders From Mars' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_OVERLOAD');

PL/SQL procedure successfully completed.

 

The main columns to note here are CODE1 which contains 1000 distinct values (and so is kinda selective on a 10M row table, but not spectacularly so, especially with a poor clustering factor) and CODE2 which always contains the same value of “42” (and so will compress wonderfully for maximum effect).

I’ll next run the following query a number of times:

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1883860831

--------------------------------------------------------------------------------------------
| Id  | Operation                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                | 10000 | 70000 |  74817 (1) | 00:00:03 |
| * 1 | TABLE ACCESS STORAGE FULL | BOWIE_OVERLOAD | 10000 | 70000 |  74817 (1) | 00:00:03 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE1"=24)
    filter("CODE1"=24)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     869893 consistent gets
     434670 physical reads
          0 redo size
     183890 bytes sent via SQL*Net to client
       7378 bytes received via SQL*Net from client
        668 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

 

Without an index, the CBO currently has no choice but to perform a FTS. An index on the CODE1 column would provide the necessary filtering to fetch and return the required rows.

BUT, if this query was important enough, we could improve things further by “Overloading” this index with the CODE2 column, so we could use the index exclusively to get all the necessary data, without having to access the table at all. Considering an index on just the CODE1 column would need to fetch a reasonable number of rows (10000) and would need to visit a substantial number of different table blocks due to its poor clustering, overloading the index in this scenario would substantially reduce the necessary workloads of this query.

So what does AI do in this scenario, is overloading an index considered?

If we look at the AI report:

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 28-APR-2022 12:15:45
Activity end                : 28-APR-2022 12:16:33
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 134.22 MB (134.22 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 2 (47.1x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 47.1x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
| Owner | Table          | Index                | Key   | Type   | Properties |
-------------------------------------------------------------------------------
| BOWIE | BOWIE_OVERLOAD | SYS_AI_aat8t6ad0ux0h | CODE1 | B-TREE | NONE       |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name : BOWIE
SQL ID              : bh5cuyv8ga0bt
SQL Text            : select code1, code2 from bowie_overload where code1=42
Improvement Factor  : 46.9x

Execution Statistics:
-----------------------------
                    Original Plan                Auto Index Plan
                    ---------------------------- ----------------------------
Elapsed Time (s):   42619069                     241844
CPU Time (s):       25387841                     217676
Buffer Gets:        12148771                     18499
Optimizer Cost:     74817                        10021
Disk Reads:         6085380                      9957
Direct Writes:      0                            0
Rows Processed:     140000                       10000
Executions:         14                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 1883860831

--------------------------------------------------------------------------------
| Id | Operation         | Name           | Rows  | Bytes | Cost  | Time       |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |                |       |       | 74817 |            |
|  1 | TABLE ACCESS FULL | BOWIE_OVERLOAD | 10000 | 70000 | 74817 | 00:00:03   |
--------------------------------------------------------------------------------

- With Auto Indexes
-----------------------------
Plan Hash Value : 2541132923

---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost  | Time       |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |  9281 | 64967 | 10021 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD       |  9281 | 64967 | 10021 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_aat8t6ad0ux0h | 10000 |       |    18 | 00:00:01   |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE1"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see that an automatic index on just the CODE1 column was created.

 

SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_OVERLOAD';

INDEX_NAME                AUT VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_aat8t6ad0ux0h      YES VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_OVERLOAD' order by index_name, column_position;

INDEX_NAME                COLUMN_NAME     COLUMN_POSITION
------------------------- --------------- ---------------
SYS_AI_aat8t6ad0ux0h      CODE1                         1

 

If we now re-run the query (noting in Oracle21c after you invalidate the current cursor):

 

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2541132923

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 |  Rows | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      | 10000 | 70000 |   10021 (1)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD       | 10000 | 70000 |   10021 (1)| 00:00:01 |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_aat8t6ad0ux0h | 10000 |       |      18 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE1"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      10021 consistent gets
          0 physical reads
          0 redo size
      50890 bytes sent via SQL*Net to client
         63 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

The query now uses the newly created automatic index.

BUT, at 10021 consistent gets, it’s still doing a substantial amount to work here.

If we manually create another index that overloads the only other column (CODE2) required in this query:

SQL> create index bowie_overload_code1_code2_i on bowie_overload(code1,code2) compress advanced low;

Index created.

I’m using COMPRESS ADVANCED LOW as used by the automatic index, noting that CODE2 only contains the value “42” for all rows, making it particularly perfect for compression and a “best case” scenario when it comes to the minimal overheads potentially associated with overloading this index (I’m trying yo give AI every chance here):

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_OVERLOAD';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_aat8t6ad0ux0h           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000
BOWIE_OVERLOAD_CODE1_CODE2_I   NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_OVERLOAD' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
BOWIE_OVERLOAD_CODE1_CODE2_I   CODE1                         1
BOWIE_OVERLOAD_CODE1_CODE2_I   CODE2                         2
SYS_AI_aat8t6ad0ux0h           CODE1                         1

In fact, my manually created index is effectively the same size as the automatic index, with the same number (15363) of leaf blocks.

So I’m giving AI the best possible scenario in which it could potentially create an overloaded index.

But I’ve never been able to get AI to create overloaded indexes. Only columns in filtering predicates are considered for inclusion in automatic indexes.

If I now re-run my query again:

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1161047960

-------------------------------------------------------------------------------------------------
| Id  | Operation        | Name                         |  Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                              | 10000 | 70000 |      18 (0)| 00:00:01 |
| * 1 | INDEX RANGE SCAN | BOWIE_OVERLOAD_CODE1_CODE2_I | 10000 | 70000 |      18 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("CODE1"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         21 consistent gets
          0 physical reads
          0 redo size
      50890 bytes sent via SQL*Net to client
         63 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

We notice the CBO now uses the manually created index without any table access path, as it can just use the index to access the necessary data.

The number of consistent gets as a result has reduced significantly, down to just 21, a fraction of the previous 10021 when the automatic index was used.

So the scenario an of overloaded index that could significantly reduce database resources, which is currently not supported by AI, is another example of where may want to manually create a necessary index.

As always, this may change in the future releases…

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part II (“Ragazzo Solo, Ragazza Sola” April 27, 2022

Posted by Richard Foote in 19c, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Constraints, Exadata, Foreign Keys, Full Table Scans, Index Internals, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Performance Tuning.
1 comment so far

In my last post, I discussed how Automatic Indexing doesn’t create an automatic index in the scenario where the minimum or maximum of a column is required.

Another scenario when an automatic index is not created is when we hit issues associated with a missing index on a Foreign Key (FK) constraint.

As I’ve discussed many times previously, if you delete a parent record without an index on the dependant FK constraints, you hit a number of issues including having to perform a (potentially expensive and problematic) Full Table Scan (FTS) on the child tables and the associated locking problems.

To illustrate, I’ll first create a small parent table:

SQL> create table daddy (id number constraint daddy_pk primary key , name varchar2(42));

Table created.

SQL> insert into daddy select rownum, 'David Bowie '|| rownum from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DADDY');

PL/SQL procedure successfully completed.

And then a somewhat larger child table, with no index on the associated foreign key constraint:

SQL> create table kiddy (id number constraint kiddy_pk primary key , code1 number constraint daddy_fk references daddy(id), code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into kiddy select rownum, mod(rownum,1000)+1000 , mod(rownum, 10000)+1, mod(rownum, 100000)+1, 'Ziggy Stardust '|| rownum from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'KIDDY');

PL/SQL procedure successfully completed.

 

If we delete a number of parent rows, for example:

SQL> delete from daddy where id = 101;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 1477800718

-------------------------------------------------------------------------------
| Id | Operation         | Name     | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------
|  0 | DELETE STATEMENT  |          |    1 |     4 |       1 (0) |   00:00:01 |
|  1 | DELETE            | DADDY    |      |       |             |            |
|* 2 | INDEX UNIQUE SCAN | DADDY_PK |    1 |     4 |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=101)

Statistics
----------------------------------------------------------
         18 recursive calls
         13 db block gets
     117462 consistent gets
      22292 physical reads
    4645500 redo size
        204 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We notice that even though we only delete one row from a relatively small table, we perform a large number of consistent gets (117462) due to the necessary FTS on the child table, as Oracle is forced to check the table for any possible FK violations. Without an index on the child CODE1 column, Oracle has no choice but to perform the relatively expensive FTS.

Additionally, if we have an existing transaction of a child table (in Session 1):

SQL> insert into kiddy values (10000001,1042,1042,1042,'Iggy Pop');

1 row created.

And then in another session attempt to delete a parent row (in Session 2):

SQL> delete from daddy where id = 112;

The delete hangs in a locked state due to the child transaction in Session 1. This can lead to further locking issues in other sessions (Session 3):

insert into kiddy values (10000002,1042,1042,1042,'Iggy Pop');

 

The FTS on the child table and these associated locks can all be avoided by having an index on the FK constraint, as the index can then be used to effectively police the constraint during such delete operations.

What does AI do in this scenario?

Currently, nothing.

I’ve been unable to ever get AI to create a usable automatic index in this scenario. In Oracle Database 19c, I’ve not been able to get an AI created at all. In Oracle Database 21c, the best I’ve seen has been a Unusable/Invisible AI:

SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows from user_indexes where table_n
ame='KIDDY';

INDEX_NAME                     INDEX_TYPE                  AUT CON VISIBILIT STATUS     NUM_ROWS
------------------------------ --------------------------- --- --- --------- -------- ----------
KIDDY_PK                       NORMAL                      NO  YES VISIBLE   VALID      10000004
SYS_AI_31thttf8v6r35           NORMAL                      YES NO  INVISIBLE UNUSABLE   10000004

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='KIDDY';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
KIDDY_PK                       ID                            1
SYS_AI_31thttf8v6r35           CODE1                         1

So you may need to manually create such an index on the FK constraint to improve performance and eliminate these locking issues:

SQL> create index kiddy_code1_i on kiddy(code1);

Index created.

SQL> delete from daddy where id = 142;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 1477800718

-------------------------------------------------------------------------------
| Id | Operation         | Name     | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------
|  0 | DELETE STATEMENT  |          |    1 |     4 |       1 (0) |   00:00:01 |
|  1 | DELETE            | DADDY    |      |       |             |            |
|* 2 | INDEX UNIQUE SCAN | DADDY_PK |    1 |     4 |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=142)

Statistics
----------------------------------------------------------
          1 recursive calls
          8 db block gets
          2 consistent gets
          2 physical reads
        132 redo size
        204 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

Consistent gets have dropped off massively (down to just 8) as Oracle can now use the index to avoid the FTS search on the child table. The associated locking issues are eliminated as well.

Note: As always, this AI behaviour can always change in the future…

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part I (“Always Crashing In The Same Car”) April 26, 2022

Posted by Richard Foote in 19c, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, MAX, MIN, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
1 comment so far

As I’ve discussed previously, Oracle has increased the number of scenarios in which it will now create automatic indexes, such as with non-equality predicates and JSON expressions.

However, as of Oracle Database 21c, there are still a number of scenarios where an automatic index will NOT be created, even though an index might prove beneficial.

One such scenario is when the Min/Max of a column is required.

As I’ve discussed a number of times previously, Oracle can very efficiently use an index to determine either the Min or Max value of a column, by (hopefully) just visiting the first or last leaf block in an index. The INDEX FULL SCAN (MIN/MAX) execution plan path can be used explicitly for this purpose.

If I create a simple table as follows:

SQL> create table bowie_min (id number constraint bowie_min_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_min select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_MIN');

PL/SQL procedure successfully completed.

 

And then run the following queries a number of times that return the Min and Max of the CODE column:

SQL> select min(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 1068446691

----------------------------------------------------------------------------------------
| Id | Operation                 | Name      | Rows | Bytes | Cost (%CPU) | Time       |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |           |    1 |     5 |    6706 (2) | 00:00:01   |
|  1 | SORT AGGREGATE            |           |    1 |     5 |             |            |
|  2 | TABLE ACCESS STORAGE FULL | BOWIE_MIN |  10M |   47M |    6706 (2) | 00:00:01   |
----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

SQL> select max(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 1068446691

----------------------------------------------------------------------------------------
| Id | Operation                 | Name      | Rows | Bytes | Cost (%CPU) | Time       |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |           |    1 |     5 |    6706 (2) | 00:00:01   |
|  1 | SORT AGGREGATE            |           |    1 |     5 |             |            |
|  2 | TABLE ACCESS STORAGE FULL | BOWIE_MIN |  10M |   47M |    6706 (2) | 00:00:01   |
----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

 

Currently, the CBO has no choice but to use a Full Table Scan (FTS) as there is currently no index on the CODE column.

So what does Automatic Indexing (AI) make of things?

Nothing.

Currently, AI will not create an index in this scenario, no matter how many times I execute these queries.

If we look at the indexes on the table after a significant period of time after running these queries:

SQL> select index_name, auto from user_indexes where table_name='BOWIE_MIN';

INDEX_NAME   AUT
------------ ---
BOWIE_MIN_PK NO

No Automatic Indexes. To improve the performance of these queries, we currently have to manually create the associated index:

SQL> create index bowie_min_code_i on bowie_min(code);

Index created.

If we now re-run these queries and look at the execution plan:

SQL> select min(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 252811132

-----------------------------------------------------------------------------------------------
| Id | Operation                 | Name             | Rows | Bytes | Cost (%CPU) | Time       |
-----------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |                  |    1 |     5 |       3 (0) | 00:00:01   |
|  1 | SORT AGGREGATE            |                  |    1 |     5 |             |            |
|  2 | INDEX FULL SCAN (MIN/MAX) | BOWIE_MIN_CODE_I |    1 |     5 |       3 (0) | 00:00:01   |
-----------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We can see that the CBO is now indeed using the index to return the Min/Max values with a vastly reduced number of consistent gets (down to just 3 from the previous 38538).

However, a key point here is that Automatic Indexes only works on an Exadata platform and Exadata has various smarts that potentially makes accessing data via a “FTS” in this manner much more efficient than in non-Exadata environments.

Oracle may well take the position that getting Min/Max data on a Exadata is potentially efficient enough and doesn’t on its own warrant the creation of an index.

More on this in future posts…

Automatic Indexing: Deferred Invalidations (“The Post War Dream”) April 19, 2022

Posted by Richard Foote in 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Deferred Invalidation, Exadata, Function Based Indexes, Index Access Path, Index Internals, JSON, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Richard's Blog.
1 comment so far

In my previous post on how JSON expressions can now be automatically indexed, I mentioned there was an outstanding issue with the associated CBO execution plan, immediately post the creation of the automatic index:

SQL> select * from bowie_json where json_value(bowie_order, '$.PONumber')='42';

Execution Plan
----------------------------------------------------------
Plan hash value: 832017402

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      | 20000 |   12M |    1524 (1) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_JSON           | 20000 |   12M |    1524 (1) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_ayvj257jd93cv | 8000  |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("BOWIE_ORDER" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.PONumber' RETURNING
           VARCHAR2(4000) ERROR ON ERROR NULL ON EMPTY)='42')

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     234168 consistent gets
     200279 physical reads
          0 redo size
       1595 bytes sent via SQL*Net to client
        526 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

If we look at the number of recursive calls, we notice that it remains at 0. If we look at both the number of consistent gets (234168) and physical reads (200279), they both remain very high and identical to that of the previous Full Table Scan plan.

Basically, although autotrace suggests the newly created automatic index is being used, in fact the previous Full Table Scan plan is still being invoked.  (Note: this of course is one of the dangers of the autotrace plan, in that it might not display the actual plan being invoked).

So what’s going on here?

The Oracle Database 21c New Features Guide makes the following point: “an enhancement has been introduced to reduce the overhead of cursor invalidations when a new automatic index is created”.

Oracle 12.2 introduced a new feature in which one can now defer the invalidation of dependent SQL cursors when an index is created or modified. I’ve of course discussed this previously in this 12.2 Index Deferred Invalidation post.

When an automatic index is created in 21c, the current SQL cursors are NOT invalidated (to reduce the overhead of having to potentially reparse of large number of current SQL cursors). However, this means that currently inefficient SQL statements will keep their existing sub-optimal execution plans post the creation of newly created automatic indexes, until the existing SQL cursors aged out.

At which point, the new CBO plan using the automatic index will actually be invoked:

SQL> select * from bowie_json where json_value(bowie_order, '$.PONumber')='42';

Execution Plan
----------------------------------------------------------
Plan hash value: 832017402

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows | Bytes | Cost (%CPU) |  Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      |    1 |   671 |       4 (0) |  00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_JSON           |    1 |   671 |       4 (0) |  00:00:01 |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_ayvj257jd93cv |    1 |       |       3 (0) |  00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("BOWIE_ORDER" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.PONumber' RETURNING
           VARCHAR2(4000) ERROR ON ERROR NULL ON EMPTY)='42')

Statistics
----------------------------------------------------------
          30 recursive calls
           0 db block gets
          46 consistent gets
          11 physical reads
           0 redo size
        1595 bytes sent via SQL*Net to client
         526 bytes received via SQL*Net from client
           3 SQL*Net roundtrips to/from client
           0 sorts (memory)
           0 sorts (disk)
           1 rows processed

So just be aware in Oracle Database 21c that your beautifully created automatic indexes may not actually get used as desired for a period of time…

Automatic Indexing: JSON Expressions Part I (Making Plans For Nigel) April 13, 2022

Posted by Richard Foote in Automatic Indexing, Autonomous Database, CBO, Exadata, Function Based Indexes, Index statistics, JSON, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Virtual Columns.
1 comment so far

When Automatic Indexing was first released, one of the restrictions was that automatic indexes on JSON expressions were NOT supported.

However, the Oracle Database 21c doco mentions:

Automatic indexes can be single or multi-column. They are considered for the following: Selected expressions (for example, JSON expressions)“.

So on my (admittedly dodgy) “Exadata” VM, I thought I’ll check out how AI now indeed deals with JSON expressions.

I start by creating a simple little table that uses the new 21c JSON datatype and populate it with some JSON documents (note the PONumber key has effectively unique numeric values assigned):

SQL> CREATE TABLE bowie_json
       (id number,
        bowie_date date,
        bowie_order JSON);

SQL> insert into bowie_json
     select
     rownum,
     sysdate,
     '{"PONumber" : ' || rownum || ',
       "Reference" : "2022' || rownum || 'DBOWIE",
       "Requestor" : "David Bowie",
       "User" : "DBOWIE",
       "CostCenter" : "A42",
       "ShippingInstructions" : {"name" : "David Bowie",
                                 "Address": {"street" : "42 Ziggy Street",
                                             "city" : "Canberra",
                                              "state" : "ACT",
                                              "zipCode" : 2601,
                                              "country" : "Australia"},
                                 "Phone" : [{"type" : "Office", "number" : "417-555-7777"},
                                            {"type" : "Mobile", "number" : "417-555-1234"}]},
       "Special Instructions" : null,
       "AllowPartialShipment" : true,
       "LineItems" : [{"ItemNumber" : 1,
                       "Part" : {"Description" : "Hunky Dory",
                                 "UnitPrice" : 10.95},
                                  "Quantity" : 5.0},
                      {"ItemNumber" : 2,
                       "Part" : {"Description" : "Pin-Ups",
                                 "UnitPrice" : 10.95},
                                 "Quantity" : 3.0}]}'
from dual connect by level <= 2000000;

2000000 rows created.

SQL> commit;

Commit complete

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_JSON');

PL/SQL procedure successfully completed.

As always, it’s important to ensure the table has statistics, as AI does not work properly without them.

I then run a number of SQL statements, with different JSON expression based predicates, including:

SQL> select * from bowie_json where json_value(bowie_order, '$.PONumber')='42';

SQL> select * from bowie_json z where z.bowie_order.PONumber.number()=4242;

SQL> select * from bowie_json where json_value(bowie_order, '$.PONumber' returning number)=42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1196930810

--------------------------------------------------------------------------------
| Id | Operation         | Name       | Rows  | Bytes | Cost (%CPU)| Time      |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |            | 20000 |   12M |  34476 (1) | 00:00:02  |
|* 1 | TABLE ACCESS FULL | BOWIE_JSON | 20000 |   12M |  34476 (1) | 00:00:02  |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(JSON_VALUE("BOWIE_ORDER" /*+ LOB_BY_VALUE */ FORMAT OSON
           , '$.PONumber' RETURNING NUMBER NULL ON ERROR)=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     259127 consistent gets
     200279 physical reads
          0 redo size
       1595 bytes sent via SQL*Net to client
        526 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

They all return just the one row, but must currently use a Full Table Scan with no indexes present.

So what does AI make of things?

The first thing to note is that running the AI last activity report generates the following error:

SQL> select dbms_auto_index.report_last_activity() report from dual;
ERROR:
ORA-30954: char 0 is invalid in json_value(BOWIE_ORDER, '$.PONumber' returning VA
ORA-06512: at "SYS.DBMS_AUTO_INDEX", line 177
ORA-06512: at "SYS.DBMS_AUTO_INDEX", line 107
ORA-06512: at "SYS.DBMS_AUTO_INDEX_INTERNAL", line 8676
ORA-06512: at "SYS.DBMS_AUTO_INDEX_INTERNAL", line 8676
ORA-06512: at "SYS.DBMS_AUTO_INDEX_INTERNAL", line 9226
ORA-06512: at "SYS.DBMS_AUTO_INDEX", line 89
ORA-06512: at "SYS.DBMS_AUTO_INDEX", line 167
ORA-06512: at line 1

no rows selected

If we look at the indexes now present with the table:

SQL> select index_name, index_type, auto, visibility, status, num_rows, leaf_blocks, clustering_factor from user_indexes
where table_name='BOWIE_JSON';

INDEX_NAME                INDEX_TYPE                AUT VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- ------------------------- --- --------- -------- ---------- ----------- -----------------
SYS_IL0000081096C00003$$  LOB                       NO  VISIBLE   VALID
SYS_AI_ayvj257jd93cv      FUNCTION-BASED NORMAL     YES VISIBLE   VALID       2000000        5141            380000
SYS_AI_gpdkwzugdn055      FUNCTION-BASED NORMAL     YES VISIBLE   VALID       2000000        4596            200000

SQL> select index_name, column_expression from user_ind_expressions where table_name='BOWIE_JSON';

INDEX_NAME                COLUMN_EXPRESSION
------------------------- --------------------------------------------------------------------------------
SYS_AI_ayvj257jd93cv      JSON_VALUE("BOWIE_ORDER" FORMAT OSON , '$.PONumber' RETURNING VARCHAR2(4000) ERR
OR ON ERROR NULL ON EMPTY)

SYS_AI_gpdkwzugdn055      JSON_VALUE("BOWIE_ORDER" FORMAT OSON , '$.PONumber' RETURNING NUMBER ERROR ON ER
ROR NULL ON EMPTY)

We can see that AI has indeed created two new automatic indexes, one on the VARCHAR2 JSON expression and one on the NUMBER JSON expression.

If we re-run the SQLs, we notice 3 very important points. Note the following example was run soon after the automatic indexes were created:

SQL> select * from bowie_json where json_value(bowie_order, '$.PONumber')='42';

Execution Plan
----------------------------------------------------------
Plan hash value: 832017402

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      | 20000 |   12M |    1524 (1) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_JSON           | 20000 |   12M |    1524 (1) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_ayvj257jd93cv |  8000 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("BOWIE_ORDER" /*+ LOB_BY_VALUE */ FORMAT OSON , '$.PONumber' RETURNING
           VARCHAR2(4000) ERROR ON ERROR NULL ON EMPTY)='42')

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     234168 consistent gets
     200279 physical reads
          0 redo size
       1595 bytes sent via SQL*Net to client
        526 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The first point to note is that the CBO now chooses to use the newly created automatic index. As only one row is return, this is as one would hope.

But there are two other very important points/issues worth making about the above execution plan and associated costs and statistics. One is associated with new AI behaviour introduced in 21c and the other is associated with an old trap in relation to function-based indexes.

I’ll leave it to the discernible reader to spot these issues, before I cover them in Part II in the coming days…

Oracle 19c Automatic Indexing: Invisible/Valid Automatic Indexes (Bowie Rare) August 31, 2021

Posted by Richard Foote in 19c, 19c New Features, Attribute Clustering, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Index Access Path, Index statistics, Invisible Indexes, Invisible/Valid Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Oracle Statistics, Oracle19c, Unusable Indexes.
1 comment so far

In my previous post, I discussed how newly created Automatic Indexes can have one of three statuses, depending the selectivity and effectiveness of the associated Automatic Index.

Indexes that improve performance sufficiently are created as Visible/Valid indexes and can be subsequently considered by the CBO. Indexes that are woeful and have no chance of improving performance are created as Invisible/Unusable indexes. Indexes considered potentially suitable but ultimately don’t sufficiently improve performance, are created as Invisible/Valid indexes.

Automatic Indexes are created as Visible/Valid indexes when shown to improve performance (by the _AUTO_INDEX_IMPROVEMENT_THRESHOLD parameter). But as I rarely came across Invisible/Valid Automatic Indexes (except for when Automatic Indexing is set to “Report Only” mode), I was curious to determine approximately at what point were such indexes created by the Automatic Indexing process.

To investigate things, I created a table with columns that contain data with various levels of selectivity, some of which should fall inside and outside the range of viability of any associated index, based on the cost of the associated Full Table Scan.

The following table has 32 columns of interest, each with a slight variation of distinct values giving small differences in overall column selectivity:

SQL> create table bowie_stuff1 (id number, code1 number, code2 number, code3 number, code4 number, code5 number, code6 number, code7 number, code8 number, code9 number, code10 number, code11 number, code12 number, code13 number, code14 number, code15 number, code16 number, code17 number, code18 number, code19 number, code20 number, code21 number, code22 number, code23 number, code24 number, code25 number, code26 number, code27 number, code28 number, code29 number, code30 number, code31 number, code32 number, name varchar2(42));

Table created.

SQL> insert into bowie_stuff1 
select rownum, 
       mod(rownum, 900)+1, 
       mod(rownum, 1000)+1, 
       mod(rownum, 1100)+1, 
       mod(rownum, 1200)+1, 
       mod(rownum, 1300)+1, 
       mod(rownum, 1400)+1, 
       mod(rownum, 1500)+1, 
       mod(rownum, 1600)+1, 
       mod(rownum, 1700)+1, 
       mod(rownum, 1800)+1, 
       mod(rownum, 1900)+1, 
       mod(rownum, 2000)+1, 
       mod(rownum, 2100)+1, 
       mod(rownum, 2200)+1, 
       mod(rownum, 2300)+1, 
       mod(rownum, 2400)+1, 
       mod(rownum, 2500)+1, 
       mod(rownum, 2600)+1, 
       mod(rownum, 2700)+1, 
       mod(rownum, 2800)+1, 
       mod(rownum, 2900)+1, 
       mod(rownum, 3000)+1, 
       mod(rownum, 3100)+1, 
       mod(rownum, 3200)+1, 
       mod(rownum, 3300)+1, 
       mod(rownum, 3400)+1, 
       mod(rownum, 3500)+1, 
       mod(rownum, 3600)+1, 
       mod(rownum, 3700)+1, 
       mod(rownum, 3800)+1, 
       mod(rownum, 3900)+1, 
       mod(rownum, 4000)+1,
       'THE RISE AND FALL OF ZIGGY STARDUST' 
from dual connect by level >=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

As always, it’s important that statistics be collected for Automatic Indexing to function properly:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STUFF1', estimate_percent=>null);

PL/SQL procedure successfully completed.

 

So on a 10M row table, I have 32 columns with the number of distinct values varying by only 100 values per column (or by a selectivity of just 0.001%):

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='BOWIE_STUFF1' order by num_distinct;

COLUMN_NAME  NUM_DISTINCT    DENSITY HISTOGRAM
------------ ------------ ---------- ---------------
NAME                    1  .00000005 FREQUENCY
CODE1                 900    .001111 HYBRID
CODE2                1000       .001 HYBRID
CODE3                1100    .000909 HYBRID
CODE4                1200    .000833 HYBRID
CODE5                1300    .000769 HYBRID
CODE6                1400    .000714 HYBRID
CODE7                1500    .000667 HYBRID
CODE8                1600    .000625 HYBRID
CODE9                1700    .000588 HYBRID
CODE10               1800    .000556 HYBRID
CODE11               1900    .000526 HYBRID
CODE12               2000      .0005 HYBRID
CODE13               2100    .000476 HYBRID
CODE14               2200    .000455 HYBRID
CODE15               2300    .000435 HYBRID
CODE16               2400    .000417 HYBRID
CODE17               2500      .0004 HYBRID
CODE18               2600    .000385 HYBRID
CODE19               2700     .00037 HYBRID
CODE20               2800    .000357 HYBRID
CODE21               2900    .000345 HYBRID
CODE22               3000    .000333 HYBRID
CODE23               3100    .000323 HYBRID
CODE24               3200    .000312 HYBRID
CODE25               3300    .000303 HYBRID
CODE26               3400    .000294 HYBRID
CODE27               3500    .000286 HYBRID
CODE28               3600    .000278 HYBRID
CODE29               3700     .00027 HYBRID
CODE30               3800    .000263 HYBRID
CODE31               3900    .000256 HYBRID
CODE32               4000     .00025 HYBRID
ID               10000000          0 HYBRID

I’ll next run the below queries (based on a simple equality predicate on each column) several times each in batches of 8 queries, so as to not swamp the Automatic Indexing process with potential new index requests (the ramifications of which I’ll discuss in another future post):

SQL> select * from bowie_stuff1 where code1=42;
SQL> select * from bowie_stuff1 where code2=42;
SQL> select * from bowie_stuff1 where code3=42;
SQL> select * from bowie_stuff1 where code4=42;
SQL> select * from bowie_stuff1 where code5=42;
...
SQL> select * from bowie_stuff1 where code31=42;
SQL> select * from bowie_stuff1 where code32=42;

 

If we now look at the statuses of the Automatic Indexes subsequently created:

SQL> select i.index_name, c.column_name, i.auto, i.constraint_index, i.visibility, i.status, i.num_rows, i.leaf_blocks, i.clustering_factor
from user_indexes i, user_ind_columns c
where i.index_name=c.index_name and i.table_name='BOWIE_STUFF1' order by visibility, status;

INDEX_NAME             COLUMN_NAME  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- ------------ --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_5rw9j3d8pc422   CODE5        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_48q3j752csn1p   CODE4        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_9sgharttf3yr7   CODE3        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_8n92acdfbuh65   CODE2        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_brgtfgngu3cj9   CODE1        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_1tu5u4012mkzu   CODE11       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_34b6zwgtm86rr   CODE12       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_gd0ccvdwwb4mk   CODE13       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_7k7wh28n3nczy   CODE14       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_67k2zjp09w101   CODE15       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_5fa6k6fm0k6wg   CODE10       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_4624ju6bxsv57   CODE9        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_bstrdkkxqtj4f   CODE8        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_39xqjjar239zq   CODE7        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_6h0adp60faytk   CODE6        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_5u0bqdgcx52vh   CODE16       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_0hzmhsraqkcgr   CODE22       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_4x716k4mdn040   CODE21       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_6wsuwr7p6drsu   CODE20       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_b424tdjx82rwy   CODE19       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_3a2y07fqkzv8x   CODE18       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_8dp0b3z0vxzyg   CODE17       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_d95hnqayd7t08   CODE23       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_fry4zrxqtpyzg   CODE24       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_920asb69q1r0m   CODE25       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_026pa8880hnm2   CODE31       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_96xhzrguz2qpy   CODE32       YES NO  VISIBLE   VALID      10000000       15368          10000000
SYS_AI_3dq93cc7uxruu   CODE29       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_5nbz41xny8fvc   CODE28       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_fz4q9bhydu2qt   CODE27       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_0kwczzg3k3pfw   CODE26       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_4qd5tsab7fnwx   CODE30       YES NO  VISIBLE   VALID      10000000       15367          10000000

We can see we indeed have the 3 statuses of Automatic Indexes captured:

Columns with a selectivity equal or worse to that of COL5 with 1300 distinct values are created as Invisible/Unusable indexes. Returning 10M/1300 rows or a cardinality of approx. 7,693 or more rows is just too expensive for such indexes on this table to be viable. This represents a selectivity of approx. 0.077%.

Note how the index statistics for these Invisible/Unusable indexes are not accurate. They all have an estimated LEAF_BLOCKS of 21702 and a CLUSTERING_FACTOR of 4272987. However, we can see from the other indexes which are physically created that these are not correct and are substantially off the mark with the actual LEAF_BLOCKS being around 15364 and the CLUSTERING_FACTOR actually much worse at around 10000000.

Again worthy of a future post to discuss how Automatic Indexing processing has to make (potentially inaccurate) guesstimates for these statistics in its analysis of index viability when such indexes don’t yet physically exist.

Columns with a selectivity equal or better to that of COL23 which has 3100 distinct values are created as Visible/Valid indexes. Returning 10M/3100 rows or a cardinality of approx. 3226 or less rows is cheap enough for such indexes on this table to be viable. This represents a selectivity of approx. 0.032%.

So in this specific example, only those columns between 1400 and 3000 distinct values meet the “borderline” criteria in which the Automatic Indexing process creates Invisible/Valid indexes. This represents a very very narrow selectivity range of only approx. 0.045% in which such Invisible/Valid indexes are created. Or for this specific example, only those columns that return approx. between 3,333 and 7,143 rows from the 10M row table.

Now the actual numbers and total range of selectivities for which Invisible/Valid Automatic Indexes are created of course depends on all sorts of factors, such as the size/cost of FTS of the table and not least the clustering of the associated data (which I’ve blogged about ad nauseam).

The point I want to make is that the range of viability for such Invisible/Valid indexes is relatively narrow and the occurrences of such indexes relatively rare in your databases. As such, the vast majority of Automatic Indexes are likely to be either Visible/Valid or Invisible/Unusable indexes.

It’s important to recognised this when you encounter such Invisible/Valid Automatic Indexes (outside of “REPORT ONLY” implementations), as it’s an indication that such an index is a borderline case that is currently NOT considered by the CBO (because of it being Invisible).

However, this Invisible/Valid Automatic Index status should really change to either of the other two more common statuses in the near future.

I’ll expand on this point in a future post…

METHOD_OPT Default In Oracle Autonomous Databases (She’ll Drive The Big Car) March 2, 2021

Posted by Richard Foote in 19c, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Histograms, METHOD_OPT, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Statistics.
1 comment so far

 

In a recent post on Invisible Automatic Indexes, I was puzzled by a couple of “oddities” in relation to some behaviour in the Oracle Autonomous Database Cloud environments.

The first one was how Oracle appeared to be creating Histograms on a much more regular basis than it had previously.

As one can see in the demo below, if I create and populate a table:

SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon' from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

And then collect statistics using the “default” options:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD');

PL/SQL procedure successfully completed.

All the columns in the table now have histograms, regardless of whether they’ve been used in SQL predicates or if they have data skew:

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='PINK_FLOYD';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
ID                        9705425          0 HYBRID
CODE                         4835     .00005 HYBRID
CREATE_DATE                 50357     .00002 HYBRID
NAME                            1 4.9639E-08 FREQUENCY

 

The explanation for this is embarrassingly simple. A quick check on the default settings for METHOD_OPT shows the following:

SQL> select dbms_stats.get_prefs('METHOD_OPT') from dual;

DBMS_STATS.GET_PREFS('METHOD_OPT')
--------------------------------------------------------------------------------
FOR ALL COLUMNS SIZE 254

 

The default is FOR ALL COLUMNS 254, meaning that we will now indeed have histograms collected on all columns. With new capabilities such as High Frequency Statistics Collection, it’s interesting that Oracle has taken this approach but Oracle has obviously taken the attitude that with Exadata as the hosted infrastructure, it can afford to simply collect histograms globally on all columns in the Autonomous Database environments.

If you wanted to change this, you can do so by for example:

SQL> exec DBMS_STATS.SET_GLOBAL_PREFS ('METHOD_OPT', 'FOR ALL COLUMNS SIZE AUTO');

PL/SQL procedure successfully completed.

SQL> select dbms_stats.get_prefs('METHOD_OPT') from dual;

DBMS_STATS.GET_PREFS('METHOD_OPT')
--------------------------------------------------------------------------------
FOR ALL COLUMNS SIZE AUTO

 

So not an “oddity”, but expected behaviour now on Oracle Autonomous Databases.

The other “oddity” I noticed were Invisible Valid Automatic indexes at times being created. The explanation for this will be the topic of my next blog post…

Oracle 19c Automatic Indexing: Function-Based Indexes? Part II (If You Can See Me) February 5, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.
1 comment so far

In my previous post, I discussed how Automatic Indexing does not currently support creating an index based on a function or expression predicate, even if it’s an equality predicate. You must manually create the associated function-based index.

However, if you have access to the application, there’s a better strategy when frequently searching on a function-based predicate. That’s to create a Virtual Column and use this column in your searching criteria (as mentioned by Connor McDonald in this comment).

To illustrate, I’m going to drop the previously manually created function-based index and hence the associated hidden virtual column, as Oracle quite rightly doesn’t allow you to have two virtual columns based on the same expression in the same table.

SQL> drop index david_upper_name_i;

Index dropped.

Since Oracle 11g, Oracle has supported the use of Visible Virtual Columns, a column that doesn’t physically exist, but defines a function/expression that can be easily accessed and populated when queried.

I’ll next create a Virtual Column called UPPER_NAME that is defined not based on a Data Type, but on the result on the UPPER function on the previously defined NAME column:

SQL> alter table david add (upper_name as (upper(name)));

Table altered.

Regardless of size of table, this column is added virtually instantly (pun fully intended), as no data is physically stored in the table itself. I view it (yep, another pun) as a “mini-view”, that can be used to hide complexity from the developer, with the actual data derived at run-time when the column is accessed in an SQL.

After I generate fresh statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null);

PL/SQL procedure successfully completed.

SQL> select column_name, hidden_column, virtual_column, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          HID VIR NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- --- --- ------------ ---------- ---------------
NAME                 NO  NO      10000000          0 HYBRID
MORE_STUFF9          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF8          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF7          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF6          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF5          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF4          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF3          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF2          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF10         NO  NO             1  .00000005 FREQUENCY
MORE_STUFF1          NO  NO             1  .00000005 FREQUENCY
ID                   NO  NO      10000000          0 HYBRID
CODE                 NO  NO         10000      .0001 HYBRID
UPPER_NAME           NO YES      10000000          0 HYBRID

Note how the UPPER_NAME virtual column is NOT hidden and now has up to date statistics.

We can now run this simplified query based on the new UPPER_NAME column, which does not need to include the potentially complex function expression:

SQL> select * from david where upper_name='DAVID BOWIE 42';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2426813604

-----------------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |    1 |   200 |    3349 (6) | 00:00:01  | 
|* 1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 |    3349 (6) | 00:00:01  |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("UPPER_NAME"='DAVID BOWIE 42')
    filter("UPPER_NAME"='DAVID BOWIE 42')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     263469 consistent gets
     263452 physical reads
          0 redo size
       1328 bytes sent via SQL*Net to client
        375 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

If we look at portions of the subsequent Automatic Indexing report:

 

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 360.71 MB (360.71 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 2 (263476.8x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 263476.8x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table | Index                | Key        | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | DAVID | SYS_AI_4k4mkgkw049ht | UPPER_NAME | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7tfqh3pu526mt
SQL Text            : select * from david where upper_name='DAVID BOWIE 42'
Improvement Factor  : 263484.7x

Execution Statistics:
-----------------------------
                        Original Plan                Auto Index Plan
                        ---------------------------- ----------------------------
Elapsed Time (s):       1471249                      1414
CPU Time (s):           300584                       986
Buffer Gets:            3161816                      4
Optimizer Cost:         3349                         4
Disk Reads:             3161432                      3
Direct Writes:          0                            0
Rows Processed:         12                           1
Executions:             12                           1

PLANS SECTION
--------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value : 2426813604

-----------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |      |       | 3349 |            |
|  1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 | 3349 | 00:00:01   |
-----------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no
- cardinality_feedback = yes

- With Auto Indexes
-----------------------------
Plan Hash Value : 1447691372

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows | Bytes | Cost | Time      |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |    1 |   200 |    4 | 00:00:01  |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |    4 | 00:00:01  |
| * 2 |  INDEX RANGE SCAN                    | SYS_AI_4k4mkgkw049ht |    1 |       |    3 | 00:00:01  |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("UPPER_NAME"='DAVID BOWIE 42')

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see from the report that Automatic Indexing has now created the associated, implicitly created function-based index (SYS_AI_4k4mkgkw049ht) based on the virtual UPPER_NAME column:

SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='DAVID';

INDEX_NAME           INDEX_TYPE                  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- --------------------------- --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_4k4mkgkw049ht FUNCTION-BASED NORMAL       YES NO  VISIBLE   VALID      10000000       43104           2136839

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='DAVID' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME          COLUMN_POSITION
-------------------- -------------------- ---------------
SYS_AI_4k4mkgkw049ht UPPER_NAME                         1

 

If we now re-run the SQL query:

SQL> select * from david where upper_name='DAVID BOWIE 4242';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1447691372

------------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                      |    1 |   200 |       4 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |       4 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | SYS_AI_4k4mkgkw049ht |    1 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("UPPER_NAME"='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
       1334 bytes sent via SQL*Net to client
        377 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The CBO now uses the new Automatic Index to significantly improve the performance of the query.

So not only is using a user defined Virtual Column a cleaner solution with respect to the frequent use of a function-based expressions, but has the added advantage of being supported with Automatic Indexing.

Oracle 19c Automatic Indexing: Non-Equality Predicates Part II (Let’s Spend The Night Together) January 21, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Non-Equality Predicates, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Performance Tuning.
1 comment so far

In my previous post in this series, I discussed out Automatic Indexing currently does not consider Non-Equality predicates. Automatic Indexing will index columns based only on Equality predicates.

So how does Oracle handle the scenario when an SQL has a mixture of both Equality and Non-Equality predicates?

I’ll begin by creating two very similar tables, but with the second table having a more selective CODE column:

SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD');

PL/SQL procedure successfully completed.


SQL> create table pink_floyd1 (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd1 select rownum, ceil(dbms_random.value(0, 25000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD1');

PL/SQL procedure successfully completed.

 

So table PINK_FLOYD has 5,000 distinct CODE values, whereas table PINK_FLOYD1 has 25,000 distinct CODE values.

I’ll next run the following identical SQLs, which both use an Equality predicate on the CODE column and a Non-Equality predicate on the CREATE_DATE column. The CODE column provides some filtering (more so with the PINK_FLOYD1 table) but in combination with the CREATE_DATE column, results in the ultimate filtering with no rows returned:

 

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed


SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 564520720

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |    1 |    41 |    856 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |    856 (11) | 00:00:01  |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))
     filter("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      64424 consistent gets
      64413 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

So how does Automatic Indexing handle this scenario. If we look at the subsequent Automatic Indexing report (highlights only):

 

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
-----------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties   |
-----------------------------------------------------------------------------
| BOWIE | PINK_FLOYD1 | SYS_AI_96snkmu4sk44g | CODE | B-TREE | NONE         |
-----------------------------------------------------------------------------
-------------------------------------------------------------------------------


-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7wag3gbk0b3tm
SQL Text            : select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021'
Improvement Factor  : 64442.3x

Execution Statistics:
-----------------------------
                      Original Plan                Auto Index Plan
                      ---------------------------- ----------------------------
Elapsed Time (s):     568513                       2771
CPU Time (s):         275534                       1874
Buffer Gets:          1031078                      406
Optimizer Cost:       856                          405
Disk Reads:           1030609                      3
Direct Writes:        0                            0
Rows Processed:       0                            0
Executions:           16                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 564520720

-----------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |      |       |  856 |            |
|  1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |  856 | 00:00:01   |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2703636439

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    1 |    41 |  405 | 00:00:01   |
| * 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |  405 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  403 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

If we look at the definitions of all indexes currently on these tables:

SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS   NUM_ROWS   LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_dp2t0j12zux49           YES INVISIBLE ADVANCED LOW  UNUSABLE   10000000       21702           4161898

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_dp2t0j12zux49           CODE                          1


SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD1';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_96snkmu4sk44g           YES VISIBLE   ADVANCED LOW  VALID      10000000       15400           9969473

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD1';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_96snkmu4sk44g           CODE                          1

 

In both cases, Automatic Indexing only created an index on the CODE column, as it was the only column with an Equality predicate.

However, the Automatic Index on the table PINK_FLOYD remained in an INVISIBLE/UNUSABLE. That’s because an index on only the CODE column was not efficient enough to improve the performance of the SQL, due to the filtering not being sufficient enough and because of the relatively poor Clustering Factor.

The index on the table PINK_FLOYD1 was eventually created as a VISIBLE/VALID index, as its better filtering was sufficient to actually improve the performance of the SQL.

So if we re-run the first query:

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

It continues to use a Full Table Scan.

If we re-run the second query:

 

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 2703636439

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      |    1 |    41 |     415 (0) | 00:00:01  |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |     415 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  412 |       |       3 (0) | 00:00:01  |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
        406 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

If now uses the newly created Automatic Index, with an improved 406 Consistent Gets (down from the previous 64424 Consistent Gets with the FTS).

BUT if we were to manually create an index on BOTH CODE and CREATE_DATE columns:

SQL> create index pink_floyd1_code_create_date_i on pink_floyd1(code, create_date) compress advanced low;

Index created.

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3366491378

----------------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                           | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                                |    1 |    41 |       4 (0) | 00:00:01  |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1                    |    1 |    41 |       4 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | PINK_FLOYD1_CODE_CREATE_DATE_I |    1 |       |       3 (0) | 00:00:01  |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"CREATE_DATE" IS NOT NULL)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

Performance improves significantly further, by reducing Consistent Gets down to just 3.

So if you have SQL statements with a mixture of both Equality and Non-Equality predicates, you may encounter these 2 scenarios:

A potentially efficient index that is not created at all as the filtering on just the Equality based predicates are not sufficient to create a viable index, or

A potentially suboptimal Automatic Index that doesn’t contain useful filtering columns because they’re used in Non-Equality predicates…

Announcement: Both Of My Oracle Webinars Scheduled For February 2021 !! January 19, 2021

Posted by Richard Foote in Oracle, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Indexing Internals Webinar, Oracle Performance Diagnostics and Tuning Webinar, Richard Foote Seminars.
add a comment

I’m please to announce that both of my highly acclaimed webinars scheduled for February are now confirmed.

These webinars are a must for any Oracle DBA or Developer interested in designing, developing or maintaining high performance, highly scalable Oracle-based applications or databases.

However only a few places are currently available on each webinar with numbers very strictly limited, as I only run small classes to give every attendee the opportunity to get the most from the training experience.

Webinar details are as follows:

 

8-12 February 2021 (5pm-9pm AEDT) РOracle Indexing Internals and Best Practices Webinar  (International Customers Only)

This seminar examines most available Oracle index structures/options and discusses in considerable detail how indexes function, how/when they should be used and how they should be maintained. It covers many useful tips and strategies to maximise the benefits of indexes on application/database performance and scalability, as well as in maximising Oracle database investments. Much of the material is exclusive to this seminar and is not generally available in Oracle documentation or in Oracle University courses.

 

23-26 February 2021 (5pm-9pm AEDT) – Oracle Performance Diagnostics and Tuning Webinar
(International Customers Only)

The seminar will detail how to maximise the performance of both Oracle databases and associated applications and how to diagnose and address any performance issues as quickly and effectively as possible. It also provides opportunity to have your own AWR reports analysed to identify performance issues.

 

You can also purchase tickets to both webinars at a special combo discount:

Both ‚ÄúOracle Indexing Internals and Best Practices‚ÄĚ and ‚ÄúOracle Performance Diagnostics and Tuning‚ÄĚ Webinars running in February 2021:¬† ¬†(International Customers Only)

 

Each webinar is $1,800 (AUS) individually and $3,000 (AUS) when both purchased in the combo package (the Buy Now functionality is available for International Customers Only).

Please Note: If based in Australia, please contact me (at richard@richardfooteconsulting.com) for a tax invoice that includes GST and instructions on how to pay.

 

Please contact me directly if you wish to pay via invoice and direct bank transfer or if you have any questions.

 

For full content details of the “Oracle Indexing Internals and Best Practices” Webinar: https://richardfooteconsulting.com/indexing-seminar/

For full content details of the “Oracle Performance Diagnostics and Tuning” Webinar: https://richardfooteconsulting.com/performance-tuning-seminar/

 

Hopefully you can take advantage of the opportunity to participate in this unique training experience (see here for some testimonials)…

Oracle 19c Automatic Indexing: Indexing Partitioned Tables Part I (Conversation Piece) October 14, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Index Access Path, Local Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning, Performance Tuning.
2 comments

In this little series, I’m going to discuss how Automatic Indexing works in relation to Partitioning.

I’ve discussed Indexing and Partitioning many times previously and how Oracle has various options when indexing a partitioned table:

  • Non-Partitioned Index
  • Globally Partitioned Index
  • Locally Partitioned Index

So the question(s) are how does Automatic Indexing handle scenarios with partitioned objects?

A very important point to make at the start is that based on my research, the answer has already changed significantly since Automatic Indexing was first released. So it’s important to understand that Automatic Indexing is an ever evolving capability, that will advance and improve as time goes on.

I’ll focus on how the feature currently works (as of Oracle Database 19.5), but will mention previously identified behaviour as a reference on how things can easily change.

In my first simple little example, I’m just going to create a range-partitioned table, partitioned based on RELEASE_DATE, with a partition for each year’s worth of data:

SQL> CREATE TABLE big_bowie1(id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2013 VALUES LESS THAN (TO_DATE('01-JAN-2014', 'DD-MON-YYYY')),
PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')),
PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
PARTITION ALBUMS_2020 VALUES LESS THAN (MAXVALUE));

Table created.

 

I’ll now add about 8 years worth of data:

SQL> INSERT INTO big_bowie1 SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> COMMIT;

Commit complete.

 

As discussed previously, I’ll importantly collect statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE1');

PL/SQL procedure successfully completed.

 

I’ll now run the following very selective query based the TOTAL_SALES column that is NOT part of the partitioning key:

 

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2468051548

---------------------------------------------------------------------------------------------------------
| Id | Operation                | Name       | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop     |
---------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |            |   20 |   520 |    643 (15)| 00:00:01 |       |           |
|  1 | PARTITION RANGE ALL      |            |   20 |   520 |    643 (15)| 00:00:01 |     1 |         8 |
|* 2 | TABLE ACCESS STORAGE FULL| BIG_BOWIE1 |   20 |   520 |    643 (15)| 00:00:01 |     1 |         8 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("TOTAL_SALES"=42)
    filter("TOTAL_SALES"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      44014 consistent gets
       9516 physical reads
          0 redo size
       1107 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         19 rows processed

 

Without an index in place, the CBO has no choice but to use a FTS. But what will Automatic Indexing make of things?

If we look at the next Automatic Indexing report:

 

SQL> select dbms_auto_index.report_last_activity() from dual;

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 13-OCT-2020 01:47:48
Activity end                : 13-OCT-2020 02:59:48
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 184.55 MB (184.55 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 1 (44119.6x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 25135.8x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| BOWIE | BIG_BOWIE1 | SYS_AI_2zt7rg40mxa4n | TOTAL_SALES | B-TREE | NONE       |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : chwm2gubm8fx9
SQL Text            : SELECT * FROM big_bowie1 WHERE total_sales = 42
Improvement Factor  : 44119.6x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    4387193                      1173
CPU Time (s):        2599423                      1037
Buffer Gets:         749507                       22
Optimizer Cost:      643                          22
Disk Reads:          470976                       2
Direct Writes:       0                            0
Rows Processed:      323                          19
Executions:          17                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 2468051548

-----------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |      |       |  643 |             |
|  1 | PARTITION RANGE ALL       |            |   20 |   520 |  643 | 00:00:01    |
|  2 | TABLE ACCESS STORAGE FULL | BIG_BOWIE1 |   20 |   520 |  643 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 937174207

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                 | Rows | Bytes | Cost | Time       |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                      |   19 |   494 |   22 | 00:00:01   |
|   1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE1           |   19 |   494 |   22 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                           | SYS_AI_2zt7rg40mxa4n |   19 |       |    3 | 00:00:01   |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("TOTAL_SALES"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We notice a couple of interesting points.

Firstly, yes Automatic Indexing has created an index based on the TOTAL_SALES column (SYS_AI_2zt7rg40mxa4n) as it improves performance by a reported 44119.6x.

Note also that the Automatic Index is a Non-Partitioned (Global) Index. From a performance perspective, this is the most efficient index to create to improve the performance of this query as the CBO only has the one index structure to navigate (vs. a LOCAL index that would require having to navigate down all 8 index structures for each table partition.

If we look at the index details:

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_2zt7rg40mxa4n           NO  YES VISIBLE   VALID

 

We notice that this is indeed a Non-Partitioned Index, that is both VISIBLE and VALID and so can be potentially used by any database session.

If we now re-run the query:

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 937174207

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                 | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop    |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                          |                      |   20 |   520 |      23 (0)| 00:00:01 |       |          |
|  1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| BIG_BOWIE1           |   20 |   520 |      23 (0)| 00:00:01 | ROWID | ROWID    |
|* 2 | INDEX RANGE SCAN                          | SYS_AI_2zt7rg40mxa4n |   20 |       |       3 (0)| 00:00:01 |       |          |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("TOTAL_SALES"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         23 consistent gets
          0 physical reads
          0 redo size
       1166 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         19 rows processed

 

We can see the query now uses the newly created Automatic Index and is indeed more efficient, performing now just 23 consistent gets (previously 44014 consistent gets).

 

However, this was NOT previous behaviour.

The documentation previously mentioned that only LOCAL indexes are used when indexing partitioned tables.

If we run the same demo on Oracle Database 19.3, we get the following report:

 

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 14-OCT-2020 13:12:07
Activity end                : 14-OCT-2020 14:24:07
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (1950.5x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1950.5x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| BOWIE | BIG_BOWIE1 | SYS_AI_8armv0hqq73fa | TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 2pp8ypramw30s
SQL Text            : SELECT * FROM big_bowie1 WHERE total_sales = 42
Improvement Factor  : 1950.5x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    6996973                      27327
CPU Time (s):        6704215                      12819
Buffer Gets:         815306                       49
Optimizer Cost:      12793                        28
Disk Reads:          2                            40
Direct Writes:       0                            0
Rows Processed:      475                          25
Executions:          19                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 4294056405

-----------------------------------------------------------------------------
| Id | Operation          | Name       | Rows | Bytes | Cost  | Time        |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |            |      |       | 12793 |             |
| 1 | PARTITION RANGE ALL |            |   20 |   520 | 12793 | 00:00:01    |
| 2 | TABLE ACCESS FULL   | BIG_BOWIE1 |   20 |   520 | 12793 | 00:00:01    |
-----------------------------------------------------------------------------

- With Auto Indexes
-----------------------------
Plan Hash Value : 3781269341

--------------------------------------------------------------------------------------------------------------
|  Id | Operation                                 | Name                 | Rows | Bytes | Cost | Time        |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                      |   25 |   650 |   28 | 00:00:01    |
|   1 | PARTITION RANGE ALL                       |                      |   25 |   650 |   28 | 00:00:01    |
|   2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE1           |   25 |   650 |   28 | 00:00:01    |
| * 3 | INDEX RANGE SCAN                          | SYS_AI_8armv0hqq73fa |   25 |       |   17 | 00:00:01    |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 3 - access("TOTAL_SALES"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

As we can see, in this scenario, the newly created Automatic Index has a “Property” of LOCAL.

If we look at its index details:

 

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_8armv0hqq73fa           YES YES VISIBLE   N/A

SQL> SELECT index_name, partitioning_type, partition_count, locality FROM user_part_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PARTITION PARTITION_COUNT LOCALI
------------------------------ --------- --------------- ------
SYS_AI_8armv0hqq73fa           RANGE                   8 LOCAL

 

We can see how a Local Index was previously created.

As such if we re-run an equivalent query:

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

25 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3781269341

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop     |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                         |                      |   20 |   520 |      26 (0)| 00:00:01 |       |           |
|  1 | PARTITION RANGE ALL                      |                      |   20 |   520 |      26 (0)| 00:00:01 |     1 |         8 |
|  2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BIG_BOWIE1           |   20 |   520 |      26 (0)| 00:00:01 |     1 |         8 |
|* 3 | INDEX RANGE SCAN                         | SYS_AI_8armv0hqq73fa |   20 |       |      17 (0)| 00:00:01 |     1 |         8 |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         50 consistent gets
          0 physical reads
          0 redo size
       1555 bytes sent via SQL*Net to client
        409 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         25 rows processed

 

Although the query is returning 6 more rows (as with the random number generation, has a slightly different data set), it’s more expensive proportionally now having to perform 50 consistent gets as it now has to read 8 index structures rather than just the one.

So (IMHO), Automatic Indexing has improved here, creating a more efficient index structure than previously. So always bear in mind that Automatic Indexing is an evolving beast, improving and adapting as time moves on.

However, note the compromise here is that by having an effectively Global index structure, there may be some additional issues depending on any subsequent structural changes to the table.

More on Automatic Indexing and Partitioning in my next post…

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part III (Do Anything You Say) October 8, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Performance Tuning, Stale Statistics.
2 comments

In Part I of this series, we saw how Automatic Indexing will not create a viable Automatic Index if there are stale or missing statistics on the underlining segments. In Part II we saw how these SQL statements effectively become blacklisted and when segment statistics are subsequently collected, Automatic Indexing will still not create viable Automatic Indexes when the SQL statements are re-run.

So how do we get Automatic Indexing to now kick in and create necessary indexes on these problematic SQLs?

As I’ve discussed previously in relation to blacklisted SQLs, we need to run a NEW SQL statement that hasn’t been blacklist that will result in a necessary index to be created. An easy way to do this is just to include a new comment within the previous SQL to give the SQL a new signature.

If we now run the following “new” SQL statement (identical to the problematic SQL but with a comment embedded):

SQL> select /* new */ * from bowie_stale where code=42;

        ID       CODE NAME
---------- ---------- ------------------------------------------
   1000041         42 David Bowie
   6000041         42 David Bowie
        41         42 David Bowie
   3000041         42 David Bowie
   7000041         42 David Bowie
   8000041         42 David Bowie
   4000041         42 David Bowie
   9000041         42 David Bowie
   5000041         42 David Bowie
   2000041         42 David Bowie

 

If we now wait to see what the next Automatic Indexing task makes of things:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 07-JUL-2020 06:34:49
Activity end                : 07-JUL-2020 06:35:54
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 142.61 MB (142.61 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (19787.7x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 19787.7x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | BOWIE_STALE | SYS_AI_300kk2unp8tr0 | CODE | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We see that the index on the CODE column (SYS_AI_300kk2unp8tr0) has now been created.

Further down the report:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : du6psd0xmzpg5
SQL Text            : select /* new */ * from bowie_stale where code=42
Improvement Factor  : 19787.7x

Execution Statistics:
-----------------------------
                  Original Plan Auto           Index Plan
                  ---------------------------- ----------------------------
Elapsed Time (s): 137261                       2620
CPU Time (s):     84621                        1769
Buffer Gets:      277028                       13
Optimizer Cost:   544                          13
Disk Reads:       275947                       2
Direct Writes:    0                            0
Rows Processed:   70                           10
Executions:       7                            1

 

A new index was indeed created because of this new SQL statement, with a performance improvement of 19787.7x.

Further down the report to the Plans Section:

 

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 65903426

-----------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |             |      |       |  544 |             |
| 1 | TABLE ACCESS STORAGE FULL | BOWIE_STALE |   10 |   230 |  544 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2558864466

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   230 |   13 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_STALE          |   10 |   230 |   13 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_300kk2unp8tr0 |   10 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We can see that the new plan using the new Automatic Index with a much lower CBO cost.

If we now look at the status of this index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID      10000000       20164             59110
SYS_AI_300kk2unp8tr0           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       16891          10000000

 

We see that the index is now both VISIBLE and VALID (previously, it was INVISIBLE and UNUSABLE).

As such, the Automatic Index can now potentially be used by any SQL, including the previous problematic query.

So with a viable index now in place, if we re-run the initial problematic query:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2558864466

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      |   10 |   230 |      14 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_STALE          |   10 |   230 |      14 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_300kk2unp8tr0 |   10 |       |       3 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         14 consistent gets
          0 physical reads
          0 redo size
        738 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

We see that finally, the SQL uses the new Automatic Index and is indeed much more efficient as a result, with just 14 consistent gets required (when previously it was 39430 consistent gets).

So if ever you come across the scenario where an SQL does not have an Automatic Index created when clearly it should, it could be that it has been blacklisted and needs a different SQL to actually generate the necessary index.

To avoid some of these issues, make sure you do not have stale or missing statistics when reliant on Automatic Indexing. The new High Frequency Statistics Collection capability to designed to specifically avoid such a scenario.

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part I (Dead Against It) October 6, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, High Frequency Statistics Collection, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning, Stale Statistics, Unusable Indexes.
5 comments

A “golden rule” when working with Automatic Indexing is that things don’t work properly if there are stale statistics on the dependant objects. Stale statistics can of course be problematic but they can be particularly troublesome when dealing with Automatic Indexing.

In the Oracle Autonomous Database environments, this issue is addressed somewhat by the new High Frequency Statistics Collection capability, which helps to automatically collect stale statistics on a regular basis. However, in on-prem Exadata environments where this can more easily be turned off or collected less frequently, it’s a potential issue worth consideration.

I’ll start with a simple little table, with a CODE column that has lots of distinct values:

SQL> create table bowie_stale (id number constraint bowie_stale_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_stale select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

Importantly, I don’t collect statistics on this newly populated table…

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables
where table_name='BOWIE_STALE';

TABLE_NAME        NUM_ROWS     BLOCKS LAST_ANAL
--------------- ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

If we now run the following query a number of times while there are no statistics on the table:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice but to use a FTS as I don’t yet have an index on the CODE column.

If I now wait for the next Automatic Indexing task to kick in AND if there are still NO statistics on the table:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT

--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 05-JUL-2020 06:36:31
Activity end                 : 05-JUL-2020 06:37:07
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates            : 1
Indexes created             : 0
Space used                  : 0 B
Indexes dropped             : 0
SQL statements verified     : 0
SQL statements improved     : 0
SQL plan baselines created  : 0
Overall improvement factor  : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

You can see that there was the one index candidate BUT no Automatic Index appears to have been created.

Assuming there are still no statistics:

 

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE';

TABLE_NAME                       NUM_ROWS    BLOCKS  LAST_ANAL
------------------------------ ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE2';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

 

If we look now at what indexes exist on the table:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID
SYS_AI_300kk2unp8tr0           YES NO  INVISIBLE DISABLED      UNUSABLE          0           0                 0

SQL> select index_name, column_name, column_position from user_ind_columns
where table_name='BOWIE_STALE2' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
------------------------------ -------------------- ---------------
BOWIE_STALE_PK                 ID                                 1
SYS_AI_300kk2unp8tr0           CODE                               1

 

We notice there is now an Automatic Index BUT it remains in an UNUSABLE/INVISIBLE state. This means the index can’t be used by the CBO.

So if we now re-run the SQL query again:

 

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice still but to use the FTS.

In Part II, we’ll see that once we get into this scenario, it can be a tad problematic to get ourselves out of it and get the Automatic Index created as we would like…

Oracle 19c Automatic Indexing: Data Skew Fixed By Baselines Part II (Sound And Vision) September 28, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Baselines, CBO, Data Skew, Exadata, Explain Plan For Index, Full Table Scans, Histograms, Index Access Path, Index statistics, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning.
add a comment

 

In my previous post, I discussed how the Automatic Indexing task by using Dynamic Sampling Level=11 can correctly determine the correct query cardinality estimates and assume the CBO will likewise determine the correct cardinality estimate and NOT use an index if it would cause performance to regress.

However, if other database sessions DON’T use Dynamic Sampling at the same Level=11 and hence NOT determine correct cardinality estimates, newly created Automatic Indexes might get used by the CBO inappropriately and result inefficient execution plans.

Likewise, with incorrect CBO cardinality estimates, it might also be possible for newly created Automatic Indexes to NOT be used when they should be (as I’ve discussed previously).

These are potential issues if the Dynamic Sampling value differs between the Automatic Indexing task and other database sessions.

One potential way to make things more consistent and see how the Automatic Indexing behaves if it detects an execution plan where the CBO would use an Automatic Index that causes performance regression, is to disable Dynamic Sampling within the Automatic Indexing task.

This can be easily achieved by using the following hint which effectively disables Dynamic Sampling with the previous problematic query:

SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |  1005K|   135M| 11411   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |  1005K|   135M| 11411   (1)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      41169  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2705  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

The query currently has good cardinality estimates (1005K vs 1000011 rows returned) only because we currently have histograms in place for the CODE column. As such, the query correctly uses a FTS.

However, if we now remove the histogram on the CODE column:

SQL> exec dbms_stats.gather_table_stats(null, 'SPACE_ODDITY', method_opt=> 'FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

 

There is no way for the CBO to now determine the correct cardinality estimate because of the skewed data and missing histograms.

So what does the Automatic Indexing tasks make of things now. If we look at the next activity report:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 18-AUG-2020 16:42:33
Activity end                 : 18-AUG-2020 16:43:06
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created                              : 0
Space used                                   : 0 B
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved                      : 0
SQL plan baselines created (SQL statements)  : 1 (1)
Overall improvement factor                   : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

We can see that it has verified this one new statement and has created 1 new SQL Plan Baseline as a result.

If we look at the Verification Details part of this report:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
The following SQL plan baselines were created:
-------------------------------------------------------------------------------
Parsing Schema Name     : BOWIE
SQL ID                  : 3yz8unzhhvnuz
SQL Text                : select /*+ dynamic_sampling(0) */ * from
space_oddity where code in (190000, 170000, 150000,
130000, 110000, 90000, 70000, 50000, 30000, 10000)
SQL Signature           : 3910785437403172730
SQL Handle              : SQL_3645e6a2952fcf7a
SQL Plan Baselines (1)  : SQL_PLAN_3cjg6naakzmvu198c05b9

We can see Automatic Indexing has created a new SQL Plan Baseline for our query with Dynamic Sampling set to 0 thanks to the hint.

Basically, the Automatic Indexing task has found a new query and determined the CBO would be inclined to use the index, because it now incorrectly assumes few rows are to be returned. It makes the poor cardinality estimate because there are currently no histograms in place AND because it can’t now use Dynamic Sampling to get a more accurate picture of things on the fly because it has been disabled with the dynamic_sampling(0) hint.

Using an Automatic Index over the current FTS plan would make the performance of the SQL regress.

Therefore, to protect the current FTS plan, Automatic Indexing has created a SQL Plan Baseline that effectively forces the CBO to use the current, more efficient FTS plan.

This can be confirmed by looking at the DBA_AUTO_INDEX_VERIFICATIONS view:

 

SQL> select execution_name, original_buffer_gets, auto_index_buffer_gets, status
from dba_auto_index_verifications where sql_id = '3yz8unzhhvnuz';

EXECUTION_NAME             ORIGINAL_BUFFER_GETS AUTO_INDEX_BUFFER_GETS STATUS
-------------------------- -------------------- ---------------------- ---------
SYS_AI_2020-08-18/16:42:33                41169                 410291 REGRESSED

 

If we now re-run the SQL again (noting we still don’t have histograms on the CODE column):

SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |    32 |  4512 | 11425   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |    32 |  4512 | 11425   (2)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Hint Report (identified by operation id / Query Block Name / Object Alias):

Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
1 -  SEL$1
U -  dynamic_sampling(0) / rejected by IGNORE_OPTIM_EMBEDDED_HINTS

Note
-----

- SQL plan baseline "SQL_PLAN_3cjg6naakzmvu198c05b9" used for this statement

Statistics
----------------------------------------------------------
          9  recursive calls
          4  db block gets
      41170  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2705  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

We can see the CBO is forced to use the SQL Plan Baseline “SQL_PLAN_3cjg6naakzmvu198c05b9” as created by the Automatic Indexing task to ensure the more efficient FTS is used and not the available Automatic Index.

So Automatic Indexing CAN create SQL PLan Baselines to protect SQL from performance regressions caused by inappropriate use of Automatic Indexes BUT it’s really hard and difficult for it to do this effectively if the Automatic Indexing tasks and other database sessions have differing Dynamic Sampling settings as it does by default…