jump to navigation

Costing Concatenated Indexes With Range Scan Predicates Part II (Coming Back To Life) July 27, 2022

Posted by Richard Foote in Automatic Indexing, CBO, Column Statistics, Concatenated Indexes, Explain Plan For Index, Full Table Scans, Index Access Path, Index Column Order, Index Column Reorder, Index Internals, Index statistics, Leaf Blocks, Non-Equality Predicates, Oracle, Oracle Blog, Oracle Cost Based Optimizer, Oracle General, Oracle Index Seminar, Oracle Indexes, Oracle Statistics, Performance Tuning, Richard Foote Training.
add a comment

In my previous Part I post, I discussed how the CBO basically stops the index leaf block access calculations after a non-equality predicate. This means that for an index with the leading indexed column being accessed via an unselective non-equality predicate, a large percentage of the index’s leaf blocks might need to be scanned, making the index access path unviable.

In the example in Part I, an index on the ID, CODE columns was too expensive due to the unselective range-scan predicate based on the leading ID column.

To provide the CBO a potentially much more efficient access path, we need an index with the more selective CODE predicate to be the leading column:

SQL> CREATE INDEX radiohead_code_id_i ON radiohead(code, id);

Index created.

SQL> SELECT index_name, blevel, leaf_blocks, clustering_factor

FROM user_indexes WHERE index_name = 'RADIOHEAD_CODE_ID_I';

INDEX_NAME                        BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
----------------------------- ---------- ----------- -----------------
RADIOHEAD_CODE_ID_I                    1         265             98619

If we now re-run the previous query:

SQL> SELECT * FROM radiohead WHERE id BETWEEN 1000 AND 5000 AND CODE = 140;

Execution Plan

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                     |     4 |    72 |     6   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| RADIOHEAD           |     4 |    72 |     6   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | RADIOHEAD_CODE_ID_I |     4 |       |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
        806  bytes sent via SQL*Net to client
        608  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          4  rows processed

We notice the CBO is now using this new index, as the costs for this index-based plan have dropped significantly, down to just 6 (from the previous 116). This overall cost of 6 is lower than the cost of 105 for the Full Table Scan and hence the reason why this index-based plan is now chosen by the CBO.

This is all due entirely to the significant drop in costs in accessing the index itself, now just 2 (from the previous 112).

Importantly, these much lower costs are accurate as we can tell via the reduced number of consistent reads, now just 7 (from 114 from the previous index-based plan).

If we now look at the associated costings:

Effective Index Selectivity = CODE selectivity x ID selectivity

= (1/10000) x ((5000-1000)/(10000-1) + 2 x (1/10000))

= 0.0001 x ((4000/9999) + 0.0002)

= 0.0001 x 0.40024)

= 0.000040024

Effective Table Selectivity = same as Index Selectivity

= 0.000040024

 

The effective index selectivity of 0.000040024 is now much lower than the previous (0.40024), as the CBO can now consider the product of the selectivities of both columns).

If we now plug this improved effective index selectivity into the index path costing calculations:

Index IO Cost = blevel +

ceil(effective index selectivity x leaf_blocks) +

ceil(effective table selectivity x clustering_factor)

 

Index IO Cost = 1  +  ceil(0.000040024 x 265) + ceil(0.000040024 x 99034)

= 1 + 1  + 4

= 2 + 4

= 6

Index Access Cost  = IO Costs + CPU Costs (in this plan, 0% of total costs and so unchanged from the IO costs)

= 2 + 4

= 6

We can see how the respective 2 and 6 improved CBO index costings are derived.

So again, it’s important to note that Automatic Indexing is doing entirely the correct thing with these examples, when it creates an index with the equality based predicate columns as the leading columns of the index…

Automatic Indexing 21c: Non-Equality Predicate Anomaly (“Strangers When We Meet”) July 14, 2022

Posted by Richard Foote in 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, Index Column Order, Invisible Indexes, Non-Equality Predicates, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Performance Tuning, Richard Foote Training, Richard's Blog.
3 comments

I’m currently putting together some Exadata related training for a couple of customers and came across a rather strange anomaly with regard the status of Automatic Indexes, when created in part on unselective, non-equality predicates.

As discussed previously, Oracle Database 21c now allows the creation of Automatic Indexes based on non-equality predicates (previously, Automatic Indexes were only created on equality-based predicates).

But one appears to get rather odd resultant Automatic Indexes in the scenario where the non-equality predicate is not particularly selective but other predicates are highly selective.

To illustrate, I’ll create a basic table that has two columns (ID and CODE) that are both highly selective:

SQL> create table ziggy_new (id number, code number, name varchar2(42));

Table created.

SQL> insert into ziggy_new select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_NEW');

PL/SQL procedure successfully completed.

So there are currently no indexes on this table.

I’ll next run the following SQL (and others similar) a number of times:

SQL> select * from ziggy_new where code=42 and id between 1 and 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3165184525

----------------------------------------------------------------------------------------
| Id  | Operation                  | Name      | Rows | Bytes | Cost (%CPU) | Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |    1 |    23 |    6738 (2) | 00:00:01 |
| * 1 |  TABLE ACCESS STORAGE FULL | ZIGGY_NEW |    1 |    23 |    6738 (2) | 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("CODE"=42 AND "ID"<=100000 AND "ID">=1)
       filter("CODE"=42 AND "ID"<=100000 AND "ID">=1)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      38605 consistent gets
      38600 physical reads
          0 redo size
        725 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

Without any indexes, the CBO currently has no choice but to use a Full Table Scan.

But only 1 row is returned. The first equality predicate on the CODE column is highly selective and on its own would only return 10 rows out of the 10M row table. The second, non-equality range-based predicate on the ID column is nowhere near as selective and offers limited additional filtering.

The CBO stops calculating index related costs after a non-equality predicate column (as subsequent column values could exist anywhere within the preceding range), and so the more effective index here is one based on (CODE, ID) with the non-equality predicate column second,  or potentially just on the CODE column only, as the ID range offers minimal filtering benefits.

So what does Automatic Indexing make of things?

If we look at the subsequent Automatic Indexing report:

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 3
Indexes created (visible / invisible)        : 1 (0 / 1)
Space used (visible / invisible)             : 209.72 MB (0 B / 209.72 MB)
Indexes dropped                              : 0
SQL statements verified                      : 44
SQL statements improved (improvement factor) : 12 (64.7x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1.6x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
----------------------------------------------------------------------------
| Owner | Table     | Index                | Key     | Type   | Properties |
----------------------------------------------------------------------------
| BOWIE | ZIGGY_NEW | SYS_AI_75j16xff1ag3j | CODE,ID | B-TREE | NONE       |
----------------------------------------------------------------------------

So Automatic Indexing has indeed created an index based on CODE,ID (a common Automatic Indexing trait appears to be to create an index based on all available predicates).

BUT the index is created as an INVISIBLE Index and so can not generally be used by database sessions.

SQL> select index_name, auto, visibility, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='ZIGGY_NEW';

INDEX_NAME                     AUT VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- -------- ---------- ----------- -----------------
SYS_AI_75j16xff1ag3j           YES INVISIBLE VALID      10000000       25123          10000000

SQL> select index_name, column_name, column_position
     from user_ind_columns where table_name='ZIGGY_NEW';

INDEX_NAME                     COLUMN_NAME  COLUMN_POSITION
------------------------------ ------------ ---------------
SYS_AI_75j16xff1ag3j           CODE                       1
SYS_AI_75j16xff1ag3j           ID                         2

 

So re-running the previous SQL statements continues to use a Full Table Scan:

SQL> select * from ziggy_new where code=42 and id between 1 and 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 3165184525

----------------------------------------------------------------------------------------
|  Id | Operation                  | Name      | Rows | Bytes | Cost (%CPU) | Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |    1 |    23 |    6738 (2) | 00:00:01 |
| * 1 |  TABLE ACCESS STORAGE FULL | ZIGGY_NEW |    1 |    23 |    6738 (2) | 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("CODE"=42 AND "ID"<=100000 AND "ID">=1)
       filter("CODE"=42 AND "ID"<=100000 AND "ID">=1)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      38605 consistent gets
      38600 physical reads
          0 redo size
        725 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

 

Automatic Indexing appears to only create Invisible indexes when there is an inefficient non-equality predicate present. It won’t create the index as a Visible index, even though it would significantly benefit these SQL statements that caused its creation. And Automatic Indexing won’t create an index on just the highly selective CODE equality predicate, which would also be of much benefit to these SQL statements.

If we now run similar queries, but with much more selective non-equality predicates, such as:

SQL> select * from ziggy_new where code=1 and id between 1 and 10;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3165184525

----------------------------------------------------------------------------------------
|  Id | Operation                  | Name      | Rows | Bytes | Cost (%CPU) | Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT           |           |    1 |    23 |    6738 (2) | 00:00:01 |
| * 1 |  TABLE ACCESS STORAGE FULL | ZIGGY_NEW |    1 |    23 |    6738 (2) | 00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("CODE"=1 AND "ID"<=10 AND "ID">=1)
       filter("CODE"=1 AND "ID"<=10 AND "ID">=1)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      38604 consistent gets
      38600 physical reads
          0 redo size
        503 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

Again, with no (Visible) index present, the CBO currently has no choice but to use the Full Table Scan.

But during the next cycle, after Automatic Indexing kicks in again:

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 5
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 209.72 MB (209.72 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 89
SQL statements improved (improvement factor) : 31 (71.9x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1.7x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
----------------------------------------------------------------------------
| Owner | Table     | Index                | Key     | Type   | Properties |
----------------------------------------------------------------------------
| BOWIE | ZIGGY_NEW | SYS_AI_75j16xff1ag3j | CODE,ID | B-TREE | NONE       |
----------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : d4znwcu4h52ca
SQL Text            : select * from ziggy_new where code=42 and id between 1 and 10
Improvement Factor  : 38604x

Execution Statistics:
-----------------------------
                    Original Plan                Auto Index Plan
                    ---------------------------- ----------------------------
Elapsed Time (s):   3398605                      68
CPU Time (s):       3166824                      68
Buffer Gets:        463250                       3
Optimizer Cost:     6738                         4
Disk Reads:         463200                       0
Direct Writes:      0                            0
Rows Processed:     0                            0
Executions:         12                           1

PLANS SECTION
--------------------------------------------------------------------------------
-------------

- Original
-----------------------------
Plan Hash Value : 3165184525

--------------------------------------------------------------------------------
| Id | Operation                  | Name      | Rows | Bytes | Cost | Time     |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT           |           |      |       | 6738 |          |
|  1 |  TABLE ACCESS STORAGE FULL | ZIGGY_NEW |    1 |    23 | 6738 | 00:00:01 |
--------------------------------------------------------------------------------

- With Auto Indexes
-----------------------------
Plan Hash Value : 1514586396

-------------------------------------------------------------------------------------------------------
|  Id | Operation                            | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |    1 |    23 |    4 | 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_NEW            |    1 |    23 |    4 | 00:00:01 |
| * 2 |   INDEX RANGE SCAN                   | SYS_AI_75j16xff1ag3j |    1 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE"=42 AND "ID">=1 AND "ID"<=10)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

But this time, the index on the CODE,ID columns is created as a Visible index.

INDEX_NAME                     AUT VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- -------- ---------- ----------- -----------------
SYS_AI_75j16xff1ag3j           YES VISIBLE   VALID      10000000       25123          10000000

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='ZIGGY_NEW';

INDEX_NAME                     COLUMN_NAME  COLUMN_POSITION
------------------------------ ------------ ---------------
SYS_AI_75j16xff1ag3j           CODE                       1
SYS_AI_75j16xff1ag3j           ID                         2

So this index can be generally used, both by the newer SQLs that generated the now Visible index:

SQL> select * from ziggy_new where code=42 and id between 1 and 10;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1514586396

------------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                      |    1 |    23 |       4 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_NEW            |    1 |    23 |       4 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | SYS_AI_75j16xff1ag3j |    1 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42 AND "ID">=1 AND "ID"<=10)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        503 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

And also used by the SQLs with the unselective non-equality predicates, that Automatic Indexing would only create as Invisible indexes:

SQL> select * from ziggy_new where code=42 and id between 1 and 100000;

Execution Plan
----------------------------------------------------------
Plan hash value: 1514586396

------------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                      |    1 |    23 |       4 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_NEW            |    1 |    23 |       4 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | SYS_AI_75j16xff1ag3j |    1 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42 AND "ID">=1 AND "ID"<=100000)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
        729 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

 

Automatic Indexing appears to currently not quite do the right thing with SQL statements that have unselective non-equality predicates, by creating such indexes as only Invisible Indexes, inclusive of the unselective columns.

Although an edge case, I would recommend looking through the list of created Automatic Indexes to see if any such Invisible/Valid indexes exists, as it can suggest there are current inefficient SQL statements that could benefit from such indexes being Visible.

Automatic Indexes: Automatically Rebuild Unusable Indexes Part IV (“Nothing Has Changed”) May 31, 2022

Posted by Richard Foote in 19c, 19c New Features, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Column Order, Index Internals, Local Indexes, Mixing Auto and Manual Indexes, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Indexing Internals Webinar, Oracle19c, Unusable Indexes.
1 comment so far

In a previous post, I discussed how Automatic Indexing (AI) does not automatically rebuild a manually built index that is in an Unusable state (but will rebuild an Unusable automatically created index).

The demo I used was a simple one, based on manually created indexes referencing a non-partitioned table.

In this post, I’m going to use a demo based on manually created indexes referencing a partitioned table.

I’ll start by creating a rather basic range-based partitioned table, using the RELEASE_DATE column to partition the data by year:

SQL> CREATE TABLE big_bowie (id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2020 VALUES LESS THAN (TO_DATE('01-JAN-2021', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2021 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_bowie SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE');

PL/SQL procedure successfully completed.

I’ll next manually create a couple indexes; a non-partitioned index based on just the ALBUM_ID column and a prefixed locally partitioned index, based on the columns RELEASE_DATE, TOTAL_SALES:

 

SQL> create index album_id_i on big_bowie(album_id);

Index created.

SQL> create index release_date_total_sales_i on big_bowie(release_date, total_sales) local;

Index created.

 

If we now re-organise just partition ALBUMS_2017 (without using the ONLINE clause):

SQL> alter table big_bowie move partition albums_2017;

Table altered.

This results in the non-partitioned index and the ALBUMS_2017 local index partition becoming Unusable:

SQL> select index_name, status from user_indexes where table_name='BIG_BOWIE';

INDEX_NAME                     STATUS
------------------------------ --------
ALBUM_ID_I                     UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     N/A

SQL> select index_name, partition_name, status from user_ind_partitions
     where index_name='RELEASE_DATE_TOTAL_SALES_I';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2014          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2015          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2016          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2017          UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2018          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2019          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2020          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2021          USABLE

Let’s now run a number of queries a number of times. The first series is based on a predicate on just the ALBUM_ID column, such as:

SQL> select * from big_bowie where album_id=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1510748290

-------------------------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 2000 | 52000 |    7959 (2) | 00:00:01 |       |       |
|   1 | PARTITION RANGE ALL |           | 2000 | 52000 |    7959 (2) | 00:00:01 |     1 |     8 |
| * 2 |  TABLE ACCESS FULL  | BIG_BOWIE | 2000 | 52000 |    7959 (2) | 00:00:01 |     1 |     8 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("ALBUM_ID"=42)
  - filter("ALBUM_ID"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      48593 consistent gets
      42881 physical reads
          0 redo size
      44289 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

We’ll also run a series of queries based on both the RELEASE_DATE column using dates from the unusable index partition and the TOTAL_SALES column, such as:

SQL> select * from big_bowie where release_date='01-JUN-2017' and total_sales=42;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3245457041

----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |    1 |    26 |     986 (2) | 00:00:01 |       |       |
|   1 | PARTITION RANGE SINGLE |           |    1 |    26 |     986 (2) | 00:00:01 |     4 |     4 |
| * 2 |  TABLE ACCESS FULL     | BIG_BOWIE |    1 |    26 |     986 (2) | 00:00:01 |     4 |     4 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("TOTAL_SALES"=42 AND "RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
   - filter("TOTAL_SALES"=42 AND "RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       5573 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

Without a valid/usable index, the CBO currently has no choice but to use a Full Table Scan on the first query, and a Full Partition Scan on the partition with the unusable local index.

So what does AI make of things? Does it rebuild the unusable manually created indexes so the associated indexes can be used to improve these queries?

If we wait until the next AI task completes and check out the indexes on the table:

SQL> select index_name, status, partitioned from user_indexes where table_name='BIG_BOWIE';

INDEX_NAME                     STATUS   PAR
------------------------------ -------- ---
RELEASE_DATE_TOTAL_SALES_I     N/A      YES
ALBUM_ID_I                     UNUSABLE NO
SYS_AI_aw2825ffpus5s           VALID    NO
SYS_AI_2hf33fpvnqztw           VALID    NO

SQL> select index_name, partition_name, status from user_ind_partitions
     where index_name='RELEASE_DATE_TOTAL_SALES_I';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2014          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2015          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2016          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2017          UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2018          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2019          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2020          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2021          USABLE

We notice that AI has created two new non-partitioned automatic indexes, while both the manually created indexes remain in the same unusable state. If we look at the columns associated with these new automatic indexes:

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BIG_BOWIE';

INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
------------------------------ -------------------- ---------------
ALBUM_ID_I                     ALBUM_ID                           1
RELEASE_DATE_TOTAL_SALES_I     RELEASE_DATE                       1
RELEASE_DATE_TOTAL_SALES_I     TOTAL_SALES                        2
SYS_AI_aw2825ffpus5s           ALBUM_ID                           1
SYS_AI_aw2825ffpus5s           RELEASE_DATE                       2
SYS_AI_2hf33fpvnqztw           TOTAL_SALES                        1
SYS_AI_2hf33fpvnqztw           RELEASE_DATE                       2

As we can see, AI has logically replaced both unusable indexes.

The manual index based on ALBUM_ID has been replaced with an inferior index based on the ALBUM_ID, RELEASE_DATE columns. Inferior in that the automatic index is both redundant (if only the manual index on ALBUM_ID were rebuilt) and in that it has the logically unnecessary RELEASE_DATE column to inflate the size of the index.

The manual index based on the RELEASE_DATE, TOTAL_SALES columns has been replaced with a redundant automatic index based on the reversed TOTAL_SALES, RELEASE_DATE columns.

Now, AI has indeed automatically addressed the current FTS performance issues associated with these queries by creating these indexes, but a better remedy would have been to rebuild the unusable manual indexes and hence negate the need for these redundant automatic indexes.

But currently (including with version 21.3), AI will NOT rebuild unusable manually created indexes, no matter the scenario, and will instead create additional automatic indexes if it’s viable for it to do so.

A reason why Oracle at times recommends dropping all current manually created secondary indexes before implementing AI (although of course this comes with a range of obvious issues and concerns).

If these manually created indexes didn’t exist, I’ll leave it as an exercise to the discernable reader on what automatic indexes would have been created…

As always, this restriction may change in future releases…

Automatic Indexes: Automatically Rebuild Unusable Indexes Part III (“Waiting For The Man”) May 17, 2022

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Exadata, Full Table Scans, Manual Indexes, Mixing Auto and Manual Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Unusable Indexes.
1 comment so far

I’ve previously discussed how Automatic Indexing (AI) will not only create missing indexes, but will also rebuild unusable indexes, be it a Global or Local index.

However, all my previous examples have been with Automatic Indexes. How does AI handle unusable indexes in which the indexes were manually created?

In my first demo, I’ll start by creating a basic non-partitioned table:

SQL> create table bowie_stuff (id number, album_id number, country_id number, release_date date, total_sales number);

Table created.

SQL> insert into bowie_stuff select rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF');

PL/SQL procedure successfully completed.

We next manually create an index on the highly selective TOTAL_SALES column:

SQL> create index bowie_stuff_total_sales_i on bowie_stuff(total_sales);

Index created.

Let’s now invalidate the index by re-organising the table without the online clause:

SQL> alter table bowie_stuff move;

Table altered.

SQL> select index_name, status from user_indexes where table_name='BOWIE_STUFF';

INDEX_NAME                     STATUS
------------------------------ --------
BOWIE_STUFF_TOTAL_SALES_I      UNUSABLE

So the index is now in an UNUSABLE state.

To perk up the interest of AI, I’ll run a number of queries such as the following with a predicate condition on TOTAL_SALES:

select * from bowie_stuff where total_sales=42;

18 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 910563088

---------------------------------------------------------------------------------
| Id | Operation          | Name        | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------------
|  0 | SELECT STATEMENT   |             |   20 |   520 |    7427 (2) | 00:00:01 |
|* 1 |  TABLE ACCESS FULL | BOWIE_STUFF |   20 |   520 |    7427 (2) | 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("TOTAL_SALES"=42)
    filter("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      42746 consistent gets
      42741 physical reads
          0 redo size
       1392 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         18 rows processed

Without a valid index, the CBO has no choice but to perform an expensive full table scan.

However, it doesn’t matter how long I wait or how many different queries I run similar to the above, AI currently will never rebuild an unusable index if the index was manually created.

AI will only rebuild unusable automatically created indexes.

I’ve discussed previously how automatic and manually created indexes often don’t gel well together and is one of the key reasons why Oracle recommends dropping all manually created secondary indexes if you wish to implement AI (using the DBMS_AUTO_INDEX.DROP_SECONDARY_INDEXES procedure, which I’ll discuss in a future post).

Things can get a little interesting with AI, if the underlining table is partitioned and you have manually created unusable indexes.

As I’ll discuss in my next post…

Automatic Indexes: Automatically Rebuild Unusable Indexes Part II (“I Wish You Would”) May 11, 2022

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Local Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning, Performance Tuning, Rebuild Unusable Indexes.
1 comment so far

Within a few hours of publishing my last blog piece on how Automatic Indexing (AI) can automatically rebuild indexes that have been placed in an UNUSABLE state, I was asked by a couple of readers a similar question: “Does this also work if just a single partition of an partitioned index becomes unusable”?

My answer to them both is that I’ve provided them the basic framework in the demo to check out the answer to that question for themselves (Note: a fantastic aspect of working with the Oracle Database is that it’s available for free to play around with, including the Autonomous Database environments).

But based on the principle that for every time someone asks a question, there’s probably a 100 others who potentially might be wondering the same thing, thought I’ll quickly whip up a demo to answer this for all.

I’ll begin with the same table format and data as my previous blog:

SQL> CREATE TABLE big_ziggy(id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2020 VALUES LESS THAN (TO_DATE('01-JAN-2021', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2021 VALUES LESS THAN (TO_DATE('01-JAN-2022', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2022 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_ziggy SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> COMMIT;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_ZIGGY');

PL/SQL procedure successfully completed.

 

But this time, I’ll run a number of queries similar to the following, that also has a predicate based on the partitioned key (RELEASE_DATE) of the table:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3599046327

----------------------------------------------------------------------------------------------------
| Id | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT       |           |    1 |    26 |    1051 (2) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE |           |    1 |    26 |    1051 (2) | 00:00:01 |      3 |     3 |
|* 2 |  TABLE ACCESS FULL     | BIG_ZIGGY |    1 |    26 |    1051 (2) | 00:00:01 |      3 |     3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
    filter(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       5618 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

If we wait for the next AI task to kick in:

DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY()
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 11-MAY-2022 10:55:43
Activity end                : 11-MAY-2022 10:56:27
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 6
SQL statements improved (improvement factor) : 3 (6670.1x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 2x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------
| Owner | Table     | Index                | Key                      | Type   | Properties |
---------------------------------------------------------------------------------------------
| BOWIE | BIG_ZIGGY | SYS_AI_6wv99zdbsy8ar | RELEASE_DATE,TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We can see that AI has indeed automatically created a LOCAL, partitioned index (on columns RELEASE_DATE, TOTAL_SALES) in this scenario, as we have an equality predicate based on the partitioned key (RELEASE_DATE).

Currently, all is well with the index, with all partitions in a USABLE state:

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes WHERE table_name = 'BIG_ZIGGY';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_6wv99zdbsy8ar           YES YES VISIBLE   N/A

SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

SQL> select index_name, column_name, column_position from user_ind_columns 
     where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_6wv99zdbsy8ar           RELEASE_DATE                  1
SYS_AI_6wv99zdbsy8ar           TOTAL_SALES                   2

 

But if we now do an offline reorg of a specific table partition:

SQL> alter table big_ziggy move partition albums_2017;

Table altered.

SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          UNUSABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

 

We can see we’ve now made the associated Local Index partition UNUSABLE.

If we run the following query:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3599046327

----------------------------------------------------------------------------------------------------
| Id | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT       |           |    1 |    26 |     986 (2) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE |           |    1 |    26 |     986 (2) | 00:00:01 |      3 |     3 |
|* 2 |  TABLE ACCESS FULL     | BIG_ZIGGY |    1 |    26 |     986 (2) | 00:00:01 |      3 |     3 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))
    filter(("TOTAL_SALES"=123456 AND "RELEASE_DATE"=TO_DATE('2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')))

Statistics
----------------------------------------------------------
          3 recursive calls
          4 db block gets
       5578 consistent gets
       5571 physical reads
        924 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

The CBO has no choice here but to do a full partition table scan.

If now wait again for the next AI task to strut its stuff:

SQL> select dbms_auto_index.report_last_activity() from dual;

DBMS_AUTO_INDEX.REPORT_LAST_ACTIVITY()
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 11-MAY-2022 11:42:42
Activity end                : 11-MAY-2022 11:43:13
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 4
SQL statements improved (improvement factor) : 1 (5573x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1.1x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-------------
| Owner | Table     | Index                | Key                      | Type   | Properties |
---------------------------------------------------------------------------------------------
| BOWIE | BIG_ZIGGY | SYS_AI_6wv99zdbsy8ar | RELEASE_DATE,TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------


SQL> select index_name, partition_name, status from user_ind_partitions where index_name='SYS_AI_6wv99zdbsy8ar';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          USABLE
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          USABLE

The index partition is now automatically in a USABLE state again.

If we look at the index object data:

SQL> select object_name, subobject_name, to_char(created, 'dd-Mon-yy hh24:mi:ss') created, to_char(last_ddl_time, 'dd-Mon-yy hh24:mi:ss’)
last_ddl_time from dba_objects where object_name='SYS_AI_6wv99zdbsy8ar';

OBJECT_NAME                    SUBOBJECT_NAME       CREATED                     LAST_DDL_TIME
------------------------------ -------------------- --------------------------- ---------------------------
SYS_AI_6wv99zdbsy8ar           ALBUMS_2015          11-May-22 10:41:33          11-May-22 10:56:14
SYS_AI_6wv99zdbsy8ar           ALBUMS_2016          11-May-22 10:41:33          11-May-22 10:56:15
SYS_AI_6wv99zdbsy8ar           ALBUMS_2017          11-May-22 10:41:33          11-May-22 11:42:42
SYS_AI_6wv99zdbsy8ar           ALBUMS_2018          11-May-22 10:41:33          11-May-22 10:56:18
SYS_AI_6wv99zdbsy8ar           ALBUMS_2019          11-May-22 10:41:33          11-May-22 10:56:19
SYS_AI_6wv99zdbsy8ar           ALBUMS_2020          11-May-22 10:41:33          11-May-22 10:56:20
SYS_AI_6wv99zdbsy8ar           ALBUMS_2021          11-May-22 10:41:33          11-May-22 10:56:22
SYS_AI_6wv99zdbsy8ar           ALBUMS_2022          11-May-22 10:41:33          11-May-22 10:56:22
SYS_AI_6wv99zdbsy8ar                                11-May-22 10:41:33          11-May-22 11:43:13

 

We can see that just the impacted index partition has been rebuilt.

The CBO can now successfully use the index to avoid the full partition table scan:

SQL> select * FROM big_ziggy where release_date = '01-JUN-2017' and total_sales = 123456;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3640710173

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                  | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart | Pstop |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                           |                      |    1 |    26 |      4 (0) | 00:00:01 |        |       |
|  1 | PARTITION RANGE SINGLE                     |                      |    1 |    26 |      4 (0) | 00:00:01 |      3 |     3 |
|  2 |  TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_ZIGGY            |    1 |    26 |      4 (0) | 00:00:01 |      3 |     3 |
|* 3 |   INDEX RANGE SCAN                         | SYS_AI_6wv99zdbsy8ar |    1 |       |      3 (0) | 00:00:01 |      3 |     3 |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "TOTAL_SALES"=123456)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

I’ll leave it to the discernible reader to determine if this also works in the scenario where the partitioned index were to be global… 🙂

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part II (“Ragazzo Solo, Ragazza Sola” April 27, 2022

Posted by Richard Foote in 19c, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Constraints, Exadata, Foreign Keys, Full Table Scans, Index Internals, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Performance Tuning.
1 comment so far

In my last post, I discussed how Automatic Indexing doesn’t create an automatic index in the scenario where the minimum or maximum of a column is required.

Another scenario when an automatic index is not created is when we hit issues associated with a missing index on a Foreign Key (FK) constraint.

As I’ve discussed many times previously, if you delete a parent record without an index on the dependant FK constraints, you hit a number of issues including having to perform a (potentially expensive and problematic) Full Table Scan (FTS) on the child tables and the associated locking problems.

To illustrate, I’ll first create a small parent table:

SQL> create table daddy (id number constraint daddy_pk primary key , name varchar2(42));

Table created.

SQL> insert into daddy select rownum, 'David Bowie '|| rownum from dual connect by level <= 10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DADDY');

PL/SQL procedure successfully completed.

And then a somewhat larger child table, with no index on the associated foreign key constraint:

SQL> create table kiddy (id number constraint kiddy_pk primary key , code1 number constraint daddy_fk references daddy(id), code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into kiddy select rownum, mod(rownum,1000)+1000 , mod(rownum, 10000)+1, mod(rownum, 100000)+1, 'Ziggy Stardust '|| rownum from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'KIDDY');

PL/SQL procedure successfully completed.

 

If we delete a number of parent rows, for example:

SQL> delete from daddy where id = 101;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 1477800718

-------------------------------------------------------------------------------
| Id | Operation         | Name     | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------
|  0 | DELETE STATEMENT  |          |    1 |     4 |       1 (0) |   00:00:01 |
|  1 | DELETE            | DADDY    |      |       |             |            |
|* 2 | INDEX UNIQUE SCAN | DADDY_PK |    1 |     4 |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=101)

Statistics
----------------------------------------------------------
         18 recursive calls
         13 db block gets
     117462 consistent gets
      22292 physical reads
    4645500 redo size
        204 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We notice that even though we only delete one row from a relatively small table, we perform a large number of consistent gets (117462) due to the necessary FTS on the child table, as Oracle is forced to check the table for any possible FK violations. Without an index on the child CODE1 column, Oracle has no choice but to perform the relatively expensive FTS.

Additionally, if we have an existing transaction of a child table (in Session 1):

SQL> insert into kiddy values (10000001,1042,1042,1042,'Iggy Pop');

1 row created.

And then in another session attempt to delete a parent row (in Session 2):

SQL> delete from daddy where id = 112;

The delete hangs in a locked state due to the child transaction in Session 1. This can lead to further locking issues in other sessions (Session 3):

insert into kiddy values (10000002,1042,1042,1042,'Iggy Pop');

 

The FTS on the child table and these associated locks can all be avoided by having an index on the FK constraint, as the index can then be used to effectively police the constraint during such delete operations.

What does AI do in this scenario?

Currently, nothing.

I’ve been unable to ever get AI to create a usable automatic index in this scenario. In Oracle Database 19c, I’ve not been able to get an AI created at all. In Oracle Database 21c, the best I’ve seen has been a Unusable/Invisible AI:

SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows from user_indexes where table_n
ame='KIDDY';

INDEX_NAME                     INDEX_TYPE                  AUT CON VISIBILIT STATUS     NUM_ROWS
------------------------------ --------------------------- --- --- --------- -------- ----------
KIDDY_PK                       NORMAL                      NO  YES VISIBLE   VALID      10000004
SYS_AI_31thttf8v6r35           NORMAL                      YES NO  INVISIBLE UNUSABLE   10000004

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='KIDDY';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
KIDDY_PK                       ID                            1
SYS_AI_31thttf8v6r35           CODE1                         1

So you may need to manually create such an index on the FK constraint to improve performance and eliminate these locking issues:

SQL> create index kiddy_code1_i on kiddy(code1);

Index created.

SQL> delete from daddy where id = 142;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 1477800718

-------------------------------------------------------------------------------
| Id | Operation         | Name     | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------
|  0 | DELETE STATEMENT  |          |    1 |     4 |       1 (0) |   00:00:01 |
|  1 | DELETE            | DADDY    |      |       |             |            |
|* 2 | INDEX UNIQUE SCAN | DADDY_PK |    1 |     4 |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID"=142)

Statistics
----------------------------------------------------------
          1 recursive calls
          8 db block gets
          2 consistent gets
          2 physical reads
        132 redo size
        204 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

Consistent gets have dropped off massively (down to just 8) as Oracle can now use the index to avoid the FTS search on the child table. The associated locking issues are eliminated as well.

Note: As always, this AI behaviour can always change in the future…

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part I (“Always Crashing In The Same Car”) April 26, 2022

Posted by Richard Foote in 19c, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, MAX, MIN, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
1 comment so far

As I’ve discussed previously, Oracle has increased the number of scenarios in which it will now create automatic indexes, such as with non-equality predicates and JSON expressions.

However, as of Oracle Database 21c, there are still a number of scenarios where an automatic index will NOT be created, even though an index might prove beneficial.

One such scenario is when the Min/Max of a column is required.

As I’ve discussed a number of times previously, Oracle can very efficiently use an index to determine either the Min or Max value of a column, by (hopefully) just visiting the first or last leaf block in an index. The INDEX FULL SCAN (MIN/MAX) execution plan path can be used explicitly for this purpose.

If I create a simple table as follows:

SQL> create table bowie_min (id number constraint bowie_min_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_min select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_MIN');

PL/SQL procedure successfully completed.

 

And then run the following queries a number of times that return the Min and Max of the CODE column:

SQL> select min(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 1068446691

----------------------------------------------------------------------------------------
| Id | Operation                 | Name      | Rows | Bytes | Cost (%CPU) | Time       |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |           |    1 |     5 |    6706 (2) | 00:00:01   |
|  1 | SORT AGGREGATE            |           |    1 |     5 |             |            |
|  2 | TABLE ACCESS STORAGE FULL | BOWIE_MIN |  10M |   47M |    6706 (2) | 00:00:01   |
----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

SQL> select max(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 1068446691

----------------------------------------------------------------------------------------
| Id | Operation                 | Name      | Rows | Bytes | Cost (%CPU) | Time       |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |           |    1 |     5 |    6706 (2) | 00:00:01   |
|  1 | SORT AGGREGATE            |           |    1 |     5 |             |            |
|  2 | TABLE ACCESS STORAGE FULL | BOWIE_MIN |  10M |   47M |    6706 (2) | 00:00:01   |
----------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

 

Currently, the CBO has no choice but to use a Full Table Scan (FTS) as there is currently no index on the CODE column.

So what does Automatic Indexing (AI) make of things?

Nothing.

Currently, AI will not create an index in this scenario, no matter how many times I execute these queries.

If we look at the indexes on the table after a significant period of time after running these queries:

SQL> select index_name, auto from user_indexes where table_name='BOWIE_MIN';

INDEX_NAME   AUT
------------ ---
BOWIE_MIN_PK NO

No Automatic Indexes. To improve the performance of these queries, we currently have to manually create the associated index:

SQL> create index bowie_min_code_i on bowie_min(code);

Index created.

If we now re-run these queries and look at the execution plan:

SQL> select min(code) from bowie_min;

Execution Plan
----------------------------------------------------------
Plan hash value: 252811132

-----------------------------------------------------------------------------------------------
| Id | Operation                 | Name             | Rows | Bytes | Cost (%CPU) | Time       |
-----------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |                  |    1 |     5 |       3 (0) | 00:00:01   |
|  1 | SORT AGGREGATE            |                  |    1 |     5 |             |            |
|  2 | INDEX FULL SCAN (MIN/MAX) | BOWIE_MIN_CODE_I |    1 |     5 |       3 (0) | 00:00:01   |
-----------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        569 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We can see that the CBO is now indeed using the index to return the Min/Max values with a vastly reduced number of consistent gets (down to just 3 from the previous 38538).

However, a key point here is that Automatic Indexes only works on an Exadata platform and Exadata has various smarts that potentially makes accessing data via a “FTS” in this manner much more efficient than in non-Exadata environments.

Oracle may well take the position that getting Min/Max data on a Exadata is potentially efficient enough and doesn’t on its own warrant the creation of an index.

More on this in future posts…

Oracle 19c Automatic Indexing: Non-Equality Predicates Part II (Let’s Spend The Night Together) January 21, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Non-Equality Predicates, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Performance Tuning.
1 comment so far

In my previous post in this series, I discussed out Automatic Indexing currently does not consider Non-Equality predicates. Automatic Indexing will index columns based only on Equality predicates.

So how does Oracle handle the scenario when an SQL has a mixture of both Equality and Non-Equality predicates?

I’ll begin by creating two very similar tables, but with the second table having a more selective CODE column:

SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD');

PL/SQL procedure successfully completed.


SQL> create table pink_floyd1 (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd1 select rownum, ceil(dbms_random.value(0, 25000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD1');

PL/SQL procedure successfully completed.

 

So table PINK_FLOYD has 5,000 distinct CODE values, whereas table PINK_FLOYD1 has 25,000 distinct CODE values.

I’ll next run the following identical SQLs, which both use an Equality predicate on the CODE column and a Non-Equality predicate on the CREATE_DATE column. The CODE column provides some filtering (more so with the PINK_FLOYD1 table) but in combination with the CREATE_DATE column, results in the ultimate filtering with no rows returned:

 

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed


SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 564520720

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |    1 |    41 |    856 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |    856 (11) | 00:00:01  |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))
     filter("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      64424 consistent gets
      64413 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

So how does Automatic Indexing handle this scenario. If we look at the subsequent Automatic Indexing report (highlights only):

 

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
-----------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties   |
-----------------------------------------------------------------------------
| BOWIE | PINK_FLOYD1 | SYS_AI_96snkmu4sk44g | CODE | B-TREE | NONE         |
-----------------------------------------------------------------------------
-------------------------------------------------------------------------------


-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7wag3gbk0b3tm
SQL Text            : select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021'
Improvement Factor  : 64442.3x

Execution Statistics:
-----------------------------
                      Original Plan                Auto Index Plan
                      ---------------------------- ----------------------------
Elapsed Time (s):     568513                       2771
CPU Time (s):         275534                       1874
Buffer Gets:          1031078                      406
Optimizer Cost:       856                          405
Disk Reads:           1030609                      3
Direct Writes:        0                            0
Rows Processed:       0                            0
Executions:           16                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 564520720

-----------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |      |       |  856 |            |
|  1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |  856 | 00:00:01   |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2703636439

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    1 |    41 |  405 | 00:00:01   |
| * 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |  405 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  403 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

If we look at the definitions of all indexes currently on these tables:

SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS   NUM_ROWS   LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_dp2t0j12zux49           YES INVISIBLE ADVANCED LOW  UNUSABLE   10000000       21702           4161898

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_dp2t0j12zux49           CODE                          1


SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD1';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_96snkmu4sk44g           YES VISIBLE   ADVANCED LOW  VALID      10000000       15400           9969473

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD1';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_96snkmu4sk44g           CODE                          1

 

In both cases, Automatic Indexing only created an index on the CODE column, as it was the only column with an Equality predicate.

However, the Automatic Index on the table PINK_FLOYD remained in an INVISIBLE/UNUSABLE. That’s because an index on only the CODE column was not efficient enough to improve the performance of the SQL, due to the filtering not being sufficient enough and because of the relatively poor Clustering Factor.

The index on the table PINK_FLOYD1 was eventually created as a VISIBLE/VALID index, as its better filtering was sufficient to actually improve the performance of the SQL.

So if we re-run the first query:

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

It continues to use a Full Table Scan.

If we re-run the second query:

 

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 2703636439

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      |    1 |    41 |     415 (0) | 00:00:01  |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |     415 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  412 |       |       3 (0) | 00:00:01  |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
        406 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

If now uses the newly created Automatic Index, with an improved 406 Consistent Gets (down from the previous 64424 Consistent Gets with the FTS).

BUT if we were to manually create an index on BOTH CODE and CREATE_DATE columns:

SQL> create index pink_floyd1_code_create_date_i on pink_floyd1(code, create_date) compress advanced low;

Index created.

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3366491378

----------------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                           | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                                |    1 |    41 |       4 (0) | 00:00:01  |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1                    |    1 |    41 |       4 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | PINK_FLOYD1_CODE_CREATE_DATE_I |    1 |       |       3 (0) | 00:00:01  |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"CREATE_DATE" IS NOT NULL)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

Performance improves significantly further, by reducing Consistent Gets down to just 3.

So if you have SQL statements with a mixture of both Equality and Non-Equality predicates, you may encounter these 2 scenarios:

A potentially efficient index that is not created at all as the filtering on just the Equality based predicates are not sufficient to create a viable index, or

A potentially suboptimal Automatic Index that doesn’t contain useful filtering columns because they’re used in Non-Equality predicates…

Oracle 19c Automatic Indexing: Non-Equality Predicates Part I (Lucy Can’t Dance) January 14, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Exadata, Full Table Scans, Non-Equality Predicates, Oracle, Oracle Blog, Oracle Cloud, Oracle Indexes, Oracle19c.
6 comments

 

I’ve been waiting a while before posting a series on the various limitations associated with Automatic Indexing, in order to see how the feature matures over time.

The following have all been re-tested post 1 January 2021 on the Autonomous ATP Database Cloud service, using Oracle Database version 19.5.0.0.0.

In the Oracle Documentation (including version 21c), the only limitations with regard Automatic Indexing listed are the following:

  • Auto indexes are local B-tree indexes.
  • Auto indexes can be created for partitioned as well as non-partitioned tables.
  • Auto indexes cannot be created for temporary tables.

Well, as I discussed in the previous series on Automatic Indexing on Partitioned tables, we already saw how Oracle can actually also create Non-Partitioned (Global) indexes. So the limitation on Automatic Indexes being “local” indexes is not actually correct, even with 19c.

But are there other limitations that are not officially documented?

If you look at every example I’ve used previously with regard Automatic Indexing, they all feature Equality predicates. In the following examples, I’m going to run a series on Range Scan predicates that heavily filter and would benefit greatly from an index.

I first create a simple table with 10M rows:

SQL> create table ziggy1 (id number, code number, name varchar2(42));

Table created.

SQL> insert into ziggy1 select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY1');

PL/SQL procedure successfully completed.

 

I then run the following range scan queries several times that each return only a few rows:

SQL> select * from ziggy1 where id between 42 and 50;

9 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    8 |   184 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    8 |   184 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<=50 AND "ID">=42)
    filter("ID"<=50 AND "ID">=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        596 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          9 rows processed


SQL> select * from ziggy1 where id < 0;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    1 |    23 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    1 |    23 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<0)
    filter("ID"<0)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        364 bytes sent via SQL*Net to client
        344 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

SQL> select * from ziggy1 where id > 100000000000;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    1 |    23 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    1 |    23 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID">100000000000)
    filter("ID">100000000000)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        364 bytes sent via SQL*Net to client
        355 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

If we look at the subsequent Automatic Indexing report:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 13-JAN-2021 11:55:37
Activity end                : 13-JAN-2021 11:56:20
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 3
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

We notice NO Automatic Indexes were created.

We can run these queries endlessly and Automatic Indexing will never create associated Automatic Indexes:

SQL> select index_name, auto, constraint_index, visibility from user_indexes where table_name='ZIGGY1';

no rows selected

These queries are doomed to perform Full Table Scans unless indexes are manually created:

SQL> select * from ziggy1 where id between 42 and 50;

9 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    8 |   184 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    8 |   184 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<=50 AND "ID">=42)
    filter("ID"<=50 AND "ID">=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        596 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          9 rows processed

 

Currently Automatic Indexes do not support Non-Equality predicates. Automatic Indexes are only created based on Equality-based predicates.

Obviously, Automatic Indexing is a fabulous feature and this might all change in the future. But with Non-Equality predicates so prevalent in SQL, it’s vital to note this current limitation when using and relying on Automatic Indexing…

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part III (Do Anything You Say) October 8, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Performance Tuning, Stale Statistics.
2 comments

In Part I of this series, we saw how Automatic Indexing will not create a viable Automatic Index if there are stale or missing statistics on the underlining segments. In Part II we saw how these SQL statements effectively become blacklisted and when segment statistics are subsequently collected, Automatic Indexing will still not create viable Automatic Indexes when the SQL statements are re-run.

So how do we get Automatic Indexing to now kick in and create necessary indexes on these problematic SQLs?

As I’ve discussed previously in relation to blacklisted SQLs, we need to run a NEW SQL statement that hasn’t been blacklist that will result in a necessary index to be created. An easy way to do this is just to include a new comment within the previous SQL to give the SQL a new signature.

If we now run the following “new” SQL statement (identical to the problematic SQL but with a comment embedded):

SQL> select /* new */ * from bowie_stale where code=42;

        ID       CODE NAME
---------- ---------- ------------------------------------------
   1000041         42 David Bowie
   6000041         42 David Bowie
        41         42 David Bowie
   3000041         42 David Bowie
   7000041         42 David Bowie
   8000041         42 David Bowie
   4000041         42 David Bowie
   9000041         42 David Bowie
   5000041         42 David Bowie
   2000041         42 David Bowie

 

If we now wait to see what the next Automatic Indexing task makes of things:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 07-JUL-2020 06:34:49
Activity end                : 07-JUL-2020 06:35:54
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 142.61 MB (142.61 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (19787.7x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 19787.7x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | BOWIE_STALE | SYS_AI_300kk2unp8tr0 | CODE | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We see that the index on the CODE column (SYS_AI_300kk2unp8tr0) has now been created.

Further down the report:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : du6psd0xmzpg5
SQL Text            : select /* new */ * from bowie_stale where code=42
Improvement Factor  : 19787.7x

Execution Statistics:
-----------------------------
                  Original Plan Auto           Index Plan
                  ---------------------------- ----------------------------
Elapsed Time (s): 137261                       2620
CPU Time (s):     84621                        1769
Buffer Gets:      277028                       13
Optimizer Cost:   544                          13
Disk Reads:       275947                       2
Direct Writes:    0                            0
Rows Processed:   70                           10
Executions:       7                            1

 

A new index was indeed created because of this new SQL statement, with a performance improvement of 19787.7x.

Further down the report to the Plans Section:

 

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 65903426

-----------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |             |      |       |  544 |             |
| 1 | TABLE ACCESS STORAGE FULL | BOWIE_STALE |   10 |   230 |  544 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2558864466

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   230 |   13 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_STALE          |   10 |   230 |   13 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_300kk2unp8tr0 |   10 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We can see that the new plan using the new Automatic Index with a much lower CBO cost.

If we now look at the status of this index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID      10000000       20164             59110
SYS_AI_300kk2unp8tr0           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       16891          10000000

 

We see that the index is now both VISIBLE and VALID (previously, it was INVISIBLE and UNUSABLE).

As such, the Automatic Index can now potentially be used by any SQL, including the previous problematic query.

So with a viable index now in place, if we re-run the initial problematic query:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2558864466

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      |   10 |   230 |      14 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_STALE          |   10 |   230 |      14 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_300kk2unp8tr0 |   10 |       |       3 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         14 consistent gets
          0 physical reads
          0 redo size
        738 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

We see that finally, the SQL uses the new Automatic Index and is indeed much more efficient as a result, with just 14 consistent gets required (when previously it was 39430 consistent gets).

So if ever you come across the scenario where an SQL does not have an Automatic Index created when clearly it should, it could be that it has been blacklisted and needs a different SQL to actually generate the necessary index.

To avoid some of these issues, make sure you do not have stale or missing statistics when reliant on Automatic Indexing. The new High Frequency Statistics Collection capability to designed to specifically avoid such a scenario.

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part II (Survive) October 7, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Internals, Index statistics, Oracle, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning, Stale Statistics.
1 comment so far

 

 

In my previous post, I discussed how having stale statistics, usually a bad idea, is especially problematic with regard Automatic Indexes as it usually results in viable automatic indexes only being created in an UNUSABLE/INVISIBLE state.

If we were to now to collect the missing statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STALE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables
where table_name='BOWIE_STALE';

TABLE_NAME             NUM_ROWS     BLOCKS LAST_ANAL
-------------------- ---------- ---------- ---------
BOWIE_STALE            10000000      39677 06-JUL-20

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                       10000000          0 HYBRID          06-JUL-20
CODE                       971092    .000001 HYBRID          06-JUL-20
NAME                            1 4.9416E-08 FREQUENCY       06-JUL-20

 

If we now repeatedly re-run the problematic query many times:

 

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 65903426

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)|  Time       |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |   10 |   230 |    544 (14)|  00:00:01   |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |   10 |   230 |    544 (14)|  00:00:01   |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        610 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

 

The CBO is forced to use the FTS as the current Automatic Index is in an UNUSABLE/INVISIBLE state.

If we wait for the next Automatic Indexing reporting period:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 06-JUL-2020 05:12:42
Activity end                : 06-JUL-2020 05:13:34
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 0
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

 

We notice that the Automatic Indexing process has nothing to report. Even though the problematic query is repeatedly executed, the SQL is now effectively on a blacklist and is not re-considered by the Automatic Indexing process.

If we look at the index details on the table:

 

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME             AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK         NO  YES VISIBLE   DISABLED      VALID      10000000       20164             59110
SYS_AI_300kk2unp8tr0   YES NO  INVISIBLE ADVANCED LOW  UNUSABLE   10000000       23058           4147514 

 

So the Automatic Index (SYS_AI_300kk2unp8tr0) is still UNUSABLE and INVISIBLE and can not be used by the CBO.

NOTE: In earlier patches of Oracle Database 19c (I’m using version 19.5.0.0.0 in this demo), I identified some scenarios after stale statistics when indexes were created in but in a VALID/INVISIBLE state, such that they could still not be used by the CBO in general database sessions.

If we simply re-run the same queries again from the time when the dependant object statistics were stale, any SQL is just ignored by the Automatic Indexing process.

As such, if we now subsequently re-run the problematic query again:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 65903426

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |   10 |   230 |    544 (14)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |   10 |   230 |    544 (14)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        610 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

Again, the CBO has no choice here with no viable VALID/VISIBLE index present but to perform a FTS, even though its getting the cardinality estimates spot on since statistics gathering.

 

In Part III I’ll discuss how to get this query to finally use the Automatic Index and improve its performance, although if you’re a regular reader of the blog you should already know the solution…

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part I (Dead Against It) October 6, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, High Frequency Statistics Collection, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning, Stale Statistics, Unusable Indexes.
5 comments

A “golden rule” when working with Automatic Indexing is that things don’t work properly if there are stale statistics on the dependant objects. Stale statistics can of course be problematic but they can be particularly troublesome when dealing with Automatic Indexing.

In the Oracle Autonomous Database environments, this issue is addressed somewhat by the new High Frequency Statistics Collection capability, which helps to automatically collect stale statistics on a regular basis. However, in on-prem Exadata environments where this can more easily be turned off or collected less frequently, it’s a potential issue worth consideration.

I’ll start with a simple little table, with a CODE column that has lots of distinct values:

SQL> create table bowie_stale (id number constraint bowie_stale_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_stale select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

Importantly, I don’t collect statistics on this newly populated table…

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables
where table_name='BOWIE_STALE';

TABLE_NAME        NUM_ROWS     BLOCKS LAST_ANAL
--------------- ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

If we now run the following query a number of times while there are no statistics on the table:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice but to use a FTS as I don’t yet have an index on the CODE column.

If I now wait for the next Automatic Indexing task to kick in AND if there are still NO statistics on the table:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT

--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 05-JUL-2020 06:36:31
Activity end                 : 05-JUL-2020 06:37:07
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates            : 1
Indexes created             : 0
Space used                  : 0 B
Indexes dropped             : 0
SQL statements verified     : 0
SQL statements improved     : 0
SQL plan baselines created  : 0
Overall improvement factor  : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

You can see that there was the one index candidate BUT no Automatic Index appears to have been created.

Assuming there are still no statistics:

 

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE';

TABLE_NAME                       NUM_ROWS    BLOCKS  LAST_ANAL
------------------------------ ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE2';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

 

If we look now at what indexes exist on the table:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID
SYS_AI_300kk2unp8tr0           YES NO  INVISIBLE DISABLED      UNUSABLE          0           0                 0

SQL> select index_name, column_name, column_position from user_ind_columns
where table_name='BOWIE_STALE2' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
------------------------------ -------------------- ---------------
BOWIE_STALE_PK                 ID                                 1
SYS_AI_300kk2unp8tr0           CODE                               1

 

We notice there is now an Automatic Index BUT it remains in an UNUSABLE/INVISIBLE state. This means the index can’t be used by the CBO.

So if we now re-run the SQL query again:

 

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice still but to use the FTS.

In Part II, we’ll see that once we get into this scenario, it can be a tad problematic to get ourselves out of it and get the Automatic Index created as we would like…

Oracle 19c Automatic Indexing: Data Skew Fixed By Baselines Part II (Sound And Vision) September 28, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Baselines, CBO, Data Skew, Exadata, Explain Plan For Index, Full Table Scans, Histograms, Index Access Path, Index statistics, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning.
1 comment so far

 

In my previous post, I discussed how the Automatic Indexing task by using Dynamic Sampling Level=11 can correctly determine the correct query cardinality estimates and assume the CBO will likewise determine the correct cardinality estimate and NOT use an index if it would cause performance to regress.

However, if other database sessions DON’T use Dynamic Sampling at the same Level=11 and hence NOT determine correct cardinality estimates, newly created Automatic Indexes might get used by the CBO inappropriately and result inefficient execution plans.

Likewise, with incorrect CBO cardinality estimates, it might also be possible for newly created Automatic Indexes to NOT be used when they should be (as I’ve discussed previously).

These are potential issues if the Dynamic Sampling value differs between the Automatic Indexing task and other database sessions.

One potential way to make things more consistent and see how the Automatic Indexing behaves if it detects an execution plan where the CBO would use an Automatic Index that causes performance regression, is to disable Dynamic Sampling within the Automatic Indexing task.

This can be easily achieved by using the following hint which effectively disables Dynamic Sampling with the previous problematic query:

SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |  1005K|   135M| 11411   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |  1005K|   135M| 11411   (1)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      41169  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2705  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

The query currently has good cardinality estimates (1005K vs 1000011 rows returned) only because we currently have histograms in place for the CODE column. As such, the query correctly uses a FTS.

However, if we now remove the histogram on the CODE column:

SQL> exec dbms_stats.gather_table_stats(null, 'SPACE_ODDITY', method_opt=> 'FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

 

There is no way for the CBO to now determine the correct cardinality estimate because of the skewed data and missing histograms.

So what does the Automatic Indexing tasks make of things now. If we look at the next activity report:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 18-AUG-2020 16:42:33
Activity end                 : 18-AUG-2020 16:43:06
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created                              : 0
Space used                                   : 0 B
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved                      : 0
SQL plan baselines created (SQL statements)  : 1 (1)
Overall improvement factor                   : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

We can see that it has verified this one new statement and has created 1 new SQL Plan Baseline as a result.

If we look at the Verification Details part of this report:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
The following SQL plan baselines were created:
-------------------------------------------------------------------------------
Parsing Schema Name     : BOWIE
SQL ID                  : 3yz8unzhhvnuz
SQL Text                : select /*+ dynamic_sampling(0) */ * from
space_oddity where code in (190000, 170000, 150000,
130000, 110000, 90000, 70000, 50000, 30000, 10000)
SQL Signature           : 3910785437403172730
SQL Handle              : SQL_3645e6a2952fcf7a
SQL Plan Baselines (1)  : SQL_PLAN_3cjg6naakzmvu198c05b9

We can see Automatic Indexing has created a new SQL Plan Baseline for our query with Dynamic Sampling set to 0 thanks to the hint.

Basically, the Automatic Indexing task has found a new query and determined the CBO would be inclined to use the index, because it now incorrectly assumes few rows are to be returned. It makes the poor cardinality estimate because there are currently no histograms in place AND because it can’t now use Dynamic Sampling to get a more accurate picture of things on the fly because it has been disabled with the dynamic_sampling(0) hint.

Using an Automatic Index over the current FTS plan would make the performance of the SQL regress.

Therefore, to protect the current FTS plan, Automatic Indexing has created a SQL Plan Baseline that effectively forces the CBO to use the current, more efficient FTS plan.

This can be confirmed by looking at the DBA_AUTO_INDEX_VERIFICATIONS view:

 

SQL> select execution_name, original_buffer_gets, auto_index_buffer_gets, status
from dba_auto_index_verifications where sql_id = '3yz8unzhhvnuz';

EXECUTION_NAME             ORIGINAL_BUFFER_GETS AUTO_INDEX_BUFFER_GETS STATUS
-------------------------- -------------------- ---------------------- ---------
SYS_AI_2020-08-18/16:42:33                41169                 410291 REGRESSED

 

If we now re-run the SQL again (noting we still don’t have histograms on the CODE column):

SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |    32 |  4512 | 11425   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |    32 |  4512 | 11425   (2)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Hint Report (identified by operation id / Query Block Name / Object Alias):

Total hints for statement: 1 (U - Unused (1))
---------------------------------------------------------------------------
1 -  SEL$1
U -  dynamic_sampling(0) / rejected by IGNORE_OPTIM_EMBEDDED_HINTS

Note
-----

- SQL plan baseline "SQL_PLAN_3cjg6naakzmvu198c05b9" used for this statement

Statistics
----------------------------------------------------------
          9  recursive calls
          4  db block gets
      41170  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2705  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

We can see the CBO is forced to use the SQL Plan Baseline “SQL_PLAN_3cjg6naakzmvu198c05b9” as created by the Automatic Indexing task to ensure the more efficient FTS is used and not the available Automatic Index.

So Automatic Indexing CAN create SQL PLan Baselines to protect SQL from performance regressions caused by inappropriate use of Automatic Indexes BUT it’s really hard and difficult for it to do this effectively if the Automatic Indexing tasks and other database sessions have differing Dynamic Sampling settings as it does by default…

Oracle 19c Automatic Indexing: Data Skew Fixed By Baselines Part I (The Prettiest Star)) September 25, 2020

Posted by Richard Foote in 19c, 19c New Features, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Baselines, CBO, Data Skew, Exadata, Full Table Scans, Histograms, Index Access Path, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning.
1 comment so far

In my previous few blog posts, I’ve been discussing some issues in relation to how Automatic Indexes handle SQL statements that accesses skewed data. In this post, I’m going to setup the scenario in which Automatic Indexing can potentially use Baselines to help address some of these issues. BUT, as we’ll see, I’m having to manufacture things somewhat to make this work due to the problem of the Automatic Indexing task using Dynamic Sampling of level 11, whereas most usual database sessions do not.

To set things up, I’m going recap what I’ve previously discussed (but with a slight difference), by creating a table that has significant data skew on the CODE column, with most values very uncommon, but with a handful of values being very common:

SQL> create table space_oddity (id number constraint space_oddity_pk primary key, code number, name varchar2(142));

Table created.

SQL> begin
2     for i in 1..2000000 loop
3       if mod(i,2) = 0 then
4          insert into space_oddity values(i, ceil(dbms_random.value(0,1000000)), 'David Bowie is really Ziggy Stardust and his band are called The Spiders From Mars. Then came Aladdin Sane and the rest is history');
5       else
6          insert into space_oddity values(i, mod(i,20)*10000, 'Ziggy Stardust is really David Bowie and his band are called The Spiders From Mars. Then came Aladdin Sane and the rest is history.');
7       end if;
8     end loop;
9     commit;
10  end;
11  /

PL/SQL procedure successfully completed.

 

So most CODE values will only occur a few times if at all, but a few values divisible by 10000 have many many occurrences within the table.

Importantly, we will initially collect statistics with NO histograms on the CODE column, which is the default behaviour anyways if no SQL has previous run with predicates on the column:

SQL> exec dbms_stats.gather_table_stats(null, 'SPACE_ODDITY', method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

 

If we run a query based on a rare value for CODE:

SQL> set arraysize 5000

SQL> select * from space_oddity where code=25;

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |     3 |   423 | 11356   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |     3 |   423 | 11356   (1)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=25)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      40974  consistent gets
          0  physical reads
          0  redo size
       1018  bytes sent via SQL*Net to client
        402  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          2  rows processed

 

Without an index, the CBO has no choice at this point but to perform a FTS. BUT note that the 2 rows returned is very similar to the 3 estimated rows, which would make an index likely the way to go if such an index existed.

However, the following SQL accesses many of the common values of CODE and returns many rows:

SQL> select * from space_oddity where code in (10000, 30000, 50000, 70000, 90000, 110000, 130000, 150000, 170000, 190000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |    32 |  4512 | 11425   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |    32 |  4512 | 11425   (2)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      41169  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2678  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

Again, without an index in place, the CBO has no choice but to perform a FTS but this is almost certainly the way to go regardless. BUT without a histogram on the CODE column, the CBO has got the cardinality estimate way way off and thinks only 32 rows are to be returned and not the actual 1000011 rows.

So what does Automatic Indexing make of things. Let’s wait and have a look at the next Automatic Indexing Report:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 18-AUG-2020 15:57:14
Activity end                 : 18-AUG-2020 15:58:10
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                              : 1
Indexes created (visible / invisible)         : 1 (1 / 0)
Space used (visible / invisible)              : 35.65 MB (35.65 MB / 0 B)
Indexes dropped                               : 0
SQL statements verified                       : 1
SQL statements improved (improvement factor)  : 1 (40984.3x)
SQL plan baselines created                    : 0
Overall improvement factor                    : 40984.3x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
----------------------------------------------------------------------------
| Owner | Table        | Index                | Key  | Type   | Properties |
----------------------------------------------------------------------------
| BOWIE | SPACE_ODDITY | SYS_AI_82bdnqs7q8rtm | CODE | B-TREE | NONE       |
----------------------------------------------------------------------------

 

So Automatic Indexing has indeed created the index (SYS_AI_82bdnqs7q8rtm) on the CODE column BUT this is based on only the one SQL statement:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name  : BOWIE
SQL ID               : 19sv1g6tt0g1y
SQL Text             : select * from space_oddity where code=25
Improvement Factor   : 40984.3x

Execution Statistics:
-----------------------------

                   Original Plan                 Auto Index Plan
                   ----------------------------  ----------------------------
Elapsed Time (s):  5417408                       139265
CPU Time (s):      1771880                       7797
Buffer Gets:       327876                        5
Optimizer Cost:    11356                         5
Disk Reads:        649                           2
Direct Writes:     0                             0
Rows Processed:    16                            2
Executions:        8                             1

 

The Automatic Indexing task has correctly identified a significant improvement of 40984.3x when using an index on the SQL statement that returned just the 2 rows. The other SQL statement that returns many rows IS NOT MENTIONED.

This is because the Automatic Indexing tasks uses Dynamic Sampling Level=11, meaning it determines the more accurate cardinality estimate on the fly and correctly identifies that a vast number of rows are going to be returned. As a result, it correctly determines that the new Automatic Indexing if used would be detrimental to performance and would not be used by the CBO.

BUT most importantly, it also makes the assumption that the CBO would automatically likewise make this same decision to NOT use any such index in other database sessions and so there’s nothing to protect.

BUT this assumption is incorrect IF other database sessions don’t likewise use Dynamic Sampling with Level=11.

BUT by default, including in Oracle’s Autonomous Database Transaction Processing Cloud environment, the Dynamic Sampling Level is NOT set to 11, but the 2.

Therefore, most database sessions will not be able to determine the correct cardinality estimate on the fly and so will incorrectly assume the number of returned rows is much less than in reality and potentially use any such new Automatic Index inappropriately…

So if we look at the Plans Section of the Automatic Indexing report:

 

PLANS SECTION

---------------------------------------------------------------------------------------------
- Original
-----------------------------

Plan Hash Value  : 2301175572
-----------------------------------------------------------------------------
| Id | Operation           | Name         | Rows | Bytes | Cost  | Time     |
-----------------------------------------------------------------------------
|  0 | SELECT STATEMENT    |              |      |       | 11356 |          |
|  1 |   TABLE ACCESS FULL | SPACE_ODDITY |    3 |   423 | 11356 | 00:00:01 |
-----------------------------------------------------------------------------

- With Auto Indexes

-----------------------------
Plan Hash Value  : 54782313
-------------------------------------------------------------------------------------------------------
| Id  | Operation                             | Name                 | Rows | Bytes | Cost | Time     |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                      |                      |    3 |   423 |    5 | 00:00:01 |
|   1 |   TABLE ACCESS BY INDEX ROWID BATCHED | SPACE_ODDITY         |    3 |   423 |    5 | 00:00:01 |
| * 2 |    INDEX RANGE SCAN                   | SYS_AI_82bdnqs7q8rtm |    2 |       |    3 | 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------

* 2 - access("CODE"=25)

Notes
-----

- Dynamic sampling used for this statement ( level = 11 )

 

The new plan for the SQL returning 2 rows when using the new Automatic Index and is much more efficient with a significantly reduced cost (just 3 down from 11356).

But again, the plans for the SQL that returns many rows are not listed as the Automatic Indexing task has already determined that an index would make such a plan significantly less efficient.

If we now rerun the SQL the returns many rows (and BEFORE High Frequency Collection Statistics potentially kicks in):

SQL> select * from space_oddity where code in (10000, 30000, 50000, 70000, 90000, 110000, 130000, 150000, 170000, 190000);

1000011 rows selected.

Execution Plan
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |    32 |  4512 |    35   (0)| 00:00:01 |
|   1 |  INLIST ITERATOR                     |                      |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID BATCHED| SPACE_ODDITY         |    32 |  4512 |    35   (0)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN                  | SYS_AI_82bdnqs7q8rtm |    32 |       |    12   (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR "CODE"=70000 OR "CODE"=90000 OR
           "CODE"=110000 OR "CODE"=130000 OR "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     410422  consistent gets
          0  physical reads
          0  redo size
  145536076  bytes sent via SQL*Net to client
       2678  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

Note that the cardinality estimate is still way way wrong, thinking that just 32 rows are to be returned, when is fact 1000011 rows are returned.

As a result, the CBO has decided to incorrectly use the new Automatic Index. Incorrectly, in that the number of consistent gets has increased 10x from the previous FTS plan (410,422 now, up from 41,169).

One way to resolve this is to collect histograms on the CODE column (or wait for the High Frequency Stats Collection to kick in):

SQL> exec dbms_stats.gather_table_stats(null, 'SPACE_ODDITY', method_opt=> 'FOR ALL COLUMNS SIZE 2048’);

PL/SQL procedure successfully completed.

If we now re-run this SQL:

SQL> select * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000);

1000011 rows selected.

Execution Plan
----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              |   996K|   133M| 11411   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| SPACE_ODDITY |   996K|   133M| 11411   (1)| 00:00:01 |
----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR
           "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR
           "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
      41169  consistent gets
          0  physical reads
          0  redo size
   13535504  bytes sent via SQL*Net to client
       2678  bytes received via SQL*Net from client
        202  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
    1000011  rows processed

 

The cardinality estimate is now much more accurate and the the execution plan now uses the more efficient FTS.

In Part II, we’ll look at how the Automatic Indexing tasks can be made to identify the dangers of a new index to SQLs that might degrade in performance and how it will create a Baseline to protect against any such SQL regressions….

Oracle 19c Automatic Indexing: CBO Incorrectly Using Auto Indexes Part I (Neighborhood Threat) September 18, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Data Skew, Explain Plan For Index, Extended Statistics, Full Table Scans, Histograms, Index Access Path, Oracle, Oracle General, Oracle Indexes.
1 comment so far

Following on from my previous few posts on “data skew”, I’m now going to look at it from a slightly different perspective, where there is an inherent relationship between columns. The CBO has difficulties in recognising (by default) that some combinations of column values are far more common than other combinations, resulting in incorrect cardinality estimates and resultant poor execution plans.

As we’ll see, this skew in returned data can lead to poor execution plans due to the inappropriate use of newly created Automatic Indexes…

I’ll start by creating a simple table that has two columns of interest, CODE1 and CODE2:

SQL> create table iggy_pop (id number, code1 number, code2 number, name varchar2(42));

Table created.

SQL> insert into iggy_pop select rownum, mod(rownum, 100)+1, mod(rownum, 100)+1, 'David Bowie'
from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'IGGY_POP');

PL/SQL procedure successfully completed.

 

Both columns CODE1 and CODE2 each have 100 distinct values, so that the possible combinations of data from both columns is 100 x 100 = 10,000. HOWEVER, the values of CODE1 and CODE2 are always the same and so there is in fact only 100 distinct combinations of data because of this inherent relationship between columns.

If we run the following query for a combination of data that exists:

 

SQL> select * from iggy_pop where code1=42 and code2=42;

100000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3288467

--------------------------------------------------------------------------------------
| Id | Operation                | Name      | Rows | Bytes | Cost (%CPU)|   Time     |
--------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT         |          |   1000|  24000|    575 (15)|   00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL| IGGY_POP |   1000|  24000|    575 (15)|   00:00:01 |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE1"=42 AND "CODE2"=42)
    filter("CODE1"=42 AND "CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      40964 consistent gets
      40953 physical reads
          0 redo size
    1092240 bytes sent via SQL*Net to client
        581 bytes received via SQL*Net from client
         21 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
     100000 rows processed

 

Without an index, the CBO has no choice but to use a FTS. However, the interesting thing to note is how the cardinality estimate is way wrong, with 100,000 rows returned but only 1000 rows estimated. The CBO incorrect assumes that 1/10000th of the data is being returned and not actual the 1/100 (1%).

If we run a query on a combination of data that doesn’t exist:

SQL> select code1, code2, name from iggy_pop where code1=1 and code2=42;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3288467

--------------------------------------------------------------------------------------
| Id | Operation                | Name     | Rows | Bytes | Cost (%CPU)| Time        |
--------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |          | 1000 |  18000|    575 (15)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| IGGY_POP | 1000 |  18000|    575 (15)| 00:00:01    |
--------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE1"=1 AND "CODE2"=42)
    filter("CODE1"=1 AND "CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      40964 consistent gets
      40953 physical reads
          0 redo size
        368 bytes sent via SQL*Net to client
        377 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

The CBO still estimates that 1000 rows are to be returned. However, with no rows returned, an index would be a much better alternative than the current FTS in this case.

Let’s now wait and see what the Automatic Indexing process makes of all this (following are highlights from the Auto Indexing Last Activity report):

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 18-SEP-2020 01:24:17
Activity end                : 18-SEP-2020 01:25:29
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 134.22 MB (134.22 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (41301.7x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 41301.7x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
| Owner | Table    | Index                | Key         | Type   | Properties |
-------------------------------------------------------------------------------
| BOWIE | IGGY_POP | SYS_AI_1awkddqkwa4f8 | CODE1,CODE2 | B-TREE | NONE       |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

So Oracle does indeed create an automatic index on the CODE1, CODE2 columns. However, notice that only 1 statement has been verified and not the above two statements that I had executed during the previous period.

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : bdnf0barn3jk7
SQL Text            : select code1, code2, name from iggy_pop where code1=1 and code2=42
Improvement Factor  : 41301.7x

Execution Statistics:
-----------------------------
                  Original Plan                 Auto Index Plan
                  ---------------------------- ----------------------------
Elapsed Time (s): 72085                        1342
CPU Time (s):     39272                        679
Buffer Gets:      123907                       3
Optimizer Cost:   575                          4
Disk Reads:       122859                       2
Direct Writes:    0                            0
Rows Processed:   0                            0
Executions:       3                            1

 

So only the SQL that returned 0 rows has been reported. As expected, it runs much more efficiently with an index than via the previous FTS, with an Improvement Factor of some 41301.7x.

 

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 3288467

--------------------------------------------------------------------------------
| Id | Operation                | Name     | Rows | Bytes | Cost | Time        |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |          |      |       |  575 |             |
| 1 | TABLE ACCESS STORAGE FULL | IGGY_POP | 1000 | 18000 |  575 | 00:00:01    |
--------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2496796491

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    2 |    36 |    4 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | IGGY_POP             |    2 |    36 |    4 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_1awkddqkwa4f8 |    1 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE1"=1 AND "CODE2"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

If we look at the comparison between plans, the new plan of course uses the newly created Automatic Index.

The critical point to notice here however is that the cardinality estimates are almost spot for the new execution plan (2 rows is much closer to reality than the previous 1000).

The reason why it’s much more accurate is because the Auto Indexing process session uses the new Dynamic Sampling Level = 11. This enables the CBO to sample data on the fly and determine a much more accurate cardinality estimate than by default where the Dynamic Sampling Level=2.

This also explains why the other statement which returned many rows was not “verified”. Actually, it was but because the Auto Index process with Dynamic Sampling set to 11 correctly identified that too many rows were being returned to make any new index viable, this statement did NOT cause the new index to be kept.

So it was only the SQL that returned no rows that resulted in the newly created Automatic Index. The other statement was correctly determined by the Automatic Indexing process to run worse with the new index and so determined that the CBO would simply ignore the index if created.

BUT this assumption of the CBO ignoring the index is NOT correct as we’ll see…

If we look at the new Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='IGGY_POP';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_1awkddqkwa4f8           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       15362           4083700

 

We can see the index is both VISIBLE and VALID and so can potentially be used now by ANY subsequent SQL statement.

Now the important thing to note is that the default for most sessions in a database is for Dynamic Sampling to be set to 2 and for Optimizer_Adaptive_Statistics=False. Importantly, this is also the case in Oracle’s Autonomous Transaction Processing Cloud service.

SQL> show parameter sampling

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_dynamic_sampling           integer     2
SQL> show parameter optimizer_adaptive

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
optimizer_adaptive_plans             boolean     TRUE
optimizer_adaptive_reporting_only    boolean     FALSE
optimizer_adaptive_statistics        boolean     FALSE

 

So this is DIFFERENT to the settings for the Automatic Indexing process. In a standard session, the CBO will NOT have the capability to accurately determine the correct cardinality estimates as we saw previously.

If we now re-run the SQL that returns no rows:

SQL> select code1, code2, name from iggy_pop where code1=1 and code2=42;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 2496796491

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      | 1000 | 18000 |     413 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| IGGY_POP             | 1000 | 18000 |     413 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_1awkddqkwa4f8 | 1000 |       |       4 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE1"=1 AND "CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        368 bytes sent via SQL*Net to client
        377 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

The execution uses the new index, because even though it STILL thinks 1000 rows are to be returned, that’s sufficiently few for the index to be costed the cheaper option.

When when re-run the SQL that returns many many rows:

 

SQL> select * from iggy_pop where code1=42 and code2=42;

100000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2496796491

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      | 1000 | 24000 |     413 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| IGGY_POP             | 1000 | 24000 |     413 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_1awkddqkwa4f8 | 1000 |       |       4 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE1"=42 AND "CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
         25 recursive calls
          0 db block gets
      41981 consistent gets
      40953 physical reads
          0 redo size
    1092240 bytes sent via SQL*Net to client
        581 bytes received via SQL*Net from client
         21 SQL*Net roundtrips to/from client
          1 sorts (memory)
          0 sorts (disk)
     100000 rows processed

 

Ouch. It also uses the new Automatic Index, because it also STILL thinks only 1000 rows are to be returned and just like the previous SQL statement, is determined to be the cheaper option.

BUT in this case it isn’t really the cheaper option, having to read the table potentially piecemeal at a time via the index, rather than more efficiently with fewer and larger multiblock reads via a FTS.

This is not really how Automatic is designed to work. Its meant to protect us from making SQL statements regress in performance BUT because there is a difference in how a normal session and the Automatic Indexing process determines the cost of execution plans, these scenarios can eventuate.

In my next blog I’ll look at how to address this specific scenario and then look at an example of how Automatic Indexing is really meant to work via the use of automated baselines…