jump to navigation

Oracle 19c Automatic Indexing: Invisible/Valid Automatic Indexes (Bowie Rare) August 31, 2021

Posted by Richard Foote in 19c, 19c New Features, Attribute Clustering, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Index Access Path, Index statistics, Invisible Indexes, Invisible/Valid Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Oracle Statistics, Oracle19c, Unusable Indexes.
add a comment

In my previous post, I discussed how newly created Automatic Indexes can have one of three statuses, depending the selectivity and effectiveness of the associated Automatic Index.

Indexes that improve performance sufficiently are created as Visible/Valid indexes and can be subsequently considered by the CBO. Indexes that are woeful and have no chance of improving performance are created as Invisible/Unusable indexes. Indexes considered potentially suitable but ultimately don’t sufficiently improve performance, are created as Invisible/Valid indexes.

Automatic Indexes are created as Visible/Valid indexes when shown to improve performance (by the _AUTO_INDEX_IMPROVEMENT_THRESHOLD parameter). But as I rarely came across Invisible/Valid Automatic Indexes (except for when Automatic Indexing is set to “Report Only” mode), I was curious to determine approximately at what point were such indexes created by the Automatic Indexing process.

To investigate things, I created a table with columns that contain data with various levels of selectivity, some of which should fall inside and outside the range of viability of any associated index, based on the cost of the associated Full Table Scan.

The following table has 32 columns of interest, each with a slight variation of distinct values giving small differences in overall column selectivity:

SQL> create table bowie_stuff1 (id number, code1 number, code2 number, code3 number, code4 number, code5 number, code6 number, code7 number, code8 number, code9 number, code10 number, code11 number, code12 number, code13 number, code14 number, code15 number, code16 number, code17 number, code18 number, code19 number, code20 number, code21 number, code22 number, code23 number, code24 number, code25 number, code26 number, code27 number, code28 number, code29 number, code30 number, code31 number, code32 number, name varchar2(42));

Table created.

SQL> insert into bowie_stuff1 
select rownum, 
       mod(rownum, 900)+1, 
       mod(rownum, 1000)+1, 
       mod(rownum, 1100)+1, 
       mod(rownum, 1200)+1, 
       mod(rownum, 1300)+1, 
       mod(rownum, 1400)+1, 
       mod(rownum, 1500)+1, 
       mod(rownum, 1600)+1, 
       mod(rownum, 1700)+1, 
       mod(rownum, 1800)+1, 
       mod(rownum, 1900)+1, 
       mod(rownum, 2000)+1, 
       mod(rownum, 2100)+1, 
       mod(rownum, 2200)+1, 
       mod(rownum, 2300)+1, 
       mod(rownum, 2400)+1, 
       mod(rownum, 2500)+1, 
       mod(rownum, 2600)+1, 
       mod(rownum, 2700)+1, 
       mod(rownum, 2800)+1, 
       mod(rownum, 2900)+1, 
       mod(rownum, 3000)+1, 
       mod(rownum, 3100)+1, 
       mod(rownum, 3200)+1, 
       mod(rownum, 3300)+1, 
       mod(rownum, 3400)+1, 
       mod(rownum, 3500)+1, 
       mod(rownum, 3600)+1, 
       mod(rownum, 3700)+1, 
       mod(rownum, 3800)+1, 
       mod(rownum, 3900)+1, 
       mod(rownum, 4000)+1,
       'THE RISE AND FALL OF ZIGGY STARDUST' 
from dual connect by level >=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

As always, it’s important that statistics be collected for Automatic Indexing to function properly:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STUFF1', estimate_percent=>null);

PL/SQL procedure successfully completed.

 

So on a 10M row table, I have 32 columns with the number of distinct values varying by only 100 values per column (or by a selectivity of just 0.001%):

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='BOWIE_STUFF1' order by num_distinct;

COLUMN_NAME  NUM_DISTINCT    DENSITY HISTOGRAM
------------ ------------ ---------- ---------------
NAME                    1  .00000005 FREQUENCY
CODE1                 900    .001111 HYBRID
CODE2                1000       .001 HYBRID
CODE3                1100    .000909 HYBRID
CODE4                1200    .000833 HYBRID
CODE5                1300    .000769 HYBRID
CODE6                1400    .000714 HYBRID
CODE7                1500    .000667 HYBRID
CODE8                1600    .000625 HYBRID
CODE9                1700    .000588 HYBRID
CODE10               1800    .000556 HYBRID
CODE11               1900    .000526 HYBRID
CODE12               2000      .0005 HYBRID
CODE13               2100    .000476 HYBRID
CODE14               2200    .000455 HYBRID
CODE15               2300    .000435 HYBRID
CODE16               2400    .000417 HYBRID
CODE17               2500      .0004 HYBRID
CODE18               2600    .000385 HYBRID
CODE19               2700     .00037 HYBRID
CODE20               2800    .000357 HYBRID
CODE21               2900    .000345 HYBRID
CODE22               3000    .000333 HYBRID
CODE23               3100    .000323 HYBRID
CODE24               3200    .000312 HYBRID
CODE25               3300    .000303 HYBRID
CODE26               3400    .000294 HYBRID
CODE27               3500    .000286 HYBRID
CODE28               3600    .000278 HYBRID
CODE29               3700     .00027 HYBRID
CODE30               3800    .000263 HYBRID
CODE31               3900    .000256 HYBRID
CODE32               4000     .00025 HYBRID
ID               10000000          0 HYBRID

I’ll next run the below queries (based on a simple equality predicate on each column) several times each in batches of 8 queries, so as to not swamp the Automatic Indexing process with potential new index requests (the ramifications of which I’ll discuss in another future post):

SQL> select * from bowie_stuff1 where code1=42;
SQL> select * from bowie_stuff1 where code2=42;
SQL> select * from bowie_stuff1 where code3=42;
SQL> select * from bowie_stuff1 where code4=42;
SQL> select * from bowie_stuff1 where code5=42;
...
SQL> select * from bowie_stuff1 where code31=42;
SQL> select * from bowie_stuff1 where code32=42;

 

If we now look at the statuses of the Automatic Indexes subsequently created:

SQL> select i.index_name, c.column_name, i.auto, i.constraint_index, i.visibility, i.status, i.num_rows, i.leaf_blocks, i.clustering_factor
from user_indexes i, user_ind_columns c
where i.index_name=c.index_name and i.table_name='BOWIE_STUFF1' order by visibility, status;

INDEX_NAME             COLUMN_NAME  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- ------------ --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_5rw9j3d8pc422   CODE5        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_48q3j752csn1p   CODE4        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_9sgharttf3yr7   CODE3        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_8n92acdfbuh65   CODE2        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_brgtfgngu3cj9   CODE1        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_1tu5u4012mkzu   CODE11       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_34b6zwgtm86rr   CODE12       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_gd0ccvdwwb4mk   CODE13       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_7k7wh28n3nczy   CODE14       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_67k2zjp09w101   CODE15       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_5fa6k6fm0k6wg   CODE10       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_4624ju6bxsv57   CODE9        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_bstrdkkxqtj4f   CODE8        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_39xqjjar239zq   CODE7        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_6h0adp60faytk   CODE6        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_5u0bqdgcx52vh   CODE16       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_0hzmhsraqkcgr   CODE22       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_4x716k4mdn040   CODE21       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_6wsuwr7p6drsu   CODE20       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_b424tdjx82rwy   CODE19       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_3a2y07fqkzv8x   CODE18       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_8dp0b3z0vxzyg   CODE17       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_d95hnqayd7t08   CODE23       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_fry4zrxqtpyzg   CODE24       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_920asb69q1r0m   CODE25       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_026pa8880hnm2   CODE31       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_96xhzrguz2qpy   CODE32       YES NO  VISIBLE   VALID      10000000       15368          10000000
SYS_AI_3dq93cc7uxruu   CODE29       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_5nbz41xny8fvc   CODE28       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_fz4q9bhydu2qt   CODE27       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_0kwczzg3k3pfw   CODE26       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_4qd5tsab7fnwx   CODE30       YES NO  VISIBLE   VALID      10000000       15367          10000000

We can see we indeed have the 3 statuses of Automatic Indexes captured:

Columns with a selectivity equal or worse to that of COL5 with 1300 distinct values are created as Invisible/Unusable indexes. Returning 10M/1300 rows or a cardinality of approx. 7,693 or more rows is just too expensive for such indexes on this table to be viable. This represents a selectivity of approx. 0.077%.

Note how the index statistics for these Invisible/Unusable indexes are not accurate. They all have an estimated LEAF_BLOCKS of 21702 and a CLUSTERING_FACTOR of 4272987. However, we can see from the other indexes which are physically created that these are not correct and are substantially off the mark with the actual LEAF_BLOCKS being around 15364 and the CLUSTERING_FACTOR actually much worse at around 10000000.

Again worthy of a future post to discuss how Automatic Indexing processing has to make (potentially inaccurate) guesstimates for these statistics in its analysis of index viability when such indexes don’t yet physically exist.

Columns with a selectivity equal or better to that of COL23 which has 3100 distinct values are created as Visible/Valid indexes. Returning 10M/3100 rows or a cardinality of approx. 3226 or less rows is cheap enough for such indexes on this table to be viable. This represents a selectivity of approx. 0.032%.

So in this specific example, only those columns between 1400 and 3000 distinct values meet the “borderline” criteria in which the Automatic Indexing process creates Invisible/Valid indexes. This represents a very very narrow selectivity range of only approx. 0.045% in which such Invisible/Valid indexes are created. Or for this specific example, only those columns that return approx. between 3,333 and 7,143 rows from the 10M row table.

Now the actual numbers and total range of selectivities for which Invisible/Valid Automatic Indexes are created of course depends on all sorts of factors, such as the size/cost of FTS of the table and not least the clustering of the associated data (which I’ve blogged about ad nauseam).

The point I want to make is that the range of viability for such Invisible/Valid indexes is relatively narrow and the occurrences of such indexes relatively rare in your databases. As such, the vast majority of Automatic Indexes are likely to be either Visible/Valid or Invisible/Unusable indexes.

It’s important to recognised this when you encounter such Invisible/Valid Automatic Indexes (outside of “REPORT ONLY” implementations), as it’s an indication that such an index is a borderline case that is currently NOT considered by the CBO (because of it being Invisible).

However, this Invisible/Valid Automatic Index status should really change to either of the other two more common statuses in the near future.

I’ll expand on this point in a future post…

Oracle 19c Automatic Indexing: The 3 Possible States Of Newly Created Automatic Indexes (“Don’t Sit Down”) August 24, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, CBO, Clustering Factor, Exadata, Invisible Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Indexes, Oracle Statistics.
1 comment so far

As I discussed way back in February 2021 (doesn’t time fly!!), I discussed some oddity cases in which Automatic Indexes were being created in an Invisible/Valid state. At the time, I described it as unexpected behaviour as this wasn’t documented and seemed an odd outcome, one which I had only expected to find when Automatic Indexing was set in “REPORT ONLY” mode.

After further research and discussions with folks within Oracle, Automatic Indexes created in this state is indeed entirely expected, albeit in relatively rare scenarios. So I thought I’ll discuss the 3 possible states in which an Automatic Index can be created and explore things further in future blog posts.

The follow demo illustrates the 3 different states in which Automatic Indexes can be created.

I start by creating a table with 3 columns of note:

  • CODE1 which is highly selective and very likely to be used by the CBO if indexed
  • CODE2 which is relatively selective BUT likely NOT quite enough so to be used by the CBO if indexed
  • CODE3 which is very unselective and almost certainly won’t be used by the CBO if indexed
SQL> create table david_bowie (id number, code1 number, code2 number, code3 number, name varchar2(42));

Table created.

SQL> insert into david_bowie select rownum, mod(rownum, 1000000)+1, mod(rownum, 5000)+1, mod(rownum, 100)+1, 'THE RISE AND FALL OF ZIGGY STARDUST' from dual connect by level >=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID_BOWIE');

PL/SQL procedure successfully completed.

Note that in an Autonomous Database, these columns will all now have histograms (as previously discussed):

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='DAVID_BOWIE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
ID                        9705425          0 HYBRID
CODE1                      971092    .000001 HYBRID
CODE2                        4835    .000052 HYBRID
CODE3                         100  .00000005 FREQUENCY
NAME                            1 4.9460E-08 FREQUENCY

I’ll now run the following simple queries a number of times, using predicates on each of the 3 columns:

SQL> select * from david_bowie where code1=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1390211489

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |   10 |   540 |    1076 (9) |  00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL | DAVID_BOWIE |   10 |   540 |    1076 (9) |  00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE1"=42)
     filter("CODE1"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      83297 consistent gets
      83285 physical reads
          0 redo size
        783 bytes sent via SQL*Net to client
        362 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed



SQL> select * from david_bowie where code2=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1390211489

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             | 2068 |  109K |   1083 (10) |  00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL | DAVID_BOWIE | 2068 |  109K |   1083 (10) |  00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE2"=42)
     filter("CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      83297 consistent gets
      83285 physical reads
          0 redo size
      32433 bytes sent via SQL*Net to client
        362 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed



SQL> select * from david_bowie where code3=42;

100000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1390211489

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             | 100K | 5273K |   1090 (10) |  00:00:01 |
|* 1 | TABLE ACCESS STORAGE FULL | DAVID_BOWIE | 100K | 5273K |   1090 (10) |  00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE3"=42)
     filter("CODE3"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      83297 consistent gets
      83285 physical reads
          0 redo size
    1984026 bytes sent via SQL*Net to client
        571 bytes received via SQL*Net from client
         21 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
     100000 rows processed

 

Obviously with no indexes in place, they all currently use a FTS.

If we wait though until the next Automatic Indexing reporting period and look at the next Automatic Indexing report:

 

SQL> select dbms_auto_index.report_last_activity() from dual;

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 3
Indexes created (visible / invisible)        : 2 (1 / 1)
Space used (visible / invisible)             : 276.82 MB (142.61 MB / 134.22 MB)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 1 (83301.1x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 2x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

 

We notice Automatic Indexing stated there were 3 index candidates, but has created 2 new indexes, one VISIBLE and one INVISIBLE.

Further down the report:

 

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
----------------------------------------------------------------------------
| Owner | Table       | Index                | Key   | Type   | Properties |
----------------------------------------------------------------------------
| BOWIE | DAVID_BOWIE | SYS_AI_48d67aycauayj | CODE1 | B-TREE | NONE       |
| BOWIE | DAVID_BOWIE | SYS_AI_cpw2p477wk6us | CODE2 | B-TREE | NONE       |
----------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We see that one index was created on the CODE1 column and the other on the CODE2 column (note: in the current 19.12.0.1.0 version of the Transaction Processing Autonomous Database, the * to denote invisible indexes above is no longer present).

No index is listed as being created on the very unselective CODE3 column.

If we continue down the report:

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 6vp85adas9tq3
SQL Text            : select * from david_bowie where code1=42
Improvement Factor  : 83301.1x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    246874                       1248
CPU Time (s):        139026                       694
Buffer Gets:         749710                       13
Optimizer Cost:      1076                         13
Disk Reads:          749568                       2
Direct Writes:       0                            0
Rows Processed:      90                           10
Executions:          9                            1

PLANS SECTION
--------------------------------------------------------------------------------
-------------

- Original
-----------------------------
Plan Hash Value : 1390211489

-----------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |      |       | 1076 |            |
|  1 | TABLE ACCESS STORAGE FULL | DAVID_BOWIE |   10 |   540 | 1076 |   00:00:01 |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 3510800558

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   540 |   13 |   00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID_BOWIE          |   10 |   540 |   13 |   00:00:01 |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_48d67aycauayj |   10 |       |    3 |   00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE1"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see that the Visible Index was actually created on the CODE1 column, thanks to the perceived 83301.1x performance improvement.

If we look at the status of all indexes now on our table:

SQL> select i.index_name, c.column_name, i.auto, i.constraint_index, i.visibility, i.compression, i.status, i.num_rows, i.leaf_blocks, i.clustering_factor
from user_indexes i, user_ind_columns c where i.index_name=c.index_name and i.table_name='DAVID_BOWIE';

INDEX_NAME             COLUMN_NAME AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- ----------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_48d67aycauayj   CODE1       YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       16891          10000000
SYS_AI_cpw2p477wk6us   CODE2       YES NO  INVISIBLE ADVANCED LOW  VALID      10000000       15369          10000000
SYS_AI_c8bkc2z4bxrzp   CODE3       YES NO  INVISIBLE ADVANCED LOW  UNUSABLE   10000000       20346           4173285

 

We see indexes with 3 different statuses:

  • CODE1 index is VISIBLE/VALID
  • CODE2 index is INVISIBLE/VALID
  • CODE3 index is INVISIBLE/UNUSABLE

The logic appears to be as follows:

If an index will demonstrably improve performance sufficiently, then the index is created as a VISIBLE and VALID index and can be subsequently used by the CBO.

If an index is demonstrably awful and has very little chance of ever being used by the CBO, it’s left INVISIBLE and put in an UNUSABLE state. It therefore takes up no space and will eventually be dropped. It will likely never be required, so no loss then if it doesn’t physically exist.

Interestingly, if an index is somewhat “borderline”, currently not efficient enough to be used by the CBO, but close enough perhaps that maybe things might change in the future to warrant such as index, then it is physically created as VALID but is not readily available to the CBO and remains in an INVISIBLE state. This index won’t have to be rebuilt in the future if indeed things change subsequently to enough to warrant future index usage.

It should of be noted that little of this is clearly documented and that it’s subject to change without notice. One of the key points of Automatic Indexing is that we can off-hand all this to Oracle and let Oracle worry about things. That said, it might be useful to understand why you might end up with indexes in different statuses and the subsequent impact this might make.

If we re-run the first query based on the CODE1 predicate:

SQL> select * from david_bowie where code1=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3510800558

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   540 |      14 (0) | 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID_BOWIE          |   10 |   540 |      14 (0) | 00:00:01 |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_48d67aycauayj |   10 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE1"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         14 consistent gets
          0 physical reads
          0 redo size
       1151 bytes sent via SQL*Net to client
        362 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

The CBO will indeed use the newly created Automatic Index.

But if we re-run either of the other 2 queries based on the CODE2 and CODE3 predicates:

SQL> select * from david_bowie where code2=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1390211489

-----------------------------------------------------------------------------------------
| Id  | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time     |
-----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |             | 2068 |  109K |   1083 (10) | 00:00:01 |
| * 1 | TABLE ACCESS STORAGE FULL | DAVID_BOWIE | 2068 |  109K |   1083 (10) | 00:00:01 |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE2"=42)
    filter("CODE2"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      83297 consistent gets
      83285 physical reads
          0 redo size
      32433 bytes sent via SQL*Net to client
        362 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

The CBO will not use an index as no VISIBLE/VALID indexes exist on these columns.

In future blog posts I’ll explore what is meant by “borderline” and what can subsequently happen to any such INVISIBLE/VALID Automatic Indexes…

Announcement: AUSOUG 1 Day Oracle Database AWR/Statspack Workshops with Richard Foote May 20, 2021

Posted by Richard Foote in AWR Reports, Oracle, Performance Tuning.
add a comment

It’s with much excitement that I announce a couple of in-person workshops I’ll be running in the coming months in conjunction with the Australian Oracle User Group (AUSOUG). The dates and locations are:

These will be 1 day workshops that discuss in detail when and how to effectively interpret and diagnose performance issues using Oracle Database AWR/Statspack Reports.

When an application suddenly runs “slow” or when people start complaining about the “poor performance” of the database, there’s often uncertainty in how to most quickly and most accurately determine the “root” cause of any such slowdown and effectively address any associated issues. One of the available tools in the DBA toolkit is the Automatic Workload Repository (AWR) Report, but many struggle to determine if it’s an appropriate report for the issue at hand and then how to read, decipher, and extract any useful information from the wealth of database diagnostic information available in AWR reports.

In this workshop, we explore a Tuning Methodology that helps Oracle professionals to both quickly and reliably determine if an AWR (or Statspack) report is the correct tool to use in determining the actual root causes of Oracle database performance issues.  Looking at several real-world scenarios, this workshop will show participants when to consider using AWR reports and then how to diagnose performance issues confidently and reliably by extracting the relevant details from such AWR reports.  Accurately determining the root causes of performance issues ensures the effectiveness and timeliness of any applied resolutions.

Participants will also be able to submit their own database AWR reports for analysis during the workshop and using the taught methodologies and techniques, identify any global performance issues within their own databases.

Both sessions will be followed by an end-of-day networking session.

AUSOUG are hosting these workshops at a very special price for their members, although it’s possible for non-members to also register (see link below):

AUSOUG One Day AWR/Statspack Workshop with Richard Foote

 

Places will be strictly limited to provide those in attendance plenty of time for questions and to have their AWR/Statspack reports anaylsed as part of the workshop. So please book early to avoid disappointment and to take advantage of the extra special early bird rate. At just $200 for AUSOUG members, it’s the best value training you will get in a very long time… 🙂

Future workshops may be added in the coming weeks, please stay tuned.

 

 

 

 

Oracle 19c Automatic Indexing: Function-Based Indexes? Part II (If You Can See Me) February 5, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.
add a comment

In my previous post, I discussed how Automatic Indexing does not currently support creating an index based on a function or expression predicate, even if it’s an equality predicate. You must manually create the associated function-based index.

However, if you have access to the application, there’s a better strategy when frequently searching on a function-based predicate. That’s to create a Virtual Column and use this column in your searching criteria (as mentioned by Connor McDonald in this comment).

To illustrate, I’m going to drop the previously manually created function-based index and hence the associated hidden virtual column, as Oracle quite rightly doesn’t allow you to have two virtual columns based on the same expression in the same table.

SQL> drop index david_upper_name_i;

Index dropped.

Since Oracle 11g, Oracle has supported the use of Visible Virtual Columns, a column that doesn’t physically exist, but defines a function/expression that can be easily accessed and populated when queried.

I’ll next create a Virtual Column called UPPER_NAME that is defined not based on a Data Type, but on the result on the UPPER function on the previously defined NAME column:

SQL> alter table david add (upper_name as (upper(name)));

Table altered.

Regardless of size of table, this column is added virtually instantly (pun fully intended), as no data is physically stored in the table itself. I view it (yep, another pun) as a “mini-view”, that can be used to hide complexity from the developer, with the actual data derived at run-time when the column is accessed in an SQL.

After I generate fresh statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null);

PL/SQL procedure successfully completed.

SQL> select column_name, hidden_column, virtual_column, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          HID VIR NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- --- --- ------------ ---------- ---------------
NAME                 NO  NO      10000000          0 HYBRID
MORE_STUFF9          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF8          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF7          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF6          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF5          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF4          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF3          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF2          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF10         NO  NO             1  .00000005 FREQUENCY
MORE_STUFF1          NO  NO             1  .00000005 FREQUENCY
ID                   NO  NO      10000000          0 HYBRID
CODE                 NO  NO         10000      .0001 HYBRID
UPPER_NAME           NO YES      10000000          0 HYBRID

Note how the UPPER_NAME virtual column is NOT hidden and now has up to date statistics.

We can now run this simplified query based on the new UPPER_NAME column, which does not need to include the potentially complex function expression:

SQL> select * from david where upper_name='DAVID BOWIE 42';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2426813604

-----------------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |    1 |   200 |    3349 (6) | 00:00:01  | 
|* 1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 |    3349 (6) | 00:00:01  |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("UPPER_NAME"='DAVID BOWIE 42')
    filter("UPPER_NAME"='DAVID BOWIE 42')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     263469 consistent gets
     263452 physical reads
          0 redo size
       1328 bytes sent via SQL*Net to client
        375 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

If we look at portions of the subsequent Automatic Indexing report:

 

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 360.71 MB (360.71 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 2 (263476.8x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 263476.8x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table | Index                | Key        | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | DAVID | SYS_AI_4k4mkgkw049ht | UPPER_NAME | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7tfqh3pu526mt
SQL Text            : select * from david where upper_name='DAVID BOWIE 42'
Improvement Factor  : 263484.7x

Execution Statistics:
-----------------------------
                        Original Plan                Auto Index Plan
                        ---------------------------- ----------------------------
Elapsed Time (s):       1471249                      1414
CPU Time (s):           300584                       986
Buffer Gets:            3161816                      4
Optimizer Cost:         3349                         4
Disk Reads:             3161432                      3
Direct Writes:          0                            0
Rows Processed:         12                           1
Executions:             12                           1

PLANS SECTION
--------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value : 2426813604

-----------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |      |       | 3349 |            |
|  1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 | 3349 | 00:00:01   |
-----------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no
- cardinality_feedback = yes

- With Auto Indexes
-----------------------------
Plan Hash Value : 1447691372

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows | Bytes | Cost | Time      |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |    1 |   200 |    4 | 00:00:01  |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |    4 | 00:00:01  |
| * 2 |  INDEX RANGE SCAN                    | SYS_AI_4k4mkgkw049ht |    1 |       |    3 | 00:00:01  |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("UPPER_NAME"='DAVID BOWIE 42')

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see from the report that Automatic Indexing has now created the associated, implicitly created function-based index (SYS_AI_4k4mkgkw049ht) based on the virtual UPPER_NAME column:

SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='DAVID';

INDEX_NAME           INDEX_TYPE                  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- --------------------------- --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_4k4mkgkw049ht FUNCTION-BASED NORMAL       YES NO  VISIBLE   VALID      10000000       43104           2136839

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='DAVID' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME          COLUMN_POSITION
-------------------- -------------------- ---------------
SYS_AI_4k4mkgkw049ht UPPER_NAME                         1

 

If we now re-run the SQL query:

SQL> select * from david where upper_name='DAVID BOWIE 4242';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1447691372

------------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                      |    1 |   200 |       4 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |       4 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | SYS_AI_4k4mkgkw049ht |    1 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("UPPER_NAME"='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
       1334 bytes sent via SQL*Net to client
        377 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The CBO now uses the new Automatic Index to significantly improve the performance of the query.

So not only is using a user defined Virtual Column a cleaner solution with respect to the frequent use of a function-based expressions, but has the added advantage of being supported with Automatic Indexing.

Oracle 19c Automatic Indexing: Function-Based Indexes? (No Plan) February 4, 2021

Posted by Richard Foote in 19c, 19c New Features, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.
3 comments

I previously discussed how Automatic Indexing only currently supports Equality based predicates.

The question I have today is does Automatic Indexing support function-based indexes? Let’s take a look.

The below DAVID table has the key column NAME which is an effectively unique VARCHAR2 column:

SQL> create table david (id number, code number, name varchar2(42), more_stuff1 varchar2(42), more_stuff2 varchar2(42), more_stuff3 varchar2(42), more_stuff4 varchar2(42), more_stuff5 varchar2(42), more_stuff6 varchar2(42), more_stuff7 varchar2(42), more_stuff8 varchar2(42), more_stuff9 varchar2(42), more_stuff10 varchar2(42));

Table created.

SQL> insert into david select rownum, mod(rownum, 10000)+1, 'David Bowie '|| rownum, 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust' from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null);

PL/SQL procedure successfully completed.

If we look at the current details of the table columns:

SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
NAME                     10000000          0 HYBRID
MORE_STUFF9                     1  .00000005 FREQUENCY
MORE_STUFF8                     1  .00000005 FREQUENCY
MORE_STUFF7                     1  .00000005 FREQUENCY
MORE_STUFF6                     1  .00000005 FREQUENCY
MORE_STUFF5                     1  .00000005 FREQUENCY
MORE_STUFF4                     1  .00000005 FREQUENCY
MORE_STUFF3                     1  .00000005 FREQUENCY
MORE_STUFF2                     1  .00000005 FREQUENCY
MORE_STUFF10                    1  .00000005 FREQUENCY
MORE_STUFF1                     1  .00000005 FREQUENCY
ID                       10000000          0 HYBRID
CODE                        10000      .0001 HYBRID

We notice the same oddity of my previous post that all columns have histograms…

Let’s run the following query with an UPPER function-based predicate that returns only the one row:

SQL> select * from david where upper(name) = 'DAVID BOWIE 4242';

Execution Plan
----------------------------------------------------------
Plan hash value: 2426813604

-----------------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       | 100K |   17M |    3350 (6) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | DAVID | 100K |   17M |    3350 (6) | 00:00:01  |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage(UPPER("NAME")='DAVID BOWIE 4242')
    filter(UPPER("NAME")='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     263469 consistent gets
     263452 physical reads
          0 redo size
       1256 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

What does Automatic Indexing make of this scenario?

Basically, it does nothing. Currently, Automatic Indexing does NOT support such function-based indexes, even with equality based predicates (as of at least version 19.5.0.0.0). If we look at the next Automatic Indexing report:

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 2
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

No such function-based index is ever created by Automatic Indexing:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='DAVID';

no rows selected

To improve the performance of this query, one has to manually create the necessary function-based index:

SQL> create index david_upper_name_i on david(upper(name));

Index created.

If we now re-run the query:

SQL> select name from david where upper(name) = 'DAVID BOWIE 4242';

Execution Plan
----------------------------------------------------------
Plan hash value: 2675555529

----------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name               | Rows  | Bytes | Cost (%CPU) | Time     |
----------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                    |  100K | 4199K |    3175 (1) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID              |  100K | 4199K |    3175 (1) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | DAVID_UPPER_NAME_I | 40000 |       |       3 (0) | 00:00:01 |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(UPPER("NAME")='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
        369 bytes sent via SQL*Net to client
        384 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The query now uses the function-based index to significantly improve the performance of this query, with just 5 consistent gets.

Note however as with all function-based indexes, by default the estimated cardinality estimate and associated CBO costs are way off (100K rows are estimated, not the 1 row that is actually returned). This is due to the CBO having no real idea of the number and distribution of values coming out of the “black box” function-based predicate.

This is why Oracle automatically creates an hidden virtual column by which to store the necessary statistics associated to the function (in this case the SYS_NC00014$ column):

SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
NAME                     10000000          0 HYBRID
MORE_STUFF9                     1  .00000005 FREQUENCY
MORE_STUFF8                     1  .00000005 FREQUENCY
MORE_STUFF7                     1  .00000005 FREQUENCY
MORE_STUFF6                     1  .00000005 FREQUENCY
MORE_STUFF5                     1  .00000005 FREQUENCY
MORE_STUFF4                     1  .00000005 FREQUENCY
MORE_STUFF3                     1  .00000005 FREQUENCY
MORE_STUFF2                     1  .00000005 FREQUENCY
MORE_STUFF10                    1  .00000005 FREQUENCY
MORE_STUFF1                     1  .00000005 FREQUENCY
ID                       10000000          0 HYBRID
CODE                        10000      .0001 HYBRID
SYS_NC00014$                                 NONE

But we need to first collect statistics on this hidden virtual column for the statistics to be populated:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', no_invalidate=> false, method_opt=> 'FOR ALL HIDDEN COLUMNS SIZE 1');

SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
NAME                     10000000          0 HYBRID
MORE_STUFF9                     1  .00000005 FREQUENCY
MORE_STUFF8                     1  .00000005 FREQUENCY
MORE_STUFF7                     1  .00000005 FREQUENCY
MORE_STUFF6                     1  .00000005 FREQUENCY
MORE_STUFF5                     1  .00000005 FREQUENCY
MORE_STUFF4                     1  .00000005 FREQUENCY
MORE_STUFF3                     1  .00000005 FREQUENCY
MORE_STUFF2                     1  .00000005 FREQUENCY
MORE_STUFF10                    1  .00000005 FREQUENCY
MORE_STUFF1                     1  .00000005 FREQUENCY
ID                       10000000          0 HYBRID
CODE                        10000      .0001 HYBRID
SYS_NC00014$              9947366          0 HYBRID

Now the CBO has the necessary statistics by which to determine a much more accurate cardinality estimate for the function-based predicate and so potentially a more efficient execution plan:

SQL> select * from david where upper(name) = 'DAVID BOWIE 4242';

Execution Plan
----------------------------------------------------------
Plan hash value: 2675555529

----------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name               | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                    |    1 |   200 |       4 (0) | 00:00:01  |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID              |    1 |   200 |       4 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | DAVID_UPPER_NAME_I |    1 |       |       3 (0) | 00:00:01  |
----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(UPPER("NAME")='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          1 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
       1256 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

With the virtual column statistics in place, the CBO now has the cardinality estimate of 1 and associated costs spot on, which is always a good thing.

This requirement to collect the necessary statistics on the associated virtual column created as a result of the function-based index to ensure the index is costed and used effectively is perhaps but one reason why function-based indexes are currently not supported by Automatic Indexing.

As always, this can always change in the future…

Oracle Database 19c Automatic Indexing: Invisible Indexes Oddity (Wild Eyed Boy From Freecloud) February 3, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Automatic Table Statistics, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Histograms, Invisible Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c.
2 comments

There have been a couple of “oddities” in relation to both Oracle Autonomous Databases and Automatic Indexing behaviour that I’ve seen frequently enough now (on Oracle 19.5.0.0.0) to make it worth a quick blog article.

The following is a simple test case that highlights both these issues. I’ll begin with a basic table, that has the key column CODE with a selectivity that would likely make it too expensive to be accessed via an associated index.

SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon' from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

Importantly, I’ll next collect statistics on this table using all the default attributes, including allowing Oracle to decide the merits of any column histogram:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD');

PL/SQL procedure successfully completed.

Note I’ve yet to run a single query against this table. And yet, if we look at the details of each of these columns:

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='PINK_FLOYD';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- ------------ ---------- ---------------
ID                        9705425          0 HYBRID
CODE                         4835     .00005 HYBRID
CREATE_DATE                 50357     .00002 HYBRID
NAME                            1 4.9639E-08 FREQUENCY

All the columns have a histogram !! This despite the columns not meeting either criteria normally required for a histogram, that the column be used in a SQL predicate AND for the column to have an uneven distribution of values.

None of these columns have yet to be used in a filtering predicate and none of these columns have a uneven distribution of values, even the CODE column as highlighted by looking at the minimum and maximum number of occurrences:

SQL> select min(code_count), max(code_count) from (select count(*) code_count from pink_floyd group by code);

MIN(CODE_COUNT) MAX(CODE_COUNT)
--------------- ---------------
           1845            2163

So it’s very odd for these histograms to be present.

If we run the following query with a filtering predicate based on the CODE column:

SQL> select * from pink_floyd where code=42;

2012 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            | 2068 | 82720 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD | 2068 | 82720 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63655 consistent gets
      63645 physical reads
          0 redo size
      38575 bytes sent via SQL*Net to client
        360 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2012 rows processed

The CBO currently has no choice but to use a FTS with no index currently present. But what will Automatic Indexing make of things? If we look at the next automatic indexing report:

 

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                      : 2
Indexes created (visible / invisible) : 1 (0 / 1)
Space used (visible / invisible)      : 134.22 MB (0 B / 134.22 MB)
Indexes dropped                       : 0
SQL statements verified               : 1
SQL statements improved               : 0
SQL plan baselines created            : 0
Overall improvement factor            : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
----------------------------------------------------------------------------
| Owner | Table      | Index                  | Key  | Type   | Properties |
----------------------------------------------------------------------------
| BOWIE | PINK_FLOYD | * SYS_AI_dp2t0j12zux49 | CODE | B-TREE | NONE       |
----------------------------------------------------------------------------
-------------------------------------------------------------------------------

We notice that Oracle has created an Automatic Index, but it’s an INVISIBLE index !!

If we look at the details of this Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='PINK_FLOYD';

INDEX_NAME                AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_dp2t0j12zux49      YES NO  INVISIBLE ADVANCED LOW  VALID      10000000       15369           9845256

The index is in an INVISIBLE/VALID state, not the usual INVISIBLE/UNUSABLE state for an index for which Automatic Indexing decides an index is not efficient enough to be implement.

This is NOT expected behaviour.

Usually INVISIBLE/VALID indexes are created when Automatic Indexing is in “REPORT ONLY” mode, although I have come across this scenario when statistics are stale or missing. But in this case, Automatic Indexing is in “IMPLEMENT” mode and the table has recently collected statistics, albeit with odd histograms present (hence why I think these issues to be related).

If we run the same query again:

SQL> select * from pink_floyd where code=42;

2012 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            | 2068 | 82720 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD | 2068 | 82720 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63655 consistent gets
      63645 physical reads
          0 redo size
      38575 bytes sent via SQL*Net to client
        360 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2012 rows processed

The CBO has again no option but to use the FTS as Invisible indexes can not be considered by the CBO. However, it’s important to note that such an index would not be used by the CBO anyways as it would be deemed too expensive to use than the current FTS.

If you’re relying on Automatic Indexing and have it in Implement mode, I would recommend checking for any indexes in this INVISIBLE/VALID state as they’re an indication that something has very likely gone wrong…

Oracle Database 19c Automatic Indexing: Index Compression Update (New Morning) January 27, 2021

Posted by Richard Foote in 19c, 19c New Features, Advanced Index Compression, Autonomous Database, Autonomous Transaction Processing, AUTO_INDEX_COMPRESSION, Exadata, Index Column Order, Index Compression, Oracle, Oracle Blog, Oracle General, Oracle Indexes, Oracle19c.
add a comment

 

I was reminded in a recent comment by Rajeshwaran Jeyabal that I hadn’t updated my post on Automatic Indexing with Advanced Compression that’s in need of a couple of amendments.

Initially when Automatic Indexing was released, the ability to set Advanced Compression was NOT included in the official documentation, although the EXEC DBMS_AUTO_INDEX.CONFIGURE( ‘AUTO_INDEX_COMPRESSION‘ , ‘ON’); option was readily accessible. This has now been fixed and the associated doco on setting Advanced Compression for Automatic Indexes can be found here.

The other significant change is that Advanced Compression Low is now the default behaviour when Automatic Indexes are created in the Oracle ATP Autonomous Database Cloud environment. This makes sense in that if you have access to the Advanced Compression option, setting all indexes to Advanced Compression Low is the no-brainer setting as I’ve discussed previously. So several of my more recent posts show how Automatic Indexes have been created with Advanced Compression Low set.

What hasn’t changed however is how Automatic Indexing does NOT consider the efficiency of an index in relation to Index Compression when deciding how to order the columns within the index.

The default order of columns within an index (when other SQL predicates are not a consideration) is simply the order by which the columns appear within the table. Even though an index could be significantly smaller thanks to Index Compression if columns with more repeated values are ordered first within an index, this is not something Automatic Indexing currently considers.

The demo in my original piece still works exactly the same in the current 19c database versions of the ATP Autonomous Cloud environments. Manually created indexes can be significantly smaller if index columns are reordered or dropped entirely if they don’t provide filtering benefits.

When reading my blog, please do take note of the date of blog piece, especially in relation to Automatic Indexing. Things are only accurate as at time of publication and may change subsequently.

I thank Rajeshwaran for getting me to pull my finger out and update my blog accordingly…

Oracle 19c Automatic Indexing: Non-Equality Predicates Part II (Let’s Spend The Night Together) January 21, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Non-Equality Predicates, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Performance Tuning.
1 comment so far

In my previous post in this series, I discussed out Automatic Indexing currently does not consider Non-Equality predicates. Automatic Indexing will index columns based only on Equality predicates.

So how does Oracle handle the scenario when an SQL has a mixture of both Equality and Non-Equality predicates?

I’ll begin by creating two very similar tables, but with the second table having a more selective CODE column:

SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD');

PL/SQL procedure successfully completed.


SQL> create table pink_floyd1 (id number, code number, create_date date, name varchar2(42));

Table created.

SQL> insert into pink_floyd1 select rownum, ceil(dbms_random.value(0, 25000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon'
from dual connect by level <=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD1');

PL/SQL procedure successfully completed.

 

So table PINK_FLOYD has 5,000 distinct CODE values, whereas table PINK_FLOYD1 has 25,000 distinct CODE values.

I’ll next run the following identical SQLs, which both use an Equality predicate on the CODE column and a Non-Equality predicate on the CREATE_DATE column. The CODE column provides some filtering (more so with the PINK_FLOYD1 table) but in combination with the CREATE_DATE column, results in the ultimate filtering with no rows returned:

 

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed


SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 564520720

-----------------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |    1 |    41 |    856 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |    856 (11) | 00:00:01  |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))
     filter("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00','syyyy-mm-dd hh24:mi:ss'))

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      64424 consistent gets
      64413 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

So how does Automatic Indexing handle this scenario. If we look at the subsequent Automatic Indexing report (highlights only):

 

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
-----------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties   |
-----------------------------------------------------------------------------
| BOWIE | PINK_FLOYD1 | SYS_AI_96snkmu4sk44g | CODE | B-TREE | NONE         |
-----------------------------------------------------------------------------
-------------------------------------------------------------------------------


-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7wag3gbk0b3tm
SQL Text            : select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021'
Improvement Factor  : 64442.3x

Execution Statistics:
-----------------------------
                      Original Plan                Auto Index Plan
                      ---------------------------- ----------------------------
Elapsed Time (s):     568513                       2771
CPU Time (s):         275534                       1874
Buffer Gets:          1031078                      406
Optimizer Cost:       856                          405
Disk Reads:           1030609                      3
Direct Writes:        0                            0
Rows Processed:       0                            0
Executions:           16                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 564520720

-----------------------------------------------------------------------------------
| Id | Operation                 | Name        | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |             |      |       |  856 |            |
|  1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD1 |    1 |    41 |  856 | 00:00:01   |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2703636439

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |    1 |    41 |  405 | 00:00:01   |
| * 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |  405 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  403 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

If we look at the definitions of all indexes currently on these tables:

SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS   NUM_ROWS   LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_dp2t0j12zux49           YES INVISIBLE ADVANCED LOW  UNUSABLE   10000000       21702           4161898

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_dp2t0j12zux49           CODE                          1


SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='PINK_FLOYD1';

INDEX_NAME                     AUT VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_96snkmu4sk44g           YES VISIBLE   ADVANCED LOW  VALID      10000000       15400           9969473

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='PINK_FLOYD1';

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_96snkmu4sk44g           CODE                          1

 

In both cases, Automatic Indexing only created an index on the CODE column, as it was the only column with an Equality predicate.

However, the Automatic Index on the table PINK_FLOYD remained in an INVISIBLE/UNUSABLE. That’s because an index on only the CODE column was not efficient enough to improve the performance of the SQL, due to the filtering not being sufficient enough and because of the relatively poor Clustering Factor.

The index on the table PINK_FLOYD1 was eventually created as a VISIBLE/VALID index, as its better filtering was sufficient to actually improve the performance of the SQL.

So if we re-run the first query:

SQL> select * from pink_floyd where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 1152280033

----------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |    1 |    40 |    844 (11) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD |    1 |    40 |    844 (11) | 00:00:01  |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)
     filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-ddhh24:mi:ss') AND "CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      63660 consistent gets
      63649 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        380 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

It continues to use a Full Table Scan.

If we re-run the second query:

 

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 2703636439

------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                 | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                      |    1 |    41 |     415 (0) | 00:00:01  |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1          |    1 |    41 |     415 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | SYS_AI_96snkmu4sk44g |  412 |       |       3 (0) | 00:00:01  |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
        406 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

If now uses the newly created Automatic Index, with an improved 406 Consistent Gets (down from the previous 64424 Consistent Gets with the FTS).

BUT if we were to manually create an index on BOTH CODE and CREATE_DATE columns:

SQL> create index pink_floyd1_code_create_date_i on pink_floyd1(code, create_date) compress advanced low;

Index created.

SQL> select * from pink_floyd1 where code=42 and create_date> '19-JAN-2021';

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3366491378

----------------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                           | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                                |    1 |    41 |       4 (0) | 00:00:01  |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | PINK_FLOYD1                    |    1 |    41 |       4 (0) | 00:00:01  |
|* 2 | INDEX RANGE SCAN                    | PINK_FLOYD1_CODE_CREATE_DATE_I |    1 |       |       3 (0) | 00:00:01  |
----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42 AND "CREATE_DATE">TO_DATE(' 2021-01-19 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND
"CREATE_DATE" IS NOT NULL)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          3 consistent gets
          0 physical reads
          0 redo size
        426 bytes sent via SQL*Net to client
        381 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

 

Performance improves significantly further, by reducing Consistent Gets down to just 3.

So if you have SQL statements with a mixture of both Equality and Non-Equality predicates, you may encounter these 2 scenarios:

A potentially efficient index that is not created at all as the filtering on just the Equality based predicates are not sufficient to create a viable index, or

A potentially suboptimal Automatic Index that doesn’t contain useful filtering columns because they’re used in Non-Equality predicates…

Announcement: Both Of My Oracle Webinars Scheduled For February 2021 !! January 19, 2021

Posted by Richard Foote in Oracle, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Indexing Internals Webinar, Oracle Performance Diagnostics and Tuning Webinar, Richard Foote Seminars.
add a comment

I’m please to announce that both of my highly acclaimed webinars scheduled for February are now confirmed.

These webinars are a must for any Oracle DBA or Developer interested in designing, developing or maintaining high performance, highly scalable Oracle-based applications or databases.

However only a few places are currently available on each webinar with numbers very strictly limited, as I only run small classes to give every attendee the opportunity to get the most from the training experience.

Webinar details are as follows:

 

8-12 February 2021 (5pm-9pm AEDT) – Oracle Indexing Internals and Best Practices Webinar  (International Customers Only)

This seminar examines most available Oracle index structures/options and discusses in considerable detail how indexes function, how/when they should be used and how they should be maintained. It covers many useful tips and strategies to maximise the benefits of indexes on application/database performance and scalability, as well as in maximising Oracle database investments. Much of the material is exclusive to this seminar and is not generally available in Oracle documentation or in Oracle University courses.

 

23-26 February 2021 (5pm-9pm AEDT) – Oracle Performance Diagnostics and Tuning Webinar
(International Customers Only)

The seminar will detail how to maximise the performance of both Oracle databases and associated applications and how to diagnose and address any performance issues as quickly and effectively as possible. It also provides opportunity to have your own AWR reports analysed to identify performance issues.

 

You can also purchase tickets to both webinars at a special combo discount:

Both “Oracle Indexing Internals and Best Practices” and “Oracle Performance Diagnostics and Tuning” Webinars running in February 2021:   (International Customers Only)

 

Each webinar is $1,800 (AUS) individually and $3,000 (AUS) when both purchased in the combo package (the Buy Now functionality is available for International Customers Only).

Please Note: If based in Australia, please contact me (at richard@richardfooteconsulting.com) for a tax invoice that includes GST and instructions on how to pay.

 

Please contact me directly if you wish to pay via invoice and direct bank transfer or if you have any questions.

 

For full content details of the “Oracle Indexing Internals and Best Practices” Webinar: https://richardfooteconsulting.com/indexing-seminar/

For full content details of the “Oracle Performance Diagnostics and Tuning” Webinar: https://richardfooteconsulting.com/performance-tuning-seminar/

 

Hopefully you can take advantage of the opportunity to participate in this unique training experience (see here for some testimonials)…

Oracle 19c Automatic Indexing: Non-Equality Predicates Part I (Lucy Can’t Dance) January 14, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Exadata, Full Table Scans, Non-Equality Predicates, Oracle, Oracle Blog, Oracle Cloud, Oracle Indexes, Oracle19c.
4 comments

 

I’ve been waiting a while before posting a series on the various limitations associated with Automatic Indexing, in order to see how the feature matures over time.

The following have all been re-tested post 1 January 2021 on the Autonomous ATP Database Cloud service, using Oracle Database version 19.5.0.0.0.

In the Oracle Documentation (including version 21c), the only limitations with regard Automatic Indexing listed are the following:

  • Auto indexes are local B-tree indexes.
  • Auto indexes can be created for partitioned as well as non-partitioned tables.
  • Auto indexes cannot be created for temporary tables.

Well, as I discussed in the previous series on Automatic Indexing on Partitioned tables, we already saw how Oracle can actually also create Non-Partitioned (Global) indexes. So the limitation on Automatic Indexes being “local” indexes is not actually correct, even with 19c.

But are there other limitations that are not officially documented?

If you look at every example I’ve used previously with regard Automatic Indexing, they all feature Equality predicates. In the following examples, I’m going to run a series on Range Scan predicates that heavily filter and would benefit greatly from an index.

I first create a simple table with 10M rows:

SQL> create table ziggy1 (id number, code number, name varchar2(42));

Table created.

SQL> insert into ziggy1 select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY1');

PL/SQL procedure successfully completed.

 

I then run the following range scan queries several times that each return only a few rows:

SQL> select * from ziggy1 where id between 42 and 50;

9 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    8 |   184 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    8 |   184 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<=50 AND "ID">=42)
    filter("ID"<=50 AND "ID">=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        596 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          9 rows processed


SQL> select * from ziggy1 where id < 0;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    1 |    23 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    1 |    23 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<0)
    filter("ID"<0)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        364 bytes sent via SQL*Net to client
        344 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

SQL> select * from ziggy1 where id > 100000000000;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    1 |    23 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    1 |    23 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID">100000000000)
    filter("ID">100000000000)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        364 bytes sent via SQL*Net to client
        355 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

If we look at the subsequent Automatic Indexing report:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 13-JAN-2021 11:55:37
Activity end                : 13-JAN-2021 11:56:20
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 3
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

We notice NO Automatic Indexes were created.

We can run these queries endlessly and Automatic Indexing will never create associated Automatic Indexes:

SQL> select index_name, auto, constraint_index, visibility from user_indexes where table_name='ZIGGY1';

no rows selected

These queries are doomed to perform Full Table Scans unless indexes are manually created:

SQL> select * from ziggy1 where id between 42 and 50;

9 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4062853157

------------------------------------------------------------------------------------
| Id | Operation                 | Name   | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |        |    8 |   184 |    538 (14) | 00:00:01  |
|* 1 | TABLE ACCESS STORAGE FULL | ZIGGY1 |    8 |   184 |    538 (14) | 00:00:01  |
------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ID"<=50 AND "ID">=42)
    filter("ID"<=50 AND "ID">=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39436 consistent gets
      39425 physical reads
          0 redo size
        596 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          9 rows processed

 

Currently Automatic Indexes do not support Non-Equality predicates. Automatic Indexes are only created based on Equality-based predicates.

Obviously, Automatic Indexing is a fabulous feature and this might all change in the future. But with Non-Equality predicates so prevalent in SQL, it’s vital to note this current limitation when using and relying on Automatic Indexing…

Oracle 19c Automatic Indexing: Indexing Partitioned Tables Part II (Neighbourhood Threat) January 13, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Transaction Processing, CBO, Exadata, Local Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning.
4 comments

In my first post on Automatic Indexing on Partitioned Tables, I discussed how Automatic Indexing (AI) can now create a Non-Partitioned index if deemed the most effective indexing structure (this wasn’t supported when AI was initially released). A Non-Partitioned index is indeed likely the most efficient indexing structure if the underlying table has many partitions and associated SQL equality predicates only reference non-partition key columns. A Non-Partitioned index ensure Oracle only needs to scan the single index structure and not all the partitions of a Local index.

But what if SQLs do reference the column by which the underlying table is partitioned?

The following SQL has an equality filtering predicate on the RELEASE_DATE column, the column by which the BIG_BOWIE1 table is partitioned:

SQL> SELECT * FROM big_bowie1 where release_date = to_date('2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss');

no rows selected

If we look at the subsequent AI report:

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
| Owner | Table      | Index                | Key          | Type   | Properties |
------------------------------------------------------------------------------------------------
| BOWIE | BIG_BOWIE1 | SYS_AI_14gpurjp8m76s | RELEASE_DATE | B-TREE | LOCAL      |
------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We notice that Automatic Indexing has in this instance created a Local Index.

If we look further down the AI report:

-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 4mm3mbkk38pa8
SQL Text            : SELECT * FROM big_bowie1 where release_date = to_date('2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss')
Improvement Factor  : 8339x

Execution Statistics:
-----------------------------
                  Original Plan                Auto Index Plan
                  ---------------------------- ----------------------------
Elapsed Time (s): 146957                       71
CPU Time (s):     146124                       71
Buffer Gets:      16678                        3
Optimizer Cost:   162                          4
Disk Reads:       0                            0
Direct Writes:    0                            0
Rows Processed:   0                            0
Executions:       2                            1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 4031749531

-----------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
| 0  | SELECT STATEMENT          |            |      |       |  162 |             |
| 1  | PARTITION RANGE SINGLE    |            | 3602 | 93652 |  162 | 00:00:01    |
| 2  | TABLE ACCESS STORAGE FULL | BIG_BOWIE1 | 3602 | 93652 |  162 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 4049653350

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                                 | Name                 | Rows | Bytes | Cost | Time        |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                      |    3 |    78 |    4 | 00:00:01    |
|   1 | PARTITION RANGE SINGLE                    |                      |    3 |    78 |    4 | 00:00:01    |
|   2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE1           |    3 |    78 |    4 | 00:00:01    |
| * 3 | INDEX RANGE SCAN                          | SYS_AI_14gpurjp8m76s |    1 |       |    3 | 00:00:01    |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 3 - access("RELEASE_DATE"=TO_DATE(' 2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss'))

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We can see Automatic Indexing has created the index because it provides an average Improvement Factor of 8339x. As the necessary indexed column(s) matches the table partitioning key, it makes sense for the associated index be a Local index as Oracle is certain which specific index partition to visit based on the value of the equality predicate.

If we look at the details of this new AI:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BIG_BOWIE1';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_14gpurjp8m76s           YES NO  VISIBLE   ADVANCED LOW  N/A        20000000       30742          19941449
SYS_AI_8armv0hqq73fa           YES NO  VISIBLE   ADVANCED LOW  VALID      20000000       42697          19995451

SQL> select index_name, column_name, column_position from user_ind_columns where table_name='BIG_BOWIE1' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
SYS_AI_14gpurjp8m76s           RELEASE_DATE                  1
SYS_AI_8armv0hqq73fa           TOTAL_SALES                   1

SQL> SELECT index_name, partitioning_type, partition_count, locality FROM user_part_indexes WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PARTITION PARTITION_COUNT LOCALI
------------------------------ --------- --------------- ------
SYS_AI_14gpurjp8m76s           RANGE                   8 LOCAL

SQL> select index_name, partition_name, status, compression from user_ind_partitions
where index_name in (select index_name from user_indexes where table_name='BIG_BOWIE1')
order by partition_position;

INDEX_NAME           PARTITION_NAME       STATUS   COMPRESSION
-------------------- -------------------- -------- -------------
SYS_AI_14gpurjp8m76s ALBUMS_2013          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2014          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2015          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2016          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2017          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2018          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2019          USABLE   ADVANCED LOW
SYS_AI_14gpurjp8m76s ALBUMS_2020          USABLE   ADVANCED LOW

 

We can see that indeed, a Visible, Usable, Local index was created by Automatic Indexing.

So depending on the column(s) within the index, Automatic Indexing can potentially create either a Local or Non-Partitioned index when indexing a partitioned table.

Oracle 19c Automatic Indexing: Currently Broken In Autonomous ATP Databases Part II (Fix You) January 12, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Oracle, Oracle General, Oracle Indexes.
2 comments

 

Firstly, thank you for all those who contacted me regarding the issue with Oracle Automatic Indexing (AI) having stopped working within the Autonomous ATP Database Cloud service. It appears this issue was indeed widely spread and impacted numerous (if not all) Autonomous ATP Database Cloud services.

This was all possibly due to (unpublished) bug 32151108 that results in a STOPPED AI task status due to No Data Found exception.

The good news is that this all appears to now be fixed (at least in the Sydney Data Centre from at least 12th January 2021). I logged on for the first time in some weeks yesterday, where the AI tasks were again running successfully and creating the necessary indexes. As the cloud service had been stopped, upon restarting the database, the AI tasks had to address the significant backlog of SQL statements that had built up and needed to be actioned.

In the coming days, I’ll run a number of tests to confirm that all indeed is now well.

On Twitter Tanel Poder made the following pertinent comment:

The quality of support org & maintenance is even more important when consuming the entire DB platform as a service, with no access to OS, internals and patching yourself…

I couldn’t agree more.

Oracle 19c Automatic Indexing: Indexing Partitioned Tables Part I (Conversation Piece) October 14, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Index Access Path, Local Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning, Performance Tuning.
1 comment so far

In this little series, I’m going to discuss how Automatic Indexing works in relation to Partitioning.

I’ve discussed Indexing and Partitioning many times previously and how Oracle has various options when indexing a partitioned table:

  • Non-Partitioned Index
  • Globally Partitioned Index
  • Locally Partitioned Index

So the question(s) are how does Automatic Indexing handle scenarios with partitioned objects?

A very important point to make at the start is that based on my research, the answer has already changed significantly since Automatic Indexing was first released. So it’s important to understand that Automatic Indexing is an ever evolving capability, that will advance and improve as time goes on.

I’ll focus on how the feature currently works (as of Oracle Database 19.5), but will mention previously identified behaviour as a reference on how things can easily change.

In my first simple little example, I’m just going to create a range-partitioned table, partitioned based on RELEASE_DATE, with a partition for each year’s worth of data:

SQL> CREATE TABLE big_bowie1(id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2013 VALUES LESS THAN (TO_DATE('01-JAN-2014', 'DD-MON-YYYY')),
PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')),
PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
PARTITION ALBUMS_2020 VALUES LESS THAN (MAXVALUE));

Table created.

 

I’ll now add about 8 years worth of data:

SQL> INSERT INTO big_bowie1 SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> COMMIT;

Commit complete.

 

As discussed previously, I’ll importantly collect statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE1');

PL/SQL procedure successfully completed.

 

I’ll now run the following very selective query based the TOTAL_SALES column that is NOT part of the partitioning key:

 

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2468051548

---------------------------------------------------------------------------------------------------------
| Id | Operation                | Name       | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop     |
---------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |            |   20 |   520 |    643 (15)| 00:00:01 |       |           |
|  1 | PARTITION RANGE ALL      |            |   20 |   520 |    643 (15)| 00:00:01 |     1 |         8 |
|* 2 | TABLE ACCESS STORAGE FULL| BIG_BOWIE1 |   20 |   520 |    643 (15)| 00:00:01 |     1 |         8 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("TOTAL_SALES"=42)
    filter("TOTAL_SALES"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      44014 consistent gets
       9516 physical reads
          0 redo size
       1107 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         19 rows processed

 

Without an index in place, the CBO has no choice but to use a FTS. But what will Automatic Indexing make of things?

If we look at the next Automatic Indexing report:

 

SQL> select dbms_auto_index.report_last_activity() from dual;

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 13-OCT-2020 01:47:48
Activity end                : 13-OCT-2020 02:59:48
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 184.55 MB (184.55 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 1 (44119.6x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 25135.8x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| BOWIE | BIG_BOWIE1 | SYS_AI_2zt7rg40mxa4n | TOTAL_SALES | B-TREE | NONE       |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : chwm2gubm8fx9
SQL Text            : SELECT * FROM big_bowie1 WHERE total_sales = 42
Improvement Factor  : 44119.6x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    4387193                      1173
CPU Time (s):        2599423                      1037
Buffer Gets:         749507                       22
Optimizer Cost:      643                          22
Disk Reads:          470976                       2
Direct Writes:       0                            0
Rows Processed:      323                          19
Executions:          17                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 2468051548

-----------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |            |      |       |  643 |             |
|  1 | PARTITION RANGE ALL       |            |   20 |   520 |  643 | 00:00:01    |
|  2 | TABLE ACCESS STORAGE FULL | BIG_BOWIE1 |   20 |   520 |  643 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 937174207

--------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                 | Rows | Bytes | Cost | Time       |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                      |   19 |   494 |   22 | 00:00:01   |
|   1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE1           |   19 |   494 |   22 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                           | SYS_AI_2zt7rg40mxa4n |   19 |       |    3 | 00:00:01   |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("TOTAL_SALES"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We notice a couple of interesting points.

Firstly, yes Automatic Indexing has created an index based on the TOTAL_SALES column (SYS_AI_2zt7rg40mxa4n) as it improves performance by a reported 44119.6x.

Note also that the Automatic Index is a Non-Partitioned (Global) Index. From a performance perspective, this is the most efficient index to create to improve the performance of this query as the CBO only has the one index structure to navigate (vs. a LOCAL index that would require having to navigate down all 8 index structures for each table partition.

If we look at the index details:

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_2zt7rg40mxa4n           NO  YES VISIBLE   VALID

 

We notice that this is indeed a Non-Partitioned Index, that is both VISIBLE and VALID and so can be potentially used by any database session.

If we now re-run the query:

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

19 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 937174207

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                 | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop    |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                          |                      |   20 |   520 |      23 (0)| 00:00:01 |       |          |
|  1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| BIG_BOWIE1           |   20 |   520 |      23 (0)| 00:00:01 | ROWID | ROWID    |
|* 2 | INDEX RANGE SCAN                          | SYS_AI_2zt7rg40mxa4n |   20 |       |       3 (0)| 00:00:01 |       |          |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("TOTAL_SALES"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         23 consistent gets
          0 physical reads
          0 redo size
       1166 bytes sent via SQL*Net to client
        369 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         19 rows processed

 

We can see the query now uses the newly created Automatic Index and is indeed more efficient, performing now just 23 consistent gets (previously 44014 consistent gets).

 

However, this was NOT previous behaviour.

The documentation previously mentioned that only LOCAL indexes are used when indexing partitioned tables.

If we run the same demo on Oracle Database 19.3, we get the following report:

 

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 14-OCT-2020 13:12:07
Activity end                : 14-OCT-2020 14:24:07
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 192.94 MB (192.94 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (1950.5x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 1950.5x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------------
| Owner | Table      | Index                | Key         | Type   | Properties |
---------------------------------------------------------------------------------
| BOWIE | BIG_BOWIE1 | SYS_AI_8armv0hqq73fa | TOTAL_SALES | B-TREE | LOCAL      |
---------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 2pp8ypramw30s
SQL Text            : SELECT * FROM big_bowie1 WHERE total_sales = 42
Improvement Factor  : 1950.5x

Execution Statistics:
-----------------------------
                     Original Plan                Auto Index Plan
                     ---------------------------- ----------------------------
Elapsed Time (s):    6996973                      27327
CPU Time (s):        6704215                      12819
Buffer Gets:         815306                       49
Optimizer Cost:      12793                        28
Disk Reads:          2                            40
Direct Writes:       0                            0
Rows Processed:      475                          25
Executions:          19                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 4294056405

-----------------------------------------------------------------------------
| Id | Operation          | Name       | Rows | Bytes | Cost  | Time        |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT    |            |      |       | 12793 |             |
| 1 | PARTITION RANGE ALL |            |   20 |   520 | 12793 | 00:00:01    |
| 2 | TABLE ACCESS FULL   | BIG_BOWIE1 |   20 |   520 | 12793 | 00:00:01    |
-----------------------------------------------------------------------------

- With Auto Indexes
-----------------------------
Plan Hash Value : 3781269341

--------------------------------------------------------------------------------------------------------------
|  Id | Operation                                 | Name                 | Rows | Bytes | Cost | Time        |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                          |                      |   25 |   650 |   28 | 00:00:01    |
|   1 | PARTITION RANGE ALL                       |                      |   25 |   650 |   28 | 00:00:01    |
|   2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE1           |   25 |   650 |   28 | 00:00:01    |
| * 3 | INDEX RANGE SCAN                          | SYS_AI_8armv0hqq73fa |   25 |       |   17 | 00:00:01    |
--------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 3 - access("TOTAL_SALES"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

As we can see, in this scenario, the newly created Automatic Index has a “Property” of LOCAL.

If we look at its index details:

 

SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PAR AUT VISIBILIT STATUS
------------------------------ --- --- --------- --------
SYS_AI_8armv0hqq73fa           YES YES VISIBLE   N/A

SQL> SELECT index_name, partitioning_type, partition_count, locality FROM user_part_indexes
WHERE table_name = 'BIG_BOWIE1';

INDEX_NAME                     PARTITION PARTITION_COUNT LOCALI
------------------------------ --------- --------------- ------
SYS_AI_8armv0hqq73fa           RANGE                   8 LOCAL

 

We can see how a Local Index was previously created.

As such if we re-run an equivalent query:

SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42;

25 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3781269341

-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                | Name                 | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop     |
-----------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                         |                      |   20 |   520 |      26 (0)| 00:00:01 |       |           |
|  1 | PARTITION RANGE ALL                      |                      |   20 |   520 |      26 (0)| 00:00:01 |     1 |         8 |
|  2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BIG_BOWIE1           |   20 |   520 |      26 (0)| 00:00:01 |     1 |         8 |
|* 3 | INDEX RANGE SCAN                         | SYS_AI_8armv0hqq73fa |   20 |       |      17 (0)| 00:00:01 |     1 |         8 |
-----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         50 consistent gets
          0 physical reads
          0 redo size
       1555 bytes sent via SQL*Net to client
        409 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         25 rows processed

 

Although the query is returning 6 more rows (as with the random number generation, has a slightly different data set), it’s more expensive proportionally now having to perform 50 consistent gets as it now has to read 8 index structures rather than just the one.

So (IMHO), Automatic Indexing has improved here, creating a more efficient index structure than previously. So always bear in mind that Automatic Indexing is an evolving beast, improving and adapting as time moves on.

However, note the compromise here is that by having an effectively Global index structure, there may be some additional issues depending on any subsequent structural changes to the table.

More on Automatic Indexing and Partitioning in my next post…

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part III (Do Anything You Say) October 8, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Performance Tuning, Stale Statistics.
2 comments

In Part I of this series, we saw how Automatic Indexing will not create a viable Automatic Index if there are stale or missing statistics on the underlining segments. In Part II we saw how these SQL statements effectively become blacklisted and when segment statistics are subsequently collected, Automatic Indexing will still not create viable Automatic Indexes when the SQL statements are re-run.

So how do we get Automatic Indexing to now kick in and create necessary indexes on these problematic SQLs?

As I’ve discussed previously in relation to blacklisted SQLs, we need to run a NEW SQL statement that hasn’t been blacklist that will result in a necessary index to be created. An easy way to do this is just to include a new comment within the previous SQL to give the SQL a new signature.

If we now run the following “new” SQL statement (identical to the problematic SQL but with a comment embedded):

SQL> select /* new */ * from bowie_stale where code=42;

        ID       CODE NAME
---------- ---------- ------------------------------------------
   1000041         42 David Bowie
   6000041         42 David Bowie
        41         42 David Bowie
   3000041         42 David Bowie
   7000041         42 David Bowie
   8000041         42 David Bowie
   4000041         42 David Bowie
   9000041         42 David Bowie
   5000041         42 David Bowie
   2000041         42 David Bowie

 

If we now wait to see what the next Automatic Indexing task makes of things:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 07-JUL-2020 06:34:49
Activity end                : 07-JUL-2020 06:35:54
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 0
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 142.61 MB (142.61 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 1
SQL statements improved (improvement factor) : 1 (19787.7x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 19787.7x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
1. The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table       | Index                | Key  | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | BOWIE_STALE | SYS_AI_300kk2unp8tr0 | CODE | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

 

We see that the index on the CODE column (SYS_AI_300kk2unp8tr0) has now been created.

Further down the report:

 

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : du6psd0xmzpg5
SQL Text            : select /* new */ * from bowie_stale where code=42
Improvement Factor  : 19787.7x

Execution Statistics:
-----------------------------
                  Original Plan Auto           Index Plan
                  ---------------------------- ----------------------------
Elapsed Time (s): 137261                       2620
CPU Time (s):     84621                        1769
Buffer Gets:      277028                       13
Optimizer Cost:   544                          13
Disk Reads:       275947                       2
Direct Writes:    0                            0
Rows Processed:   70                           10
Executions:       7                            1

 

A new index was indeed created because of this new SQL statement, with a performance improvement of 19787.7x.

Further down the report to the Plans Section:

 

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 65903426

-----------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost | Time        |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT          |             |      |       |  544 |             |
| 1 | TABLE ACCESS STORAGE FULL | BOWIE_STALE |   10 |   230 |  544 | 00:00:01    |
-----------------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 2558864466

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   10 |   230 |   13 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_STALE          |   10 |   230 |   13 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_300kk2unp8tr0 |   10 |       |    3 | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We can see that the new plan using the new Automatic Index with a much lower CBO cost.

If we now look at the status of this index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID      10000000       20164             59110
SYS_AI_300kk2unp8tr0           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       16891          10000000

 

We see that the index is now both VISIBLE and VALID (previously, it was INVISIBLE and UNUSABLE).

As such, the Automatic Index can now potentially be used by any SQL, including the previous problematic query.

So with a viable index now in place, if we re-run the initial problematic query:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2558864466

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      |   10 |   230 |      14 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_STALE          |   10 |   230 |      14 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_300kk2unp8tr0 |   10 |       |       3 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         14 consistent gets
          0 physical reads
          0 redo size
        738 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

We see that finally, the SQL uses the new Automatic Index and is indeed much more efficient as a result, with just 14 consistent gets required (when previously it was 39430 consistent gets).

So if ever you come across the scenario where an SQL does not have an Automatic Index created when clearly it should, it could be that it has been blacklisted and needs a different SQL to actually generate the necessary index.

To avoid some of these issues, make sure you do not have stale or missing statistics when reliant on Automatic Indexing. The new High Frequency Statistics Collection capability to designed to specifically avoid such a scenario.

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part II (Survive) October 7, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Internals, Index statistics, Oracle, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning, Stale Statistics.
1 comment so far

 

 

In my previous post, I discussed how having stale statistics, usually a bad idea, is especially problematic with regard Automatic Indexes as it usually results in viable automatic indexes only being created in an UNUSABLE/INVISIBLE state.

If we were to now to collect the missing statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STALE');

PL/SQL procedure successfully completed.

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables
where table_name='BOWIE_STALE';

TABLE_NAME             NUM_ROWS     BLOCKS LAST_ANAL
-------------------- ---------- ---------- ---------
BOWIE_STALE            10000000      39677 06-JUL-20

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                       10000000          0 HYBRID          06-JUL-20
CODE                       971092    .000001 HYBRID          06-JUL-20
NAME                            1 4.9416E-08 FREQUENCY       06-JUL-20

 

If we now repeatedly re-run the problematic query many times:

 

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 65903426

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)|  Time       |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |   10 |   230 |    544 (14)|  00:00:01   |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |   10 |   230 |    544 (14)|  00:00:01   |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        610 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

 

The CBO is forced to use the FTS as the current Automatic Index is in an UNUSABLE/INVISIBLE state.

If we wait for the next Automatic Indexing reporting period:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 06-JUL-2020 05:12:42
Activity end                : 06-JUL-2020 05:13:34
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 0
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

 

We notice that the Automatic Indexing process has nothing to report. Even though the problematic query is repeatedly executed, the SQL is now effectively on a blacklist and is not re-considered by the Automatic Indexing process.

If we look at the index details on the table:

 

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME             AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK         NO  YES VISIBLE   DISABLED      VALID      10000000       20164             59110
SYS_AI_300kk2unp8tr0   YES NO  INVISIBLE ADVANCED LOW  UNUSABLE   10000000       23058           4147514 

 

So the Automatic Index (SYS_AI_300kk2unp8tr0) is still UNUSABLE and INVISIBLE and can not be used by the CBO.

NOTE: In earlier patches of Oracle Database 19c (I’m using version 19.5.0.0.0 in this demo), I identified some scenarios after stale statistics when indexes were created in but in a VALID/INVISIBLE state, such that they could still not be used by the CBO in general database sessions.

If we simply re-run the same queries again from the time when the dependant object statistics were stale, any SQL is just ignored by the Automatic Indexing process.

As such, if we now subsequently re-run the problematic query again:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 65903426

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |   10 |   230 |    544 (14)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |   10 |   230 |    544 (14)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      39430 consistent gets
      39421 physical reads
          0 redo size
        610 bytes sent via SQL*Net to client
        361 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

Again, the CBO has no choice here with no viable VALID/VISIBLE index present but to perform a FTS, even though its getting the cardinality estimates spot on since statistics gathering.

 

In Part III I’ll discuss how to get this query to finally use the Automatic Index and improve its performance, although if you’re a regular reader of the blog you should already know the solution…