Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part III (Do Anything You Say) October 8, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Performance Tuning, Stale Statistics.2 comments
In Part I of this series, we saw how Automatic Indexing will not create a viable Automatic Index if there are stale or missing statistics on the underlining segments. In Part II we saw how these SQL statements effectively become blacklisted and when segment statistics are subsequently collected, Automatic Indexing will still not create viable Automatic Indexes when the SQL statements are re-run.
So how do we get Automatic Indexing to now kick in and create necessary indexes on these problematic SQLs?
As I’ve discussed previously in relation to blacklisted SQLs, we need to run a NEW SQL statement that hasn’t been blacklist that will result in a necessary index to be created. An easy way to do this is just to include a new comment within the previous SQL to give the SQL a new signature.
If we now run the following “new” SQL statement (identical to the problematic SQL but with a comment embedded):
SQL> select /* new */ * from bowie_stale where code=42; ID CODE NAME ---------- ---------- ------------------------------------------ 1000041 42 David Bowie 6000041 42 David Bowie 41 42 David Bowie 3000041 42 David Bowie 7000041 42 David Bowie 8000041 42 David Bowie 4000041 42 David Bowie 9000041 42 David Bowie 5000041 42 David Bowie 2000041 42 David Bowie
If we now wait to see what the next Automatic Indexing task makes of things:
SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual; REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 07-JUL-2020 06:34:49 Activity end : 07-JUL-2020 06:35:54 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 0 Indexes created (visible / invisible) : 1 (1 / 0) Space used (visible / invisible) : 142.61 MB (142.61 MB / 0 B) Indexes dropped : 0 SQL statements verified : 1 SQL statements improved (improvement factor) : 1 (19787.7x) SQL plan baselines created : 0 Overall improvement factor : 19787.7x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- 1. The following indexes were created: *: invisible ------------------------------------------------------------------------------- --------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | --------------------------------------------------------------------------- | BOWIE | BOWIE_STALE | SYS_AI_300kk2unp8tr0 | CODE | B-TREE | NONE | --------------------------------------------------------------------------- -------------------------------------------------------------------------------
We see that the index on the CODE column (SYS_AI_300kk2unp8tr0) has now been created.
Further down the report:
VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : du6psd0xmzpg5 SQL Text : select /* new */ * from bowie_stale where code=42 Improvement Factor : 19787.7x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 137261 2620 CPU Time (s): 84621 1769 Buffer Gets: 277028 13 Optimizer Cost: 544 13 Disk Reads: 275947 2 Direct Writes: 0 0 Rows Processed: 70 10 Executions: 7 1
A new index was indeed created because of this new SQL statement, with a performance improvement of 19787.7x.
Further down the report to the Plans Section:
PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 65903426 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 544 | | | 1 | TABLE ACCESS STORAGE FULL | BOWIE_STALE | 10 | 230 | 544 | 00:00:01 | ----------------------------------------------------------------------------------- Notes ----- - dop = 1 - px_in_memory_imc = no - px_in_memory = no - With Auto Indexes ----------------------------- Plan Hash Value : 2558864466 ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 230 | 13 | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_STALE | 10 | 230 | 13 | 00:00:01 | | * 2 | INDEX RANGE SCAN | SYS_AI_300kk2unp8tr0 | 10 | | 3 | 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access("CODE"=42) Notes ----- - Dynamic sampling used for this statement ( level = 11 )
We can see that the new plan using the new Automatic Index with a much lower CBO cost.
If we now look at the status of this index:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BOWIE_STALE'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------------ --- --- --------- ------------- -------- ---------- ----------- ----------------- BOWIE_STALE_PK NO YES VISIBLE DISABLED VALID 10000000 20164 59110 SYS_AI_300kk2unp8tr0 YES NO VISIBLE ADVANCED LOW VALID 10000000 16891 10000000
We see that the index is now both VISIBLE and VALID (previously, it was INVISIBLE and UNUSABLE).
As such, the Automatic Index can now potentially be used by any SQL, including the previous problematic query.
So with a viable index now in place, if we re-run the initial problematic query:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2558864466 ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10 | 230 | 14 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_STALE | 10 | 230 | 14 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | SYS_AI_300kk2unp8tr0 | 10 | | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CODE"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 738 bytes sent via SQL*Net to client 361 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We see that finally, the SQL uses the new Automatic Index and is indeed much more efficient as a result, with just 14 consistent gets required (when previously it was 39430 consistent gets).
So if ever you come across the scenario where an SQL does not have an Automatic Index created when clearly it should, it could be that it has been blacklisted and needs a different SQL to actually generate the necessary index.
To avoid some of these issues, make sure you do not have stale or missing statistics when reliant on Automatic Indexing. The new High Frequency Statistics Collection capability to designed to specifically avoid such a scenario.
Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part II (Survive) October 7, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Internals, Index statistics, Oracle, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning, Stale Statistics.1 comment so far
In my previous post, I discussed how having stale statistics, usually a bad idea, is especially problematic with regard Automatic Indexes as it usually results in viable automatic indexes only being created in an UNUSABLE/INVISIBLE state.
If we were to now to collect the missing statistics:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STALE'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE'; TABLE_NAME NUM_ROWS BLOCKS LAST_ANAL -------------------- ---------- ---------- --------- BOWIE_STALE 10000000 39677 06-JUL-20 SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols where table_name='BOWIE_STALE'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM LAST_ANAL -------------------- ------------ ---------- --------------- --------- ID 10000000 0 HYBRID 06-JUL-20 CODE 971092 .000001 HYBRID 06-JUL-20 NAME 1 4.9416E-08 FREQUENCY 06-JUL-20
If we now repeatedly re-run the problematic query many times:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 65903426 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 230 | 544 (14)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 10 | 230 | 544 (14)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 39430 consistent gets 39421 physical reads 0 redo size 610 bytes sent via SQL*Net to client 361 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
The CBO is forced to use the FTS as the current Automatic Index is in an UNUSABLE/INVISIBLE state.
If we wait for the next Automatic Indexing reporting period:
SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual; REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 06-JUL-2020 05:12:42 Activity end : 06-JUL-2020 05:13:34 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 0 Indexes created : 0 Space used : 0 B Indexes dropped : 0 SQL statements verified : 0 SQL statements improved : 0 SQL plan baselines created : 0 Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 -------------------------------------------------------------------------------
We notice that the Automatic Indexing process has nothing to report. Even though the problematic query is repeatedly executed, the SQL is now effectively on a blacklist and is not re-considered by the Automatic Indexing process.
If we look at the index details on the table:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BOWIE_STALE'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ---------------------- --- --- --------- ------------- -------- ---------- ----------- ----------------- BOWIE_STALE_PK NO YES VISIBLE DISABLED VALID 10000000 20164 59110 SYS_AI_300kk2unp8tr0 YES NO INVISIBLE ADVANCED LOW UNUSABLE 10000000 23058 4147514
So the Automatic Index (SYS_AI_300kk2unp8tr0) is still UNUSABLE and INVISIBLE and can not be used by the CBO.
NOTE: In earlier patches of Oracle Database 19c (I’m using version 19.5.0.0.0 in this demo), I identified some scenarios after stale statistics when indexes were created in but in a VALID/INVISIBLE state, such that they could still not be used by the CBO in general database sessions.
If we simply re-run the same queries again from the time when the dependant object statistics were stale, any SQL is just ignored by the Automatic Indexing process.
As such, if we now subsequently re-run the problematic query again:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 65903426 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 230 | 544 (14)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 10 | 230 | 544 (14)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 39430 consistent gets 39421 physical reads 0 redo size 610 bytes sent via SQL*Net to client 361 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
Again, the CBO has no choice here with no viable VALID/VISIBLE index present but to perform a FTS, even though its getting the cardinality estimates spot on since statistics gathering.
In Part III I’ll discuss how to get this query to finally use the Automatic Index and improve its performance, although if you’re a regular reader of the blog you should already know the solution…
Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part I (Dead Against It) October 6, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, High Frequency Statistics Collection, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning, Stale Statistics, Unusable Indexes.5 comments
A “golden rule” when working with Automatic Indexing is that things don’t work properly if there are stale statistics on the dependant objects. Stale statistics can of course be problematic but they can be particularly troublesome when dealing with Automatic Indexing.
In the Oracle Autonomous Database environments, this issue is addressed somewhat by the new High Frequency Statistics Collection capability, which helps to automatically collect stale statistics on a regular basis. However, in on-prem Exadata environments where this can more easily be turned off or collected less frequently, it’s a potential issue worth consideration.
I’ll start with a simple little table, with a CODE column that has lots of distinct values:
SQL> create table bowie_stale (id number constraint bowie_stale_pk primary key, code number, name varchar2(42)); Table created. SQL> insert into bowie_stale select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000; 10000000 rows created. SQL> commit; Commit complete.
Importantly, I don’t collect statistics on this newly populated table…
SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE'; TABLE_NAME NUM_ROWS BLOCKS LAST_ANAL --------------- ---------- ---------- --------- BOWIE_STALE SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols where table_name='BOWIE_STALE'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM LAST_ANAL -------------------- ------------ ---------- --------------- --------- ID NONE CODE NONE NAME NONE
If we now run the following query a number of times while there are no statistics on the table:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 437 | 21413 | 553 (16)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 437 | 21413 | 553 (16)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - dynamic statistics used: dynamic sampling (level=2) - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 39026 consistent gets 0 physical reads 0 redo size 867 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
The CBO has no choice but to use a FTS as I don’t yet have an index on the CODE column.
If I now wait for the next Automatic Indexing task to kick in AND if there are still NO statistics on the table:
SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual; REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 05-JUL-2020 06:36:31 Activity end : 05-JUL-2020 06:37:07 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created : 0 Space used : 0 B Indexes dropped : 0 SQL statements verified : 0 SQL statements improved : 0 SQL plan baselines created : 0 Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0
You can see that there was the one index candidate BUT no Automatic Index appears to have been created.
Assuming there are still no statistics:
SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE'; TABLE_NAME NUM_ROWS BLOCKS LAST_ANAL ------------------------------ ---------- ---------- --------- BOWIE_STALE SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols where table_name='BOWIE_STALE2'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM LAST_ANAL -------------------- ------------ ---------- --------------- --------- ID NONE CODE NONE NAME NONE
If we look now at what indexes exist on the table:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BOWIE_STALE'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------------ --- --- --------- ------------- -------- ---------- ----------- ----------------- BOWIE_STALE_PK NO YES VISIBLE DISABLED VALID SYS_AI_300kk2unp8tr0 YES NO INVISIBLE DISABLED UNUSABLE 0 0 0 SQL> select index_name, column_name, column_position from user_ind_columns where table_name='BOWIE_STALE2' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ -------------------- --------------- BOWIE_STALE_PK ID 1 SYS_AI_300kk2unp8tr0 CODE 1
We notice there is now an Automatic Index BUT it remains in an UNUSABLE/INVISIBLE state. This means the index can’t be used by the CBO.
So if we now re-run the SQL query again:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 437 | 21413 | 553 (16)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 437 | 21413 | 553 (16)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - dynamic statistics used: dynamic sampling (level=2) - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 39026 consistent gets 0 physical reads 0 redo size 867 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
The CBO has no choice still but to use the FTS.
In Part II, we’ll see that once we get into this scenario, it can be a tad problematic to get ourselves out of it and get the Automatic Index created as we would like…