Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part I (Dead Against It) October 6, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, High Frequency Statistics Collection, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning, Stale Statistics, Unusable Indexes.trackback
A “golden rule” when working with Automatic Indexing is that things don’t work properly if there are stale statistics on the dependant objects. Stale statistics can of course be problematic but they can be particularly troublesome when dealing with Automatic Indexing.
In the Oracle Autonomous Database environments, this issue is addressed somewhat by the new High Frequency Statistics Collection capability, which helps to automatically collect stale statistics on a regular basis. However, in on-prem Exadata environments where this can more easily be turned off or collected less frequently, it’s a potential issue worth consideration.
I’ll start with a simple little table, with a CODE column that has lots of distinct values:
SQL> create table bowie_stale (id number constraint bowie_stale_pk primary key, code number, name varchar2(42)); Table created. SQL> insert into bowie_stale select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000; 10000000 rows created. SQL> commit; Commit complete.
Importantly, I don’t collect statistics on this newly populated table…
SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE'; TABLE_NAME NUM_ROWS BLOCKS LAST_ANAL --------------- ---------- ---------- --------- BOWIE_STALE SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols where table_name='BOWIE_STALE'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM LAST_ANAL -------------------- ------------ ---------- --------------- --------- ID NONE CODE NONE NAME NONE
If we now run the following query a number of times while there are no statistics on the table:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 437 | 21413 | 553 (16)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 437 | 21413 | 553 (16)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - dynamic statistics used: dynamic sampling (level=2) - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 39026 consistent gets 0 physical reads 0 redo size 867 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
The CBO has no choice but to use a FTS as I don’t yet have an index on the CODE column.
If I now wait for the next Automatic Indexing task to kick in AND if there are still NO statistics on the table:
SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual; REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 05-JUL-2020 06:36:31 Activity end : 05-JUL-2020 06:37:07 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created : 0 Space used : 0 B Indexes dropped : 0 SQL statements verified : 0 SQL statements improved : 0 SQL plan baselines created : 0 Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0
You can see that there was the one index candidate BUT no Automatic Index appears to have been created.
Assuming there are still no statistics:
SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE'; TABLE_NAME NUM_ROWS BLOCKS LAST_ANAL ------------------------------ ---------- ---------- --------- BOWIE_STALE SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols where table_name='BOWIE_STALE2'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM LAST_ANAL -------------------- ------------ ---------- --------------- --------- ID NONE CODE NONE NAME NONE
If we look now at what indexes exist on the table:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BOWIE_STALE'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------------ --- --- --------- ------------- -------- ---------- ----------- ----------------- BOWIE_STALE_PK NO YES VISIBLE DISABLED VALID SYS_AI_300kk2unp8tr0 YES NO INVISIBLE DISABLED UNUSABLE 0 0 0 SQL> select index_name, column_name, column_position from user_ind_columns where table_name='BOWIE_STALE2' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ -------------------- --------------- BOWIE_STALE_PK ID 1 SYS_AI_300kk2unp8tr0 CODE 1
We notice there is now an Automatic Index BUT it remains in an UNUSABLE/INVISIBLE state. This means the index can’t be used by the CBO.
So if we now re-run the SQL query again:
SQL> select * from bowie_stale where code=42; 10 rows selected. Execution Plan ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 437 | 21413 | 553 (16)| 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE | 437 | 21413 | 553 (16)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - dynamic statistics used: dynamic sampling (level=2) - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 39026 consistent gets 0 physical reads 0 redo size 867 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
The CBO has no choice still but to use the FTS.
In Part II, we’ll see that once we get into this scenario, it can be a tad problematic to get ourselves out of it and get the Automatic Index created as we would like…
[…] my previous post, I discussed how having stale statistics, usually a bad idea, is especially problematic with regard […]
LikeLike
[…] Part I of this series, we saw how Automatic Indexing will not create a viable Automatic Index if there are […]
LikeLike
[…] discussed previously, I’ll importantly collect […]
LikeLike
Richard – Thanks for this nice demo.
Hope you are running all these examples from ATP/ADW environment.
if so why these full scan’s are not parallelized like your pervious blog posts (https://richardfoote.wordpress.com/2020/09/16/oracle-19c-automatic-indexing-data-skew-part-iii-the-good-son/)?
what could be the reason for this?
LikeLike
Hi Rajeshwaran
I previously had access to a fully scalable Autonomous Environment but now only to the free tier version, with just the 1 OCPU, which explains some of the parallelism differences.
I run the demos in both the free tier Autonomous Database environments and in an Exadata like environment for which Auto Indexing would also be of interest.
LikeLike