Oracle 19c Automatic Indexing: Data Skew Fixed By Baselines Part II (Sound And Vision) September 28, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Baselines, CBO, Data Skew, Exadata, Explain Plan For Index, Full Table Scans, Histograms, Index Access Path, Index statistics, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Performance Tuning.trackback
In my previous post, I discussed how the Automatic Indexing task by using Dynamic Sampling Level=11 can correctly determine the correct query cardinality estimates and assume the CBO will likewise determine the correct cardinality estimate and NOT use an index if it would cause performance to regress.
However, if other database sessions DON’T use Dynamic Sampling at the same Level=11 and hence NOT determine correct cardinality estimates, newly created Automatic Indexes might get used by the CBO inappropriately and result inefficient execution plans.
Likewise, with incorrect CBO cardinality estimates, it might also be possible for newly created Automatic Indexes to NOT be used when they should be (as I’ve discussed previously).
These are potential issues if the Dynamic Sampling value differs between the Automatic Indexing task and other database sessions.
One potential way to make things more consistent and see how the Automatic Indexing behaves if it detects an execution plan where the CBO would use an Automatic Index that causes performance regression, is to disable Dynamic Sampling within the Automatic Indexing task.
This can be easily achieved by using the following hint which effectively disables Dynamic Sampling with the previous problematic query:
SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000); 1000011 rows selected. Execution Plan ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1005K| 135M| 11411 (1)| 00:00:01 | |* 1 | TABLE ACCESS FULL| SPACE_ODDITY | 1005K| 135M| 11411 (1)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 41169 consistent gets 0 physical reads 0 redo size 13535504 bytes sent via SQL*Net to client 2705 bytes received via SQL*Net from client 202 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1000011 rows processed
The query currently has good cardinality estimates (1005K vs 1000011 rows returned) only because we currently have histograms in place for the CODE column. As such, the query correctly uses a FTS.
However, if we now remove the histogram on the CODE column:
SQL> exec dbms_stats.gather_table_stats(null, 'SPACE_ODDITY', method_opt=> 'FOR ALL COLUMNS SIZE 1’); PL/SQL procedure successfully completed.
There is no way for the CBO to now determine the correct cardinality estimate because of the skewed data and missing histograms.
So what does the Automatic Indexing tasks make of things now. If we look at the next activity report:
SQL> select dbms_auto_index.report_last_activity() report from dual; REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 18-AUG-2020 16:42:33 Activity end : 18-AUG-2020 16:43:06 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 0 Indexes created : 0 Space used : 0 B Indexes dropped : 0 SQL statements verified : 1 SQL statements improved : 0 SQL plan baselines created (SQL statements) : 1 (1) Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0
We can see that it has verified this one new statement and has created 1 new SQL Plan Baseline as a result.
If we look at the Verification Details part of this report:
VERIFICATION DETAILS ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- The following SQL plan baselines were created: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 3yz8unzhhvnuz SQL Text : select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000) SQL Signature : 3910785437403172730 SQL Handle : SQL_3645e6a2952fcf7a SQL Plan Baselines (1) : SQL_PLAN_3cjg6naakzmvu198c05b9
We can see Automatic Indexing has created a new SQL Plan Baseline for our query with Dynamic Sampling set to 0 thanks to the hint.
Basically, the Automatic Indexing task has found a new query and determined the CBO would be inclined to use the index, because it now incorrectly assumes few rows are to be returned. It makes the poor cardinality estimate because there are currently no histograms in place AND because it can’t now use Dynamic Sampling to get a more accurate picture of things on the fly because it has been disabled with the dynamic_sampling(0) hint.
Using an Automatic Index over the current FTS plan would make the performance of the SQL regress.
Therefore, to protect the current FTS plan, Automatic Indexing has created a SQL Plan Baseline that effectively forces the CBO to use the current, more efficient FTS plan.
This can be confirmed by looking at the DBA_AUTO_INDEX_VERIFICATIONS view:
SQL> select execution_name, original_buffer_gets, auto_index_buffer_gets, status from dba_auto_index_verifications where sql_id = '3yz8unzhhvnuz'; EXECUTION_NAME ORIGINAL_BUFFER_GETS AUTO_INDEX_BUFFER_GETS STATUS -------------------------- -------------------- ---------------------- --------- SYS_AI_2020-08-18/16:42:33 41169 410291 REGRESSED
If we now re-run the SQL again (noting we still don’t have histograms on the CODE column):
SQL> select /*+ dynamic_sampling(0) */ * from space_oddity where code in (190000, 170000, 150000, 130000, 110000, 90000, 70000, 50000, 30000, 10000); 1000011 rows selected. Execution Plan ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 32 | 4512 | 11425 (2)| 00:00:01 | |* 1 | TABLE ACCESS FULL| SPACE_ODDITY | 32 | 4512 | 11425 (2)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("CODE"=10000 OR "CODE"=30000 OR "CODE"=50000 OR "CODE"=70000 OR "CODE"=90000 OR "CODE"=110000 OR "CODE"=130000 OR "CODE"=150000 OR "CODE"=170000 OR "CODE"=190000) Hint Report (identified by operation id / Query Block Name / Object Alias): Total hints for statement: 1 (U - Unused (1)) --------------------------------------------------------------------------- 1 - SEL$1 U - dynamic_sampling(0) / rejected by IGNORE_OPTIM_EMBEDDED_HINTS Note ----- - SQL plan baseline "SQL_PLAN_3cjg6naakzmvu198c05b9" used for this statement Statistics ---------------------------------------------------------- 9 recursive calls 4 db block gets 41170 consistent gets 0 physical reads 0 redo size 13535504 bytes sent via SQL*Net to client 2705 bytes received via SQL*Net from client 202 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1000011 rows processed
We can see the CBO is forced to use the SQL Plan Baseline “SQL_PLAN_3cjg6naakzmvu198c05b9” as created by the Automatic Indexing task to ensure the more efficient FTS is used and not the available Automatic Index.
So Automatic Indexing CAN create SQL PLan Baselines to protect SQL from performance regressions caused by inappropriate use of Automatic Indexes BUT it’s really hard and difficult for it to do this effectively if the Automatic Indexing tasks and other database sessions have differing Dynamic Sampling settings as it does by default…
Hi Richard,
sorry i am lost at this.
but how does this auto sql plan baseline with DS hint of level = 0 solves the problem at (from this pervious) this article @ https://richardfoote.wordpress.com/2020/09/21/oracle-19c-automatic-indexing-cbo-incorrectly-using-auto-indexes-part-ii-sleepwalk/
select code1, code2, name from iggy_pop where code1=1 and code2=42;
estimated cardinality incorrectly computed as 50K – with extended stats and frequency histogram in place
LikeLike