Oracle 19c Automatic Indexing: Index Created But Not Actually Used (Because Your Young) March 30, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, CBO, Oracle Indexes.8 comments
The following is an interesting example of how Oracle Automatic Indexing is currently implemented that can result in an Automatic Index being created but ultimately ignored by the CBO.
To illustrate, we begin by creating a simple little table that has two columns of particular interest, CODE2 which has 100 distinct values and CODE3 which has only 10 distinct values. The data of both columns is very poorly clustered with data for both columns sprinkled throughout the table:
SQL> create table major_tom3 (id number, code1 number, code2 number, code3 number, name varchar2(42)); Table created. SQL> insert into major_tom3 select rownum, mod(rownum, 1000)+1, ceil(dbms_random.value(0, 100)), ceil(dbms_random.value(0, 10)), 'David Bowie' from dual connect by level 10000000; SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'MAJOR_TOM3'); PL/SQL procedure successfully completed.
When we run the following query (a total of 4 times), it returns close to 10K rows per execution:
SQL> select * from major_tom3 where code3=4 and code2=42; 9968 rows selected.
When we look at the Automatic Indexing report after the next Automatic Index job run:
INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: --------------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties| --------------------------------------------------------------------------------- | BOWIE | MAJOR_TOM3 | SYS_AI_bnyacywycxx8b | CODE2,CODE3 | B-TREE | NONE | ---------------------------------------------------------------------------------
We notice that Automatic Indexing has indeed created a new index based on the columns CODE2, CODE3.
VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 1h4j53jruuzht SQL Text : select * from major_tom3 where code3=4 and code2=42 Improvement Factor : 5.1x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 1276903 29310 CPU Time (s): 1226240 25905 Buffer Gets: 183493 8954 Optimizer Cost: 7355 8996 Disk Reads: 0 26 Direct Writes: 0 0 Rows Processed: 39872 9968 Executions: 4 1
The reason why the index was created was because the Automatic Indexing process has determined the query will improve by a factor of 5.1 with the new index in place.
This has been calculated by first determining the average number of consistent gets per execution: 183493 total consistent gets / 4 executions = 45873 consistent gets average per execution
This average consistent gets is then divided by the number of consistent gets with the new index 45873 / 8954 = 5.1x.
So this “Improvement Factor” is determined primarily by the ratio of improvement based on consistent gets.
If we look now at the Plans Section of the Automatic Indexing report:
PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 2354969370 --------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | --------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 7355 | | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10000 | 280000 | 7355 | 00:00:01 | | 3 | PX BLOCK ITERATOR | | 10000 | 280000 | 7355 | 00:00:01 | | 4 | TABLE ACCESS STORAGE FULL | MAJOR_TOM3 | 10000 | 280000 | 7355 | 00:00:01 | --------------------------------------------------------------------------------------- - With Auto Indexes ----------------------------- Plan Hash Value : 1676847804 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10820 | 302960 | 8996 | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | MAJOR_TOM3 | 10820 | 302960 | 8996 | 00:00:01 | | * 2 | INDEX RANGE SCAN | SYS_AI_bnyacywycxx8b | 9968 | | 27 | 00:00:01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access("CODE2"=42 AND "CODE3"=4)
We notice with some interest that the new “improved” plan actually has a CBO cost of 8996 which is greater than the original plan cost of just 7355.
This is because the CBO does not simply use Consistent Gets in its calculations, but a combination of anticipated I/O and CPU costs. Noting that Multi-block reads may physically read more blocks more efficiently than a fewer number of Single-block reads, it’s quite conceivable that the CBO would consider an execution plan with more consistent gets to be more efficient if the underlining I/Os are costed as being cheaper.
This is precisely what’s happening with this query…
If we look at the resultant Automatic Index:
SQL> select index_name, auto, constraint_index, visibility, compression, status from user_indexes where table_name='MAJOR_TOM3'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS -------------------- --- --- --------- ------------- -------- SYS_AI_bnyacywycxx8b YES NO VISIBLE DISABLED VALID SQL> select index_name, column_name, column_position from user_ind_columns where table_name='MAJOR_TOM3' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION -------------------- -------------------- --------------- SYS_AI_bnyacywycxx8b CODE2 1 SYS_AI_bnyacywycxx8b CODE3 2
We can see that the newly created Automatic Index is both VISIBLE and VALID and so can potentially be used by any SQL within the database.
If we now re-run the query:
SQL> select * from major_tom3 where code3=4 and code2=42; 9968 rows selected. Execution Plan ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10045 | 274K| 7355 (7)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10045 | 274K| 7355 (7)| 00:00:01 | | 3 | PX BLOCK ITERATOR | | 10045 | 274K| 7355 (7)| 00:00:01 | |* 4 | TABLE ACCESS STORAGE FULL| MAJOR_TOM3 | 10045 | 274K| 7355 (7)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - storage("CODE2"=42 AND "CODE3"=4) filter("CODE2"=42 AND "CODE3"=4) Statistics ---------------------------------------------------------- 11 recursive calls 4 db block gets 45859 consistent gets 0 physical reads 0 redo size 275084 bytes sent via SQL*Net to client 7892 bytes received via SQL*Net from client 666 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 9968 rows processed
We notice that the execution plan remains the same as previously, with the newly available index NOT used by the CBO. This is because the index is deemed too inefficient and the result index based execution plan too expensive by the CBO.
If we now re-run the query with a hint to make the CBO use the index:
SQL> select /*+ index(major_tom3) */ * from major_tom3 where code3=4 and code2=42; 9968 rows selected. Execution Plan -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10045 | 274K | 9065 (1)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 10045 | 274K | 9065 (1)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| MAJOR_TOM3 | 10045 | 274K | 9065 (1)| 00:00:01 | | 4 | BUFFER SORT | | | | | | | 5 | PX RECEIVE | | 10045 | | 27 (4)| 00:00:01 | | 6 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 10045 | | 27 (4)| 00:00:01 | | 7 | PX SELECTOR | | | | | | |* 8 | INDEX RANGE SCAN | SYS_AI_bnyacywycxx8b | 10045 | | 27 (4)| 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("CODE2"=42 AND "CODE3"=4) Statistics ---------------------------------------------------------- 16 recursive calls 4 db block gets 8958 consistent gets 0 physical reads 0 redo size 275084 bytes sent via SQL*Net to client 7892 bytes received via SQL*Net from client 666 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 9968 rows processed
The Automatic Index is now used, but it requires a hint for the index to be used. The cost of using the Automatic Index is too great, even though the Automatic Index has only just been created and created to specifically address this query.
Having one criteria (based on Consistent Gets) used by Automatic Indexing and another criteria (based on estimated I/O and CPU costs) as used by the CBO leaves open this possibility of the Automatic Indexing thinking a new index is a great idea, while the CBO thinks not.
In my next post I’ll show you how Automatic Indexing actually does a rather clever job of improving the current index definition with the introduction of a new SQL…
ANNOUNCEMENT: Replacement Oracle Indexing Internals and Performance Tuning Webinars 23-26 March 2020 !! March 11, 2020
Posted by Richard Foote in Oracle Indexing Internals Webinar, Oracle Performance Diagnostics and Tuning Webinar, Webinar.add a comment
London March 2020 Seminar Dates Cancelled 😦
Unfortunately, due to the Coronavirus crisis, both seminars scheduled for London during March 2020 have been cancelled. Full refunds have been processed for all attendees.
March 2020 Webinar Replacement Dates Announced !!
However, as a quick replacement primarily for those disappointed by the cancellation of the London seminars, I will now be running webinar versions of both events as replacements during the same times the seminars were previously scheduled.
Remaining places for the webinars can be purchased by anyone until all remaining places are taken. Simple click on the Buy Now button to purchase your place for this unique training opportunity.
23-24 March 2020 (running between 9:00 – 17:00 GMT): “Oracle Indexing Internals and Best Practices” Webinar:
25-26 March 2020 (running between 9:00 – 17:00 GMT): “Oracle Performance Diagnostics and Tuning” Webinar:
You can also purchase tickets to both webinars at a special combo discount:
23-26 March 2020 (running between 9:00 – 17:00 GMT): Both “Oracle Indexing Internals and Best Practices” and “Oracle Performance Diagnostics and Tuning” Webinars:
The cost for an individual webinar is $1,800 Australian Dollars.
The cost for the webinar combo is $3,000 Australian Dollars.
For full content details of “Oracle Indexing Internals and Best Practices” Webinar: https://richardfooteconsulting.com/indexing-seminar/
For full content details of “Oracle Performance Diagnostics and Tuning” Webinar: https://richardfooteconsulting.com/performance-tuning-seminar/
Please contact me directly if you wish to pay via invoice and direct bank transfer or if you have any questions.
Although I might not currently be able to run these events in London, hopefully you’ll still have the opportunity to attend this high acclaimed training 🙂