Oracle 19c Automatic Indexing: Common Index Creation Trap (Rat Trap) June 30, 2020
Posted by Richard Foote in 19c, 19c New Features, ASSM, Automatic Indexing, CBO, Clustering Factor, Data Clustering, Oracle Indexes, TABLE_CACHED_BLOCKS.1 comment so far
When I go to a customer site to resolve performance issues, one of the most common issues I encounter is in relation to inefficient SQL. And one of the most common causes for inefficient SQL I encounter is because of deficiencies the default manner by which the index Clustering Factor is calculated.
When it comes to both Automatic Indexes and in relation to the Oracle Autonomous Database Cloud Services, the “flawed” default manner by which the index Clustering Factor is calculated still applies. So we need to exercise some caution when Auto Indexes are created and the impact their default statistics can have on the performance of subsequent SQL statements.
To illustrate with a simple example, I’ll first create a table with the key column being the ID column which will be effectively unique. The table will be populated via a basic procedure that just inserts 1M rows. The procedure uses an ORDER sequence, such that the ID values are generated in a monotonically increasing manner:
SQL> create table bowie_assm (id number, code number, name varchar2(42)); Table created. SQL> create sequence bowie_assm_seq order; Sequence created. Procedure created. SQL> create or replace procedure pop_bowie_assm as 2 begin 3 for i in 1..1000000 loop 4 insert into bowie_assm values (bowie_assm_seq.nextval, mod(i,1000), 'DAVID BOWIE'); 5 commit; 6 end loop; 7 end; 8 / Procedure created.
However crucially, the procedure is executed by 3 different session concurrently, to simulate a multi user environment inserting into a table…
SQL> exec pop_bowie_assm PL/SQL procedure successfully completed.
We’ll now collect statistics on the table:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_ASSM'); PL/SQL procedure successfully completed. SQL> select table_name, num_rows, blocks from user_tables where table_name='BOWIE_ASSM'; TABLE_NAME NUM_ROWS BLOCKS --------------- ---------- ---------- BOWIE_ASSM 3000000 12137
So the table has 3M rows and is 12137 blocks in size.
If we run an SQL a few times where we select only the one ID value:
SQL> select * from bowie_assm where id = 42; Execution Plan ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 22 | 1934 (6)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 1 | 22 | 1934 (6)| 00:00:01 | | 3 | PX BLOCK ITERATOR | | 1 | 22 | 1934 (6)| 00:00:01 | |* 4 | TABLE ACCESS STORAGE FULL| BOWIE_ASSM | 1 | 22 | 1934 (6)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - storage("ID"=42) filter("ID"=42) Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 12138 consistent gets 0 physical reads 0 redo size 707 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The execution plan shows a Full Table Scan (FTS) is invoked, the only choice the CBO has without an index on the ID column. Clearly an index on the ID column would make the plan substantially more efficient with just 1 row selected from a 3M row table. Hopefully, Automatic Indexing will come to our rescue, so let’s check out the subsequent Automatic Indexing Report:
REPORT SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created (visible / invisible) : 1 (1 / 0) Space used (visible / invisible) : 58.72 MB (58.72 MB / 0 B) Indexes dropped : 0 SQL statements verified : 2 SQL statements improved (improvement factor) : 1 (1.2x) SQL plan baselines created : 0 Overall improvement factor : 1.1x ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: ------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | ------------------------------------------------------------------------- | BOWIE | BOWIE_ASSM | SYS_AI_2w1pss6qbdz6z | ID | B-TREE | NONE | -------------------------------------------------------------------------
So yes indeed, an Automatic Index (SYS_AI_2w1pss6qbdz6z) was created on the ID column.
If we look at the default Clustering Factor of this index:
SQL> select index_name, auto, constraint_index, visibility, status, clustering_factor from user_indexes where table_name='BOWIE_ASSM'; INDEX_NAME AUT CON VISIBILIT STATUS CLUSTERING_FACTOR -------------------- --- --- --------- -------- ----------------- SYS_AI_2w1pss6qbdz6z YES NO VISIBLE VALID 2504869
We notice the Clustering Factor is relatively high at 2504869, much higher than the 12137 number of blocks in the table.
But if the ID column in the table has been loaded via a monotonically increasing sequence, doesn’t that mean the ID values have been inserted in approximately in ID order? If so, doesn’t that mean the ID column should have a “good” Clustering Factor” as the order of the rows in the table matches the order of the indexed values in the ID index?
Clearly not.
The reason being that the table is stored in the default Automatic Segment Space Management (ASSM) tablespace type, which is designed to avoid contention by concurrent inserts from different sessions. Therefore each of the 3 sessions inserting into the table are each assigned to different table blocks, resulting in the rows not being precisely inserted in ID order. It’s very close to ID order, the the ID values clustered within a few blocks from each other, but not precisely stored in ID order.
However, by default, the Clustering Factor is calculated by reading each index entry and determining if it references a ROWID that accesses a table block different from the PREVIOUS index entry. If it does differ, it increments the Clustering Factor, if it doesn’t differ and accesses the same table block as the previous index entry, the Clustering Factor is NOT incremented.
So in theory, we could have 100 rows that reside in just 2 different table blocks, but if the odd IDs live in one block and the even IDs live in the other block, meaning that each ID is stored in a different table block to the previous, the Clustering Factor would have a value of 100 for these 100 rows, even though they only occupy 2 table blocks. The Clustering Factor is therefore much higher than in reality it should be as ultimately only 2 different table blocks are accessed within a negligible time from each other.
This is the “flaw” with how the default Clustering Factor is calculated. By noting if a table block access differs only from the previous table block accessed, it leaves the Clustering Factor calculation susceptible to exaggerated high values when the data really is relatively well clustered within the table.
If we run the same SQL as previously which only selects one ID value:
SQL> select * from bowie_assm where id = 42; Execution Plan -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 22 | 4 (0)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 1 | 22 | 4 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM | 1 | 22 | 4 (0)| 00:00:01 | | 4 | BUFFER SORT | | | | | | | 5 | PX RECEIVE | | 1 | | 3 (0)| 00:00:01 | | 6 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 1 | | 3 (0)| 00:00:01 | | 7 | PX SELECTOR | | | | | | |* 8 | INDEX RANGE SCAN | SYS_AI_2w1pss6qbdz6z | 1 | | 3 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("ID"=42) Statistics ---------------------------------------------------------- 12 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 707 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 1 rows processed
The CBO now uses the new Automatic Index as with just one row, the index is clearly more efficient regardless of the Clustering Factor value.
However, if we now run a query that selects a range of ID values, in this example between 42 and 4242 which represents only a relatively low 0.14% of the table:
SQL> select * from bowie_assm where id between 42 and 4242; 4201 rows selected. Execution Plan ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4202 | 92444 | 1934 (6)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 4202 | 92444 | 1934 (6)| 00:00:01 | | 3 | PX BLOCK ITERATOR | | 4202 | 92444 | 1934 (6)| 00:00:01 | |* 4 | TABLE ACCESS STORAGE FULL| BOWIE_ASSM | 4202 | 92444 | 1934 (6)| 00:00:01 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - storage("ID"<=4242 AND "ID">=42) filter("ID"<=4242 AND "ID">=42) Statistics ---------------------------------------------------------- 8 recursive calls 4 db block gets 12138 consistent gets 0 physical reads 0 redo size 54767 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 4201 rows processed
The CBO decides to use a Full Table Scan as it deems the index with the massive Clustering Factor to be too expensive, with it having to visit differing blocks for the majority of the estimated 4202 rows (note at 4201 actual rows returned, this estimate by the CBO is practically spot on).
If we force the use of the index via an appropriate hint:
SQL> select /*+ index (bowie_assm) */ * from bowie_assm where id between 42 and 4242; 4201 rows selected. Execution Plan -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4202 | 92444 | 3530 (1)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 4202 | 92444 | 3530 (1)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM | 4202 | 92444 | 3530 (1)| 00:00:01 | | 4 | BUFFER SORT | | | | | | | 5 | PX RECEIVE | | 4202 | | 12 (0)| 00:00:01 | | 6 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 4202 | | 12 (0)| 00:00:01 | | 7 | PX SELECTOR | | | | | | |* 8 | INDEX RANGE SCAN | SYS_AI_2w1pss6qbdz6z | 4202 | | 12 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("ID">=42 AND "ID"<=4242) Statistics ---------------------------------------------------------- 12 recursive calls 0 db block gets 26 consistent gets 0 physical reads 0 redo size 54767 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 4201 rows processed
Note at an estimated cost of 3530, this is greater than the 1934 cost of the FTS which explains why the CBO decides the FTS is best. However, if we look at the number of Consistent Gets, it’s only 26, meaning the CBO is actually getting these costs way wrong.
Why?
Because of the grossly inflated Clustering Factor.
As I’ve discussed previously, Oracle 12.1 introduced a new TABLE_CACHED_BLOCKS preference. Rather than the default value of 1, we can set this to any value up to 255. When calculating the Clustering Factor during statistics collection, it will NOT increment the Clustering Factor if the index visits a table block again that was one of the last “x” distinct table blocks visited. So by setting TABLE_CACHED_BLOCKS to (say) 42, if the index visits a block that was one of the last 42 distinct table blocks previously visited, don’t now increment the Clustering Factor. This can therefore generate a much more “accurate” Clustering Factor which can be significantly smaller than previously. This in turn makes the index much more efficient to the CBO because it then estimates far fewer table blocks need be accessed during a range scan.
So let’s change the TABLE_CACHED_BLOCKS value for this table to 42 (don’t increment now the Clustering Factor value when collecting statistics if we visit again any of the last 42 differently accessed table blocks) and recollect the segment statistics:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE_ASSM', pname=>'TABLE_CACHED_BLOCKS', pvalue=>42); PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_ASSM', cascade=>true); PL/SQL procedure successfully completed.
If we now examine the new Clustering Factor value:
SQL> select index_name, auto, constraint_index, visibility, status, clustering_factor from user_indexes where table_name='BOWIE_ASSM'; INDEX_NAME AUT CON VISIBILIT STATUS CLUSTERING_FACTOR -------------------- --- --- --------- -------- ----------------- SYS_AI_2w1pss6qbdz6z YES NO VISIBLE VALID 11608
We can see that at just 11608 it’s substantially less than the previous 2504869.
If we now rerun the previous range scan SQL without the hint:
SQL> select * from bowie_assm where id between 42 and 4242; 4201 rows selected. Execution Plan -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 4202 | 92444 | 30 (4)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10001 | 4202 | 92444 | 30 (4)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_ASSM | 4202 | 92444 | 30 (4)| 00:00:01 | | 4 | BUFFER SORT | | | | | | | 5 | PX RECEIVE | | 4202 | | 12 (0)| 00:00:01 | | 6 | PX SEND HASH (BLOCK ADDRESS) | :TQ10000 | 4202 | | 12 (0)| 00:00:01 | | 7 | PX SELECTOR | | | | | | |* 8 | INDEX RANGE SCAN | SYS_AI_2w1pss6qbdz6z | 4202 | | 12 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 8 - access("ID">=42 AND "ID"<=4242) Statistics ---------------------------------------------------------- 12 recursive calls 0 db block gets 26 consistent gets 0 physical reads 0 redo size 54767 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 2 sorts (memory) 0 sorts (disk) 4201 rows processed
We can see the CBO now automatically uses the new Automatic Index. At a new cost of just 30, it’s substantially less than the previous index cost of 3530 and now much less than the 1934 for the FTS and so why the index is now automatically chosen by the CBO.
When Automatic Indexes are created, it’s usually a good idea to check on the Clustering Factor and because default ASSM tablespaces have a tendency to significantly escalate the values of index Clustering Factors, to look at recalculating them with an non-default setting of the TABLE_CACHED_BLOCKS statistics collection preference.
Of course, not only is this a good idea for Automatic Indexes, but for manually created indexes as well.
Although no doubt Autonomous Database Cloud services will look at these issues in the future, such self-tuning capabilities are not currently available. You will need to go in there and make these changes as necessary to fix the root issues with such inefficient SQL statements…
Oracle 19c Automatic Indexing: A More Complex Example (How Does The Grass Grow) June 16, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, CBO.3 comments
In this post I’m going to put together in a slightly more complex SQL example a number of the concepts I’ve covered thus far in relation to the current implementation of Oracle Automatic Indexing.
I’ll begin by creating three tables, a larger TABLE1 and two smaller TABLE2 and TABLE3 lookup tables. Each table is created with only a Primary Key, Unique index and currently have no secondary indexes (this demo was run in an ATP Autonomous Cloud environment hence the odd parallel execution plans):
SQL> create table table1 (id number not null, code1 number not null, grade1 number not null, name1 varchar2(42)); Table created. SQL> insert into table1 select rownum, mod(rownum, 1000)+1, mod(rownum, 200000)+1, 'David Bowie '|| rownum from dual connect by level <=1000000; 1000000 rows created. SQL> commit; Commit complete. SQL> create unique index table1_id_i on table1(id); Index created. SQL> alter table table1 add primary key(id); Table altered. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE1'); PL/SQL procedure successfully completed. SQL> create table table2 (id number not null, stuff number not null, name2 varchar2(42)); Table created. SQL> insert into table2 select rownum, mod(rownum, 100)+1, 'Ziggy Stardust ' || rownum from dual connect by level <=10000; 10000 rows created. SQL> commit; Commit complete. SQL> create unique index table2_id_i on table2(id); Index created. SQL> alter table table2 add primary key(id); Table altered. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE2'); PL/SQL procedure successfully completed. SQL> create table table3 (id number not null, code3 number not null, name3 varchar2(42)); Table created. SQL> insert into table3 select rownum, mod(rownum, 500)+1, 'Thin White Duke ' || rownum from dual connect by level <=1000; 1000 rows created. SQL> commit; Commit complete. SQL> create unique index table3_id_i on table3(id); Index created. SQL> alter table table3 add primary key(id); Table altered. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'TABLE3'); PL/SQL procedure successfully completed.
I’ll next run the following “complex” query a number of times:
SQL> select code1, grade1, name1, name2 from table1, table2 where table1.code1=table2.id and table1.grade1 in (select table3.id from table3 where table3.code3=42); 10 rows selected. Execution Plan ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10 | 600 | 857 (7)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10 | 600 | 857 (7)| 00:00:01 | | 3 | NESTED LOOPS | | 10 | 600 | 857 (7)| 00:00:01 | | 4 | NESTED LOOPS | | 10 | 600 | 857 (7)| 00:00:01 | |* 5 | HASH JOIN | | 10 | 360 | 852 (7)| 00:00:01 | | 6 | JOIN FILTER CREATE | :BF0000 | 2 | 16 | 2 (0)| 00:00:01 | |* 7 | TABLE ACCESS STORAGE FULL | TABLE3 | 2 | 16 | 2 (0)| 00:00:01 | | 8 | JOIN FILTER USE | :BF0000 | 1000K| 26M| 833 (5)| 00:00:01 | | 9 | PX BLOCK ITERATOR | | 1000K| 26M| 833 (5)| 00:00:01 | |* 10 | TABLE ACCESS STORAGE FULL | TABLE1 | 1000K| 26M| 833 (5)| 00:00:01 | |* 11 | INDEX UNIQUE SCAN | TABLE2_ID_I | 1 | | 0 (0)| 00:00:01 | | 12 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 1 | 24 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 5 - access("TABLE1"."GRADE1"="TABLE3"."ID") 7 - storage("TABLE3"."CODE3"=42) filter("TABLE3"."CODE3"=42) 10 - storage(SYS_OP_BLOOM_FILTER(:BF0000,"TABLE1"."GRADE1")) filter(SYS_OP_BLOOM_FILTER(:BF0000,"TABLE1"."GRADE1")) 11 - access("TABLE1"."CODE1"="TABLE2"."ID") Statistics ---------------------------------------------------------- 6 recursive calls 0 db block gets 5777 consistent gets 0 physical reads 0 redo size 1224 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We note there are three predicates listed in which a column access could potentially benefit from an index being created:
table1.code1=table2.id (note that table2.id already has a unique index defined)
table1.grade1 in (Select…)
table3.code3=42
Let’s have a look at the corresponding Automatic Indexing report to see what the Oracle Automatic Indexing process made of this example:
REPORT -------------------------------------------------------------------------------- GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 26-JUN-2019 08:56:08 Activity end : 26-JUN-2019 08:56:53 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 3 Indexes created (visible / invisible) : 2 (2 / 0) Space used (visible / invisible) : 18.94 MB (18.94 MB / 0 B) Indexes dropped : 0 SQL statements verified : 2 SQL statements improved (improvement factor) : 1 (192.7x) SQL plan baselines created : 0 Overall improvement factor : 170.2x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0
We note there are 3 index candidates that were considered, BUT only 2 new indexes were actually created. Overall, the created indexes resulted in an estimated 192.7x improvement in the above SQL performance.
If we look further on in the report and at the actual indexes created:
INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: ------------------------------------------------------------------------ | Owner | Table | Index | Key | Type | Properties | ------------------------------------------------------------------------ | BOWIE | TABLE1 | SYS_AI_0p5nn18dt9bn1 | GRADE1 | B-TREE | NONE | | BOWIE | TABLE3 | SYS_AI_b4ntgxt6pdbfh | CODE3 | B-TREE | NONE | ------------------------------------------------------------------------ VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 21v8yqs9jrnzm SQL Text : select code1, grade1, name1, name2 from table1, table2 where table1.code1=table2.id and table1.grade1 in (select table3.id from table3 where table3.code3=42) Improvement Factor : 192.7x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 299396 2170 CPU Time (s): 289510 1036 Buffer Gets: 28912 43 Optimizer Cost: 857 27 Disk Reads: 0 4 Direct Writes: 0 0 Rows Processed: 50 10 Executions: 5 1
We note the report states the two new indexes are created on the TABLE1.GRADE1 and TABLE3.CODE3 columns.
If look down further in the report and compare the before and after execution plans in the PLANS SECTION:
PLANS SECTION - Original ----------------------------- Plan Hash Value : 1597150496 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 857 | | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10 | 600 | 857 | 00:00:01 | | 3 | NESTED LOOPS | | 10 | 600 | 857 | 00:00:01 | | 4 | NESTED LOOPS | | 10 | 600 | 857 | 00:00:01 | | 5 | HASH JOIN | | 10 | 360 | 852 | 00:00:01 | | 6 | JOIN FILTER CREATE | :BF0000 | 2 | 16 | 2 | 00:00:01 | | 7 | TABLE ACCESS STORAGE FULL | TABLE3 | 2 | 16 | 2 | 00:00:01 | | 8 | JOIN FILTER USE | :BF0000 | 1000000 | 28000000 | 833 | 00:00:01 | | 9 | PX BLOCK ITERATOR | | 1000000 | 28000000 | 833 | 00:00:01 | | 10 | TABLE ACCESS STORAGE FULL | TABLE1 | 1000000 | 28000000 | 833 | 00:00:01 | | 11 | INDEX UNIQUE SCAN | TABLE2_ID_I | 1 | | 0 | | | 12 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 1 | 24 | 1 | 00:00:01 | ------------------------------------------------------------------------------------------------ - With Auto Indexes ----------------------------- Plan Hash Value : 2014256176 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 600 | 27 | 00:00:01 | | 1 | NESTED LOOPS | | 10 | 600 | 27 | 00:00:01 | | 2 | NESTED LOOPS | | 10 | 600 | 27 | 00:00:01 | | 3 | NESTED LOOPS | | 10 | 360 | 17 | 00:00:01 | | 4 | TABLE ACCESS BY INDEX ROWID BATCHED | TABLE3 | 2 | 16 | 3 | 00:00:01 | | * 5 | INDEX RANGE SCAN | SYS_AI_b4ntgxt6pdbfh | 2 | | 1 | 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID BATCHED | TABLE1 | 5 | 140 | 7 | 00:00:01 | | * 7 | INDEX RANGE SCAN | SYS_AI_0p5nn18dt9bn1 | 5 | | 2 | 00:00:01 | | * 8 | INDEX UNIQUE SCAN | TABLE2_ID_I | 1 | | 0 | | | 9 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 1 | 24 | 1 | 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 5 - access("TABLE3"."CODE3"=42) * 7 - access("TABLE1"."GRADE1"="TABLE3"."ID") * 8 - access("TABLE1"."CODE1"="TABLE2"."ID")
We can see that the new plan uses both the new automatic indexes and has a new improved cost of 27 (down from 857).
If we look at all the indexes that have been created on these tables:
SQL> select table_name, index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name like 'TABLE%'; TABLE_NAME INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------ ---------------------- --- --- --------- ------------- -------- ---------- ----------- ----------------- TABLE1 TABLE1_ID_I NO NO VISIBLE DISABLED VALID 1000000 2088 5202 TABLE1 SYS_AI_85dcbcnkgj0w0 YES NO INVISIBLE DISABLED UNUSABLE 1000000 2171 415525 TABLE1 SYS_AI_0p5nn18dt9bn1 YES NO VISIBLE DISABLED VALID 1000000 2221 1000000 TABLE2 TABLE2_ID_I NO NO VISIBLE DISABLED VALID 10000 20 44 TABLE3 TABLE3_ID_I NO NO VISIBLE DISABLED VALID 1000 2 5 TABLE3 SYS_AI_b4ntgxt6pdbfh YES NO VISIBLE DISABLED VALID 1000 3 999 SQL> select table_name, index_name, column_name, column_position from user_ind_columns where table_name like 'TABLE%' order by table_name, index_name, column_position; TABLE_NAME INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------ ---------------------- --------------- --------------- TABLE1 SYS_AI_0p5nn18dt9bn1 GRADE1 1 TABLE1 SYS_AI_85dcbcnkgj0w0 CODE1 1 TABLE1 TABLE1_ID_I ID 1 TABLE2 TABLE2_ID_I ID 1 TABLE3 SYS_AI_b4ntgxt6pdbfh CODE3 1 TABLE3 TABLE3_ID_I ID 1
We note that both indexes listed as created in the Auto Indexing report on the TABLE1.GRADE1 (and TABLE3.CODE3 columns are both listed as being VISIBLE and VALID. Both of these indexes have been proven to improve the performance of the SQL statement by reducing the number of logical reads.
However, there is also an Auto Indexed defined on the other potential indexed column table1.code1, called “SYS_AI_85dcbcnkgj0w0” that has been left in an INVISIBLE, UNUSABLE state. This index has been shown to be ineffective in improving SQL performance and has been converted back to an UNUSABLE state.
If we run the query again and look at the resultant execution plan:
SQL> select code1, grade1, name1, name2 from table1, table2 where table1.code1=table2.id and table1.grade1 in (select table3.id from table3 where table3.code3=24); 10 rows selected. Execution Plan ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 600 | 15 (0)| 00:00:01 | | 1 | PX COORDINATOR | | | | | | | 2 | PX SEND QC (RANDOM) | :TQ10000 | 10 | 600 | 15 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 10 | 600 | 15 (0)| 00:00:01 | | 4 | NESTED LOOPS | | 10 | 600 | 15 (0)| 00:00:01 | | 5 | NESTED LOOPS | | 10 | 360 | 10 (0)| 00:00:01 | | 6 | PX BLOCK ITERATOR | | | | | | |* 7 | TABLE ACCESS STORAGE FULL | TABLE3 | 2 | 16 | 2 (0)| 00:00:01 | | 8 | TABLE ACCESS BY INDEX ROWID BATCHED| TABLE1 | 5 | 140 | 4 (0)| 00:00:01 | |* 9 | INDEX RANGE SCAN | SYS_AI_0p5nn18dt9bn1 | 5 | | 1 (0)| 00:00:01 | |* 10 | INDEX UNIQUE SCAN | TABLE2_ID_I | 1 | | 0 (0)| 00:00:01 | | 11 | TABLE ACCESS BY INDEX ROWID | TABLE2 | 1 | 24 | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 7 - storage("TABLE3"."CODE3"=24) filter("TABLE3"."CODE3"=24) 9 - access("TABLE1"."GRADE1"="TABLE3"."ID") 10 - access("TABLE1"."CODE1"="TABLE2"."ID") Statistics ---------------------------------------------------------- 10 recursive calls 4 db block gets 59 consistent gets 0 physical reads 0 redo size 1122 bytes sent via SQL*Net to client 588 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We note the new execution plan now only uses one of newly created Automatic Indexes (SYS_AI_0p5nn18dt9bn1) and at just 59 consistent gets, is significantly more efficient than it was previously where it required 5777 consistent gets.
However, the other VISIBLE automatic index (SYS_AI_b4ntgxt6pdbfh on the BOWIE3 table) is NOT actually used in the new plan. Why?
Because as discussed previously, it’s entirely possible for the Auto Indexing process to consider a new plan with Auto Index to be more efficient because it uses less consistent gets BUT the CBO not use the plan because the plan has a higher cost. The plan generated by the Auto Index process has a cost of 27 and uses 43 buffer gets but the CBO uses the plan without the second Auto Index because it has a reduced cost of only 15, even though it uses 59 consistent gets.
So the Auto Indexing process can create any number of possible indexes for a particular query and may independently ultimately determine different states for the various candidate indexes and so only create and keep the necessary indexes to sufficiently improve an SQL.
We now have an SQL statement that automatic runs much more efficiently without human intervention thanks to these automatically created indexes…