Oracle 19c Automatic Indexing: Function-Based Indexes? Part II (If You Can See Me) February 5, 2021
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.1 comment so far
In my previous post, I discussed how Automatic Indexing does not currently support creating an index based on a function or expression predicate, even if it’s an equality predicate. You must manually create the associated function-based index.
However, if you have access to the application, there’s a better strategy when frequently searching on a function-based predicate. That’s to create a Virtual Column and use this column in your searching criteria (as mentioned by Connor McDonald in this comment).
To illustrate, I’m going to drop the previously manually created function-based index and hence the associated hidden virtual column, as Oracle quite rightly doesn’t allow you to have two virtual columns based on the same expression in the same table.
SQL> drop index david_upper_name_i; Index dropped.
Since Oracle 11g, Oracle has supported the use of Visible Virtual Columns, a column that doesn’t physically exist, but defines a function/expression that can be easily accessed and populated when queried.
I’ll next create a Virtual Column called UPPER_NAME that is defined not based on a Data Type, but on the result on the UPPER function on the previously defined NAME column:
SQL> alter table david add (upper_name as (upper(name))); Table altered.
Regardless of size of table, this column is added virtually instantly (pun fully intended), as no data is physically stored in the table itself. I view it (yep, another pun) as a “mini-view”, that can be used to hide complexity from the developer, with the actual data derived at run-time when the column is accessed in an SQL.
After I generate fresh statistics:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null); PL/SQL procedure successfully completed. SQL> select column_name, hidden_column, virtual_column, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID'; COLUMN_NAME HID VIR NUM_DISTINCT DENSITY HISTOGRAM -------------------- --- --- ------------ ---------- --------------- NAME NO NO 10000000 0 HYBRID MORE_STUFF9 NO NO 1 .00000005 FREQUENCY MORE_STUFF8 NO NO 1 .00000005 FREQUENCY MORE_STUFF7 NO NO 1 .00000005 FREQUENCY MORE_STUFF6 NO NO 1 .00000005 FREQUENCY MORE_STUFF5 NO NO 1 .00000005 FREQUENCY MORE_STUFF4 NO NO 1 .00000005 FREQUENCY MORE_STUFF3 NO NO 1 .00000005 FREQUENCY MORE_STUFF2 NO NO 1 .00000005 FREQUENCY MORE_STUFF10 NO NO 1 .00000005 FREQUENCY MORE_STUFF1 NO NO 1 .00000005 FREQUENCY ID NO NO 10000000 0 HYBRID CODE NO NO 10000 .0001 HYBRID UPPER_NAME NO YES 10000000 0 HYBRID
Note how the UPPER_NAME virtual column is NOT hidden and now has up to date statistics.
We can now run this simplified query based on the new UPPER_NAME column, which does not need to include the potentially complex function expression:
SQL> select * from david where upper_name='DAVID BOWIE 42'; 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2426813604 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 200 | 3349 (6) | 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL | DAVID | 1 | 200 | 3349 (6) | 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("UPPER_NAME"='DAVID BOWIE 42') filter("UPPER_NAME"='DAVID BOWIE 42') Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 263469 consistent gets 263452 physical reads 0 redo size 1328 bytes sent via SQL*Net to client 375 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
If we look at portions of the subsequent Automatic Indexing report:
SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created (visible / invisible) : 1 (1 / 0) Space used (visible / invisible) : 360.71 MB (360.71 MB / 0 B) Indexes dropped : 0 SQL statements verified : 2 SQL statements improved (improvement factor) : 2 (263476.8x) SQL plan baselines created : 0 Overall improvement factor : 263476.8x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: ------------------------------------------------------------------------------- --------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | --------------------------------------------------------------------------- | BOWIE | DAVID | SYS_AI_4k4mkgkw049ht | UPPER_NAME | B-TREE | NONE | --------------------------------------------------------------------------- ------------------------------------------------------------------------------- VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 7tfqh3pu526mt SQL Text : select * from david where upper_name='DAVID BOWIE 42' Improvement Factor : 263484.7x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 1471249 1414 CPU Time (s): 300584 986 Buffer Gets: 3161816 4 Optimizer Cost: 3349 4 Disk Reads: 3161432 3 Direct Writes: 0 0 Rows Processed: 12 1 Executions: 12 1 PLANS SECTION -------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 2426813604 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3349 | | | 1 | TABLE ACCESS STORAGE FULL | DAVID | 1 | 200 | 3349 | 00:00:01 | ----------------------------------------------------------------------------- Notes ----- - dop = 1 - px_in_memory_imc = no - px_in_memory = no - cardinality_feedback = yes - With Auto Indexes ----------------------------- Plan Hash Value : 1447691372 ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 200 | 4 | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID | 1 | 200 | 4 | 00:00:01 | | * 2 | INDEX RANGE SCAN | SYS_AI_4k4mkgkw049ht | 1 | | 3 | 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access("UPPER_NAME"='DAVID BOWIE 42') Notes ----- - Dynamic sampling used for this statement ( level = 11 )
We see from the report that Automatic Indexing has now created the associated, implicitly created function-based index (SYS_AI_4k4mkgkw049ht) based on the virtual UPPER_NAME column:
SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='DAVID'; INDEX_NAME INDEX_TYPE AUT CON VISIBILIT STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR -------------------- --------------------------- --- --- --------- -------- ---------- ----------- ----------------- SYS_AI_4k4mkgkw049ht FUNCTION-BASED NORMAL YES NO VISIBLE VALID 10000000 43104 2136839 SQL> select index_name, column_name, column_position from user_ind_columns where table_name='DAVID' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION -------------------- -------------------- --------------- SYS_AI_4k4mkgkw049ht UPPER_NAME 1
If we now re-run the SQL query:
SQL> select * from david where upper_name='DAVID BOWIE 4242'; 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1447691372 ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 200 | 4 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID | 1 | 200 | 4 (0) | 00:00:01 | |* 2 | INDEX RANGE SCAN | SYS_AI_4k4mkgkw049ht | 1 | | 3 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("UPPER_NAME"='DAVID BOWIE 4242') Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 1334 bytes sent via SQL*Net to client 377 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The CBO now uses the new Automatic Index to significantly improve the performance of the query.
So not only is using a user defined Virtual Column a cleaner solution with respect to the frequent use of a function-based expressions, but has the added advantage of being supported with Automatic Indexing.
Oracle 19c Automatic Indexing: Function-Based Indexes? (No Plan) February 4, 2021
Posted by Richard Foote in 19c, 19c New Features, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.3 comments
I previously discussed how Automatic Indexing only currently supports Equality based predicates.
The question I have today is does Automatic Indexing support function-based indexes? Let’s take a look.
The below DAVID table has the key column NAME which is an effectively unique VARCHAR2 column:
SQL> create table david (id number, code number, name varchar2(42), more_stuff1 varchar2(42), more_stuff2 varchar2(42), more_stuff3 varchar2(42), more_stuff4 varchar2(42), more_stuff5 varchar2(42), more_stuff6 varchar2(42), more_stuff7 varchar2(42), more_stuff8 varchar2(42), more_stuff9 varchar2(42), more_stuff10 varchar2(42)); Table created. SQL> insert into david select rownum, mod(rownum, 10000)+1, 'David Bowie '|| rownum, 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust', 'Ziggy Stardust' from dual connect by level <=10000000; 10000000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null); PL/SQL procedure successfully completed.
If we look at the current details of the table columns:
SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM -------------------- ------------ ---------- --------------- NAME 10000000 0 HYBRID MORE_STUFF9 1 .00000005 FREQUENCY MORE_STUFF8 1 .00000005 FREQUENCY MORE_STUFF7 1 .00000005 FREQUENCY MORE_STUFF6 1 .00000005 FREQUENCY MORE_STUFF5 1 .00000005 FREQUENCY MORE_STUFF4 1 .00000005 FREQUENCY MORE_STUFF3 1 .00000005 FREQUENCY MORE_STUFF2 1 .00000005 FREQUENCY MORE_STUFF10 1 .00000005 FREQUENCY MORE_STUFF1 1 .00000005 FREQUENCY ID 10000000 0 HYBRID CODE 10000 .0001 HYBRID
We notice the same oddity of my previous post that all columns have histograms…
Let’s run the following query with an UPPER function-based predicate that returns only the one row:
SQL> select * from david where upper(name) = 'DAVID BOWIE 4242'; Execution Plan ---------------------------------------------------------- Plan hash value: 2426813604 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K | 17M | 3350 (6) | 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL | DAVID | 100K | 17M | 3350 (6) | 00:00:01 | ----------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage(UPPER("NAME")='DAVID BOWIE 4242') filter(UPPER("NAME")='DAVID BOWIE 4242') Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 263469 consistent gets 263452 physical reads 0 redo size 1256 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
What does Automatic Indexing make of this scenario?
Basically, it does nothing. Currently, Automatic Indexing does NOT support such function-based indexes, even with equality based predicates (as of at least version 19.5.0.0.0). If we look at the next Automatic Indexing report:
SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 0 Indexes created : 0 Space used : 0 B Indexes dropped : 0 SQL statements verified : 2 SQL statements improved : 0 SQL plan baselines created : 0 Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 -------------------------------------------------------------------------------
No such function-based index is ever created by Automatic Indexing:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='DAVID'; no rows selected
To improve the performance of this query, one has to manually create the necessary function-based index:
SQL> create index david_upper_name_i on david(upper(name)); Index created.
If we now re-run the query:
SQL> select name from david where upper(name) = 'DAVID BOWIE 4242'; Execution Plan ---------------------------------------------------------- Plan hash value: 2675555529 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100K | 4199K | 3175 (1) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID | 100K | 4199K | 3175 (1) | 00:00:01 | |* 2 | INDEX RANGE SCAN | DAVID_UPPER_NAME_I | 40000 | | 3 (0) | 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(UPPER("NAME")='DAVID BOWIE 4242') Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 369 bytes sent via SQL*Net to client 384 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The query now uses the function-based index to significantly improve the performance of this query, with just 5 consistent gets.
Note however as with all function-based indexes, by default the estimated cardinality estimate and associated CBO costs are way off (100K rows are estimated, not the 1 row that is actually returned). This is due to the CBO having no real idea of the number and distribution of values coming out of the “black box” function-based predicate.
This is why Oracle automatically creates an hidden virtual column by which to store the necessary statistics associated to the function (in this case the SYS_NC00014$ column):
SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM -------------------- ------------ ---------- --------------- NAME 10000000 0 HYBRID MORE_STUFF9 1 .00000005 FREQUENCY MORE_STUFF8 1 .00000005 FREQUENCY MORE_STUFF7 1 .00000005 FREQUENCY MORE_STUFF6 1 .00000005 FREQUENCY MORE_STUFF5 1 .00000005 FREQUENCY MORE_STUFF4 1 .00000005 FREQUENCY MORE_STUFF3 1 .00000005 FREQUENCY MORE_STUFF2 1 .00000005 FREQUENCY MORE_STUFF10 1 .00000005 FREQUENCY MORE_STUFF1 1 .00000005 FREQUENCY ID 10000000 0 HYBRID CODE 10000 .0001 HYBRID SYS_NC00014$ NONE
But we need to first collect statistics on this hidden virtual column for the statistics to be populated:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', no_invalidate=> false, method_opt=> 'FOR ALL HIDDEN COLUMNS SIZE 1'); SQL> select column_name, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM -------------------- ------------ ---------- --------------- NAME 10000000 0 HYBRID MORE_STUFF9 1 .00000005 FREQUENCY MORE_STUFF8 1 .00000005 FREQUENCY MORE_STUFF7 1 .00000005 FREQUENCY MORE_STUFF6 1 .00000005 FREQUENCY MORE_STUFF5 1 .00000005 FREQUENCY MORE_STUFF4 1 .00000005 FREQUENCY MORE_STUFF3 1 .00000005 FREQUENCY MORE_STUFF2 1 .00000005 FREQUENCY MORE_STUFF10 1 .00000005 FREQUENCY MORE_STUFF1 1 .00000005 FREQUENCY ID 10000000 0 HYBRID CODE 10000 .0001 HYBRID SYS_NC00014$ 9947366 0 HYBRID
Now the CBO has the necessary statistics by which to determine a much more accurate cardinality estimate for the function-based predicate and so potentially a more efficient execution plan:
SQL> select * from david where upper(name) = 'DAVID BOWIE 4242'; Execution Plan ---------------------------------------------------------- Plan hash value: 2675555529 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 200 | 4 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | DAVID | 1 | 200 | 4 (0) | 00:00:01 | |* 2 | INDEX RANGE SCAN | DAVID_UPPER_NAME_I | 1 | | 3 (0) | 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(UPPER("NAME")='DAVID BOWIE 4242') Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 1256 bytes sent via SQL*Net to client 381 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
With the virtual column statistics in place, the CBO now has the cardinality estimate of 1 and associated costs spot on, which is always a good thing.
This requirement to collect the necessary statistics on the associated virtual column created as a result of the function-based index to ensure the index is costed and used effectively is perhaps but one reason why function-based indexes are currently not supported by Automatic Indexing.
As always, this can always change in the future…
Oracle Database 19c Automatic Indexing: Invisible Indexes Oddity (Wild Eyed Boy From Freecloud) February 3, 2021
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Automatic Table Statistics, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Histograms, Invisible Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c.2 comments
There have been a couple of “oddities” in relation to both Oracle Autonomous Databases and Automatic Indexing behaviour that I’ve seen frequently enough now (on Oracle 19.5.0.0.0) to make it worth a quick blog article.
The following is a simple test case that highlights both these issues. I’ll begin with a basic table, that has the key column CODE with a selectivity that would likely make it too expensive to be accessed via an associated index.
SQL> create table pink_floyd (id number, code number, create_date date, name varchar2(42)); Table created. SQL> insert into pink_floyd select rownum, ceil(dbms_random.value(0, 5000)), sysdate-mod(rownum, 50000)+1, 'Dark Side of the Moon' from dual connect by level <=10000000; 10000000 rows created. SQL> commit; Commit complete.
Importantly, I’ll next collect statistics on this table using all the default attributes, including allowing Oracle to decide the merits of any column histogram:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD'); PL/SQL procedure successfully completed.
Note I’ve yet to run a single query against this table. And yet, if we look at the details of each of these columns:
SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='PINK_FLOYD'; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM -------------------- ------------ ---------- --------------- ID 9705425 0 HYBRID CODE 4835 .00005 HYBRID CREATE_DATE 50357 .00002 HYBRID NAME 1 4.9639E-08 FREQUENCY
All the columns have a histogram !! This despite the columns not meeting either criteria normally required for a histogram, that the column be used in a SQL predicate AND for the column to have an uneven distribution of values.
None of these columns have yet to be used in a filtering predicate and none of these columns have a uneven distribution of values, even the CODE column as highlighted by looking at the minimum and maximum number of occurrences:
SQL> select min(code_count), max(code_count) from (select count(*) code_count from pink_floyd group by code); MIN(CODE_COUNT) MAX(CODE_COUNT) --------------- --------------- 1845 2163
So it’s very odd for these histograms to be present.
If we run the following query with a filtering predicate based on the CODE column:
SQL> select * from pink_floyd where code=42; 2012 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1152280033 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2068 | 82720 | 844 (11) | 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD | 2068 | 82720 | 844 (11) | 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 63655 consistent gets 63645 physical reads 0 redo size 38575 bytes sent via SQL*Net to client 360 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2012 rows processed
The CBO currently has no choice but to use a FTS with no index currently present. But what will Automatic Indexing make of things? If we look at the next automatic indexing report:
SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 2 Indexes created (visible / invisible) : 1 (0 / 1) Space used (visible / invisible) : 134.22 MB (0 B / 134.22 MB) Indexes dropped : 0 SQL statements verified : 1 SQL statements improved : 0 SQL plan baselines created : 0 Overall improvement factor : 0x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: *: invisible ------------------------------------------------------------------------------- ---------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | ---------------------------------------------------------------------------- | BOWIE | PINK_FLOYD | * SYS_AI_dp2t0j12zux49 | CODE | B-TREE | NONE | ---------------------------------------------------------------------------- -------------------------------------------------------------------------------
We notice that Oracle has created an Automatic Index, but it’s an INVISIBLE index !!
If we look at the details of this Automatic Index:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='PINK_FLOYD'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------- --- --- --------- ------------- -------- ---------- ----------- ----------------- SYS_AI_dp2t0j12zux49 YES NO INVISIBLE ADVANCED LOW VALID 10000000 15369 9845256
The index is in an INVISIBLE/VALID state, not the usual INVISIBLE/UNUSABLE state for an index for which Automatic Indexing decides an index is not efficient enough to be implement.
This is NOT expected behaviour.
Usually INVISIBLE/VALID indexes are created when Automatic Indexing is in “REPORT ONLY” mode, although I have come across this scenario when statistics are stale or missing. But in this case, Automatic Indexing is in “IMPLEMENT” mode and the table has recently collected statistics, albeit with odd histograms present (hence why I think these issues to be related).
If we run the same query again:
SQL> select * from pink_floyd where code=42; 2012 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1152280033 ---------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ---------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2068 | 82720 | 844 (11) | 00:00:01 | |* 1 | TABLE ACCESS STORAGE FULL | PINK_FLOYD | 2068 | 82720 | 844 (11) | 00:00:01 | ---------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage("CODE"=42) filter("CODE"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 63655 consistent gets 63645 physical reads 0 redo size 38575 bytes sent via SQL*Net to client 360 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2012 rows processed
The CBO has again no option but to use the FTS as Invisible indexes can not be considered by the CBO. However, it’s important to note that such an index would not be used by the CBO anyways as it would be deemed too expensive to use than the current FTS.
If you’re relying on Automatic Indexing and have it in Implement mode, I would recommend checking for any indexes in this INVISIBLE/VALID state as they’re an indication that something has very likely gone wrong…