Oracle 19c Automatic Indexing: Invisible/Valid Automatic Indexes (Bowie Rare) August 31, 2021
Posted by Richard Foote in 19c, 19c New Features, Attribute Clustering, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Index Access Path, Index statistics, Invisible Indexes, Invisible/Valid Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Oracle Statistics, Oracle19c, Unusable Indexes.1 comment so far
In my previous post, I discussed how newly created Automatic Indexes can have one of three statuses, depending the selectivity and effectiveness of the associated Automatic Index.
Indexes that improve performance sufficiently are created as Visible/Valid indexes and can be subsequently considered by the CBO. Indexes that are woeful and have no chance of improving performance are created as Invisible/Unusable indexes. Indexes considered potentially suitable but ultimately don’t sufficiently improve performance, are created as Invisible/Valid indexes.
Automatic Indexes are created as Visible/Valid indexes when shown to improve performance (by the _AUTO_INDEX_IMPROVEMENT_THRESHOLD parameter). But as I rarely came across Invisible/Valid Automatic Indexes (except for when Automatic Indexing is set to “Report Only” mode), I was curious to determine approximately at what point were such indexes created by the Automatic Indexing process.
To investigate things, I created a table with columns that contain data with various levels of selectivity, some of which should fall inside and outside the range of viability of any associated index, based on the cost of the associated Full Table Scan.
The following table has 32 columns of interest, each with a slight variation of distinct values giving small differences in overall column selectivity:
SQL> create table bowie_stuff1 (id number, code1 number, code2 number, code3 number, code4 number, code5 number, code6 number, code7 number, code8 number, code9 number, code10 number, code11 number, code12 number, code13 number, code14 number, code15 number, code16 number, code17 number, code18 number, code19 number, code20 number, code21 number, code22 number, code23 number, code24 number, code25 number, code26 number, code27 number, code28 number, code29 number, code30 number, code31 number, code32 number, name varchar2(42)); Table created. SQL> insert into bowie_stuff1 select rownum, mod(rownum, 900)+1, mod(rownum, 1000)+1, mod(rownum, 1100)+1, mod(rownum, 1200)+1, mod(rownum, 1300)+1, mod(rownum, 1400)+1, mod(rownum, 1500)+1, mod(rownum, 1600)+1, mod(rownum, 1700)+1, mod(rownum, 1800)+1, mod(rownum, 1900)+1, mod(rownum, 2000)+1, mod(rownum, 2100)+1, mod(rownum, 2200)+1, mod(rownum, 2300)+1, mod(rownum, 2400)+1, mod(rownum, 2500)+1, mod(rownum, 2600)+1, mod(rownum, 2700)+1, mod(rownum, 2800)+1, mod(rownum, 2900)+1, mod(rownum, 3000)+1, mod(rownum, 3100)+1, mod(rownum, 3200)+1, mod(rownum, 3300)+1, mod(rownum, 3400)+1, mod(rownum, 3500)+1, mod(rownum, 3600)+1, mod(rownum, 3700)+1, mod(rownum, 3800)+1, mod(rownum, 3900)+1, mod(rownum, 4000)+1, 'THE RISE AND FALL OF ZIGGY STARDUST' from dual connect by level >=10000000; 10000000 rows created. SQL> commit; Commit complete.
As always, it’s important that statistics be collected for Automatic Indexing to function properly:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STUFF1', estimate_percent=>null); PL/SQL procedure successfully completed.
So on a 10M row table, I have 32 columns with the number of distinct values varying by only 100 values per column (or by a selectivity of just 0.001%):
SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='BOWIE_STUFF1' order by num_distinct; COLUMN_NAME NUM_DISTINCT DENSITY HISTOGRAM ------------ ------------ ---------- --------------- NAME 1 .00000005 FREQUENCY CODE1 900 .001111 HYBRID CODE2 1000 .001 HYBRID CODE3 1100 .000909 HYBRID CODE4 1200 .000833 HYBRID CODE5 1300 .000769 HYBRID CODE6 1400 .000714 HYBRID CODE7 1500 .000667 HYBRID CODE8 1600 .000625 HYBRID CODE9 1700 .000588 HYBRID CODE10 1800 .000556 HYBRID CODE11 1900 .000526 HYBRID CODE12 2000 .0005 HYBRID CODE13 2100 .000476 HYBRID CODE14 2200 .000455 HYBRID CODE15 2300 .000435 HYBRID CODE16 2400 .000417 HYBRID CODE17 2500 .0004 HYBRID CODE18 2600 .000385 HYBRID CODE19 2700 .00037 HYBRID CODE20 2800 .000357 HYBRID CODE21 2900 .000345 HYBRID CODE22 3000 .000333 HYBRID CODE23 3100 .000323 HYBRID CODE24 3200 .000312 HYBRID CODE25 3300 .000303 HYBRID CODE26 3400 .000294 HYBRID CODE27 3500 .000286 HYBRID CODE28 3600 .000278 HYBRID CODE29 3700 .00027 HYBRID CODE30 3800 .000263 HYBRID CODE31 3900 .000256 HYBRID CODE32 4000 .00025 HYBRID ID 10000000 0 HYBRID
I’ll next run the below queries (based on a simple equality predicate on each column) several times each in batches of 8 queries, so as to not swamp the Automatic Indexing process with potential new index requests (the ramifications of which I’ll discuss in another future post):
SQL> select * from bowie_stuff1 where code1=42; SQL> select * from bowie_stuff1 where code2=42; SQL> select * from bowie_stuff1 where code3=42; SQL> select * from bowie_stuff1 where code4=42; SQL> select * from bowie_stuff1 where code5=42; ... SQL> select * from bowie_stuff1 where code31=42; SQL> select * from bowie_stuff1 where code32=42;
If we now look at the statuses of the Automatic Indexes subsequently created:
SQL> select i.index_name, c.column_name, i.auto, i.constraint_index, i.visibility, i.status, i.num_rows, i.leaf_blocks, i.clustering_factor from user_indexes i, user_ind_columns c where i.index_name=c.index_name and i.table_name='BOWIE_STUFF1' order by visibility, status; INDEX_NAME COLUMN_NAME AUT CON VISIBILIT STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ---------------------- ------------ --- --- --------- -------- ---------- ----------- ----------------- SYS_AI_5rw9j3d8pc422 CODE5 YES NO INVISIBLE UNUSABLE 10000000 21702 4272987 SYS_AI_48q3j752csn1p CODE4 YES NO INVISIBLE UNUSABLE 10000000 21702 4272987 SYS_AI_9sgharttf3yr7 CODE3 YES NO INVISIBLE UNUSABLE 10000000 21702 4272987 SYS_AI_8n92acdfbuh65 CODE2 YES NO INVISIBLE UNUSABLE 10000000 21702 4272987 SYS_AI_brgtfgngu3cj9 CODE1 YES NO INVISIBLE UNUSABLE 10000000 21702 4272987 SYS_AI_1tu5u4012mkzu CODE11 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_34b6zwgtm86rr CODE12 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_gd0ccvdwwb4mk CODE13 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_7k7wh28n3nczy CODE14 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_67k2zjp09w101 CODE15 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_5fa6k6fm0k6wg CODE10 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_4624ju6bxsv57 CODE9 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_bstrdkkxqtj4f CODE8 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_39xqjjar239zq CODE7 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_6h0adp60faytk CODE6 YES NO INVISIBLE VALID 10000000 15364 10000000 SYS_AI_5u0bqdgcx52vh CODE16 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_0hzmhsraqkcgr CODE22 YES NO INVISIBLE VALID 10000000 15366 10000000 SYS_AI_4x716k4mdn040 CODE21 YES NO INVISIBLE VALID 10000000 15366 10000000 SYS_AI_6wsuwr7p6drsu CODE20 YES NO INVISIBLE VALID 10000000 15366 10000000 SYS_AI_b424tdjx82rwy CODE19 YES NO INVISIBLE VALID 10000000 15366 10000000 SYS_AI_3a2y07fqkzv8x CODE18 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_8dp0b3z0vxzyg CODE17 YES NO INVISIBLE VALID 10000000 15365 10000000 SYS_AI_d95hnqayd7t08 CODE23 YES NO VISIBLE VALID 10000000 15366 10000000 SYS_AI_fry4zrxqtpyzg CODE24 YES NO VISIBLE VALID 10000000 15366 10000000 SYS_AI_920asb69q1r0m CODE25 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_026pa8880hnm2 CODE31 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_96xhzrguz2qpy CODE32 YES NO VISIBLE VALID 10000000 15368 10000000 SYS_AI_3dq93cc7uxruu CODE29 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_5nbz41xny8fvc CODE28 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_fz4q9bhydu2qt CODE27 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_0kwczzg3k3pfw CODE26 YES NO VISIBLE VALID 10000000 15367 10000000 SYS_AI_4qd5tsab7fnwx CODE30 YES NO VISIBLE VALID 10000000 15367 10000000
We can see we indeed have the 3 statuses of Automatic Indexes captured:
Columns with a selectivity equal or worse to that of COL5 with 1300 distinct values are created as Invisible/Unusable indexes. Returning 10M/1300 rows or a cardinality of approx. 7,693 or more rows is just too expensive for such indexes on this table to be viable. This represents a selectivity of approx. 0.077%.
Note how the index statistics for these Invisible/Unusable indexes are not accurate. They all have an estimated LEAF_BLOCKS of 21702 and a CLUSTERING_FACTOR of 4272987. However, we can see from the other indexes which are physically created that these are not correct and are substantially off the mark with the actual LEAF_BLOCKS being around 15364 and the CLUSTERING_FACTOR actually much worse at around 10000000.
Again worthy of a future post to discuss how Automatic Indexing processing has to make (potentially inaccurate) guesstimates for these statistics in its analysis of index viability when such indexes don’t yet physically exist.
Columns with a selectivity equal or better to that of COL23 which has 3100 distinct values are created as Visible/Valid indexes. Returning 10M/3100 rows or a cardinality of approx. 3226 or less rows is cheap enough for such indexes on this table to be viable. This represents a selectivity of approx. 0.032%.
So in this specific example, only those columns between 1400 and 3000 distinct values meet the “borderline” criteria in which the Automatic Indexing process creates Invisible/Valid indexes. This represents a very very narrow selectivity range of only approx. 0.045% in which such Invisible/Valid indexes are created. Or for this specific example, only those columns that return approx. between 3,333 and 7,143 rows from the 10M row table.
Now the actual numbers and total range of selectivities for which Invisible/Valid Automatic Indexes are created of course depends on all sorts of factors, such as the size/cost of FTS of the table and not least the clustering of the associated data (which I’ve blogged about ad nauseam).
The point I want to make is that the range of viability for such Invisible/Valid indexes is relatively narrow and the occurrences of such indexes relatively rare in your databases. As such, the vast majority of Automatic Indexes are likely to be either Visible/Valid or Invisible/Unusable indexes.
It’s important to recognised this when you encounter such Invisible/Valid Automatic Indexes (outside of “REPORT ONLY” implementations), as it’s an indication that such an index is a borderline case that is currently NOT considered by the CBO (because of it being Invisible).
However, this Invisible/Valid Automatic Index status should really change to either of the other two more common statuses in the near future.
I’ll expand on this point in a future post…