jump to navigation

Oracle 19c Automatic Indexing: Function-Based Indexes? Part II (If You Can See Me) February 5, 2021

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Function Based Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Virtual Columns.
add a comment

In my previous post, I discussed how Automatic Indexing does not currently support creating an index based on a function or expression predicate, even if it’s an equality predicate. You must manually create the associated function-based index.

However, if you have access to the application, there’s a better strategy when frequently searching on a function-based predicate. That’s to create a Virtual Column and use this column in your searching criteria (as mentioned by Connor McDonald in this comment).

To illustrate, I’m going to drop the previously manually created function-based index and hence the associated hidden virtual column, as Oracle quite rightly doesn’t allow you to have two virtual columns based on the same expression in the same table.

SQL> drop index david_upper_name_i;

Index dropped.

Since Oracle 11g, Oracle has supported the use of Visible Virtual Columns, a column that doesn’t physically exist, but defines a function/expression that can be easily accessed and populated when queried.

I’ll next create a Virtual Column called UPPER_NAME that is defined not based on a Data Type, but on the result on the UPPER function on the previously defined NAME column:

SQL> alter table david add (upper_name as (upper(name)));

Table altered.

Regardless of size of table, this column is added virtually instantly (pun fully intended), as no data is physically stored in the table itself. I view it (yep, another pun) as a “mini-view”, that can be used to hide complexity from the developer, with the actual data derived at run-time when the column is accessed in an SQL.

After I generate fresh statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'DAVID', estimate_percent=>null);

PL/SQL procedure successfully completed.

SQL> select column_name, hidden_column, virtual_column, num_distinct, density, histogram from dba_tab_cols where table_name='DAVID';

COLUMN_NAME          HID VIR NUM_DISTINCT    DENSITY HISTOGRAM
-------------------- --- --- ------------ ---------- ---------------
NAME                 NO  NO      10000000          0 HYBRID
MORE_STUFF9          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF8          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF7          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF6          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF5          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF4          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF3          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF2          NO  NO             1  .00000005 FREQUENCY
MORE_STUFF10         NO  NO             1  .00000005 FREQUENCY
MORE_STUFF1          NO  NO             1  .00000005 FREQUENCY
ID                   NO  NO      10000000          0 HYBRID
CODE                 NO  NO         10000      .0001 HYBRID
UPPER_NAME           NO YES      10000000          0 HYBRID

Note how the UPPER_NAME virtual column is NOT hidden and now has up to date statistics.

We can now run this simplified query based on the new UPPER_NAME column, which does not need to include the potentially complex function expression:

SQL> select * from david where upper_name='DAVID BOWIE 42';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2426813604

-----------------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost (%CPU) | Time      |
-----------------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |    1 |   200 |    3349 (6) | 00:00:01  | 
|* 1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 |    3349 (6) | 00:00:01  |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("UPPER_NAME"='DAVID BOWIE 42')
    filter("UPPER_NAME"='DAVID BOWIE 42')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     263469 consistent gets
     263452 physical reads
          0 redo size
       1328 bytes sent via SQL*Net to client
        375 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

If we look at portions of the subsequent Automatic Indexing report:

 

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 360.71 MB (360.71 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 2 (263476.8x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 263476.8x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
---------------------------------------------------------------------------
| Owner | Table | Index                | Key        | Type   | Properties |
---------------------------------------------------------------------------
| BOWIE | DAVID | SYS_AI_4k4mkgkw049ht | UPPER_NAME | B-TREE | NONE       |
---------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 7tfqh3pu526mt
SQL Text            : select * from david where upper_name='DAVID BOWIE 42'
Improvement Factor  : 263484.7x

Execution Statistics:
-----------------------------
                        Original Plan                Auto Index Plan
                        ---------------------------- ----------------------------
Elapsed Time (s):       1471249                      1414
CPU Time (s):           300584                       986
Buffer Gets:            3161816                      4
Optimizer Cost:         3349                         4
Disk Reads:             3161432                      3
Direct Writes:          0                            0
Rows Processed:         12                           1
Executions:             12                           1

PLANS SECTION
--------------------------------------------------------------------------------
- Original
-----------------------------
Plan Hash Value : 2426813604

-----------------------------------------------------------------------------
| Id | Operation                 | Name  | Rows | Bytes | Cost | Time       |
-----------------------------------------------------------------------------
|  0 | SELECT STATEMENT          |       |      |       | 3349 |            |
|  1 | TABLE ACCESS STORAGE FULL | DAVID |    1 |   200 | 3349 | 00:00:01   |
-----------------------------------------------------------------------------

Notes
-----
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no
- cardinality_feedback = yes

- With Auto Indexes
-----------------------------
Plan Hash Value : 1447691372

-------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                 | Rows | Bytes | Cost | Time      |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                      |    1 |   200 |    4 | 00:00:01  |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |    4 | 00:00:01  |
| * 2 |  INDEX RANGE SCAN                    | SYS_AI_4k4mkgkw049ht |    1 |       |    3 | 00:00:01  |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("UPPER_NAME"='DAVID BOWIE 42')

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see from the report that Automatic Indexing has now created the associated, implicitly created function-based index (SYS_AI_4k4mkgkw049ht) based on the virtual UPPER_NAME column:

SQL> select index_name, index_type, auto, constraint_index, visibility, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='DAVID';

INDEX_NAME           INDEX_TYPE                  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- --------------------------- --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_4k4mkgkw049ht FUNCTION-BASED NORMAL       YES NO  VISIBLE   VALID      10000000       43104           2136839

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='DAVID' order by index_name, column_position;

INDEX_NAME           COLUMN_NAME          COLUMN_POSITION
-------------------- -------------------- ---------------
SYS_AI_4k4mkgkw049ht UPPER_NAME                         1

 

If we now re-run the SQL query:

SQL> select * from david where upper_name='DAVID BOWIE 4242';

1 row selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1447691372

------------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name                 | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                      |    1 |   200 |       4 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | DAVID                |    1 |   200 |       4 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | SYS_AI_4k4mkgkw049ht |    1 |       |       3 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("UPPER_NAME"='DAVID BOWIE 4242')

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
          5 consistent gets
          0 physical reads
          0 redo size
       1334 bytes sent via SQL*Net to client
        377 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

The CBO now uses the new Automatic Index to significantly improve the performance of the query.

So not only is using a user defined Virtual Column a cleaner solution with respect to the frequent use of a function-based expressions, but has the added advantage of being supported with Automatic Indexing.