jump to navigation

Oracle Database 19c Automatic Indexing: Index Compression (Ghosteen) December 9, 2019

Posted by Richard Foote in 19c, 19c New Features, Advanced Index Compression, Automatic Indexing, AUTO_INDEX_COMPRESSION, Index Column Order, Index Compression, Index Internals.
trackback

 

 

In my previous post on Automatic Indexing, I discussed how the default index column order (in absence of other factors) is column id, the order in which the columns are defined in the table. In this post, I’ll explore if this changes if index compression is also implemented.

By default, Automatic Indexing does NOT use index compression. However, if you have access to the Advanced Compression option, you have the choice to turn on index compression in the following manner:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_COMPRESSION','ON');

PL/SQL procedure successfully completed.

Note: the AUTO_INDEX_COMPRESSION parameter is not actually documented, which could be a documentation bug or that Oracle is not yet ready to release this capability. The above will enable Automatic Indexes to be created with Compress Advanced Low, which is the “no-brain” index compression option as it will compress (deduplicate) safely with negligible CPU overheads. However, index column order is still critical to ensure effective compression as we shall see…

We begin by creating a simple table, that has four columns of interest, CODE1, CODE2, CODE3 and STATUS. They are defined in this order within the table, but CODE1 has the most number of distinct values (5000000 distinct values), then CODE2 (1000), then CODE3 (10) and finally STATUS which only has the 1 distinct value.

SQL> create table bowie_compress (id number, code1 number, code2 number, code3 number, status varchar2(42), name varchar2(42));

Table created.

SQL> insert into bowie_compress select rownum, mod(rownum, 5000000)+1, mod(rownum, 1000)+1, mod(rownum, 10)+1, 'COMPLETED’,

'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_COMPRESS');

PL/SQL procedure successfully completed.

 

In terms of being the most efficient from a compression perspective, it would be better to have the index defined in STATUS, CODE3, CODE2, CODE1 order, so that the leading columns in the index have the most duplicated values that enable effective deduplication and hence index compression. I’ve discussed the importance of index column for effective index compression a number of times previously. Arguably, it would be better not to actually index the STATUS column at all as with just 1 distinct value, provides no effective filtering benefits.

Having the CODE1 column as the leading column however with so many distinct values would effectively make the index non-compressible (with LOW compression), as the leading column would have too many distinct values to benefit much from compression.

So how does Automatic Indexing handle this scenario? Does it keep the same default index column order or does it alter the index column order to provide better index compression benefits?

Let’s run the following SQL with all four columns in the predicates and see what index Automatic Indexing creates…

 

SQL> select * from bowie_compress where code1=42 and code2=42 and code3=2 and status='COMPLETED';

Execution Plan

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    41 |   9998  (5)| 00:00:01 |
|   1 |  PX COORDINATOR              |                |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000       |     1 |    41 |   9998  (5)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |                |     1 |    41 |   9998  (5)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| BOWIE_COMPRESS |     1 |    41 |   9998  (5)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - storage("CODE1"=42 AND "CODE2"=42 AND "CODE3"=2 AND "STATUS"='COMPLETED')
     filter("CODE1"=42 AND "CODE2"=42 AND "CODE3"=2 AND "STATUS"='COMPLETED')

Statistics
----------------------------------------------------------
        6 recursive calls
        0 db block gets
    63562 consistent gets
        0 physical reads
        0 redo size
     1038 bytes sent via SQL*Net to client
      588 bytes received via SQL*Net from client
        2 SQL*Net roundtrips to/from client
        0 sorts (memory)
        0 sorts (disk)
        2 rows processed

 

If we look at the Automatic Indexing report for the period in which the above SQL was run:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 18-JUL-2019 00:18:35
Activity end                 : 18-JUL-2019 00:19:58
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                                : 1
Indexes created (visible / invisible)           : 1 (1 / 0)
Space used (visible / invisible)                : 293.6 MB (293.6 MB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 1
SQL statements improved (improvement factor)    : 1 (63563.9x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 63563.9x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

-------------------------------------------------------------------------------
INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
--------------------------------------------------------------------------------------------------
| Owner | Table          | Index                | Key                      | Type   | Properties |
--------------------------------------------------------------------------------------------------
| BOWIE | BOWIE_COMPRESS | SYS_AI_bkdhrsd29vd4f | CODE1,CODE2,CODE3,STATUS | B-TREE | NONE       |
--------------------------------------------------------------------------------------------------

 

We see that Automatic Index has created the index with all four columns from the SQL predicate in again the default column order as the column order as defined in the table (CODE1, CODE2, CODE3, STATUS). Even though Automatic Index Compression was enabled, it hasn’t considered the column cardinalities in its determination of best index column order.

Automatic Indexing has the tendency to index ALL columns specified in SQL predicates, regardless of whether all such columns provide filtering benefits AND does not consider the best column order from a compression perspective when determining index column order. So definitely room for improvement here methinks.

If we look at the definition and size of the resultant Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_COMPRESS';

INDEX_NAME                   AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_bkdhrsd29vd4f         YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       35451          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_COMPRESS' order by index_name, column_position;

INDEX_NAME                   COLUMN_NAME     COLUMN_POSITION
---------------------------- --------------- ---------------
SYS_AI_bkdhrsd29vd4f         CODE1                         1
SYS_AI_bkdhrsd29vd4f         CODE2                         2
SYS_AI_bkdhrsd29vd4f         CODE3                         3
SYS_AI_bkdhrsd29vd4f         STATUS                        4

 

We note the index has 35451 leaf blocks.

If we were to create the index manully in a more appropriate manner from a compression perspective, with the index columns defined in reverse order and also with another index without the redundant STATUS column:

SQL> create index bowie_compress_best_order_i on bowie_compress(status, code3, code2, code1) compress advanced low;

Index created.

SQL> create index bowie_compress_best_order2_i on bowie_compress(code3, code2, code1) compress advanced low;

Index created.

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_COMPRESS';

INDEX_NAME                   AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_bkdhrsd29vd4f         YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       35451          10000000
BOWIE_COMPRESS_BEST_ORDER_I  NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       23509          10000000
BOWIE_COMPRESS_BEST_ORDER2_I NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       23462          10000000

 

We notice the resultant indexes are substantially smaller, at just 23509 and 23462 leaf blocks respectively.

It could well be that Index Compression is not yet documented because Oracle appreciates that more work yet needs to be done to create indexes optimally from a compression perspective…

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: