jump to navigation

Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part III (“Loaded”) April 28, 2022

Posted by Richard Foote in 19c, Advanced Index Compression, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Data Clustering, Exadata, Index Access Path, Index Column Order, Index Compression, Oracle, Oracle 21c, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Overloading.
add a comment

In my previous two posts, I’ve discussed scenarios where Automatic Indexing (AI) does not currently created automatic indexes and you may need to manually create the necessary indexes.

In this post, I’ll discuss a third scenario where AI will create an index, but you may want to manually create an even better one…

I’ll start by creating a relatively “large” table, with 20+ columns:

SQL> create table bowie_overload (id number, code1 number, code2 number, stuff1 varchar2(42), stuff2 varchar2(42), stuff3 varchar2(42), stuff4 varchar2(42), stuff5 varchar2(42), stuff6 varchar2(42), stuff7 varchar2(42), stuff8 varchar2(42), stuff9 varchar2(42), stuff10 varchar2(42), stuff11 varchar2(42), stuff12 varchar2(42), stuff13 varchar2(42), stuff14 varchar2(42), stuff15 varchar2(42), stuff16 varchar2(42), stuff17 varchar2(42), stuff18 varchar2(42), stuff19 varchar2(42), stuff20 varchar2(42), name varchar2(42));

Table created.

SQL> insert into bowie_overload select rownum, mod(rownum, 1000)+1, '42', 'David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke', 'David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke','David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke','David Bowie', 'Major Tom', 'Ziggy Stardust', 'Aladdin Sane', 'Thin White Duke', 'The Spiders From Mars' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_OVERLOAD');

PL/SQL procedure successfully completed.

 

The main columns to note here are CODE1 which contains 1000 distinct values (and so is kinda selective on a 10M row table, but not spectacularly so, especially with a poor clustering factor) and CODE2 which always contains the same value of “42” (and so will compress wonderfully for maximum effect).

I’ll next run the following query a number of times:

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1883860831

--------------------------------------------------------------------------------------------
| Id  | Operation                 | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |                | 10000 | 70000 |  74817 (1) | 00:00:03 |
| * 1 | TABLE ACCESS STORAGE FULL | BOWIE_OVERLOAD | 10000 | 70000 |  74817 (1) | 00:00:03 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE1"=24)
    filter("CODE1"=24)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
     869893 consistent gets
     434670 physical reads
          0 redo size
     183890 bytes sent via SQL*Net to client
       7378 bytes received via SQL*Net from client
        668 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

 

Without an index, the CBO currently has no choice but to perform a FTS. An index on the CODE1 column would provide the necessary filtering to fetch and return the required rows.

BUT, if this query was important enough, we could improve things further by “Overloading” this index with the CODE2 column, so we could use the index exclusively to get all the necessary data, without having to access the table at all. Considering an index on just the CODE1 column would need to fetch a reasonable number of rows (10000) and would need to visit a substantial number of different table blocks due to its poor clustering, overloading the index in this scenario would substantially reduce the necessary workloads of this query.

So what does AI do in this scenario, is overloading an index considered?

If we look at the AI report:

GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 28-APR-2022 12:15:45
Activity end                : 28-APR-2022 12:16:33
Executions completed        : 1
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                             : 1
Indexes created (visible / invisible)        : 1 (1 / 0)
Space used (visible / invisible)             : 134.22 MB (134.22 MB / 0 B)
Indexes dropped                              : 0
SQL statements verified                      : 2
SQL statements improved (improvement factor) : 2 (47.1x)
SQL plan baselines created                   : 0
Overall improvement factor                   : 47.1x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
| Owner | Table          | Index                | Key   | Type   | Properties |
-------------------------------------------------------------------------------
| BOWIE | BOWIE_OVERLOAD | SYS_AI_aat8t6ad0ux0h | CODE1 | B-TREE | NONE       |
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------

Parsing Schema Name : BOWIE
SQL ID              : bh5cuyv8ga0bt
SQL Text            : select code1, code2 from bowie_overload where code1=42
Improvement Factor  : 46.9x

Execution Statistics:
-----------------------------
                    Original Plan                Auto Index Plan
                    ---------------------------- ----------------------------
Elapsed Time (s):   42619069                     241844
CPU Time (s):       25387841                     217676
Buffer Gets:        12148771                     18499
Optimizer Cost:     74817                        10021
Disk Reads:         6085380                      9957
Direct Writes:      0                            0
Rows Processed:     140000                       10000
Executions:         14                           1

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 1883860831

--------------------------------------------------------------------------------
| Id | Operation         | Name           | Rows  | Bytes | Cost  | Time       |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT  |                |       |       | 74817 |            |
|  1 | TABLE ACCESS FULL | BOWIE_OVERLOAD | 10000 | 70000 | 74817 | 00:00:03   |
--------------------------------------------------------------------------------

- With Auto Indexes
-----------------------------
Plan Hash Value : 2541132923

---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost  | Time       |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |  9281 | 64967 | 10021 | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD       |  9281 | 64967 | 10021 | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_aat8t6ad0ux0h | 10000 |       |    18 | 00:00:01   |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE1"=42)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

We see that an automatic index on just the CODE1 column was created.

 

SQL> select index_name, auto, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_OVERLOAD';

INDEX_NAME                AUT VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_aat8t6ad0ux0h      YES VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_OVERLOAD' order by index_name, column_position;

INDEX_NAME                COLUMN_NAME     COLUMN_POSITION
------------------------- --------------- ---------------
SYS_AI_aat8t6ad0ux0h      CODE1                         1

 

If we now re-run the query (noting in Oracle21c after you invalidate the current cursor):

 

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2541132923

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 |  Rows | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      | 10000 | 70000 |   10021 (1)| 00:00:01 |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD       | 10000 | 70000 |   10021 (1)| 00:00:01 |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_aat8t6ad0ux0h | 10000 |       |      18 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE1"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      10021 consistent gets
          0 physical reads
          0 redo size
      50890 bytes sent via SQL*Net to client
         63 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

The query now uses the newly created automatic index.

BUT, at 10021 consistent gets, it’s still doing a substantial amount to work here.

If we manually create another index that overloads the only other column (CODE2) required in this query:

SQL> create index bowie_overload_code1_code2_i on bowie_overload(code1,code2) compress advanced low;

Index created.

I’m using COMPRESS ADVANCED LOW as used by the automatic index, noting that CODE2 only contains the value “42” for all rows, making it particularly perfect for compression and a “best case” scenario when it comes to the minimal overheads potentially associated with overloading this index (I’m trying yo give AI every chance here):

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_OVERLOAD';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_aat8t6ad0ux0h           YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000
BOWIE_OVERLOAD_CODE1_CODE2_I   NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       15363          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_OVERLOAD' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME     COLUMN_POSITION
------------------------------ --------------- ---------------
BOWIE_OVERLOAD_CODE1_CODE2_I   CODE1                         1
BOWIE_OVERLOAD_CODE1_CODE2_I   CODE2                         2
SYS_AI_aat8t6ad0ux0h           CODE1                         1

In fact, my manually created index is effectively the same size as the automatic index, with the same number (15363) of leaf blocks.

So I’m giving AI the best possible scenario in which it could potentially create an overloaded index.

But I’ve never been able to get AI to create overloaded indexes. Only columns in filtering predicates are considered for inclusion in automatic indexes.

If I now re-run my query again:

SQL> select code1, code2 from bowie_overload where code1=42;

10000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1161047960

-------------------------------------------------------------------------------------------------
| Id  | Operation        | Name                         |  Rows | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                              | 10000 | 70000 |      18 (0)| 00:00:01 |
| * 1 | INDEX RANGE SCAN | BOWIE_OVERLOAD_CODE1_CODE2_I | 10000 | 70000 |      18 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("CODE1"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         21 consistent gets
          0 physical reads
          0 redo size
      50890 bytes sent via SQL*Net to client
         63 bytes received via SQL*Net from client
          3 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
      10000 rows processed

We notice the CBO now uses the manually created index without any table access path, as it can just use the index to access the necessary data.

The number of consistent gets as a result has reduced significantly, down to just 21, a fraction of the previous 10021 when the automatic index was used.

So the scenario an of overloaded index that could significantly reduce database resources, which is currently not supported by AI, is another example of where may want to manually create a necessary index.

As always, this may change in the future releases…

Oracle Database 19c Automatic Indexing: Index Compression Update (New Morning) January 27, 2021

Posted by Richard Foote in 19c, 19c New Features, Advanced Index Compression, Autonomous Database, Autonomous Transaction Processing, AUTO_INDEX_COMPRESSION, Exadata, Index Column Order, Index Compression, Oracle, Oracle Blog, Oracle General, Oracle Indexes, Oracle19c.
add a comment

 

I was reminded in a recent comment by Rajeshwaran Jeyabal that I hadn’t updated my post on Automatic Indexing with Advanced Compression that’s in need of a couple of amendments.

Initially when Automatic Indexing was released, the ability to set Advanced Compression was NOT included in the official documentation, although the EXEC DBMS_AUTO_INDEX.CONFIGURE( ‘AUTO_INDEX_COMPRESSION‘ , ‘ON’); option was readily accessible. This has now been fixed and the associated doco on setting Advanced Compression for Automatic Indexes can be found here.

The other significant change is that Advanced Compression Low is now the default behaviour when Automatic Indexes are created in the Oracle ATP Autonomous Database Cloud environment. This makes sense in that if you have access to the Advanced Compression option, setting all indexes to Advanced Compression Low is the no-brainer setting as I’ve discussed previously. So several of my more recent posts show how Automatic Indexes have been created with Advanced Compression Low set.

What hasn’t changed however is how Automatic Indexing does NOT consider the efficiency of an index in relation to Index Compression when deciding how to order the columns within the index.

The default order of columns within an index (when other SQL predicates are not a consideration) is simply the order by which the columns appear within the table. Even though an index could be significantly smaller thanks to Index Compression if columns with more repeated values are ordered first within an index, this is not something Automatic Indexing currently considers.

The demo in my original piece still works exactly the same in the current 19c database versions of the ATP Autonomous Cloud environments. Manually created indexes can be significantly smaller if index columns are reordered or dropped entirely if they don’t provide filtering benefits.

When reading my blog, please do take note of the date of blog piece, especially in relation to Automatic Indexing. Things are only accurate as at time of publication and may change subsequently.

I thank Rajeshwaran for getting me to pull my finger out and update my blog accordingly…

Oracle Database 19c Automatic Indexing: Index Compression (Ghosteen) December 9, 2019

Posted by Richard Foote in 19c, 19c New Features, Advanced Index Compression, Automatic Indexing, AUTO_INDEX_COMPRESSION, Index Column Order, Index Compression, Index Internals.
2 comments

 

 

In my previous post on Automatic Indexing, I discussed how the default index column order (in absence of other factors) is column id, the order in which the columns are defined in the table. In this post, I’ll explore if this changes if index compression is also implemented.

By default, Automatic Indexing does NOT use index compression. However, if you have access to the Advanced Compression option, you have the choice to turn on index compression in the following manner:

SQL> EXEC DBMS_AUTO_INDEX.CONFIGURE('AUTO_INDEX_COMPRESSION','ON');

PL/SQL procedure successfully completed.

Note: the AUTO_INDEX_COMPRESSION parameter is not actually documented, which could be a documentation bug or that Oracle is not yet ready to release this capability. The above will enable Automatic Indexes to be created with Compress Advanced Low, which is the “no-brain” index compression option as it will compress (deduplicate) safely with negligible CPU overheads. However, index column order is still critical to ensure effective compression as we shall see…

We begin by creating a simple table, that has four columns of interest, CODE1, CODE2, CODE3 and STATUS. They are defined in this order within the table, but CODE1 has the most number of distinct values (5000000 distinct values), then CODE2 (1000), then CODE3 (10) and finally STATUS which only has the 1 distinct value.

SQL> create table bowie_compress (id number, code1 number, code2 number, code3 number, status varchar2(42), name varchar2(42));

Table created.

SQL> insert into bowie_compress select rownum, mod(rownum, 5000000)+1, mod(rownum, 1000)+1, mod(rownum, 10)+1, 'COMPLETED’,

'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_COMPRESS');

PL/SQL procedure successfully completed.

 

In terms of being the most efficient from a compression perspective, it would be better to have the index defined in STATUS, CODE3, CODE2, CODE1 order, so that the leading columns in the index have the most duplicated values that enable effective deduplication and hence index compression. I’ve discussed the importance of index column for effective index compression a number of times previously. Arguably, it would be better not to actually index the STATUS column at all as with just 1 distinct value, provides no effective filtering benefits.

Having the CODE1 column as the leading column however with so many distinct values would effectively make the index non-compressible (with LOW compression), as the leading column would have too many distinct values to benefit much from compression.

So how does Automatic Indexing handle this scenario? Does it keep the same default index column order or does it alter the index column order to provide better index compression benefits?

Let’s run the following SQL with all four columns in the predicates and see what index Automatic Indexing creates…

 

SQL> select * from bowie_compress where code1=42 and code2=42 and code3=2 and status='COMPLETED';

Execution Plan

-----------------------------------------------------------------------------------------------
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                |     1 |    41 |   9998  (5)| 00:00:01 |
|   1 |  PX COORDINATOR              |                |       |       |            |          |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000       |     1 |    41 |   9998  (5)| 00:00:01 |
|   3 |    PX BLOCK ITERATOR         |                |     1 |    41 |   9998  (5)| 00:00:01 |
|*  4 |     TABLE ACCESS STORAGE FULL| BOWIE_COMPRESS |     1 |    41 |   9998  (5)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

4 - storage("CODE1"=42 AND "CODE2"=42 AND "CODE3"=2 AND "STATUS"='COMPLETED')
     filter("CODE1"=42 AND "CODE2"=42 AND "CODE3"=2 AND "STATUS"='COMPLETED')

Statistics
----------------------------------------------------------
        6 recursive calls
        0 db block gets
    63562 consistent gets
        0 physical reads
        0 redo size
     1038 bytes sent via SQL*Net to client
      588 bytes received via SQL*Net from client
        2 SQL*Net roundtrips to/from client
        0 sorts (memory)
        0 sorts (disk)
        2 rows processed

 

If we look at the Automatic Indexing report for the period in which the above SQL was run:

 

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 18-JUL-2019 00:18:35
Activity end                 : 18-JUL-2019 00:19:58
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates                                : 1
Indexes created (visible / invisible)           : 1 (1 / 0)
Space used (visible / invisible)                : 293.6 MB (293.6 MB / 0 B)
Indexes dropped                                 : 0
SQL statements verified                         : 1
SQL statements improved (improvement factor)    : 1 (63563.9x)
SQL plan baselines created                      : 0
Overall improvement factor                      : 63563.9x

-------------------------------------------------------------------------------
SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

-------------------------------------------------------------------------------
INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
--------------------------------------------------------------------------------------------------
| Owner | Table          | Index                | Key                      | Type   | Properties |
--------------------------------------------------------------------------------------------------
| BOWIE | BOWIE_COMPRESS | SYS_AI_bkdhrsd29vd4f | CODE1,CODE2,CODE3,STATUS | B-TREE | NONE       |
--------------------------------------------------------------------------------------------------

 

We see that Automatic Index has created the index with all four columns from the SQL predicate in again the default column order as the column order as defined in the table (CODE1, CODE2, CODE3, STATUS). Even though Automatic Index Compression was enabled, it hasn’t considered the column cardinalities in its determination of best index column order.

Automatic Indexing has the tendency to index ALL columns specified in SQL predicates, regardless of whether all such columns provide filtering benefits AND does not consider the best column order from a compression perspective when determining index column order. So definitely room for improvement here methinks.

If we look at the definition and size of the resultant Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_COMPRESS';

INDEX_NAME                   AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_bkdhrsd29vd4f         YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       35451          10000000

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BOWIE_COMPRESS' order by index_name, column_position;

INDEX_NAME                   COLUMN_NAME     COLUMN_POSITION
---------------------------- --------------- ---------------
SYS_AI_bkdhrsd29vd4f         CODE1                         1
SYS_AI_bkdhrsd29vd4f         CODE2                         2
SYS_AI_bkdhrsd29vd4f         CODE3                         3
SYS_AI_bkdhrsd29vd4f         STATUS                        4

 

We note the index has 35451 leaf blocks.

If we were to create the index manully in a more appropriate manner from a compression perspective, with the index columns defined in reverse order and also with another index without the redundant STATUS column:

SQL> create index bowie_compress_best_order_i on bowie_compress(status, code3, code2, code1) compress advanced low;

Index created.

SQL> create index bowie_compress_best_order2_i on bowie_compress(code3, code2, code1) compress advanced low;

Index created.

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_COMPRESS';

INDEX_NAME                   AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------------- --- --- --------- ------------- -------- ---------- ----------- -----------------
SYS_AI_bkdhrsd29vd4f         YES NO  VISIBLE   ADVANCED LOW  VALID      10000000       35451          10000000
BOWIE_COMPRESS_BEST_ORDER_I  NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       23509          10000000
BOWIE_COMPRESS_BEST_ORDER2_I NO  NO  VISIBLE   ADVANCED LOW  VALID      10000000       23462          10000000

 

We notice the resultant indexes are substantially smaller, at just 23509 and 23462 leaf blocks respectively.

It could well be that Index Compression is not yet documented because Oracle appreciates that more work yet needs to be done to create indexes optimally from a compression perspective…

Bitmap Index On Column With 212552698 Distinct Values, What Gives? (I’d Rather Be High) April 23, 2019

Posted by Richard Foote in Advanced Index Compression, Autonomous Data Warehouse, Autonomous Database, Index Compression, Oracle Indexes.
add a comment

The Next Day

In my previous post on Indexing The Autonomous Warehouse, I highlighted how it might be necessary to create indexes to improve the performance and scalability of highly selective queries, as it might on any Data Warehouse running on an Exadata platform. In the post, I created a Bitmap Index and showed how improve SQL performance insured.

I’ve had two subsequent correspondences asking why I decided to use a Bitmap Index in my example, when the column in question had a (relatively) massive number of distinct values, some 212552698. What gives, wouldn’t it have been far more efficient to have used a standard B-Tree index instead, as Bitmap Indexes are only suitable for columns that have low-medium numbers of distinct values? There’s nothing particularly low-medium cardinality about 212552698 distinct values.

It’s a good question and as 2 people asked basically the same question, thought it worthy of a separate post to try to address it.

Firstly, I must admit the only calculation I performed mentally when deciding which type of index to use was to look at the number of distinct values, 212552698 and the number of rows in the table, 1 billion and determine that there’s approximately 5 rows per distinct value. With 5 repeated values on average, this was sufficient for the Bitmap Index to likely be the more efficient type of index and so I created a Bitmap Index in my demo, considering this was a Data Warehouse environment where potential locking issues relating to concurrent DMLs was not an issue.

As I’ve discussed previously, as with this classic post from way back in 2010, “So What Is A Good Cardinality Estimate For A Bitmap Index Column“, it’s not the number of distinct values that’s important, it’s the column cardinality and the average number of repeated column values that’s important when deciding if a Bitmap Index might be appropriate in a Data Warehouse environment.

As above post discusses, a Bitmap Index might only need the one index entry per column value and as the resultant index bitmap string can be very very highly compressed if there are few actual occurrences per value, a column with as many as 5 repeated values of average would likely be more efficient than a corresponding B-Tree Index. A B-Tree Index with 5 repeated values would require the value to be stored 5 times, with their 5 corresponding rowids, whereas the Bitmap Index might only need to store the indexed value once with its 2 corresponding rowids and a trivial amount for the highly compressed bitmap string.

If we create an equivalent B-Tree Index to the previously created Bitmap Index:

SQL> create index big_ziggy_lo_orderkey_i2 on big_ziggy(lo_orderkey) invisible;

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I2       NORMAL         2506552 DISABLED

We notice the Bitmap Index at 843144 leaf blocks is indeed substantially smaller than the equivalent B-Tree Index at 2506552 leaf blocks.

However, this is the Oracle Autonomous Data Warehouse environment where I have accessed to the Advanced Compression option and the capability to potentially create highly compressed B-Tree index structures. See various previous posts on Index Advanced Compression.

Perhaps, I can create a smaller structure to the Bitmap Index by using Index Advanced Compress. Let’s try initially with Advanced Compression Low:

SQL> drop index big_ziggy_lo_orderkey_i2;

Index dropped.

SQL> create index big_ziggy_lo_orderkey_i3 on big_ziggy(lo_orderkey) compress advanced low invisible;
...

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I3       NORMAL         1921296 ADVANCED LOW

We notice the B-Tree Index is indeed now smaller at just 1921296 leaf blocks, down from 2506552 leaf blocks, but still not as small as the 843144 leaf blocks of the Bitmap Index.

However, this is a Data Warehouse environment where the DML overheads associated with the maintenance of Advanced Compression High indexes may not be of such concern. Additionally, I might have ample CPU resources to cater for any additional CPU requirements of accessing such Advanced Compression High indexes. Therefore, let’s create a B-Tree Index with Advanced Compression High:

SQL> drop index big_ziggy_lo_orderkey_i3;

Index dropped.

SQL> create index big_ziggy_lo_orderkey_i4 on big_ziggy(lo_orderkey) compress advanced high invisible;
...

SQL> select index_name, index_type, leaf_blocks, compression from user_indexes
     where table_name = 'BIG_ZIGGY';

INDEX_NAME                     INDEX_TYPE LEAF_BLOCKS COMPRESSION
------------------------------ ---------- ----------- -------------
BIG_ZIGGY_LO_ORDERKEY_I        BITMAP          843144 DISABLED
BIG_ZIGGY_LO_ORDERKEY_I4       NORMAL          786537 ADVANCED HIGH

We notice the index has significantly reduced further in size and at just 786537 leaf blocks in now indeed smaller than the corresponding Bitmap Index.

Note that although it might be beneficial to use Advanced Compressed High on Bitmap Indexes, such an option is not currently supported.

Of course your mileage will vary on which Index Type and Compression Option will be most appropriate depending on the characteristics of your data, however it might be worth considering these options in environments where you have access to these indexing options.

But yes, a Bitmap Index might indeed be the better option, even if there are billions of distinct values (if there are say 10s of billions of rows) for the columns to be indexed…

12.2 Index Advanced Compression “High” Part IV (The Width of a Circle) March 10, 2017

Posted by Richard Foote in 12c Rel 2, Advanced Index Compression, Index Compression, Oracle Indexes.
2 comments

DAVID_BOWIE_THE+MAN+WHO+SOLD+THE+WORLD-291998

A quick post (for me) with a long weekend beckoning…

In Part I, Part II and Part III of looking at the new Index Advanced Compression level of “High”, we discussed how it can significantly decrease the size of your indexes in a manner not previously possible. This can result in significant reductions of index storage and the potential subsequent reduction in IO and memory related overheads.

This is of course good.

However, if you have applications which have tables/indexes that are very sensitive regarding DML performance, you need to exercise caution before compressing indexes in this manner. This is due to the extra CPU overheads and file IO wait times that can result in maintaining the highly compressed index structures.

To quickly illustrate this point, let’s first look at the timings when performing DML with an associated index that has no compression:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> create index bowie_code_idx on bowie(code) nocompress;

Index created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST'
from dual connect by level <= 1000000;

1000000 rows created.
Elapsed: 00:00:06.61

SQL> commit;

Commit complete.

SQL> update bowie set code = 42 where id between 250000 and 499999;

250000 rows updated.
Elapsed: 00:00:12.91

If we now repeat the same demo, but this time with an index that’s compressed with advanced compression set to “HIGH”:


SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> create index bowie_code_idx on bowie(code) compress advanced high;

Index created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <= 1000000;

1000000 rows created.
Elapsed: 00:00:39.08

SQL> commit;

Commit complete.

SQL> update bowie set code = 42 where id between 250000 and 499999;

250000 rows updated.
Elapsed: 00:01:09.83

We see there’s a significant increase in timings when both inserting into the table and when updating the highly compressed indexed column.

Therefore you need to consider the detrimental impact on DML performance due to the additional resources required in maintaining the highly compressed indexes, as it might offset the potential benefits of having the smaller index structures. Your mileage may vary.

More to come 🙂

12.2 Index Advanced Compression “High” Part III (I Can’t Give Everything Away) January 25, 2017

Posted by Richard Foote in 12c Rel 2, 12c Release 2 New Features, Advanced Index Compression, Oracle Indexes.
1 comment so far

If you like the look of the new Index Advanced Compression setting of “High” available with Oracle Database 12c Release 2 as I’ve discussed previously in Part I and Part II, well there’s a way to make this the default index compression method in your 12.2 database.

Let’s begin by creating a table and explicitly creating a NOCOMPRESS index in the BOWIE tablespace:


SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_code_idx on bowie(code) tablespace bowie nocompress;

Index created.

If we look at the size and compression type of this index:


SQL> select index_name, tablespace_name, leaf_blocks, compression
from dba_indexes where index_name='BOWIE_CODE_IDX';

INDEX_NAME      TABLESPACE_NAME      LEAF_BLOCKS COMPRESSION
--------------- -------------------- ----------- -------------
BOWIE_CODE_IDX  BOWIE                       1939 DISABLED

We notice the index has 1939 leaf blocks and that index compression is indeed disabled as expected.

Let’s now drop the index and recreate again it in the BOWIE tablespace, but this time without explicitly stating any compression option:


SQL> drop index bowie_code_idx;

Index dropped.

SQL> create index bowie_code_idx on bowie(code) tablespace bowie;

Index created.

SQL> exec dbms_stats.gather_index_stats(ownname=>null, indname=>'BOWIE_CODE_IDX');

PL/SQL procedure successfully completed.

If we look at the index now:


SQL> select index_name, tablespace_name, leaf_blocks, compression from dba_indexes
where index_name='BOWIE_CODE_IDX';

INDEX_NAME      TABLESPACE_NAME      LEAF_BLOCKS COMPRESSION
--------------- -------------------- ----------- -------------
BOWIE_CODE_IDX  BOWIE                        355 ADVANCED HIGH

We notice the index now only has 355 leaf blocks (down from 1939 leaf blocks) and that it has automatically used the new index advanced compression option of “HIGH”.

The secret lies with the following new settings.

Firstly, with the new db_index_compression_inheritance parameter, you can specify how during index creation the index inherits its index compression attributes (tablespace or table or not at all):


SQL> alter system set db_index_compression_inheritance=tablespace scope=both;

System altered.

Secondly, you can give a tablespace an index compression attribute on how indexes are compressed by default within the tablespace:


SQL> alter tablespace bowie default index compress advanced high;

Tablespace altered.

SQL> select tablespace_name, def_index_compression, index_compress_for
from dba_tablespaces where tablespace_name = 'BOWIE';

TABLESPACE_NAME      DEF_INDE INDEX_COMPRES
-------------------- -------- -------------
BOWIE                ENABLED  ADVANCED HIGH

So in this database, all indexes created within the BOWIE tablespace are automatically created with index advanced compression set to HIGH.

There are however some disadvantages with high index advanced compression that need to be considered.

More to come.

12.2 Index Advanced Compression “High” Part II (One Of My Turns) December 12, 2016

Posted by Richard Foote in 12c Rel 2, 12c Release 2 New Features, Advanced Index Compression, Oracle Indexes.
4 comments

In Part I, I introduced the new Index Advanced Compression default value of “HIGH”, which has the potential to significantly compress indexes much more than previously possible. This is due to new index compression algorithms that do more than simply de-duplicate indexed values within a leaf block.

Previously, any attempt to completely compress a Unique Index was doomed to failure as a Unique Index by definition only has unique values and so has nothing to de-duplicate. As such, you were previously restricted (quite rightly) to only being able to compress n-1 columns within a Unique Index. An attempt compress all columns in a Unique Index would only result in a larger index structure due to the associated overheads of the prefix-table within the leaf blocks.

But what happens if we now use Index Advanced Compression set to “HIGH” on a Unique Index ?

Let’s see.

Let’s first create a simple table with a unique ID column:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Let’s start by creating an uncompressed unique index on the ID column:

SQL> create unique index bowie_id_i on bowie(id);

Index created.

SQL> select index_name, leaf_blocks, compression from user_indexes where index_name='BOWIE_ID_I';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_ID_I          2088 DISABLED

So the uncompressed unique index has 2088 leaf blocks.

If we try and use normal compression on the index:

SQL> alter index bowie_id_i rebuild compress;
alter index bowie_id_i rebuild compress
*
ERROR at line 1:
ORA-25193: cannot use COMPRESS option for a single column key

We get an error saying we’re not allowed to compress a single column unique index. Doing so makes no sense, as there’s no benefit in de-duplicating such an index.

If we attempt to use advanced index compression with a value of “LOW”:

SQL> alter index bowie_id_i rebuild compress advanced low;
alter index bowie_id_i rebuild compress advanced low
*
ERROR at line 1:
ORA-25193: cannot use COMPRESS option for a single column key

We get the same error. Although advanced index compression of LOW is clever enough to automatically compress only those leaf blocks where there is a benefit in compression, there can be no such index leaf block that benefits from compression via the de-duplication method. Therefore, the error is really there to just let you know that you’re wasting your time in attempting to do this on a unique index.

If however we use the new HIGH option with index advanced compression:

SQL> alter index bowie_code_i rebuild compress advanced high;

Index altered.

SQL> exec dbms_stats.gather_index_stats(ownname=>null, indname=>'BOWIE_ID_I');

PL/SQL procedure successfully completed.

SQL> select index_name, leaf_blocks, compression from user_indexes where index_name='BOWIE_ID_I';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_ID_I           965 ADVANCED HIGH

Not only does it not give us an error, but it has actually managed to successfully compress such a unique index containing nothing but a bunch of unique numbers to just 965 leaf blocks, down from 2088. The index is now less than half its previous size.

So any Oracle B-tree index, even if it’s a single column unique index, is a possible candidate to be compressed with “High” advanced index compression.

More to come.

12.2 Index Advanced Compression “High” – Part I (High Hopes) December 6, 2016

Posted by Richard Foote in 12c Rel 2, 12c Release 2 New Features, Advanced Index Compression, Oracle Indexes.
8 comments

Oracle first introduced Advanced Compression for Indexes in 12.1 as I’ve discussed here a number of times.

With Oracle Database 12c Release 2, you can now use Index Advanced Compression “High” to further (and potentially dramatically) improve the index compression ratio.  Instead of simply de-duplicating the index entries within an index leaf block, High Index Compression uses more complex compression algorithms and stores the index entries in a Compression Unit (similar to that as used with Hybrid Columnar Compression). The net result is generally a much better level of compression, but at the potential cost of more CPU resources to both access and maintain the index structures.

To give you an idea on the possible compression improvements, let’s re-run the demo I used previously when I first discussed Advanced Index Compression.

So I first create a table, where the CODE column that has many distinct values, but a portion (25%) of data that is replicated:

SQL> create table bowie (id number, code number, name varchar2(30));

Table created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <= 1000000;

1000000 rows created.

SQL> update bowie set code = 42 where id between 250000 and 499999;

250000 rows updated.

SQL> commit;

Commit complete.

I then create an index on the CODE column and check out its initial size:

SQL> create index bowie_code_i on bowie(code);

Index created.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_CODE_I        2158 DISABLED

 

If I just use normal compression on this index:

SQL> alter index bowie_code_i rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_CODE_I        2684 ENABLED

 

We notice the index actually increases in size (2684 up from 2158), as most (75%) of the CODE values are unique and so the overheads associated with the resultant prefix table in the leaf blocks used with normal index compression overrides the savings of compression on the 25% of the index where compression is beneficial.

If we use “Low” advanced index compression as introduced in 12.1:

SQL> alter index bowie_code_i rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_CODE_I        2057 ADVANCED LOW

 

We notice the index has now indeed decreased in size (2057 down from 2158), as Oracle has automatically compressed just the 25% of the index where compression was beneficial and not touched the 75% of the index where compression wasn’t possible when de-duplicating values.

If we now however use the new 12.2 Advanced Index Compression “High” option:

SQL> alter index bowie_code_i rebuild compress advanced high;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_CODE_I           0 ADVANCED HIGH

Wow, an index with now no leaf blocks, that’s unbelievably small. Actually, I don’t believe it as this is due to bug 22094934. We need to gather index statistics to see the new index size:

 

SQL> exec dbms_stats.gather_index_stats(ownname=>null, indname=>'BOWIE_CODE_I');

PL/SQL procedure successfully completed.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME   LEAF_BLOCKS COMPRESSION
------------ ----------- -------------
BOWIE_CODE_I         815 ADVANCED HIGH

 

We notice that the index hasn’t just gone now a tad in size, but is now substantially smaller than before (down to just 815 leaf blocks, rather than the smaller 2057 from 2158 reduction we previously achieved with low index advanced compression.

So Index Advanced Compression, with the now default “HIGH” option can substantially reduce index sizes. Note this new capability of course requires the Advanced Compression Option.

More to come.

Index Advanced Compression: Multi-Column Index Part II (Blow Out) September 24, 2015

Posted by Richard Foote in Advanced Index Compression, Concatenated Indexes, Index Column Order, Index Rebuild, Oracle Indexes.
1 comment so far

I previously discussed how Index Advanced Compression can automatically determine not only the correct number of columns to compress, but also the correct number of columns to compress within specific leaf blocks of the index.

However, this doesn’t mean we can just order the columns within the index without due consideration from a “compression” perspective. As I’ve discussed previously, the column order within an index can be very important (especially with regard the use of the index if the leading column of the index is not specified in the SQL), including with regard to the possible compression capabilities of an index.

Advanced Index Compression does not change this and if we order columns inappropriately, one of the consequences can be the index simply can’t be compressed.

To illustrate, back to my simple little example:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

But this time, I’m going to create the index with the columns the other way around than I had in the previous post. The effectively unique ID column is now the leading column in the index, followed by the CODE column that indeed has many duplicate values. There is a “myth” that suggests this is actually a more “efficient” way to order an index, put the column with most distinct values first in the index. This is of course not true (yes, I’ve covered this one before as well).

SQL> create index bowie_idx on bowie(id, code) pctfree 0;

Index created.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2363          2

So the index without compression has 2363 leaf blocks.

As the leading column is effectively unique, we simply can’t now compress this index effectively. That’s because compression requires there to be duplicate index entries starting with at least the leading column. If the leading column has few (or no) duplicates, then by compressing the index Oracle is effectively creating a prefixed entry within the leaf block for each and every index entry. The whole point of index (or table) compression is to effectively de-duplicate the index values but there’s nothing to de-duplicate if there are no repeating values in at least the leading column of the index.

If we attempt to just compress fully the index anyways:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3115          2

It actually results in a bigger, not smaller index. The leaf blocks has gone up from 2363 to 3115.

Unlike the previous post where the columns in the index were the other way around, if we attempt to just compress the first column, it makes no difference to the inefficiency of the index compression because the number of prefix entries we create remains exactly the same:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3115          2

So the index remains at the higher 3115 leaf blocks.

The good thing with Advanced Index Compression is that we can “give it a go”, but it will not result in a larger index structure. If there’s nothing to compress within a leaf block, Oracle just ignores it and moves on to the next leaf block. If there’s nothing to compress at all within the index, the index remains the same as if it’s not been compressed:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2363          2

So the index is now back to 2363 leaf blocks, the same as if it wasn’t compressed at all. No it hasn’t helped, but at least it hasn’t made things worse.

So the order of the columns still plays a vital part in the “compress-ability” of the index, even with Index Advanced Compression at your disposal. If both the ID and CODE columns are referenced in your code, then having CODE as the leading column of the index would both improve the manner in which the index can be compressed and make a Skip-Scan index scan viable in the case when the CODE column might not occasionally be specified.

Now, if we change the leading column and create some duplicates (in this case, we update about 10% of the rows to now have duplicate values in the leading ID column):

SQL> update bowie set id=42 where id between 442000 and 542000;

100001 rows updated.

SQL> commit;

Commit complete.

SQL> alter index bowie_idx rebuild nocompress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2338          2

With a whole bunch of IDs with a value of 42, the non-compressed index now has 2338 leaf blocks. Yes, 10% of the leading columns have duplicates, but 90% of the index doesn’t and remains effectively unique. So if we try and compress this index now:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2941          2

The compressed index now has 2941 leaf blocks and is still larger than the 2338 non-compressed index. Yes, it’s compressed the 10% of the index that it could, but the inefficiencies in dealing with the other 90% has resulted in an overall larger index. So not too good really.

Again, compressing just the leading ID column doesn’t improve matters:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2977          2

In fact, at 2977 it’s even worse than compressing all the index because by compressing both columns, we could also effectively compress the duplicate CODE columns as well within that 10% of the index where we had duplicate ID values. With compressing just the ID column, we don’t get the benefit of compressing the duplicate CODE values. So not very good either.

In either case, compressing the index is ineffective as we end up with a bigger, not smaller index.

But not with Index Advanced Compression:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2265          2

We now have a index structure at just 2265 leaf blocks that is indeed smaller than the non-compressed index (2338 leaf blocks) because Oracle can now compress just the 10% of index where compression is effective and just ignore the rest of the index (90%) where compression is ineffective.

The best of both worlds, where Index Advanced Compression can compress just the part of an index where it effectively can and ignore and not make matters worse in any parts of the index where index compression is ineffective.

An indexing no-brainer …

Index Advanced Compression: Multi-Column Index Part I (There There) September 17, 2015

Posted by Richard Foote in 12c, Advanced Index Compression, Concatenated Indexes, Index Rebuild, Oracle Indexes.
1 comment so far

I’ve discussed Index Advanced Compression here a number of times previously. It’s the really cool additional capability introduced to the Advanced Compression Option with 12.1.0.2, that not only makes compressing indexes a much easier exercise but also enables indexes to be compressed more effectively than previously possible.

Thought I might look at a multi-column index to highlight just how truly cool this new feature is in automatically managing the compression of indexes.

First, let’s create a little table and multi-column index:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(code, id) pctfree 0;

Index created.

OK, the key thing to note here is that the leading CODE column in the index only has 10 distinct values and so is repeated very frequently. However, the second ID column is effectively unique such that the index entry overall is also likewise effectively unique. I’ve created this index initially with no compression, but with a PCTFREE 0 to make the non-compressed index as small as possible.

If we look at the size of the index:

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2361          2

We notice the index currently has 2361 leaf blocks.

I’ve previously discussed how index compression basically de-duplicates the indexed values by storing them in a pre-fixed table within the index leaf block. These pre-fixed entries are them referenced in the actual index entries, meaning it’s only now necessary to store repeated values once within a leaf block. Only repeated index values within an index leaf block can therefore be effectively compressed.

In this example, it would be pointless in compressing both indexed columns as this would only result in a unique pre-fixed entry for each any every index entry, given that the ID column is unique. In fact, the overhead of having the pre-fixed table for each and every index entry would actually result in a larger, not small overall index structure.

To show how compressing the whole index would be a really dumb idea for this particular index:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3120          2

The COMPRESS option basically compresses the whole index and we note that rather than creating a smaller, compressed index structure, the index is in fact bigger at 3120 leaf blocks.

However, as the leading CODE column in the index only has 10 distinct values and so is heavily repeated, it would make sense to just compress this first CODE column only in the index. This of course requires us to fully understand the data associated with the index.

We can do this by specifying just how many leading columns to compress (in this case just 1):

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2002          2

We note the index is indeed smaller than it was originally, now at just 2002 leaf blocks.

So this requires us to make the correct decision in how many columns in the index to compress. Getting this wrong can result in a worse, not better overall index structure.

Now with Advanced Index Compression, we don’t have to make this decision, we can simply let Oracle do it for us. As discussed previously, Oracle can go through each leaf block and decide how to best compress each leaf block. In this case, it can automatically determine that it’s only beneficial to compress the CODE column throughout the index.

If we compress this index with the new COMPRESS ADVANCED LOW clause:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2002          2

We note we get the index at the nice, small 2002 leaf blocks, as if we used the correct COMPRESS 1 decision.

However, the story gets a little better than this …

Let’s now modify the contents of the table so that we create some duplicates also for the second ID column:

SQL> update bowie set id=42 where id between 442000 and 542000;

100001 rows updated.

SQL> commit;

Commit complete.

OK, so for about 10% of rows, the ID column value is indeed repeated with the value 42. However, for the remaining 90% of rows (and hence index entries), the ID column remains effectively unique. So we have this 10% section of the index where ID is indeed heavily repeated with the value 42, but everywhere else within the index the ID remain unique.

If we rebuild this index again with no compression:

SQL> alter index bowie_idx rebuild nocompress pctfree 0;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2336          2

We now end up with 2336 leaf blocks (a little smaller than before the update as we’re replacing 10% of the IDs with a smaller value of just 42).

However, the vast majority (90%) of the index entries are still unique, so attempting to compress the entire index is again unlikely to be beneficial:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2946          2

Indeed, the index is again now bigger at 2946 than it was when it wasn’t compressed.

We can again effectively compress just the CODE column in the index:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         1977          2

OK, just compressing the CODE column has indeed resulted in a smaller index structure (just 1977 leaf blocks) as it did before.

Without Advanced Index Compression we have the option to not compress the index (the result is average), compress both columns (the result is worse) or compress just the leading column (the result is better). It’s an all or nothing approach to index compression with the best method decided at the overall index level.

We don’t have the option to compress just the leading column when it makes sense to do so, but to also compress both columns in just the 10% portion of the index where it also makes sense to do so (when we have lots of repeating 42 values for ID).

We do have this option though with Advanced Index Compression and indeed this is performed automatically by Oracle in just those leaf blocks where it’s beneficial because the decision on how to compress an index is not performed at the overall index level but at the leaf block level. As such, Advanced Index Compression has the potential to compress an index in a manner that was simply not possible previously:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         1941          2

We notice the index is now even smaller at just 1941 leaf blocks than it was when just compressing the leading column as we now also compress the CODE column in just that 10% of the table where we also had repeating ID values.

I can’t emphasise enough just how cool this feature is !!

In fact, I would recommend something I don’t usually recommend and that is rebuilding all your indexes at least once (where you know the leading column has some repeated values) with the Advanced Index Compression option, so that all indexes can be compressed to their optimal manner.

Note though that this does require the Advanced Compression Option !!

More later 🙂

Index Advanced Compression vs. Bitmap Indexes (Candidate) October 31, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Bitmap Indexes, Oracle Indexes.
7 comments

A good question from Robert Thorneycroft I thought warranted its own post. He asked:

I have a question regarding bitmapped indexes verses index compression. In your previous blog titled ‘So What Is A Good Cardinality Estimate For A Bitmap Index Column ? (Song 2)’ you came to the conclusion that ‘500,000 distinct values in a 1 million row table’ would still be a viable scenario for deploying bitmapped indexes over non-compressed b-tree indexes.

Now b-tree index compression is common, especially with the release of Advanced Index Compression how does this affect your conclusion? Are there still any rules of thumb which can be used to determine when to deploy bitmapped indexes instead of compressed b-tree indexes or has index compression made bitmapped indexes largely redundant?”

 

If you’re not familiar with Bitmap Indexes, it might be worth having a read of my previous posts on the subject.

Now Advanced Index Compression introduced in 12.1.0.2 has certainly made compressing indexes a lot easier and in many scenarios, more efficient than was previously possible. Does that indeed mean Bitmap Indexes, that are relatively small and automatically compressed, are now largely redundant ?

The answer is no, Bitmap Indexes are still highly relevant in Data Warehouse environments as they have a number of key advantages in the manner they get compressed over B-Tree Indexes.

Compression of a B-Tree index is performed within a leaf block where Oracle effectively de-duplicates the index entries (or parts thereof). This means that a highly repeated index value might need to be stored repeatedly in each leaf block. Bitmap index entries on the other hand can potentially span the entire table and only need to be split if the overall size of the index entries exceeds 1/2 a block. Therefore, the number of indexed values stored in a Bitmap Index can be far less than with a B-tree.

However, it’s in the area of storing the associated rowids where Bitmap Indexes can have the main advantage. With a B-tree index, even when highly compressed, each and every index entry must have an associated rowid stored in the index. If you have say 1 million index entries, that’s 1 million rowids that need to be stored, regardless of the compression ratio. With a Bitmap Index, an index entry has 2 rowids to specify the range of rows covered by the index entry, but this might be sufficient to cover the entire table. So depending on the number of distinct values being indexed in say a million row table, there may be dramatically fewer than 1 million rowids stored in the Bitmap Index.

To show how Bitmap Indexes are generally much smaller than corresponding compressed B-Tree indexes, a few simple examples.

In example 1, I’m going to create a B-Tree Index that is perfect candidate for compression. This index has very large indexed values that are all duplicates and so will compress very effectively:

SQL> create table ziggy (id number, weird varchar2(100));

Table created.

SQL> insert into ziggy select rownum, 'THE RISE AND FALL OF ZIGGY STARDUST AND THE SPIDERS FROM MARS'
     from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy_weird_i on ziggy(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY_WEIRD_I';

INDEX_NAME        BLEVEL LEAF_BLOCKS   NUM_ROWS
------------- ---------- ----------- ----------
ZIGGY_WEIRD_I          2        9175    1000000

SQL> drop index ziggy_weird_i2;

Index dropped.

SQL> create index ziggy_weird_i on ziggy(weird) pctfree 0 compress advanced low;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY_WEIRD_I';

INDEX_NAME        BLEVEL LEAF_BLOCKS   NUM_ROWS
------------- ---------- ----------- ----------
ZIGGY_WEIRD_I          2        1389    1000000

 

So this index has compressed down from 9175 leaf blocks to just 1389. That’s impressive.

However, this scenario is also the perfect case for a Bitmap Index with large, highly repeated index entries. If we compare the compressed B-Tree Index with a corresponding Bitmap index:

SQL> create bitmap index ziggy_weird_i on ziggy(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY_WEIRD_I';

INDEX_NAME        BLEVEL LEAF_BLOCKS   NUM_ROWS
------------- ---------- ----------- ----------
ZIGGY_WEIRD_I          1          21         42

 

At just a tiny 21 leaf blocks, the Bitmap Index wins by a mile.

In example 2, I’m going to create an index that still almost a perfect case for compressing a B-Tree Index, but far less so for a Bitmap Index. I’m going to create enough duplicate entries to just about fill a specific leaf block, so that each leaf block only has 1 or 2 distinct index values. However, as we’ll have many more distinct indexed values overall, this means we’ll need more index entries in the corresponding Bitmap Index.

SQL> create table ziggy2 (id number, weird varchar2(100));

Table created.

SQL> insert into ziggy2 select rownum, 'THE RISE AND FALL OF ZIGGY STARDUST AND THE SPIDERS FROM MARS'||mod(rownum,1385)
     from dual connect by level<=1000000;

1000000 rows created.

SQL> commit;

Commit complete.
SQL> create index ziggy2_weird_i on ziggy2(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY2_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY2_WEIRD_I          2        9568    1000000

SQL> drop index ziggy2_weird_i;

Index dropped.

SQL> create index ziggy2_weird_i on ziggy2(weird) pctfree 0 compress advanced low;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY2_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY2_WEIRD_I          2        1401    1000000

 

So we have a relatively large indexed column that has some 1385 distinct values but each value just about fills out a compress leaf block. If we look at the compression of the index, we have reduced the index down from 9568 leaf blocks to just 1401 leaf blocks. Again, a very impressive compression ratio.

Unlike the previous example where we had just the one value, we now have some 1385 index entries that need to be created as a minimum for our Bitmap Index. So how does it compare now ?

SQL> drop index ziggy2_weird_I;

Index dropped.

SQL> create bitmap index ziggy2_weird_i on ziggy2(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY2_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY2_WEIRD_I          2         462       1385

 

Although the Bitmap Index is much larger than it was in the previous example, at just 464 leaf blocks it’s still significantly smaller than the corresponding compressed 1401 leaf block B-Tree index.

OK, example 3, we’re going to go into territory where no Bitmap Index should tread (or so many myths would suggest). We going to index a column in which each value only has the one duplicate. So for our 1 million row table, the column will have some 500,000 distinct values.

With relatively few duplicate column values, the compression of our B-Tree Indexes is not going to be as impressive. However, because the indexed values are still relatively large, any reduction here would likely have some overall impact:

SQL> create table ziggy3 (id number, weird varchar2(100));

Table created.

SQL> insert into ziggy3 select rownum, 'THE RISE AND FALL OF ZIGGY STARDUST AND THE SPIDERS FROM MARS'||mod(rownum,500000)
     from dual connect by level<=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy3_weird_i on ziggy3(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY3_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY3_WEIRD_I          2        9891    1000000

SQL> drop index ziggy3_weird_i;

Index dropped.

SQL> create index ziggy3_weird_i on ziggy3(weird) pctfree 0 compress advanced low;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY3_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY3_WEIRD_I          2        6017    1000000

 

So the compression ratio is not as good now, coming down to 6017 leaf blocks from 9891. However, this will surely be better than a Bitmap Index with 500,000 distinct values …

 

SQL> drop index ziggy3_weird_i;

Index dropped.

SQL> create bitmap index ziggy3_weird_i on ziggy3(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY3_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY3_WEIRD_I          2        5740     500000

 

So even in this extreme example, the Bitmap Index at 5740 leaf blocks is still smaller than the corresponding compressed B-Tree Index at 6017 leaf blocks.

In this last example 4, it’s a scenario similar to the last one, except the index entries themselves are going to be much smaller (a few byte number column vs. the 60 odd byte varchar2). Therefore, the rowids of the index entries will be a much larger proportion of the overall index entry size. Reducing the storage of index values via compression will be far less effective, considering the prefix table in a compressed index comes with some overhead.

SQL> create table ziggy4 (id number, weird number);

Table created.

SQL> insert into ziggy4 select rownum, mod(rownum,500000) from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy4_weird_i on ziggy4(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY4_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY4_WEIRD_I          2        1998    1000000

SQL> drop index ziggy4_weird_i;

Index dropped.

SQL> create index ziggy4_weird_i on ziggy4(weird) pctfree 0 compress advanced low;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY4_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY4_WEIRD_I          2        1998    1000000

 

So Index Advanced Compression has decided against compressing this index, it’s just not worth the effort. If we force compression:

 

SQL> drop index ziggy4_weird_i;

Index dropped.

SQL> create index ziggy4_weird_i on ziggy4(weird) pctfree 0 compress;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY4_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY4_WEIRD_I          2        2065    1000000

 

We notice the index has actually increased in size, up to 2065 leaf blocks from 1998. The overheads of the prefix table over-ride the small efficiencies of reducing the duplicate number indexed values.

Meanwhile the corresponding Bitmap Index:

SQL> drop index ziggy4_weird_i;

Index dropped.

SQL> create bitmap index ziggy4_weird_i on ziggy4(weird) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='ZIGGY4_WEIRD_I';

INDEX_NAME         BLEVEL LEAF_BLOCKS   NUM_ROWS
-------------- ---------- ----------- ----------
ZIGGY4_WEIRD_I          2        1817     500000

 

Is still smaller at 1817 leaf blocks than the best B-Tree index has to offer.

So the answer is no, Bitmap Indexes are not now redundant now we have Index Advanced Compression. In Data Warehouse environments, as long as they don’t reference column values that are approaching uniqueness,  Bitmap Indexes are likely going to be smaller than corresponding compressed B-Tree indexes.

Index Compression Part VI: 12c Index Advanced Compression Block Dumps (Tumble and Twirl) October 9, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Block Dumps, Index Compression, Oracle Indexes.
5 comments

Sometimes, a few pictures (or in this case index block dumps) is better than a whole bunch of words 🙂

In my previous post, I introduced the new Advanced Index Compression feature, whereby Oracle automatically determines how to best compress an index. I showed a simple example of an indexed column that had sections of index entries that were basically unique (and so don’t benefit from compression) and other sections with index entries that had many duplicates (that do compress well). Advanced Index Compression enables Oracle to automatically just compress those index leaf blocks where compression is beneficial.

If we look at a couple of partial block dumps from this index, first a dump from a leaf block that did have duplicate index entries:

Leaf block dump
===============
header address 216542820=0xce82e64
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 651
kdxcofbo 1346=0x542
kdxcofeo 2172=0x87c
kdxcoavs 826
kdxlespl 0
kdxlende 0
kdxlenxt 25166046=0x18000de
kdxleprv 25166044=0x18000dc
kdxledsz 0
kdxlebksz 8036
kdxlepnro 1
kdxlepnco 1 (Adaptive)
prefix row#0[8031] flag: -P—–, lock: 0, len=5
col 0; len 2; (2):  c1 2b
prc 651
row#0[8022] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5c
psno 0
row#1[8013] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5d
psno 0
row#2[8004] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5e
psno 0
row#3[7995] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5f
psno 0
row#4[7986] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 60
psno 0

row#650[2172] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 8d 00 10
psno 0
—– end of leaf block Logical dump —–

The red section is a portion of the index header that determines the number of rows in the prefix table of the index (kdxlepnro 1). The prefix table basically lists all the distinct column values in the leaf blocks that are to be compressed. The value 1 denotes there is actually only just the 1 distinct column value in this specific leaf block (i.e. all index entries have the same indexed value). This section also denotes how many of the indexed columns are to be compressed (kdxlepnco 1). As this index only has the one column, it also has a value of 1. Note this value can potentially be anything between 0 (no columns compressed) up to the number of columns in the index.  The (Adaptive) reference tells us that Index Advanced Compression has been used and that the values here can change from leaf block to leaf block depending on the data characteristics of the index entries within each leaf block (a dump of a basic compressed index will not have the “Adaptive” reference).

The green section is the compression prefix table and details all the unique combinations of index entries to be compressed within the leaf block. As all indexed values are the same in this index (value 42, internally represented as c1 2b hex), the prefix table only has the one row. prc 651 denotes that all 651 index entries in this leaf block have this specific indexed value.

Next follows all the actual index entries, which now only consist of the rowid (the 6 byte col 0 column) as they all reference psno 0, which is the unique row id of the only row within the prefix table (row#0).

So rather than storing the indexed value 651 times, we can just store the index value (42) just the once within the prefix table and simply reference it from within the actual index entries. This is why index compression can save us storage, storing something once within a leaf block rather than multiple times.

If we now look at a partial block dump of another index leaf block within the index, that consists of many differing (basically unique) index entries:

Leaf block dump
===============
header address 216542820=0xce82e64
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 449
kdxcofbo 938=0x3aa
kdxcofeo 1754=0x6da
kdxcoavs 816
kdxlespl 0
kdxlende 0
kdxlenxt 25168667=0x1800b1b
kdxleprv 25168665=0x1800b19
kdxledsz 0
kdxlebksz 8036
kdxlepnro 0
kdxlepnco 0 (Adaptive)
row#0[8022] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2c
col 1; len 6; (6):  01 80 12 e6 00 41
row#1[8008] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2d
col 1; len 6; (6):  01 80 12 e6 00 42
row#2[7994] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2e
col 1; len 6; (6):  01 80 12 e6 00 43


row#448[1754] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 41 5c
col 1; len 6; (6):  01 80 12 ee 00 1d
—– end of leaf block Logical dump —–

We notice that in the red section, both kdxlepnro 0 and kdxlepnco 0 (Adaptive) have a value of 0, meaning we have no rows and no columns within the prefix table. As such, we have no prefix table at all here and that this leaf block has simply not been compressed.

If we look at the actual index entries, they all have an additional column now in blue, that being the actual indexed value as all the index values in this leaf block are different from each other. Without some form of index entry duplication, there would be no benefit from compression and Index Advanced Compression has automatically determined this and not bothered to compress this leaf block. An attempt to compress this block would have actually increased the necessary overall storage for these index entries, due to the additional overheads associated with the prefix table (note it has an additional 2 byes of overhead per row within the prefix table).

I’ll next look at an example of a multi-column index and how Index Advanced Compression handles which columns in the index to compress.

Index Compression Part V: 12c Advanced Index Compression (Little Wonder) October 2, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Index Compression, Oracle Indexes.
3 comments

I’ve finally managed to find some free time in the evening to write a new blog piece 🙂

This will have to be the record for the longest time between parts in a series, having written Part IV of this Index Compression series way way back in February 2008 !! Here are the links to the previous articles in the series:

Index Compression Part I (Low)

Index Compression Part II (Down Is The New Up)

Index Compression Part III (2+2=5)

Index Compression Part IV (Packt Like Sardines In a Crushd Tin Box)

 

As I’ve previously discussed, compressing an index can be an excellent way to permanently reduce the size of an index in a very cost effective manner. Index entries with many duplicate values (or duplicate leading columns within the index) can be “compressed” by Oracle to reduce both storage overheads and potentially access overheads for large index scans. Oracle basically de-duplicates repeated indexed column values within each individual leaf block by storing each unique occurrence in a prefix section within the block, as I explain in the above links.

But it’s important to compress the right indexes in the right manner. If indexes do not have enough repeated data, it’s quite possible to make certain indexes larger rather than smaller when using compression (as the overheads of having the prefix section in the index block outweighs the benefits of limited reduction of repeated values). So one needs to be very selective on which indexes to compress and take care to compress the correct number of columns within the index. Oracle will only protect you from yourself if you attempt to compress all columns in a unique index, as in this scenario there can be no duplicate values to compress. This is all discussed in Part II and Part III of the series.

So, wouldn’t it be nice if Oracle made it all a lot easier for us and automatically decided which indexes to compress, which columns within the index to compress and which indexes to simply not bother compressing at all. Additionally, rather than an all or nothing approach in which all index leaf blocks are compressed in the same manner, wouldn’t it be nice if Oracle decided for each and every individual leaf block within the index how to best compress it. For those index leaf block that have no duplicate entries, do nothing, for those with some repeated columns just compress them and for those leaf blocks with lots of repeated columns and values to compress all of them as efficiently as possible.

Well, wish no more 🙂

With the recent release of Oracle Database 12.1.0.2, one of the really cool new features that got introduced was Advanced Index Compression. Now a warning from the get-go. The use of Advanced Index Compression requires the Advanced Compression Option and this option is automatically enabled with Enterprise Edition. So only use this feature if you are licensed to do so 🙂

The best way as always to see this new feature in action is via a simple little demo.

To begin, I’ll create a table with a CODE column that is populated with unique values:

SQL> create table bowie (id number, code number, name varchar2(30));

Table created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <= 1000000;

1000000 rows created.

I’ll now create a section of data within the table in which we have many repeated values:

SQL> update bowie set code = 42 where id between 250000 and 499999;

250000 rows updated.

SQL> commit;

Commit complete.

So I’ve fabricated the data such that the values in the CODE column are effectively unique within 75% of the table but the other 25% consists of repeated values.

From an index compression perspective, this index really isn’t a good candidate for normal compression as most of the CODE data contains unique data that doesn’t compress. However, it’s a shame that we can’t easily just compress the 25% of the index that would benefit from compression (without using partitioning or some such).

If we create a normal B-Tree index on the CODE column without compression:

SQL> create index bowie_code_i on bowie(code);

Index created.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2157 DISABLED

 

We notice the index consists of 2157 leaf blocks.

If we now try to use normal compression on the index:

SQL> alter index bowie_code_i rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2684 ENABLED

 

We notice that the compressed index rather than decrease in size has actually increased in size, up to 2684 leaf blocks. So the index has grown by some 25% due to the fact the index predominately contains unique values which don’t compress at all and the resultant prefix section in the leaf blocks becomes nothing more than additional overhead. The 25% section of the index containing all the repeated values has indeed compressed effectively but these savings are more than offset by the increase in size associated with the other 75% of the index where the index entries had no duplication.

However, if we use the new advanced index compression capability via the COMPRESS ADVANCED LOW clause:

SQL> alter index bowie_code_i rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2054 ADVANCED LOW

 

We notice the index has now indeed decreased in size from the original 2157 leaf blocks down to 2054. Oracle has effectively ignored all those leaf blocks where compression wasn’t viable and compressed just the 25% of the index where compression was effective. Obviously, the larger the key values (remembering the rowids associated with the index entries can’t be compressed) and the larger the percentage of repeated data, the larger the overall compression returns.

With Advanced Index Compression, it’s viable to simply set it on for all your B-Tree indexes and Oracle will uniquely compress automatically each individual index leaf block for each and every index as effectively as it can for the life of the index.

12.1.0.2 Released With Cool Indexing Features (Short Memory) July 25, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Attribute Clustering, Database In-Memory, Zone Maps.
2 comments

Oracle Database 12.1.0.2 has finally been released and it has a number of really exciting goodies from an indexing perspective which include:

  • Database In-Memory Option, which enables specific portions of the database to be in dual format, in both the existing row based format and additionally into an efficient memory only columnar based format. This in turn enables analytical based processing to access the real-time data in the In-Memory Store extremely fast, potentially faster and more effectively than via standard analytical based database indexes.
  • Advanced Index Compression, which allows Oracle to automatically choose the appropriate compression method for each individual leaf block, rather than having to manually select a single compression method across the whole index. This makes compressing an index a breeze and much more effective than previously possible.
  • Zone Maps, which enables Storage Index like capabilities to be manually configured and physically implemented inside the database, to eliminate unnecessary accesses of table storage via much smaller objects than conventional database indexes.
  • Attribute Clustering, a new table attribute which enables much better clustering of table data and we all know how both compression and index structures love table data to be well clustered.

These are all topics I’ll be covering in the coming weeks so stay tuned 🙂