Index Advanced Compression: Multi-Column Index Part II (Blow Out) September 24, 2015
Posted by Richard Foote in Advanced Index Compression, Concatenated Indexes, Index Column Order, Index Rebuild, Oracle Indexes.1 comment so far
I previously discussed how Index Advanced Compression can automatically determine not only the correct number of columns to compress, but also the correct number of columns to compress within specific leaf blocks of the index.
However, this doesn’t mean we can just order the columns within the index without due consideration from a “compression” perspective. As I’ve discussed previously, the column order within an index can be very important (especially with regard the use of the index if the leading column of the index is not specified in the SQL), including with regard to the possible compression capabilities of an index.
Advanced Index Compression does not change this and if we order columns inappropriately, one of the consequences can be the index simply can’t be compressed.
To illustrate, back to my simple little example:
SQL> create table bowie (id number, code number, name varchar2(42)); Table created. SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit; Commit complete.
But this time, I’m going to create the index with the columns the other way around than I had in the previous post. The effectively unique ID column is now the leading column in the index, followed by the CODE column that indeed has many duplicate values. There is a “myth” that suggests this is actually a more “efficient” way to order an index, put the column with most distinct values first in the index. This is of course not true (yes, I’ve covered this one before as well).
SQL> create index bowie_idx on bowie(id, code) pctfree 0; Index created. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2363 2
So the index without compression has 2363 leaf blocks.
As the leading column is effectively unique, we simply can’t now compress this index effectively. That’s because compression requires there to be duplicate index entries starting with at least the leading column. If the leading column has few (or no) duplicates, then by compressing the index Oracle is effectively creating a prefixed entry within the leaf block for each and every index entry. The whole point of index (or table) compression is to effectively de-duplicate the index values but there’s nothing to de-duplicate if there are no repeating values in at least the leading column of the index.
If we attempt to just compress fully the index anyways:
SQL> alter index bowie_idx rebuild compress; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 3115 2
It actually results in a bigger, not smaller index. The leaf blocks has gone up from 2363 to 3115.
Unlike the previous post where the columns in the index were the other way around, if we attempt to just compress the first column, it makes no difference to the inefficiency of the index compression because the number of prefix entries we create remains exactly the same:
SQL> alter index bowie_idx rebuild compress 1; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 3115 2
So the index remains at the higher 3115 leaf blocks.
The good thing with Advanced Index Compression is that we can “give it a go”, but it will not result in a larger index structure. If there’s nothing to compress within a leaf block, Oracle just ignores it and moves on to the next leaf block. If there’s nothing to compress at all within the index, the index remains the same as if it’s not been compressed:
SQL> alter index bowie_idx rebuild compress advanced low; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2363 2
So the index is now back to 2363 leaf blocks, the same as if it wasn’t compressed at all. No it hasn’t helped, but at least it hasn’t made things worse.
So the order of the columns still plays a vital part in the “compress-ability” of the index, even with Index Advanced Compression at your disposal. If both the ID and CODE columns are referenced in your code, then having CODE as the leading column of the index would both improve the manner in which the index can be compressed and make a Skip-Scan index scan viable in the case when the CODE column might not occasionally be specified.
Now, if we change the leading column and create some duplicates (in this case, we update about 10% of the rows to now have duplicate values in the leading ID column):
SQL> update bowie set id=42 where id between 442000 and 542000; 100001 rows updated. SQL> commit; Commit complete. SQL> alter index bowie_idx rebuild nocompress; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2338 2
With a whole bunch of IDs with a value of 42, the non-compressed index now has 2338 leaf blocks. Yes, 10% of the leading columns have duplicates, but 90% of the index doesn’t and remains effectively unique. So if we try and compress this index now:
SQL> alter index bowie_idx rebuild compress; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2941 2
The compressed index now has 2941 leaf blocks and is still larger than the 2338 non-compressed index. Yes, it’s compressed the 10% of the index that it could, but the inefficiencies in dealing with the other 90% has resulted in an overall larger index. So not too good really.
Again, compressing just the leading ID column doesn’t improve matters:
SQL> alter index bowie_idx rebuild compress 1; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2977 2
In fact, at 2977 it’s even worse than compressing all the index because by compressing both columns, we could also effectively compress the duplicate CODE columns as well within that 10% of the index where we had duplicate ID values. With compressing just the ID column, we don’t get the benefit of compressing the duplicate CODE values. So not very good either.
In either case, compressing the index is ineffective as we end up with a bigger, not smaller index.
But not with Index Advanced Compression:
SQL> alter index bowie_idx rebuild compress advanced low; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2265 2
We now have a index structure at just 2265 leaf blocks that is indeed smaller than the non-compressed index (2338 leaf blocks) because Oracle can now compress just the 10% of index where compression is effective and just ignore the rest of the index (90%) where compression is ineffective.
The best of both worlds, where Index Advanced Compression can compress just the part of an index where it effectively can and ignore and not make matters worse in any parts of the index where index compression is ineffective.
An indexing no-brainer …
Index Advanced Compression: Multi-Column Index Part I (There There) September 17, 2015
Posted by Richard Foote in 12c, Advanced Index Compression, Concatenated Indexes, Index Rebuild, Oracle Indexes.1 comment so far
I’ve discussed Index Advanced Compression here a number of times previously. It’s the really cool additional capability introduced to the Advanced Compression Option with 12.1.0.2, that not only makes compressing indexes a much easier exercise but also enables indexes to be compressed more effectively than previously possible.
Thought I might look at a multi-column index to highlight just how truly cool this new feature is in automatically managing the compression of indexes.
First, let’s create a little table and multi-column index:
SQL> create table bowie (id number, code number, name varchar2(42)); Table created. SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit; Commit complete. SQL> create index bowie_idx on bowie(code, id) pctfree 0; Index created.
OK, the key thing to note here is that the leading CODE column in the index only has 10 distinct values and so is repeated very frequently. However, the second ID column is effectively unique such that the index entry overall is also likewise effectively unique. I’ve created this index initially with no compression, but with a PCTFREE 0 to make the non-compressed index as small as possible.
If we look at the size of the index:
SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2361 2
We notice the index currently has 2361 leaf blocks.
I’ve previously discussed how index compression basically de-duplicates the indexed values by storing them in a pre-fixed table within the index leaf block. These pre-fixed entries are them referenced in the actual index entries, meaning it’s only now necessary to store repeated values once within a leaf block. Only repeated index values within an index leaf block can therefore be effectively compressed.
In this example, it would be pointless in compressing both indexed columns as this would only result in a unique pre-fixed entry for each any every index entry, given that the ID column is unique. In fact, the overhead of having the pre-fixed table for each and every index entry would actually result in a larger, not small overall index structure.
To show how compressing the whole index would be a really dumb idea for this particular index:
SQL> alter index bowie_idx rebuild compress; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 3120 2
The COMPRESS option basically compresses the whole index and we note that rather than creating a smaller, compressed index structure, the index is in fact bigger at 3120 leaf blocks.
However, as the leading CODE column in the index only has 10 distinct values and so is heavily repeated, it would make sense to just compress this first CODE column only in the index. This of course requires us to fully understand the data associated with the index.
We can do this by specifying just how many leading columns to compress (in this case just 1):
SQL> alter index bowie_idx rebuild compress 1; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2002 2
We note the index is indeed smaller than it was originally, now at just 2002 leaf blocks.
So this requires us to make the correct decision in how many columns in the index to compress. Getting this wrong can result in a worse, not better overall index structure.
Now with Advanced Index Compression, we don’t have to make this decision, we can simply let Oracle do it for us. As discussed previously, Oracle can go through each leaf block and decide how to best compress each leaf block. In this case, it can automatically determine that it’s only beneficial to compress the CODE column throughout the index.
If we compress this index with the new COMPRESS ADVANCED LOW clause:
SQL> alter index bowie_idx rebuild compress advanced low; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2002 2
We note we get the index at the nice, small 2002 leaf blocks, as if we used the correct COMPRESS 1 decision.
However, the story gets a little better than this …
Let’s now modify the contents of the table so that we create some duplicates also for the second ID column:
SQL> update bowie set id=42 where id between 442000 and 542000; 100001 rows updated. SQL> commit; Commit complete.
OK, so for about 10% of rows, the ID column value is indeed repeated with the value 42. However, for the remaining 90% of rows (and hence index entries), the ID column remains effectively unique. So we have this 10% section of the index where ID is indeed heavily repeated with the value 42, but everywhere else within the index the ID remain unique.
If we rebuild this index again with no compression:
SQL> alter index bowie_idx rebuild nocompress pctfree 0; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2336 2
We now end up with 2336 leaf blocks (a little smaller than before the update as we’re replacing 10% of the IDs with a smaller value of just 42).
However, the vast majority (90%) of the index entries are still unique, so attempting to compress the entire index is again unlikely to be beneficial:
SQL> alter index bowie_idx rebuild compress; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 2946 2
Indeed, the index is again now bigger at 2946 than it was when it wasn’t compressed.
We can again effectively compress just the CODE column in the index:
SQL> alter index bowie_idx rebuild compress 1; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 1977 2
OK, just compressing the CODE column has indeed resulted in a smaller index structure (just 1977 leaf blocks) as it did before.
Without Advanced Index Compression we have the option to not compress the index (the result is average), compress both columns (the result is worse) or compress just the leading column (the result is better). It’s an all or nothing approach to index compression with the best method decided at the overall index level.
We don’t have the option to compress just the leading column when it makes sense to do so, but to also compress both columns in just the 10% portion of the index where it also makes sense to do so (when we have lots of repeating 42 values for ID).
We do have this option though with Advanced Index Compression and indeed this is performed automatically by Oracle in just those leaf blocks where it’s beneficial because the decision on how to compress an index is not performed at the overall index level but at the leaf block level. As such, Advanced Index Compression has the potential to compress an index in a manner that was simply not possible previously:
SQL> alter index bowie_idx rebuild compress advanced low; Index altered. SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE'; INDEX_NAME LEAF_BLOCKS BLEVEL ---------- ----------- ---------- BOWIE_IDX 1941 2
We notice the index is now even smaller at just 1941 leaf blocks than it was when just compressing the leading column as we now also compress the CODE column in just that 10% of the table where we also had repeating ID values.
I can’t emphasise enough just how cool this feature is !!
In fact, I would recommend something I don’t usually recommend and that is rebuilding all your indexes at least once (where you know the leading column has some repeated values) with the Advanced Index Compression option, so that all indexes can be compressed to their optimal manner.
Note though that this does require the Advanced Compression Option !!
More later 🙂
Presenting the Hotsos Symposium Training Day – 10 March 2016 (Heat) September 16, 2015
Posted by Richard Foote in Oracle Indexes.4 comments
I’ve just accepted an invitation to present the Hotsos Symposium Training Day on 10 March 2016 in sunny Dallas, Texas. In the age of Exadata and In-Memory databases, it’ll be an updated and consolidated version of my Index Internals and Best Practices seminar. With an emphasis on using indexes appropriately to boost performance, it’ll feature lots of tips, tricks and references to David Bowie 🙂
The Hotsos Symposium (6-10 March 2016), with it’s focus on performance analysis is one of the very best conferences I’ve ever presented at and I’m very excited at the prospect of returning next year. The confirmed list of presenters is already impressive: Tanel Põder, Bryn Llewellyn, Kellyn Pot’Vin, Jim Czuprynski, Jeff Smith, Kerry Osborne, Carlos Sierra and now Richard Foote, as I’ll likely present a session or two during the conference proper as well.
If you get the chance, do yourself a favour and join me at one of the best Oracle conferences available and hopefully stay on for the Training Day where there’ll be lots of learning and fun to be had !!