jump to navigation

Index Compression Part I (Low) February 17, 2008

Posted by Richard Foote in Index Compression, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
trackback

Index compression is perhaps one of the most under used and neglected index options available. It has the potential to substantially reduce the overall size of Non-Unique indexes and multi-column Unique indexes, in some scenarios dramatically so. A smaller index, especially if it stays permanently smaller without any subsequent expensive maintenance operations, it always a nice thing. Not only will it potentially save storage, but if the resultant index contains fewer leaf blocks, that’s potentially fewer LIOs and from the Cost Based Optimizer’s point of view, potentially a cheaper execution plan option.

All possible without a single index rebuild in sight …

However like most things Oracle, index compression also has the potential to be misused and to cause rather than solve problems. So it really helps to understand how index compression works and how it’s actually implemented before we rush into anything.

The first point to understand is that index compression is implemented at the index block level. Basically, Oracle stores each distinct combination of compressed column values found within a specific index leaf block in a “Prefix” table within the leaf block and assigns each combination a unique prefix number.

The more numbers of distinct compressed column values, the more entries in the prefix table and the larger the prefixed related data. The fewer numbers of distinct compressed column values, the fewer entries in the prefix table and the smaller the prefix related data. Generally, the fewer entries in the prefix table, the better the compression.

Oracle now no longer stores these compressed columns within the actual index row entries. These compressed columns are substituted and referenced with the details stored in the prefix table entry, which are shared by all index row entries with the same corresponding prefix value.

Only leading columns (which in a Non-Unique Index can potentially be all the columns in an index, in a Unique Index can potentially be all but the last column in the index) can be compressed. Therefore, the prefix table is in the same logical order as they would appear in the index row entries. The values of the prefix values always appear within the index row entries in a sequential manner, noting that (hopefully) several row entries share the same prefix number.

Let’s say we currently have a nocompress Non-Unique index on a NAME column with the following entries:

0: David Bowie

  : ROWID

1: David Bowie

  : ROWID

2: David Bowie

  : ROWID

3: David Bowie

  : ROWID

4: David Bowie

  : ROWID

5: David Jones

  : ROWID

6: David Jones

  : ROWID

7: David Jones

  : ROWID

After the index is compressed, the leaf block entries would look logically something like this:

Prefix

0: David Bowie

1: David Jones

0:0

  : ROWID

1: 0

  : ROWID

2: 0

  : ROWID

3: 0

  : ROWID

4: 0

  : ROWID

5: 1

  : ROWID

6: 1

  : ROWID

7: 1

  : ROWID

Importantly, each distinct combination of compressed column values is now stored just the once within an individual index leaf block. In the example above, we previously stored “David Bowie” 5 times. In the compressed index leaf block, we now only store “David Bowie” once, with the prefix value now being referenced by each index entry instead.

The net result being we can now (hopefully) store many more index entries per leaf block meaning we don’t need as many leaf blocks in our index.

To compress an index, simply specify the COMPRESS option:

CREATE INDEX bowie_table_i ON bowie_table(col1, col2) COMPRESS 2;

The number after the COMPRESS keyword denotes how many columns to compress. The default is all columns in a Non-Unique index and all columns except the last column in a Unique index.

This demo, Index Compression Part I shows how an appropriately compressed index can substantially reduce the overall size of an index. It also shows a couple of index leaf block dumps to highlight how index compression is implemented.

In Part II, I’ll show you how you can really stuff things up by implementing index compression totally inappropriately …

Comments»

1. praveen - February 18, 2008

Its true compressing an index improves the performance. But with my real experience compressed indexes will deteriorate the performance if the table is prone for huge concurrent multi row dml statements.

Like

2. DavidB - February 18, 2008

So by “improve performance” you mean that the time saved by doing fewer LIOs (because the index is smaller) is greater than the extra CPU time spent processing the “prefix” table?

Looking forward to the next article Richard.

Like

3. Neil Johnson - February 19, 2008

Excellent demo – a point made simply and clearly. I’ve been looking forward to index compression cropping up.

Like

4. Richard Foote - February 20, 2008

Hi Praveen

Do you have any actual “real experience” details to share ?

I’m curious to know what specifically you mean by deteriorate performance, what you define as hugely concurrent and the specifics of multi row DML statement you found to be problematic.

Any additional info would be appreciated.

Like

5. Richard Foote - February 20, 2008

Hi David

Exactly. Most index related execution plans would likely not see any difference but some larger range scan, fast full scan, etc. queries might. And this is a permanent, no maintenance required, relatiively minimal overhead benefit. If you can reduce the leaf blocks in an index, permanently, by (say) half then it’s something worth considering.

Like

6. Richard Foote - February 20, 2008

Hi Neil

Thanks for the comments. Unfortunately, I can only make points simply 😉

More stuff on compression coming up.

Like

7. Brian Tkatch - February 20, 2008

Am i understanding this correctly?

The compression helps because it is less bytes overall. Which means that if the INDEXed COLUMN is a small amount of data, for example, a status field that uses one byte, compression wouldn’t actually help.

If that is the case, is there a basic idea of how many bytes would need to be saved to possibly help? Obviously, a rule cannot be given, but if it is true that for one byte there is no benefit, can the same be said for two or three?

Like

8. Richard Foote - February 21, 2008

Hi Brian

Your question has convienced me for there to be at least a Part IV to this discussion !!

You understand correctly in that compression helps because it’s less bytes overall. However my attempt to describe things as they’re “logically” implemented and some of my wording may have confused matters regarding how it’s all physically implemeted (which I’ve amended in places).

The short answer is that yes, compression can help even with a status field because we still have the potential to save bytes. It all depends again on how many repeating compressed values we have on average per leaf block.

Two quick points.

One, even a single byte column requires 2 bytes in the index row entry, one byte for the column value and 1 byte for the column length. So it’s 2 bytes per index row entry (less the compression overheads) we could potentially save by compressing

Two, I refer to “pointers” in the blog entries in my attempt to logically describe the relationship between index row entries and the prefix entries. However, physically, the prefix entries are actually stored “next to” the corresponding index entries. Remember, these prefix entries can only refer to the leading columns of the index and these must be in the same logically order as the index entries themselves. Therefore, these “pointers” are really references or logically references if you like to the corresponding prefix entry. In actual fact, they’re inexpensive in that in themselves they don’t require storage, the “cost” is in the overheads associated with the prefix entry they’re associated with.

So in answer, yes you could maybe compress a really small column if it has many repeating values and yes, I need to expand on things more in another blog entry 🙂

Like

9. Brian Tkatch - February 21, 2008

Thanx for the explanation.

I see i am missing quite a bit. I anxiously await your next entries!

Like

10. Richard Foote - February 22, 2008

Hi Brian

They’re coming, they’re coming 😉

Like

11. Avirup Sen - June 26, 2009

Hi Richard,
First of all thanks for this blog. I found it really helpful.I know this is quite an old blog, but please help me with some of my doubts.

1) If a non unique index involving multiple columns is created, then to get the maximum benefits of key compression should one use the column with the lowest cardinality (lowest number of distinct values) to compress?

If this is true, then please answer my 2nd question as well

2) I have to create a non-unique index on a table involving 2 columns (C1, C2)
C2 has the lower cardinality between C1 and C2. However data in the table is frequently queried using C1. So, C1 needs to be the first column in the index definition. In that case, I won’t be able to compress C2. Is that true? Because Compress will be something like the following

create index t1_idx01 on t1(c1,c2) compress 1;

Like

Richard Foote - June 30, 2009

Hi Avirup

1) Yes, if the leading column has less cardinality, then likely it can be compressed more effectively. Oracle can only compress an index block by not having to restore repeating values. If the second column is very distinct, then you don’t want to include it in the columns to be compressed.

2) Yes that’s true as well. There’s not much point in having a compressed index if Oracle is unlikely to use it. By having C1 has the leading column, Oracle can still use the index if only C1 is being queried but it won’t be able to be compressed effectively. If C2 has very very few distinct values, Oracle might consider an index skip scan if it’s the leading column and only C1, the second column is queried but this will not be as efficient as having it as the leading column.

Like

12. A few words on Index Compression « The Dutch Prutser's Blog - April 6, 2010

[…] a much more thorough discussion on index compression see Richard Foote’s fabulous Oracle blog. ( part 1 , part 2 , part 3 And part […]

Like

13. Marcel - July 6, 2010

I am doing a compress in a especific partition in a table and we have 5 local indexes in this table, all indexes become UNUSABLE and I did the rebuild UNUSABLE LOCAL INDEXES,
Are these indexes partition COMPRESSED too? how can I check it?

thanks.

Like

Richard Foote - July 20, 2010

Hi Marcel

No they’re not.

DBA_IND_PARTITIONS has a COMPRESSION column.

Like

14. Amir Riaz - November 25, 2010

nice blog.

Like

15. Ganesh - January 5, 2011

Nice article Richard. I am planning to implement index compression for some of the tables in my 10.2.0.4 production DB. Are there specific restrictions on indexes once they are made compressed, just like we have for compressed tables? (For example, we cannot drop a column from a 10G compressed table etc.)

Like

Richard Foote - January 25, 2011

Hi Ganesh

None that come to mind. You can’t compress a bitmap index (they’re already effectively compressed), you need to rebuild an index to enable/disable compression but other than that, none that I can think off on the top of my head.

Like

16. Mark Brady - April 27, 2011

When you say “Oracle now no longer stores these compressed columns within the actual index row entries.” Could you include the versions? I know that I go back to websites years after I read them and can’t figure out what they were talking about because of relative language. I don’t want to interrupt your conversational flow, so a Foote-note?

Like

Richard Foote - April 27, 2011

Hi Mark

Oh dear, you really put your Foote in it with that pun !!

What I meant by that statement and what I meant by “now” is that once compressed, Oracle no longer stores the compressed columns within the index entry.

This is true for all compressed columns within an index for all versions since index compression was introduced (sometime way back in Oracle8 I think).

Like

17. Oracle Index Compression « Technical Notes of Jeson Martajaya - June 25, 2011

[…] has an index compression feature. It reduces I/O read but increases CPU consumption. If you have bottleneck in I/O (aka I/O […]

Like

18. davidbudac - September 15, 2011

Hi Richard,
Could you please make the demo (https://richardfoote.files.wordpress.com/2008/02/index-compression-demo-part-i.txt) public again? At the moment it says Access denied.

Thanks,
David

Like

19. briankp - February 16, 2012

Hi Richard,

The link for the demo is not accessible. Can you please make it available to public view? Thanks.

https://richardfoote.files.wordpress.com/2008/02/index-compression-demo-part-i.txt

— 403: Access Denied —

This file requires authorization:

You must both be a user of this blog as well as be currently logged into WordPress.com

Like

20. Pratik Mehta - December 23, 2013

Very nice insightful article and discussion. Thanks Richard.

Like

21. Index design | Mohamed Houri’s Oracle Notes - September 4, 2014

[…] to learn and master indexes I would encourage them to read the world expert person in this field, Richard Foote. He has an excellent blog with several articles about almost all what one has to know about indexes […]

Like

22. Index compression–working out the compression number | Learning is not a spectator sport - August 5, 2015

[…] Foote did a series of informative posts on Index Compression which concludes that there is whole lot of positives about index compression, and very little […]

Like


Leave a reply to Richard Foote Cancel reply