So when does an Oracle B-Tree index actually increase in height ?
I’ve basically been asked this same question a number of times over the past few days with regard to the discussions on indexes and different block sized tablespaces, so I thought it might be worth quickly sharing the answer to a wider audience.
Imagine a new, empty table and a corresponding new, empty index. At this stage, the index structure basically consists of one, empty block. The index has a BLEVEL of 0 (from DBA_INDEXES) and a HEIGHT of 1 (from INDEX_STATS), yes it can be confusing This block is basically the Root block of the index as it’s the first (and currently only) block to be accessed during an index scan, but at this stage is used to also store the actual index entries as well (and so can kinda be viewed as being a Leaf block as well).
We now start to insert rows into the table and thus row entries into the index. These index entries basically consist of the indexed column(s) and its corresponding ROWID, and are sorted based on the indexed column values.
Eventually, this single index block will fill; Oracle simply can’t add any more index entries into it. Now comes the fun bit.
When Oracle wants to insert a new index entry but it can’t as this Root index block is full, Oracle will allocate two new index blocks. If the new index entry is the maximum value currently to be indexed, Oracle will move all the index entries from the full block and put it into one of the new index blocks and place the new index entry into the other block. This is known as a 90-10 index block split.
If the new index entry isn’t the maximum value, Oracle will place the lower 1/2 valued index entries into one new block and the other 1/2 into the other new block. This is known as a 50-50 index block split.
These two new blocks are now the new leaf blocks in the index structure.
The contents of the previously single filled block is now totally replaced with pointers to the two new blocks. This block therefore remains the Root block in the index structure. These pointers basically consist of the Relative Block Address (RBA) to the new index blocks and a value which represents the lowest indexed value found in the specific referenced leaf block. These indexed values in the Root block are now used by Oracle as the method by which it can navigate the index structure to find the specific index leaf block containing a required indexed entry.
The index has just increased in height and now has a BLEVEL of 1 and a HEIGHT of 2.
As we continue to add more rows into the table, we add more index entries into our 2 leaf blocks. Eventually they will fill again and will again perform either a 90-10 or 50-50 block split depending on the new index value to be inserted. With a non Root block split, only one additional index block is allocated and the index entries are distributed between the full and new index block. Each time a leaf block splits in a BLEVEL 1 index, a new entry is also added into the Root block to point to the new Leaf block.
Once we have enough Leaf blocks, the Root block will again eventually fill. At this point, Oracle will again allocate two new blocks and distribute the contents of the Root block into these two new blocks, again 90-10 or 50-50 depending on the new indexed value to be inserted. The contents of the Root block is now totally replaced with pointers to these 2 new “Branch” blocks which of course in turn now contain the pointers to the Leaf blocks.
The index has again increased in height and we now have an index with a BLEVEL of 2 and a HEIGHT of 3.
As the leaf blocks continue fill and split, a new entry is added to the corresponding Branch block each time. When these Branch blocks fill and split, a new entry is added to the Root block. When the Root block eventually fills, it will again allocate 2 new blocks and so the index grows in height again.
So basically, an index increases in height whenever the index Root block splits and the two new allocated blocks result in a new level within the index structure. Note the index Root block remains the same throughout the entire life of the index, no matter the index height.
Note also a Root block split is the only time an index increases in height. Therefore, the number of levels between the Root block and any/all of the Leaf blocks is always and must always be the same. Hence, an Oracle B-Tree index is always structurally height balanced, always.
Larger Block Index Tablespace and Small Index Scans – Performance Improvement ? (Let Down) March 31, 2008Posted by Richard Foote in Index Block Size, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
Thought it might be worth looking at the impact on the performance of Unique and small index range scans as typical in OLTP environments, when an index is rebuilt in a larger block tablespace.
Warning, this discussion will again primarily be an exercise in simple mathematics. However, a few little details to start the ball rolling.
I’ve already discussed how in many scenarios, by increasing the index block size, the height of the index can remain unchanged. One can’t simply assume a larger block index results in an index with a lesser height. However, as we shall see, even when indexes do reduce in height, the so-called performance benefits can be somewhat ”exaggerated”.
Note also in OLTP environments, the vast majority of index related access paths are either unique scans or small index scans. We’re only interested in a specific customer, in a specific order, in the bank balances of specific accounts, with specific customers buying specific products with specific credit cards, etc. Remember also that say a block size of 8K can generally store many hundreds of index entries per index leaf block and that even a humble little 2K leaf block can often store a three figure number of index entries …
So let’s say in our first example we currently have an 8K index with a height of 3 which we rebuild in a 16K tablespace and the index height remains the same. What are the comparative costs of performing a simple index look-up.
In the 8K index, we need to read 3 x 8K (root, branch and leaf block) to read the index entry of interest for a total of 24K. However, in the new 16K index, we now need to read 3 x 16K (again, a root, branch and index leaf block) to read the index entry of interest, for a total of 48K.
That’s double the potential physical I/O if the blocks are not currently cached, that’s double the memory that needs to be used in the buffer cache to store the index blocks and that’s potentially more CPU we need to use in order to pin and process the larger index blocks.
Hummm, these larger blocks aren’t too impressive so far …
But what if we actually reduce the height after moving an index to a larger block tablespace ?
In the 8K index, we still have our 3 x 8K = 24K. If we rebuild in an 16K block tablespace and reduce it’s height by 1, we now only need to read 2 x 16K = 32K. That’s 3 consistent reads at 24K vs. 2 consistent reads at 32K. Yes, we reduce the number of consistent reads which is a good thing but we still have a larger index footprint which again means possibly more data off disk, more memory and more CPU resources to process the index blocks.
Let’s hope our 2K block index with a height of 2 reduces its height somewhat if we decide to rebuild it in 32K blocks because that’s 2 x 2K = 4K vs 2 x 32K = 64K otherwise. But even if we did, although we reduce the consistent reads, which is a good thing, that’s still 2 x 2K = 4K vs 1 x 32K = 32K.
But we can potentially reduce the height of a 2K index if rebuild in a much larger block size (in specific cases) by more than 1 level, right ?
Yes, a 4 level 2K index, 4 x 2K = 8K, might very well now only require 2 levels in a 32K index, 2 x 32K = 64K. But that’s still 64K of index data we need to access vs. 8K in the smaller block. Consistent reads reduce but our footprint for small index range scans is still larger, sometimes by a considerably amount depending on the change of block size.
You begin to see the issue …
It’s a bit like someone saying they’re going to improve the skyline somewhat and only build appartment buildings that have fewer numbers of floors than previously. But if the floors they build are 2 times or 4 times or even 16 times higher than the previous floors and they only reduce the number of floors by a moderate amount, is the building really lower ?
Have we really reduced the height of the building ?
Yes, the lift only has to stop at fewer floors which can be more “efficient” but it takes longer each time as it goes from one floor to the next. So is it really that much quicker to get to the top, or more specifically in our discussion, from the top to the bottom of the building ?
Surely though, things must run faster, performance must actually improve, things must be more efficient by having larger sized index blocks else why bother ? Why indeed …
This demo on the Performance Impact Of Small Index Scans In A Larger Index Block Size shows how performance may actually worsen, not improve if we move indexes into a larger block size. It creates an index on a well clustered column (so the index is at it’s most efficient and potentially impacts performance the most), first with an 8K block size and then with a 16K block size. The size of the index was carefully chosen so that the larger index block size did indeed reduce the height of the index (although the range of values when this was possible was actually quite limited with these block sizes). A simple PL/SQL procedure then performs a massive number of single row look-ups and the CPU and elapsed times are monitored. The results show when the associated blocks are either cached or not cached, the larger block index incurs larger CPU related costs and results in overall slower response times, despite the fact it actually has a lower height than the smaller block index.
It’s not precisely the same as a large scale environment as PL/SQL has subtle little differences and efficiencies when compared to multiple, separate transactions. Also, it’s not the specific results that are important here but the overall general approach. You can pick any index you want to investigate, you can pick whatever block sizes you may be interested in, you can monitor and benchmark by checking out specific session details or by tracing the specific sessions, you can determine what the comparative costs and response times may be and you can determine what may cause any performance differences by digging deeper into the session statistics or trace files.
No matter what index or block size you select, you will likely come to the same conclusion. The overall performance benefits of your OLTP transactions when you move indexes to a larger block tablespace will likely be “disappointing”.
If a real estate developer comes knocking on your door with a promise that the empty block around the corner will have an apartment building with only a few floors and it shouldn’t ruin the view too much, you may just want to ask them quietly exactly how many floors they’re planning to build and exactly how high each floor will be.
Else the resulting view may turn out more disappointing than you’re led to believe
Store Indexes In A Larger Block Tablespace: Height Reduction 1/2 Myth (Five Foot One) March 26, 2008Posted by Richard Foote in Index Block Size, Index Height, Index Internals, Oracle General, Oracle Indexes, Oracle Myths, Tablespace Management.
A common misconception with using a larger block tablespace specifically for indexes is that this will result in a reduction in the height of indexes and hence ”flatten” index structures.
However, this is only partly true.
A few little generalisations to begin with.
First, most databases out there have a default block size of 8K. I won’t go into a big discussion on what the database block size should be set to (maybe another time), however I will say most databases these days have a default block size of 8K and that it’s certainly questionable to have the database block size set to 2K.
Note also in many common platforms (e.g. Windows, Linux) the maximum block size limit is 16K. So in many environments, when we talk about moving indexes into a bigger sized block, it specifically involves moving from an 8K to a 16K block size.
Yes, in theory you could move an index from (say) a 2K block size up as high as a 32K block size but you would need to question why the default is so low to begin with and whether the upper value is actually supported in your environment.
I make this point because the difference between block sizes makes a huge difference in the probability of the index height actually being reduced.
So let’s start with an example of moving indexes between an 8K default block size to a 16K block tablespace, not least because the arithmetic is easier and extrapolate out as we go along.
So in our example, the new block size is double or 2 times that of the default one. By doubling the block size, we effectively 1/2 the number of necessary leaf blocks in the index structure. By having fewer leaf blocks we also therefore reduce the overall associated block level overheads so the actual reduction in leaf blocks could be a tad more, but we’ll say a 1/2 reduction to keep the numbers nice and simple.
Note the reduction in leaf blocks in therefore simply 1 / the ratio of block increase (1/2). Moving from a 2K block to a 32K block is 16 times larger so we’ll have approximately 1/16 the number of leaf blocks.
So how does (say) halving the number of leaf blocks impact the overall height of the index ?
We obviously can’t reduce the height of an index with a height of just 1. The index consists of just the one block so a larger block would simply mean the block having more free space.
To reduce the height of an index with a height of 2 (back to 1), we therefore must be able to store all index entries within a single block. Therefore, in the 8K to 16K example, the index can only have 2 full leaf blocks for this to be possible. If an index has 3 or more “filled” leaf blocks, the index must remain at a height of 2 as we can’t fit all the index entries into the single larger index block.
Importantly therefore, all indexes with a height of 2 with more than 2 full leaf blocks would not reduce in height by simply doubling the block size. This could very well be the vast majority of indexes at this level.
For an index with a height of 2, the index must have less full leaf blocks than the ratio of block increase for a height reduction to be possible. In our best case scenario, 2K block to 32K block, any index with more than 16 full leaf blocks would not reduce in height.
To reduce the height of an index with a height of 3 (back to 2), we must therefore be able to store all intermediate branch blocks into the one branch (root) block. When we double the block size, we therefore 1/2 the leaf blocks and 1/2 again the necessary branch blocks. Therefore the necessary branch blocks is 1/(2×2) = 1/4 that of the default block size. Therefore any index with a height of 3 that has more than 4 full intermediate branch blocks will again not reduce in height as again all the necessary branch information would not fit in one root block.
Importantly therefore, all indexes with a height of 3 with more than 4 full intermediate branch blocks would not reduce in height by simply doubling the block size. Again, this could very well be a significant proportion of all indexes at this level. Note also in many databases, the vast majority of indexes have a height of 3 or less so by simply doubling the index block size, most indexes would not reduce in height …
For an index with a height of 3, the index must have less full intermediate branch blocks than the ratio of block increase to the power of 2 for a height reduction to be possible. In our best case scenario, the 2K block to 32K block, only those indexes with more than 16×16=256 full intermediate branch blocks will reduce in height. This is therefore likely to be a far higher proportion of all such indexes.
You see the pattern …
To reduce the height of an index with a height of 4 (back to 3), we must therefore store all first level intermediate branch blocks into the one branch (root) block. When we double the block size, we therefore 1/2 the leaf blocks, 1/2 again the second level intermediate branch blocks and 1/2 again the first level intermediate branch blocks. Therefore the necessary first level intermediate branch blocks is 1/(2x2x2) = 1/8 that of the default block size. Therefore any index with a height of 4 that has more than 8 full intermediate first level branch blocks would again not reduce in height as again all the necessary first level branch information would not fit in the one root block.
Importantly therefore, all indexes with a height of 4 with more than 8 full intermediate branch blocks will not reduce in height by simply doubling the block size. However, as the index height increases, the ratio of indexes where this is likely to be the case decreases.
For our best case scenario, 2K to 32K, we now start hitting very large numbers 16x16x16=4096 so the likelihood of a index height reduction is very very high.
And so on …
The important point being that by simply doubling the index block size, in most databases, the vast majority of indexes are actually quite unlikely to reduce in height as the index needs to be within very limited size boundaries for the index height to reduce. The greater the index height however, the greater the index size boundaries whereby an index height reduction is possible.
Also, the greater the index block increase, proportionally the fewer the index blocks and so greater the likelihood of an index height reduction.
This demo on the Impact Of Block Size On Index Height illustrates that by simply doubling the index block size, the height of an index (in various sizes) rarely decreases.
One final point. With our height 4 index example, note the index can only have a maximum of 8 first level branch blocks for the height to reduce. Therefore, in effect, we’re replacing a maximum of 9 x 8K branch blocks with 1 x 16K block. If this index is frequently accessed, these 9 branch blocks are likely cached and we only need to read two of these blocks anyways for an index range scan (for a total of 16K). After the rebuild, we still need to read this block (16K again) anyways so from a purely performance perspective with regard to just simply reducing the index height, the so-called performance benefits are often very much exaggerated.
As we’ll see in the next epic episode of this series, performance can actually decrease
Next time someone claims moving indexes into a larger block size will decrease the height and flatten an index, remember it really does depend. In many databases, especially when the index block size is just doubled, it’s actually quite surprising just how unlikely it is for an index to actually decrease in height.
Store Indexes In A Larger Block Tablespace: The Multiblock Read Myth Part II (The Fly) March 20, 2008Posted by Richard Foote in Index Block Size, Index statistics, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Tablespace Management.
Thought I might begin by mentioning a lovely little story by Billy Verreynne in this OTN Forum Thread.
Basically a scientist is doing research on the behaviour of flies. He notes when he opens the jar lid of a trapped fly and claps his hands, the fly takes off and flies away. One day, he decides to pull the wings off the fly. When he claps his hands, the fly just sits there. No matter how loud he claps, other than a slight rocking motion, the fly just doesn’t budge.
Excitedly, he writes in his journal his latest discovery. When you pull the wings off a fly, it goes stone deaf !!
This is soooo funny because this sort of thing happens all the time in the Oracle world (and elsewhere). I see it time after time after time, we’ve all done it.
Hey, I just compressed a segment into one extent and performance improved. Conclusion, storing a segment in one extent improves performance.
Hey, I just separated my indexes from my tables and performance improved. Conclusion, separating indexes from tables improves performance.
Hey, I just moved my indexes into a larger block sized tablespace and my Index Fast Full Scan performance improved. Conclusion, a larger index block size improves Index Fast Full Scan performance.
Lots of flies with no wings. Lots of people thinking flies go deaf when they don’t have wings. An action appears to cause an effect, but does it really …
As Billy himself suggests, the fundamental reason why so many people think “flies with no wings go deaf” is that many don’t understand what’s actually going on. Many simply don’t understand the basic workings, the fundamental processes and mechanisms involved in how Oracle functions or how Oracle performs a specific operation. Tuning by observation, tuning by making change “A” without understanding all the implications of such a change and subsequently making suppositions on the results of such a change, ultimately means we have lots of people thinking flies without wings are deaf.
As I discussed in Part 1, Oracle performs exactly the same sized I/O during a multiblock read, regardless of the block size of the segment. Exactly the same. Without understanding this simple fact, one could very easily come to a wrong conclusion regarding the ramification of block sizes on multiblock reads. Without understanding flies don’t have ears or sound sensors in their wings, one could very well come to a wrong conclusion regarding the ramifications of removing the wings from a fly.
If we perform an Index Fast Full Scan and performance improves, it can’t be because associated multiblock I/Os are more efficient. A fly doesn’t have sound sensors in its wings. There must be another explanation. Conversely, in my example in Part I of this discussion, performance went worse with a larger index block size (as it did in Greg Rahn’s example on this OTN Forum Thread), but again not as a result of multiblock read performance.
So how could the performance of an Index Fast Full Scan change (for better or worse), if one simply rebuilds the index within a larger block size tablespace ? Well there are of course many possible reasons, with the two more obvious explanations being the following:
1) Most randomly inserted indexes have a PCT_USED value ranging between 70-75% as these indexes perform random 50-50 block splits that are subsequently in differing stages of being filled. By rebuilding an index (say back to a default PCTFREE of 10%) one might increase index compactness by say 15% and hence decrease the overall index size (note reduced block overheads may also reduce the index a little as well, depending on index size and differences in block sizes). The Fast Full Index Scan is the access path that potentially benefits most by defragmenting an index as the associated costs are proportional to the overall size of the index. Reducing the size of an index could therefore impact subsequent performance. However, rebuilding the index in the current block size would likely achieve a similar result (plus block overheads), compacting the index and resulting in potentially better performance (although once the index blocks begin to split again, the index would eventually return back to its previous state). Therefore, it’s not the bigger block size but the resultant defragmentation of the index that’s improved matters. The fly isn’t deaf, it just needs its wings to fly …
2) By storing an index in a larger block tablespace, the index must physically be stored on a different database file. This file could be on a faster disk, improving performance, this file could be on a faster part of the disk, improving performance, this file could be on a disk with far less disk contention, improving performance, etc. etc. It’s not the larger block size that’s improved (or worsened) performance, it’s the new physical characteristics of where the index is now stored. If one were to rebuild the index with the current block size and use the same physical characteristics of the larger block index, subsequent performance would likewise increase (or decrease). The fly isn’t deaf, it just needs its wings to fly …
There are many other possible reasons, the system was less busy when using the larger block index, more of the index was physically cached when using the larger block index, etc. etc.
Of course, an Index Fast Full Scan is rarely a scalability issue anyways. Do we really want our applications to perform hundreds of large, concurrent Index Fast Full Scans ? Tuning the application to avoid these overheads should be the focus rather than moving indexes into a larger block tablespace in the vain hope it will improve things dramatically. But that’s the topic of another discussion …
Can the performance of an Index Fast Full scan change after moving the index to a larger block size tablespace. Absolutely. However, it doesn’t necessarily mean such a change in performance is a direct result of the index having a larger block size and that multiblock read performance has improved.
It doesn’t mean moving indexes to larger block size tablespaces suddenly makes Oracle go deaf …
UPDATE: I’ve added this simple little demo that illustrates how performance improves when an index is rebuilt in a larger block tablespace. This will of course suggest to some folk that the larger block tablespace improved the performance but what actually improved things was rebuilding the fragmented index to be a more efficient structure. The larger block tablespace was not the fix, rebuilding the index was the important factor. In fact, by rebuilding the index in the original smaller block tablespace, not only do we also improve performance, but things are further improved as we reduce CPU overheads incurred by the larger block tablespace and as a result elapsed times are further improved.
add a comment
Just wanted to quickly mention an excellent new blog, the Oracle Multimedia Blog, that might be of interest to some of you.
It’s run by fellow Canberra resident, Marcel Kratochvil, a well known Oracle identity, who is almost as well known as myself but nowhere near as good looking
Marcel and I go way back, having worked together at Oracle Corporation in the mid 1990′s. Marcel was also my partner in crime when we won the runner’s up award for best paper at Oracle Openworld in Brisbane in 1999 with our paper / theatre production called “Good DBA / Bad DBA”. For those who might remember that fateful day, he was the over-acting “Cowboy”, I was the understated and somewhat professional looking “Airline Pilot”
Marcel is a most knowledgeable and clever fellow who among his many achievements won the Oracle PL/SQL Developer Of the Year award in 2004 and has recently been made an Oracle Ace.
I wish Marcel the very best with his new Blog and encourage everyone to check it out.
One of the great myths surrounding the use of differing block sizes is that storing indexes in larger block sizes somehow dramatically improves the performance of index related multiblock reads.
Oracle performs index multiblock reads when performing an Index Fast Full Scan, when it basically treats the index structure as a skinny version of the table. It reads the entire index structure, multiple blocks at a time, “throwing away” any non leaf blocks as it stumbles across them.
The theory goes that by storing indexes in larger size blocks, we would obviously have fewer index related blocks. If we need to read the entire index, as we do with an Index Fast Full Scan, surely it must be more efficient if we have fewer, larger index blocks.
The evidence looks convincing. Here’s a link to an extract from a book by Robin Schumacher where he clearly shows a dramatic “improvement” by using an index tablespace with double the block size. In one query using an 8K block index, the consistent reads during an Index Fast Full Scan was 421. However, when the same index was recreated as a 16K block index, the same query only used 211 consistent gets, 1/2 of what it was previously.
Conclusive “proof” that the 16K block index improved performance wouldn’t you say ?
Well actually, it’s only conclusive proof that the number of consistent gets has dropped, whether it actually improves “performance” is another thing entirely.
There are a couple of little “details” that many don’t quite appreciate. The devil is always in the details …
The first point to note is that when Oracle performs a multiblock read, it uses the value in the db_file_multiblock_read_count parameter to determine how many blocks to read per multiblock read (with system statistics, Oracle itself can determine how best to set this value).
So if the db_file_multiblock_read_count value were set to say 16, Oracle will attempt to read as many as 16 blocks at a time during a multiblock read operation.
Note this value is based on the default block size of the database. So if the default block size is 8K and the db_file_multiblock_read_count is 16, Oracle will try and read 16 x 8K blocks at a time during a multiblock read operation.
However, if there’s a non-default block sized segment (say 16K), Oracle will adjust the number of blocks that are actually read during a multiblock read operation so that the maximum size of the overall multiblock read is identical to that of the default block size.
So if the db_file_multiblock_read_count is 16 and the default block size is 8K, a multiblock read of an object in a 16K tablespace will only read 8 blocks at a time (and not 16). A multiblock read of an object in a 2K tablespace will read 64 blocks at a time.
The actual size of a multiblock read therefore is identical regardless of the block size of an object within a database.
An easy way to highlight this is to simply trace a session and see the specific size of corresponding multiblock read operations.
A sample from a trace on an 8K block index (with the db_file_multiblock_read_count set to 16), performing an Index Fast Full Scan looks like this:
WAIT #1: nam=’db file scattered read’ ela= 1487 file#=8 block#=1050 blocks=16 obj#=78294 tim=615409554677
WAIT #1: nam=’db file scattered read’ ela= 1377 file#=8 block#=1066 blocks=16 obj#=78294 tim=615409557777
WAIT #1: nam=’db file scattered read’ ela= 1143 file#=8 block#=1082 blocks=16 obj#=78294 tim=615409561563
Note that Oracle is reading 16 x 8K blocks (128K) per multiblock read operation.
However, when the index is recreated in a 16K block size tablespace, the Fast Full Index Scan looks like this:
WAIT #1: nam=’db file scattered read’ ela= 1413 file#=6 block#=14 blocks=8 obj#=78296 tim=626802128684
WAIT #1: nam=’db file scattered read’ ela= 1447 file#=6 block#=22 blocks=8 obj#=78296 tim=626802131649
WAIT #1: nam=’db file scattered read’ ela= 2014 file#=6 block#=30 blocks=8 obj#=78296 tim=626802135222
Note that Oracle is now only reading 8 x 16K blocks (128K) per multiblock operation.
Both indexes are effectively doing exactly the same work, both are effectively reading up to 128K of data per multiblock read …
It’s like paying someone $50 per 1/2 hour of work and then deciding to make things more “efficient” by paying them $100 per hour of work instead. In the end, you’re still just paying them $800 for an 8 hour day’s work regardless …
Note a larger block size will have less associated block overheads with there being less actual blocks so the overall size of an index may reduce a little, depending on index size and differences in block sizes. Therefore any possible improvements will only be restricted to the potential savings in the overall index size. With many databases having default block sizes of 8k and a maximum block size restricted to 16k, these savings may be minimum or non-existent.
This demo on the impact of different block sizes on multiblock read operations shows how Oracle actually performs the same sized reads when performing multiblock reads from differing block sized tablespaces, with the performance of the index in the larger block size tablespace being somewhat worse in this specific example.
With Oracle effectively performing identical work behind the scenes, the performance between different block size tablespaces is likely to be similar. You’re still paying $800 a day regardless …
Although it’s often claimed that multiblock reads is one of the key areas where larger index block sizes are beneficial, a claim based generally on the simplistic fact the number of consistent reads is reduced, the reality of the situation is somewhat different …
A suggestion that seems to pop up on a routine basis on various forums and discussion boards is that we should be storing our Indexes in a larger block-size tablespace. For example, if our database block size is set to 8K, we should be creating separate (say) 16K block tablespaces specifically for our indexes. Doing so will improve performance as the index will have a flatter, more efficient structure. Multiblock reads will also be more efficient (or so the theory goes) as we would be reading fewer index blocks during such scans.
Oracle introduced the concept of having different tablespaces in a database with different block sizes back in 9i Release 1 in order to make transportable tablespaces between databases with differing block sizes possible. However, there’s nothing preventing one creating a new tablespace with a non-default block size and assigning objects to these tablespaces.
In principle, storing indexes (in particular) in a larger block size sounds like a really good idea doesn’t it ?
I’ll be discussing the pros and cons of this approach in future postings but just some initial thoughts to get everyone thinking about it:
All tablespaces with a non-default block size requires a separate, non-default block size buffer cache to be manually configured
Non-Default buffer caches are not automatically sized as part of Oracle’s automatic memory management and must be manually tuned and sized, potentially increasing administrative overheads
Non-Default buffer caches do not have an associated KEEP or RECYCLED pool and so all objects with the same non-default block size must reside in the same buffer cache
The possibility of unnecessarily caching blocks from an infrequently accessed object and wasting memory is therefore likely to increase
The possibility of unnecessarily aging out blocks from a more frequently accessed object is also likely to increase, thus increasing I/O related overheads
Although the height of an index may reduce if stored in a larger block size, in many cases it may not actually change at all
In those cases when the height of an index is actually reduced, the actual performance benefit of such a height reduction is often overstated
The reduction of index leaf blocks (a much more telling possible advantage) is only beneficial to very specific types of queries
Larger blocks often have the disadvantage of greater contention, which can lead to performance related issues
Indexes with larger block sizes have a significantly greater I/O and memory related footprint in relation to most OLTP related index scans
Index related multiblock reads on larger block sized segments actually have no real benefit when compared to multiblock reads on smaller block sized segments
Most databases out there in the “real world” only use default block size tablespaces so the risks associated with finding bugs, CBO anomalies, etc. increase once non-default block sizes are introduced
Although in specific scenarios with specific applications, there may be some potential benefits of using non-default blocks sizes, in general, the disadvantages of using non-default block sizes usually out weigh these potential benefits.
As we shall see …
I’ve already written a few posts regarding concatenated indexes and things to consider (and not consider) when deciding the column order of a concatenated (composite) index.
A comment I see from time to time is that the whole question of column order within an index is now somewhat redundant anyways as Oracle since 9i has introduced the Index Skip Scan access path. Prior to 9i, if the leading column of an index wasn’t specified in a predicate, the index was effectively ignored by the CBO. However, if the leading column isn’t referenced now, Oracle can use the index anyways via an Index Skip Scan access path.
So the column order in a concatenated index doesn’t matter that much now, right ?
Well, not quite ….
An Index Skip Scan can only actually be used and considered by the CBO in very specific scenarios and is often an indicator there’s either a missing index or an exisiting index has the columns in the wrong order.
If the leading column of an index is missing, it basically means the values in subsequently referenced columns in the index can potentially appear anywhere within the index structure as the index entries are sorted primarily on the leading indexed column. So if we have column A with 100,000 distinct values and column B with 100,000 distinct values and an index based on (A,B), all index entries are sorted primarily on column A and within a specific value of column A, sorted by column B. Therefore if we attempt a search on just Column B = 42, these values could potentially appear anywhere within the index structure and so the index can not generally be effectively used.
However, what if the leading column actually contained very few distinct values ? Yes, the subsequent column(s) values could appear anywhere within the index structure BUT if these subsequent columns have relatively high cardinality, once we’ve referenced the required index entries for a specific occurrence of a leading column value, we can ignore all subsequent index row entries with the same leading column value. If the leading column has few distinct values, this means we can potentially “skip” several/many leaf blocks until the leading column value changes again, where we can again ”probe” the index looking for the subsequent indexed column values of interest.
So if we have a leading column with few distinct values, we may be able to use the index “relatively” efficiently by probing the index as many times as we have distinct leading column values.
On the other hand, if the leading column has relatively high cardinality then an Index Skip Scan is not a viable option. An index can generally store many hundreds of index entries per index leaf block, depending on the block size and the average size of the index row entries of course. So if the leading column were to change just once on average within the subsequent index leaf block, Oracle would be forced to scan the next index leaf block anyways as it may contain the required index row entry.
For Oracle to be able to “skip” an index leaf block, the leaf block must contain nothing but the same leading column value as last changed in a preceding leaf block. Hopefully, there are many leaf blocks where the leading column value doesn’t change and so hopefully there are many leaf blocks that can potentially be “skipped”.
Therefore, the cardinality of the leading column is crucial for an Index Skip Scan to be viable. In the example above where we had 100,000 distinct values for columns A and B, unless the table is massive, it’s unlikely an Index Skip Scan will be viable regardless of which column is the first column in the index. However, if column B only had 10 distinct values, then an index based on (B,A) may very well be able to use an Index Skip Scan whereas an index on (A,B) would not.
Note though that an Index Skip Scan must probe the index at least as many times as there are distinct values of the leading column. This will not be as efficient as an index that only requires the one index probe. Therefore although a query with a predicate based on A=42 could use an Index Skip Scan with an index on (B,A) assuming column B had few distinct values, an index on (A,B) or (A) would be more efficient as it would only require the one index probe.
However, if the performance of index (B,A) were “good enough” and/or a search on just A=42 was uncommon, then the index on (B,A) may be quite adequate and an index on (A,B) may be unnecessary. The index on (B,A) would also be able to handle queries based on columns A and B and queries based on just column B (providing the CBO determined the selectivity acceptable, which it might for unevenly distributed rare values of column B).
See this Index Skip Scan demo to see when it all may prove useful.
No, an Index Skip Scan doesn’t mean we don’t need to consider the column order of an index. If anything, it’s something else that needs to be considered and along with index compression, is another reason why low cardinality leading index columns have advantages.
Index Compression Part IV (Packt Like Sardines In a Crushd Tin Box) February 29, 2008Posted by Richard Foote in Index Compression, Index Internals, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
A great question by Brian Tkatch has convinced me there needs to be at least a Part IV in this particular mini-series. The basic question is what size does the column need to be for it to be effectively compressed ?
If a column is only just the one byte, then it can’t really be compressed as the prefix related overheads would likely make the exercise pointless, right ?
Well not quite …
True a larger column that is heavily repeated will yield a better return from compression than a smaller column, but even a single byte column could be effectively compressed, IF there are sufficient repeated values per index leaf block.
Previously, I illustrated a logical representation of how Oracle compresses a block. The following is more of a physical representation of a compressed block:
Prefix 0: David Bowie
Prefix 1: David Jones
Remember, only the leading columns in an index can be compressed and index row entries must be sorted by these leading columns. Therefore each prefix entry is referenced by one or more logically consecutive index row entries and each index row entry can only have the one index prefix. The corresponding prefix for a specific index row entry is therefore simply the prefix entry that precedes it within the leaf block.
If we look at sample prefix entries:
prefix row#0 flag: -P—-, lock: 0, len=6
col 0; len 3; (3): c2 08 43
prefix row#1 flag: -P—-, lock: 0, len=6
col 0; len 3; (3): c2 08 44
Note that the prefix row#0 starts at address 8030 while the next prefix entry row#1 starts at address 8015. That’s a total of 15 byes and yet the length of the prefix entry is just 6. Where’s the other 9 bytes ?
Well note the prc value for prefix row#0 is 1 meaning this prefix entry is only referenced by the one index row, as is the next prefix entry as well. The leading column in this particular example is actually unique meaning each and every prefix entry only has one corresponding index row.
If we look at the corresponding row entry, guess what it’s length will be …
row#0 flag: ——, lock: 0, len=9
col 0; len 6; (6): 04 81 aa 8b 00 69
row#1 flag: ——, lock: 0, len=9
col 0; len 6; (6): 04 81 aa 8b 00 6a
If you guessed 9, well done. Note that it’s starting address is 8021, which is the starting address of the previous entry 3015 + 6 bytes for the prefix entry which equals 8021 (as Oracle actually fills index blocks from the “bottom up”). Note that index entry row#0 has a psno of 0 meaning it’s corresponding prefix entry references prefix row#0.
Note that the index row length of 9 is made up of 2 byes for locks and flags info, 6 byes for the rowid and 1 byte for the rowid length (as this index is a non-unique index, albeit with unique index values).
Therefore the psno value actually uses 0 bytes as its value can be purely derived from it’s position within the leaf block in relation to its parent prefix value.
Let’s now look at an index row entry with a leading column that is heavily repeated but only one byte in size.
row#0 flag: ——, lock: 0, len=11
col 0; len 1; (1): 41
col 1; len 6; (6): 04 81 12 8a 02 67
row#1 flag: ——, lock: 0, len=11
col 0; len 1; (1): 41
col 1; len 6; (6): 04 81 12 8a 02 68
row#2 flag: ——, lock: 0, len=11
col 0; len 1; (1): 41
col 1; len 6; (6): 04 81 12 8a 02 69
Note the same leading column (hex 41) is repeated again and again within the leaf block and uses a total of 2 bytes of storage, 1 byte for the value and 1 byte for the column length. That means for each and every index row entry, this column uses 2 bytes. Note the length of the index row entries is currently 11 bytes.
Let’s look at the same index compressed.
prefix row#0 flag: -P—-, lock: 0, len=4
col 0; len 1; (1): 41
row#0 flag: ——, lock: 0, len=9
col 0; len 6; (6): 04 81 12 8b 00 42
row#1 flag: ——, lock: 0, len=9
col 0; len 6; (6): 04 81 12 8b 00 43
row#2 flag: ——, lock: 0, len=9
col 0; len 6; (6): 04 81 12 8b 00 44
First thing to note is that there is only the one prefix row entry for this particular leaf block and that all 727 index row entries all share the same prefix row entry.
The length of the prefix entry is 4, 2 bytes for these flags and locking info, 1 byte for the prefix column value and 1 bye for the prefix column length. That’s just 4, tiny little byes.
Note that each index row entry is now only 9 bytes reduced from 11 as we no longer need to store the 2 bytes for the repeated column entry. That means for a total overhead of 4 byes, we are able to save 2 bytes in each and every index row entry.
Not bad at all and as such represents a useful saving. In this particular example, the index reduced from 163 leaf blocks down to 138 leaf blocks, just by compressing a single byte column as the column has very low cardinality.
Note however if this leading column were unique for every row, we would increase the storage of this column from 2 bytes for every index row entry up to 4 bytes for every index row entry as every index entry would have it’s own associated prefix entry. That’s why compression can potentially increase the size of an index if the compressed column(s) has too high a cardinality due to the additional 2 byte overhead required to store all prefix row entries.
The following Index Compression Part IV demo shows all this in a little more detail.
There’s more to index (and indeed table) compression than simply compressing an index (or table) for the sake of it …
BTW, the spelling mistakes in the post title are fully intentional
Oracle Diagnostic Tools – Some Random Thoughts (Hammer To Fall) February 26, 2008Posted by Richard Foote in Oracle Cost Based Optimizer, Oracle General, Performance Tuning, Richard's Musings.
There’s been a bit of discussion lately and some interesting opinions aired, such as those by Daniel Fink , Doug Burns, and Alex Gorgachev regarding the whole issue of the usefulness or otherwise of various Oracle diagnostic tools (such as Statspack, AWR, ADDM, Extended Tracing, etc).
Are they really useful in diagnosing problems or do they simply provide a whole bunch of numbers that most folk can’t readily decipher ?
Boy, I could go on and on with this one …
IMHO, there’s no doubt that many (most even ?) people use many of these “diagnostic tools” incorrectly and use their view and perception of the data as some kind of justification for what really amounts to nothing more than a guess when attempting to determine the root cause and solution of a performance issue.
Look at all that time spent performing sequential reads, boy you need faster disks. Buffer Busy Waits are in your top 5 wait events, obviously need to add more freelists somewhere. Your session’s running slow eh, well statspack is showing enqueue waits times are really high, it’s probably that. If not, it might have something to do with those log file sync times (whatever they are).
I have no doubt about Mogens Norgaard’s assertion (as referred to by Daniel Fink) that if you take 2 people of the same skill in separate rooms and arm them with the same Statspack report, they would both come up with different suggestions on what needs to be done. I’ve lost count of the number of threads in forums that start with someone complaining about “performance”, posting a statspack report and ending up with 10 different people making 10 different suggestions.
I’ll digress somewhat, but this is a story I use to try and explain to my kids what it is I actually do.
Mum says she’s heading off to the local shops to get some milk for our breakfast cereal. The shops are close, a few minutes drive away and I expect she’ll be back in 10 minutes or so. 2 hours later, she finally gets home, drops off the milk and leaves again in a huff before we can ask what happened. Kids, my job is to find out what took her so long.
It’s at this point the kids say that we don’t normally have cereal, but toast anyways and if I’m angry at mum for taking so long I should have gone and got the milk myself, but I try to get the discussion back on track …
I know there was plenty of fuel in the car, the weather was fine and clear and that the traffic generally wasn’t too busy at the time. There’s also usually plenty of milk at the shop. Now I’ve been listening to the radio and have a pretty good idea of the overall traffic and weather conditions. There was a truck that overturned nearby, resulting in some pretty major traffic delays, causing lots of folk to run late.
So I “guess” it must have been the truck that likely caused the delay, right ?
The kids suggest it was probably someone she met and she just got caught up having a long chat. Then again, maybe the car broke down or maybe she didn’t have enough money and had to go to town to get to the bank or maybe …
The point of course is that we simply don’t know. Any diagnosis based on the data at hand would be nothing but a guess. Yes there maybe some historical additional reference we could use (like she sometimes gets caught up with someone for a chat) but it would still be a guess to assume it was the issue this time.
There maybe some “global” events taking place that could be a factor (such as the overturned truck, or really bad weather) but again, there’s nothing to directly suggest it’s the actual issue.
Yet how often do people when trying to determine the cause of specific performance issues turn to global, generalised, averaged to death statistics, charts and reports to diagnose the problem ? How often do people “hope” that because the weather looks bad, that it’s a likely cause of the problem. Or because the amount of traffic is more than a certain threshold or slower than a certain ratio that maybe that’s the cause of the issue.
How often do people make these (sometimes educated) guesses, get it wrong, apply a solution that makes no difference, take another stab, wrong again, yet another guess and yet another guess until eventually either they hit the right diagnosis and solution or the problem goes away (only to come back at some other point in time) …
A key issue is that many of these fancy diagnosis tools, reports and lists of statistics are used inappropriately. It’s a potential diagnostic tool but often the wrong tool for the problem at hand. A really really sharp and shiny and fancy looking saw when the problem is a “protruding nail” that needs to be hit with a hammer.
Another problem is that there’s generally no process or methodology associated with looking at these reports and charts and database wide statistics. As such they’re open to interpretation and differing views. Most databases can be improved somehow in all sorts of different ways but what is the precise reason for a specific (or general) database performance issue. Everyone is slightly sick or imperfect in some way or another (with the possible exception of David Bowie), but what exactly is it that’s killing us …
However the key problem is that often, very often, most of these diagnostic tools, reports and flashing screens don’t actually provide the necessary information to make an educated diagnosis. The answer to the problem is simply not to be found in statspack, or in the v$ views or on the radio or in the weather report. 10 people can look at a statspack report and 10 different solutions can improve the database in 10 different ways but none of them may necessarily solve the specific database performance issue that’s causing stress to the business.
The actual issue is buried and hidden and drowned out in a sea of numbers, averages, statistics and people who are all predominately able to get their milk in a timely manner.
Not that a “saw” is a totally useless tool. Jonathan Lewis recently referenced a rather nice article by Connie Green on how a saw can be used effectively for slicing through some issues. The traffic report can be a useful source of information.
However, the Oracle marketing machine has certainly been promoting many of these “shiny saws” to the point where many see them as being the tools of choice, no matter the performance issue at hand. Oracle says they provide great, “useful” information so they must be really really useful in solving my performance issues. The answer to my problems has got to be in here somewhere right, if I just know where to look ?
The problem is not necessarily with the diagnostic tools themselves but in the manner in which they’re often used and attempted to be applied to specific issues. A saw is not particularly useful at driving in a nail. You need a hammer for that …
Back to my little story.
Kids, imagine if we had a camera on mum and saw exactly what she was doing for the entire time she was away getting the milk. We could actually see where all the time was spent, see exactly what happened in the 2 hours she was away. We can account for every moment and see exactly what happened during the 1 hour and 50 minutes of “lost” time.
Then we could see that in fact, the car worked fine, she took another route to the local shops bypassing the truck, got the milk at the counter straightaway. However, when she got back to the car she had a problem unlocking the car door as the key was quite bent and got to the point where she just couldn’t open the door.
In fact, out of the 2 hours, 1 hour and 50 minutes was spent frustratingly trying to open the car door.
So it was a “locking” problem all along
No guesses. No assumptions. No ifs and maybes. We know exactly the root cause of the problem.
Therefore no wasted effort and time filling the car up with petrol, no need to drive the longer way around to miss that interchange, no need to demand she stop chatting at the shops (thank goodness), none of which if applied would have actually resolved the issue, none of which would have prevented the same problem from reoccurring again next time …
And that of course is the information Extended Tracing can provide. IMHO, if only this hammer were used more often, if this tool was considered more often to knock in that “protruding nail”, if people posted an extended trace file more frequently, then it would be a big step in the right general direction in correctly diagnosing problems.
Is a 10046 event, DBMS_SUPPORT, DBMS_MONITOR, etc. perfect ? No, of course not. Although there are constant improvements with most releases, it can be difficult to setup in some environments and configurations, it can be difficult to interpret and piece together, it can tell you what the issue might be without telling why, it may only tell you that the problem isn’t Oracle related (although that in itself can be useful), it requires the issue to generally be repeatable, it has overheads, etc. etc.
However, in most scenarios, when applied appropriately, it can provide the necessary information to diagnose the exact cause of performance issues. In most scenarios, it can take the guess work out of the equation and save time by driving one directly to the correct diagnosis, first time.
I’ll add this point in as well. Most people working on other software solutions trying to resolve performance issues, would faint with disbelief at the level of instrumentation available in Oracle. No it’s not perfect, but boy, things could be a lot lot worse.
My general recommendation is this. When you want to determine and diagnose the cause of specific or general database performance issues, consider extended tracing as the first tool within the toolkit. You want to know why the milk took so long, ensure you have a camera available and record what happens.
If you want to be on the lookout for low hanging fruit, if you want to have a global view of the general road conditions, of the weather, of the fuel left in the tank, and proactively see what areas in a database may benefit from some attention, then look at using “saws” such as Statspack, ADDM, etc.
IMHO, if the Oracle diagnostic tools were used more appropriately, if more people read Connie Green’s article, if more people investigated and applied the use of extended tracing, then I’m sure the perception of their usefulness would increase as well.
Meanwhile, I’m going to use a hammer to see if I can get this damn key straightened out …
BTW, the kids think I’m some kind of private investigator who follows and monitors people all day long so I guess I need to try again in explaining what it is I actually do …
Index Compression Part III (2+2=5) February 22, 2008Posted by Richard Foote in Index Compression, Index Internals, Oracle General, Oracle Indexes, Performance Tuning.
1 comment so far
As previously discussed, compression is only beneficial when the compressed columns have repeated values within an index leaf block. If a compressed column has repeated values, Oracle will basically store the one copy of the compressed column (or columns) as a prefix entry within a leaf block and each index row entry will then be associated with it’s corresponding prefix entry.
That being the case, it makes no sense in compressing a single column Unique index. Each index entry must be unique, there can’t be any repeatable, duplicate column values. Therefore compression will be not only totally ineffective, but would actually result in an index structure that increases rather than decreases in size due to the extra overheads associated with having a prefix index entry for each and every index row entry.
It also makes no sense in compressing every column in a concatenated, multi-column Unique index. For exactly the same reasons. Each compressed index column combination must be unique and would result in a prefix entry for each and every index row entry. Compression would be worse than useless and result in an increased index structure.
However, Oracle does its best to protect ourselves from ourselves.
For a start, it does not allow one to create a compressed index on a single column Unique index. Attempt to do so and Oracle generates an “ORA-25193: cannot use COMPRESS option for a single column key”. Hey, even the error message is nice and meaningful.
If the Unique index has multiple columns, the default prefix length value (number of compressed columns) is the number of indexed columns minus one, not all columns as it is for a Non-Unique index. See, Oracle is doing its best here to prevent a useless attempt at index compression.
If you specify a prefix length value equal or greater than the number of columns in a Unique index, Oracle generates an “ORA-25194: invalid COMPRESS prefix length value”. These are not restrictions but designed to stop the creation of inefficient compressed indexes.
Note however that Non-Unique indexes can be used to police primary Key (PK) and Unique Key (UK) constraints. I’ve discussed all this previously. The constraints might be Unique, the data might be unique but the index is Non-Unique and so these “protections” fly out the window. There is nothing stopping one creating a single column compressed Non-Unique index to police a PK or UK constraint. There’s nothing preventing you from creating a concatenated, Non-Unique index with all columns compressed, from policing a PK or UK constraint. In fact, you can even create a fully compressed Non-Unique index at the same time as the PK or UK constraint with Oracle’s extended constraint and index creation syntax.
Nothing stopping you, except perhaps the realisation that it would be a very bad and futile thing to implement as the resultant index will be guaranteed to be less efficient than an equivalent nocompress index.
Yes, it might make sense to compress just the leading columns of a Unique index or a Non-Unique index that’s policing a PK or UK constraint, if such leading columns have sufficient repeating values to make compression effective. But it would simply not make sense to compress all columns from such indexes.
See this demo Index Compression Part III on how compression works and doesn’t work for Unique Indexes.
In Part IV, we’ll look at the issue of what specific columns might benefit from compression and look a little closer at how storage is actually saved.
Whoever said there can’t possibly be enough things to discuss on a Blog that focuses mainly on Oracle indexes
Index Compression Part II (Down Is The New Up) February 20, 2008Posted by Richard Foote in Index Compression, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
Compression can be very beneficial when the compressed columns of an index have many repeated values within a leaf block as these repeated values are only stored once and subsequently referred to within the index row entries themselves.
However, when the leading column is very selective or the compressed columns are very selective and have very few or possibly no repeating values in an index, then we have a problem.
In these scenarios, Oracle will create a massive prefix table, housing most if not all the individual column values anyways, as the prefix table stores all unique combinations of compressed column values within a leaf block.
The index row entries will point or refer to a prefix entry which is hardly shared (if at all) by other index row entries.
Compression in these cases is not only pointless but potentially worse than pointless as we now not only have to store a similar amount of index column data anyways, but additional compression related overheads such as those associated with the prefix table entries.
Net result can be no effective compression or indeed leaf blocks that can no longer store as many index values due to these associated compression overheads, resulting in indexes increasing rather than decreasing in size as a consequence.
For concatenated, multi-column indexes, the order of the indexed columns can make a huge difference to the compression of an index. If the leading column or columns (depending on how many columns are compressed) have many repeated column value combinations, not a problem as compression will be effective.
However if the leading column is very selective, then compression by definition will be ineffective, as there must be many distinct column values in the prefix table as a result which are less likely to shared by the index row entries.
For compression to be effective, the prefix table should have considerably fewer entries than index row entries in most leaf blocks. If there are approximately (say) 200 index entries per no-compressed leaf block, you want the number of distinct column combinations within the leaf block to be substantially less than 200.
For index compression to be feasible, ensure low cardinality columns are the leading columns in a concatenated index, else compression may be futile or worse.
See this demo on Index Compression Part II to see how compressing an index can be either effective or plain terrible and how the order of columns in a concatenated index can make all the difference to the effectiveness of index compression.
Next I’ll cover index compression with Unique Indexes …
Index Compression Part I (Low) February 17, 2008Posted by Richard Foote in Index Compression, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
Index compression is perhaps one of the most under used and neglected index options available. It has the potential to substantially reduce the overall size of Non-Unique indexes and multi-column Unique indexes, in some scenarios dramatically so. A smaller index, especially if it stays permanently smaller without any subsequent expensive maintenance operations, it always a nice thing. Not only will it potentially save storage, but if the resultant index contains fewer leaf blocks, that’s potentially fewer LIOs and from the Cost Based Optimizer’s point of view, potentially a cheaper execution plan option.
All possible without a single index rebuild in sight …
However like most things Oracle, index compression also has the potential to be misused and to cause rather than solve problems. So it really helps to understand how index compression works and how it’s actually implemented before we rush into anything.
The first point to understand is that index compression is implemented at the index block level. Basically, Oracle stores each distinct combination of compressed column values found within a specific index leaf block in a “Prefix” table within the leaf block and assigns each combination a unique prefix number.
The more numbers of distinct compressed column values, the more entries in the prefix table and the larger the prefixed related data. The fewer numbers of distinct compressed column values, the fewer entries in the prefix table and the smaller the prefix related data. Generally, the fewer entries in the prefix table, the better the compression.
Oracle now no longer stores these compressed columns within the actual index row entries. These compressed columns are substituted and referenced with the details stored in the prefix table entry, which are shared by all index row entries with the same corresponding prefix value.
Only leading columns (which in a Non-Unique Index can potentially be all the columns in an index, in a Unique Index can potentially be all but the last column in the index) can be compressed. Therefore, the prefix table is in the same logical order as they would appear in the index row entries. The values of the prefix values always appear within the index row entries in a sequential manner, noting that (hopefully) several row entries share the same prefix number.
Let’s say we currently have a nocompress Non-Unique index on a NAME column with the following entries:
0: David Bowie
1: David Bowie
2: David Bowie
3: David Bowie
4: David Bowie
5: David Jones
6: David Jones
7: David Jones
After the index is compressed, the leaf block entries would look logically something like this:
0: David Bowie
1: David Jones
Importantly, each distinct combination of compressed column values is now stored just the once within an individual index leaf block. In the example above, we previously stored “David Bowie” 5 times. In the compressed index leaf block, we now only store “David Bowie” once, with the prefix value now being referenced by each index entry instead.
The net result being we can now (hopefully) store many more index entries per leaf block meaning we don’t need as many leaf blocks in our index.
To compress an index, simply specify the COMPRESS option:
CREATE INDEX bowie_table_i ON bowie_table(col1, col2) COMPRESS 2;
The number after the COMPRESS keyword denotes how many columns to compress. The default is all columns in a Non-Unique index and all columns except the last column in a Unique index.
This demo, Index Compression Part I shows how an appropriately compressed index can substantially reduce the overall size of an index. It also shows a couple of index leaf block dumps to highlight how index compression is implemented.
In Part II, I’ll show you how you can really stuff things up by implementing index compression totally inappropriately …
Clustering Factor: A Consideration in Concatenated Index Leading Column Decision (Sweet Thing) February 15, 2008Posted by Richard Foote in Clustering Factor, Concatenated Indexes, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
Short but sweet today.
I last discussed how high cardinality columns shouldn’t necessarily be in the leading column of a concatenated index as they don’t provide the performance benefit as sometimes claimed.
If all things are equal and the columns in the concatenated index are all likely to be referenced, a simple consideration that is often forgotten when deciding which column to have as the leading index column is the Clustering Factor of the corresponding columns.
As previously discussed, the Clustering Factor determines how well aligned or ordered the index entries are in relation to the rows in the parent table. So if the rows are ordered within the table on a particular column or columns (such as a sequential ID column, a monotonically increasing date or time-stamp, etc), then an index on these columns is likely to have a very good Clustering Factor. Consequently less IOs will be required to retrieve all the required rows via the index as all the required rows will be housed in relatively few, well clustered data blocks.
It therefore makes sense to at least consider the Clustering Factor of the various columns in a concatenated index. Why ? Because if the leading column has a very good Clustering Factor, the concatenated index by definition must also have a very good Clustering Factor as all indexes are primarily ordered based on the leading indexed column. A concatenated index with a good Clustering Factor is going to be more efficient in retrieving rows from the table and more importantly, will be considered more desirably by the CBO when costing access path options.
Of course, the opposite is also true. By having a leading column with a poor Clustering Factor will mean the concatenated index will have a poor Clustering Factor, making it less efficient and less likely to be considered by the CBO.
As such, the Clustering Factor of each corresponding column in a concatenated index is at least worthy of some consideration when making the decision on how best to order the indexed columns.
This demo on Index Column Order and Clustering Factor shows how the order of columns in a concatenated index has a big impact on the Clustering Factor of the resultant index.
UPDATE: However as Tom Kyte has stated in the comments, in virtually all cases, the Clustering Factor is not really a factor (yes, pun fully intended) as subsequently columns are generally going to impact the CF anyways or the selectivity of the index is such that the improved CF is not relevant anyways.
More relevant considerations regarding the ordering of columns in an index coming I promise
A common myth or mis-perception is that when deciding how to order the columns in a concatenated, multi-column index, one should avoid placing low cardinality columns in front.
For example, if you want to create an index on two columns, column ID which has many many distinct values and column CODE which has very few distinct values, create the index as (ID, CODE) as it’ll be far more efficient than a corresponding index on (CODE, ID).
The reasoning goes that by creating the (CODE, ID) index, one decreases the performance and efficiency of using the index as Oracle will have to scan through multiple index leaf blocks containing the low cardinality column, until it eventually finds the specific index entries of interest.
Or so the theory goes …
In actual fact, there’s no real difference in navigating to the specific leaf block of interest for an index on (ID, CODE) compared to an index based on (CODE, ID), providing both indexed columns are known.
The important fact that’s missed is that the branch index entries contain column entries based on all indexed columns, or at least on as much as is necessary to uniquely identify the required navigational path. Therefore, Oracle can directly navigate to the leaf block of interest, no matter the index order, providing all index column values are know.
The only slight overhead that an index based on (CODE,ID) will have is that these branch index entries are going to be somewhat larger as it will likely require both columns for the branch index entries but likely only the one column the other way around. However, branch blocks usually take up a small percentage of the overall index structure and this (usually) minor overhead is very unlikely to make a difference to the index height.
This demo on Index Column Cardinality Order shows how Oracle navigates to a specific leaf block of interest in the same manner and with the same costs, regardless of the ordering of low and high cardinality columns in the index. It also shows and describes a couple of index branch block dumps to highlight how Oracle uses the column values to define the necessary navigational path.
So the high cardinality column shouldn’t necessarily be the column given leading column status.
In actual fact there are a number of good reasons why the low cardinality column could be considered as the better option as the leading column. For a start, the index can be compressed much more efficiently if the leading column has lower cardinality. Also, an Index Skip Scan can at least be considered if the leading column has lower cardinality.
Of course, the ordering of columns in an index can be very significant and can make a huge difference to the possible efficiency of an index for other key reasons as well. Whether the leading column is always going to be a known value is an important consideration, as is the clustering factor of the leading column.
All good discussions for another day