jump to navigation

Store Indexes In A Larger Block Tablespace: Height Reduction 1/2 Myth (Five Foot One) March 26, 2008

Posted by Richard Foote in Index Block Size, Index Height, Index Internals, Oracle General, Oracle Indexes, Oracle Myths, Tablespace Management.
trackback

A common misconception with using a larger block tablespace specifically for indexes is that this will result in a reduction in the height of indexes and hence “flatten” index structures.

However, this is only partly true.

A few little generalisations to begin with.

First, most databases out there have a default block size of 8K. I won’t go into a big discussion on what the database block size should be set to (maybe another time), however I will say most databases these days have a default block size of 8K and that it’s certainly questionable to have the database block size set to 2K.

Note also in many common platforms (e.g. Windows, Linux) the maximum block size limit is 16K. So in many environments, when we talk about moving indexes into a bigger sized block, it specifically involves moving from an 8K to a 16K block size.

Yes, in theory you could move an index from (say) a 2K block size up as high as a 32K block size but you would need to question why the default is so low to begin with and whether the upper value is actually supported in your environment.

I make this point because the difference between block sizes makes a huge difference in the probability of the index height actually being reduced.

So let’s start with an example of moving indexes between an 8K default block size to a 16K block tablespace, not least because the arithmetic is easier and extrapolate out as we go along.

So in our example, the new block size is double or 2 times that of the default one. By doubling the block size, we effectively 1/2 the number of necessary leaf blocks in the index structure. By having fewer leaf blocks we also therefore reduce the overall associated block level overheads so the actual reduction in leaf blocks could be a tad more, but we’ll say a 1/2 reduction to keep the numbers nice and simple.

Note the reduction in leaf blocks in therefore simply 1 / the ratio of block increase (1/2). Moving from a 2K block to a 32K block is 16 times larger so we’ll have approximately 1/16 the number of leaf blocks.

So how does (say) halving the number of leaf blocks impact the overall height of the index ?

We obviously can’t reduce the height of an index with a height of just 1. The index consists of just the one block so a larger block would simply mean the block having more free space.

To reduce the height of an index with a height of 2 (back to 1), we therefore must be able to store all index entries within a single block. Therefore, in the 8K to 16K example, the index can only have 2 full leaf blocks for this to be possible. If an index has 3 or more “filled” leaf blocks, the index must remain at a height of 2 as we can’t fit all the index entries into the single larger index block.

Importantly therefore, all indexes with a height of 2 with more than 2 full leaf blocks would not reduce in height by simply doubling the block size. This could very well be the vast majority of indexes at this level.

For an index with a height of 2, the index must have less full leaf blocks than the ratio of block increase for a height reduction to be possible. In our best case scenario, 2K block to 32K block, any index with more than 16 full leaf blocks would not reduce in height.

To reduce the height of an index with a height of 3 (back to 2), we must therefore be able to store all intermediate branch blocks into the one branch (root) block. When we double the block size, we therefore 1/2 the leaf blocks and 1/2 again the necessary branch blocks. Therefore the necessary branch blocks is 1/(2×2) = 1/4 that of the default block size. Therefore any index with a height of 3 that has more than 4 full intermediate branch blocks will again not reduce in height as again all the necessary branch information would not fit in one root block.

Importantly therefore, all indexes with a height of 3 with more than 4 full intermediate branch blocks would not reduce in height by simply doubling the block size. Again, this could very well be a significant proportion of all indexes at this level. Note also in many databases, the vast majority of indexes have a height of 3 or less so by simply doubling the index block size, most indexes would not reduce in height …

For an index with a height of 3, the index must have less full intermediate branch blocks than the ratio of block increase to the power of 2 for a height reduction to be possible. In our best case scenario, the 2K block to 32K block, only those indexes with more than 16×16=256 full intermediate branch blocks will reduce in height. This is therefore likely to be a far higher proportion of all such indexes.

You see the pattern …

To reduce the height of an index with a height of 4 (back to 3), we must therefore store all first level intermediate branch blocks into the one branch (root) block. When we double the block size, we therefore 1/2 the leaf blocks, 1/2 again the second level intermediate branch blocks and 1/2 again the first level intermediate branch blocks. Therefore the necessary first level intermediate branch blocks is 1/(2x2x2) = 1/8 that of the default block size. Therefore any index with a height of 4 that has more than 8 full intermediate first level branch blocks would again not reduce in height as again all the necessary first level branch information would not fit in the one root block.

Importantly therefore, all indexes with a height of 4 with more than 8 full intermediate branch blocks will not reduce in height by simply doubling the block size. However, as the index height increases, the ratio of indexes where this is likely to be the case decreases.

For our best case scenario, 2K to 32K, we now start hitting very large numbers 16x16x16=4096 so the likelihood of a index height reduction is very very high.

And so on …

The important point being that by simply doubling the index block size, in most databases, the vast majority of indexes are actually quite unlikely to reduce in height as the index needs to be within very limited size boundaries for the index height to reduce. The greater the index height however, the greater the index size boundaries whereby an index height reduction is possible.

Also, the greater the index block increase, proportionally the fewer the index blocks and so greater the likelihood of an index height reduction.

This demo on the Impact Of Block Size On Index Height illustrates that by simply doubling the index block size, the height of an index (in various sizes) rarely decreases.

One final point. With our height 4 index example, note the index can only have a maximum of 8 first level branch blocks for the height to reduce. Therefore, in effect, we’re replacing a maximum of 9 x 8K branch blocks with 1 x 16K block. If this index is frequently accessed, these 9 branch blocks are likely cached and we only need to read two of these blocks anyways for an index range scan (for a total of 16K). After the rebuild, we still need to read this block (16K again) anyways so from a purely performance perspective with regard to just simply reducing the index height, the so-called performance benefits are often very much exaggerated.

As we’ll see in the next epic episode of this series, performance can actually decrease :(

Next time someone claims moving indexes into a larger block size will decrease the height and flatten an index, remember it really does depend. In many databases, especially when the index block size is just doubled, it’s actually quite surprising just how unlikely it is for an index to actually decrease in height.

About these ads

Comments»

1. John - March 27, 2008

I wonder if the performance gains seen by some people following index rebuilds with larger block sizes is more due to an increase in index density following a freshly built index.

I wonder if an index on a table in an OLTP system with many INSERT, UPDATE and DELETE operations will become quite sparsely populated i.e. many leaf blocks not being near maximum capacity. UPDATEs on some indexed columns may cause entries to move within the leaf blocks e.g. updating a shipping or some other date. It is possible that the new destination leaf block needs to be split to store the updated entry, resulting in two blocks 50% full. Equally DELETEs would leave empty slots within leaf blocks.

As you point out, a rebuild with the same block size would result in a smaller index in such a situation, as each leaf block would now be near full capacity. And there would be a consequential reduction in branch blocks too. Combined with an increase in block size, this could result in a reduction in the height of the index. But as you say, this would not be due to the use of a larger block size on its own.

Is there a relatively simple way of comparing the current size of an index on disk to its size if rebuilt? i.e. its estimated optimal size if built again from scratch. This would help identify indexes that would shrink a lot from such a rebuild, regardless of block size. I suppose this is some of the AWR functionality of Oracle and the Segment Advisor which identifies inefficiently stored segments and advises rebuilds.

John

2. Richard Foote - March 27, 2008

Hi John

That’s one of the points I’ve been trying to make. You rebuild an index, maybe the index rebuild itself contributes to the overall improvement in some scenarios.

Or maybe the new tablespace uses faster disks, faster parts of a disk, a disk with less contention, better memory management, etc.

Remember, a randomly populated index on average is approximately 70-75% used. Index Fast Full Scans are the execution plans most impacted by reducing the overall index size as the costs are directly proportional.

Yes the segment advisor can give you info on indexes where there may be space savings, although note these space savings may only be temporary as the index will simply split out again in the future at a cost.

3. Gary - March 27, 2008

I’ve been wondering about whether its the separation of the caches. The obvious way to test is a comparison after
a) rebuild to the default block size
b) rebuild to the larger block size
c) rebuild to a smaller block size.
This would determine whether it is the rebuild, the block size or the separation.

4. Thomas Kyte - March 28, 2008

@Gary

or just alter the segments to be in the keep or recycle pool – eg: REMOVE the rebuild from the equation all together.

You do not need multiple block sizes for something as trivial as cache segregation. And it is so much easier to do the keep/recycle pools than it is to rebuild the entire index….

5. Richard Foote - March 28, 2008

Hi Gary

If you move an index to another, separate cache, then it could have an impact on the performance of the index (for better, for worse, or maybe not at all). If the default buffer cache is poorly sized but the bigger block cache is sized appropriately, the index may now have better caching characteristics and so perhaps perform better as well. However, tune the default cache (or the bigger block cache) and performance becomes comparable again.

Your test wouldn’t however necessarily determine whether it’s the rebuild, the block size or the separation as it could also be the disk performance that makes the difference, or the load on the system at the time of the benchmarks, or the sql parsing overheads, etc.

However, you would likely find at the end of the day, there’s likely not going to be much difference either way …

6. Richard Foote - March 28, 2008

Hi Tom

I noticed on your site today a question on Undo Tablespace Blocksize where you said:

“BUT – do not use multiple block sizes for anything other than transporting data from database A to database B where the block size in A is different from B. No silver bullets with this “trick”, nothing you want to do in real life. The cases whereby multiple blocksizes are useful are typically limited to benchmarks, old wives tales, and very exceptionally rare conditions.”

It perfectly sums it all up :)

7. Log Buffer #90: a Carnival of the Vanities for DBAs - March 28, 2008

[...] Foote cracks what he calls the height reduction 1/2-myth, beginning, “A common misconception with using a larger block tablespace specifically for [...]

8. Gary - March 29, 2008

“If the default buffer cache is poorly sized…” But size isn’t necessarily the only thing that can cause buffer cache issues. There is that demo SQL to artificially set a buffer cache hit ratio which shows that SQL can make a buffer cache behave ‘sub-optimallly’.
This is mostly a ‘thought experiment’ on what might be the effect of separating index and table buffer caches. I suspect that in most cases there wouldn’t be a benefit and it would probably be detrimental (otherwise Oracle would tweak the caching algorithm accordingly themselves). But in odd cases it may mitigate some effects of bad SQL/plans (sort of like having a smaller library cache may improve the performance of an application not using binds).
If I had an environment which had shown a performance benefit, it would be interesting to see if there had been any significant change in the number of index blocks in the buffer cache. If not, it would at least rule this out as a contributory factor.

9. Thomas Kyte - March 30, 2008

@Gary

not to be redundant… but….

or just alter the segments to be in the keep or recycle pool – eg: REMOVE the rebuild from the equation all together.

pretty much *every* time I meet someone that says “large blocksizes are better for indexes”, we discover “oh, it was physical IO, gee I see”…..

10. Richard Foote - March 30, 2008

Hi Gary

Correct, size isn’t the only thing that can cause discrepancies, it was just an example. As Tom suggests, if you just want to test separating indexes from tables at the cache level, simply assign the indexes in question to a different buffer pool (keep or recycle).

The point I was making which I think you agree with, is that by rebuilding an index in a different sized tablespace, you use a different cache and it could be caching characteristics that cause performance differences. When making comparisons between the two, ensure that indeed they’re the same so you can rule it out as a contributing factor.

I think we agree :)

11. Sum Gai - March 30, 2008

“You rebuild an index, maybe the index rebuild itself contributes to the overall improvement in some scenarios.”

But index rebuilding RARELY helps, eh?

12. Richard Foote - March 30, 2008

Hi Sum

Where’s the conflict between “in some scenarios” and “rarely” ?

Read my rebuild presentation and you’ll see I use a classic example of when a rebuild may be benefical …

13. Sum Gai - April 2, 2008

“Where’s the conflict between “in some scenarios” and “rarely” ?”

How frequent is it? You seem to advocate hardly ever rebuilding.

14. Richard Foote - April 3, 2008

Hi Sum

Not very frequent at all, however it’s an example of why rebuilding in a larger block tablespace might appear to improve performance, at least temporarily, in some scenarios.

Appear, because it’s actually the reduced number of blocks in the index that’s improved the Index Fast Full Scan, not the larger block. A rebuild in the same block tablespace would yield similar results.

Temporarily, because once the index leaf blocks split again and the index increases in size again, performance returns back to what it once was.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,713 other followers

%d bloggers like this: