jump to navigation

How To Rebuild And Make An Index Bigger, Not Smaller (Carry That Weight) January 13, 2009

Posted by Richard Foote in Index Shrink, Oracle Myths.
22 comments

I sometimes hear suggestions along the lines of:

 “when you rebuild an index, at least you make the index as small and efficient as possible, even if it doesn’t necessarily improve performance”

or

“when you rebuild an index, at least you’ll always save some space and storage if nothing else”.

However, this of course is not necessarily the case. There are many scenarios where by rebuilding an index, you can actually make the index bigger, not smaller, you can make the index use more storage, not less storage, you can make the index less efficient, not more efficient and hence you can make the exercise worse than useless, not just useless.

It all depends.

Here are just a few little examples whereby rebuilding the index has resulted in a larger index than it was before the index was rebuilt. Note in all of these scenarios, I’ll be sticking with the default PCTFREE value of 10%, which is by far the most commonly set PCTFREE value of every Oracle index out there in existence.

The first example is very simple and so very common. It’s simply an index on a monotonically increasing value, as used by many Primary Keys out there in the “real world”. 

First we create the table and associated index. Nice and simple …

SQL> create table bowie (id number);

Table created.

SQL> create index bowie_i on bowie(id);

Index created.

We now populate the index with a whole bunch of monotonically increasing values …

SQL> insert into bowie select rownum from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete

Let’s see how big our index might now be …

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2048    1999    16024128      100

Note the index has a “perfect” PCT_USED value of 100%, you can’t get any better than that !! Why ? Because as the index is on a monotonically increasing value, all the inserts take place on the “right hand side” of the index and Oracle performs it’s 90-10 index block split operation, rather than the 50-50 block split. 

If we try and rebuild such an index …

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2226    17848160       90

We notice that the index is now actually bigger, not smaller. Not only that, but we have now introduced 10% free space that will likely not be reused.

 

Another example.

This time we build a similar table but this time populate it with randomly generated numbers, not monotonically increasing values.

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number);

Table created.

SQL> insert into bowie select ceil(dbms_random.value(0,100000)) from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Let’s now create an index on this numeric column …

SQL> create index bowie_i on bowie(id);

Index created.

Let’s now add more random numbers to the table …

SQL> insert into bowie select ceil(dbms_random.value(0,100000)) from dual connect by level <=50000;

50000 rows created.

SQL> commit;

Commit complete.

OK, let’s check out the index …

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2214    17760192       95

OK, the index has a PCT_USED value of 95%.

However, after the rebuild …

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2432    2324    18640192       90

The index is now bigger, with more leaf blocks and a worse PCT_USED value.

If we rebuild an index before the index has had the opportunity to use its available free space, an index rebuild can introduce more free space, not less. Now this may not necessarily be a bad thing, but it’s another example of the index potentially being bigger, not smaller after a rebuild.

 

Here’s yet another example.

Similar to the previous example, but this time we’ll throw in a whole bunch of update statements as well. As we know, an update statement within an index is effectively a delete statement, followed by an insert statement. All that “wasted” deleted space should be nicely cleaned up after an index rebuild, resulting in a smaller index structure. Or so we might think …

First, create and populate the table with randomly generated numbers similar to before …

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, value number);

Table created.

SQL> insert into bowie select rownum, ceil(dbms_random.value(0,100000)) from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Now let’s create the index and aftewards insert a bunch of new rows as before.

SQL> create index bowie_i on bowie(value);

Index created.

SQL> insert into bowie select rownum+1000000, ceil(dbms_random.value(0,100000)) from dual connect by level <=50000;

50000 rows created.

SQL> commit;

Commit complete.

Now let’s run a little procedure that will update roughly 10% of all the data (a  reasonable overall percentage), with new random numbers.

SQL> begin
  2  for i in 1..105000 loop
  3    update bowie set value = ceil(dbms_random.value(0,100000))
  4    where id = i;
  5    commit;
  6  end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

OK, let’s now look at the index, wasted space and all …

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2234    17952320       94

Interestingly, even with all the random updates (and associated deletes), the index is relatively efficient at 2234 leaf blocks.

Let’s see how many of the 105,000 updated entries have remained deleted:

SQL> select del_lf_rows from index_stats;

DEL_LF_ROWS
-----------
       1129

Not many at all, just 1129 deleted entries remain, less than 1% !!

In fact, the vast majority of the deleted entries have all been automatically cleaned out by Oracle. Much more on the myth that an index with lots of update activity needs to be rebuilt to come in the near future …

Let’s rebuild this index now.

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2432    2324    18640192       90

The index is again bigger than it was previously. Even after we’ve added in a bunch of rows and updated 10% of the entire table.

The point here of course is that even if an index is heavily updated, the deleted space can generally be subsequently reused. Like I said, more on all this in a future post.

For now however, just note that rebuilding an index doesn’t necessarily mean the resultant index will now be smaller or more efficient or more pristine. In many cases, rebuilding an index blindly can result in a larger index structure than it was prior to the rebuild.

Bear this in mind the next time you read or hear someone suggest an index rebuild will always ensure the new index is smaller and as efficient as possible …

Index Rebuild vs. Coalesce vs. Shrink Space (Pigs – 3 Different Ones) February 8, 2008

Posted by Richard Foote in Index Coalesce, Index Rebuild, Index Shrink, Oracle General, Oracle Indexes, Performance Tuning.
31 comments

Previously, I discussed how an ALTER INDEX  … COALESCE is going to be less expensive in terms of using resources than an equivalent ALTER INDEX … SHRINK SPACE COMPACT (or ALTER INDEX … SHRINK SPACE) as the Coalesce doesn’t have to concern itself with ensuring all leaf blocks at the physical end of the index segment have all been moved to allow for the storage to be de-allocated from the index segment. If you just want to de-fragment an index and not necessarily reduce the overall space allocated to the segment, use Coalesce rather than the Shrink options as it’s cheaper.

But what about an ALTER INDEX … REBUILD, when, if ever, should it be used ?

Well the answer is as with most things Oracle, it depends.

Scenario One.

We have a table and the application deletes historical data but in a manner in which leaf blocks are not being entirely emptied. Basically, older stuff is removed, but it’s only removed in a random manner, from approximately the “earlier” 10% of the table. The index is sequenced which means only those leaf blocks in the “left-most” 10% of the index structure are impacted but all this deleted space is “deadwood” as new index entries are only being inserted into the “right-most” part of the index.

Note that basically 90% of the index is fine and very well utilised, it’s only 10% of the index that’s problematic. Of the problem 10% of leaf blocks, there’s plenty of free or deleted space, with many leaf blocks almost but not quite empty.

Coalesce (and indeed Shrink) will basically run through these 10% of fragmented leaf blocks and will merge the index row entries into as few leaf blocks as possible. With the 90% of blocks that are fine, Coalesce will basically read and then ignore them from any processing as there’s nothing that can be done for them.

Rebuild on the other hand will take an entirely different approach. It will (generally) read the entire existing index structure and will build a brand new, bright and shining index segment.  As part of this process, it will rebuild the entire index, it has no choice (assuming the index isn’t partitioned, but that’s another story) and will rebuild the 90% of the index that was actually perfect to begin with. Rebuilding 90% of something that doesn’t need rebuilding doesn’t sound particularly efficient and indeed it isn’t. As a result, the index rebuild will use substantially more resources and generate substantially more redo than an equivalent Coalesce (or Shrink Space).

Scenario Two.

We have an application that deletes data and it deletes data throughout the entire index structure. The deletes are significant with a substantial proportion of the overall rows having been deleted. Additionally, the table is not going to be repopulated with anything like the same volume of data or it won’t be repopulated for a substantial period of time. As such, all this deleted index space is “deadwood” as it’s not going to be used any time soon, if at all.

Now typically in this sort of scenario, it’s of course the table as much as the associated indexes that needs to be rebuilt. That’s a key point. However, maybe Full Table Scans are not an issue for this table so the wasted space in the table is not of urgent concern. Maybe the table in not in an ASSM tablespace or in a database that supports a Table Shrink command and maybe moving the table is not an immediate option due to availability concerns. For whatever reason (or lack of reason), the index needs to be de-fragmented.

Note it’s the entire index that’s problematic here and there could be portions of the index that have very few remaining index entries.

Now poor Coalesce (and indeed Shrink) has a bit of an issue here. They both merge index entries from two blocks into the one block where it can. However, if leaf blocks are really empty, these merged index entries may in turn be merged and moved again with index entries from yet another leaf block. And maybe yet again with another leaf block. And again and again … So a specific index entry may actually be moved into several different leaf blocks during the entire process. Each of these moves requires resources and generates redo and takes time.

Now the rebuild has an entirely different approach. As mentioned, it will basically (generally) read the entire exisiting index structure and will build a brand new one, but importantly as it does so will only have to locate a specific index entry once and once only. Also, as it’s the entire index structure that’s problematic, there’s no issue with fixing the entire index, as it’s all “broken”.

As a result of only having to deal with an existing index entry the once vs. the Coalesce which may relocate a specific index entry many times, the index rebuild is going to be substantially more efficient and potentially use significantly less resources and generate less redo.

This demo of the Differences between a Coalesce, Shrink Space and Rebuild shows when one out performs the other.

Basically, Coalesce is particularly efficient and uses less resources when the percentage of the overall index structure that’s problematic and fragmented is relatively small (less than approximately 20-25% of leaf blocks). Rebuild is particularly efficient when the percentage of the overall index structure that’s problematic and fragmented is relatively large and the average degree of fragmentation within an index leaf block is relatively high.  Note Pre 10g, an index needed to have at least 50% free space less pctfree in neighbouring leaf blocks for a Coalesce to be effective.

Now Rebuild (and Rebuild Online) potentially have locking implications that need to be considered although as we’ll see later, 11g has addressed some of these issues …

Differences and Similarities Between Index Coalesce and Shrink Space February 6, 2008

Posted by Richard Foote in Index Coalesce, Index Shrink, Oracle General, Oracle Indexes, Performance Tuning.
8 comments

As already discussed, ALTER INDEX COALESCE in 10g onwards works in a very similar manner to ALTER INDEX SHRINK SPACE.

However, there are a number of key differences.

The first thing to point out is that each command has a slightly different purpose.

Coalesce is designed specifically to reduce fragmentation within an index but not to deallocate any freed up blocks which are placed on the freelist and recycled by subsequent block splits.

Shrink is designed specifically to reduce the overall size of an index segment, resetting the High Water Mark (HWM) and releasing any excess storage as necessary.

The key difference being that Shrink must reorganise the index leaf blocks in such a way that all the freed up, now empty blocks are all grouped together at “one end” of the index segment. All these blocks can then be deallocated and removed from the index segment. This means that specific leaf block entries must be removed from these specific blocks, in order to free up the leaf blocks in this manner.

Although Coalesce in 10g performs the operation in a similar manner to that of the Shrink Space, it can be more “lazy” in how it deals with the subsequent empty blocks and places then on the segment freelist as necessary.

COALESCE and SHRINK SPACE COMPACT are logically equivalent commands. Both options will “defragment” an index by “merging” index entries where possible thus reducing the number of blocks within the logical index structure. Both will result in the same number of leaf blocks within the index and both will result in the index height not being changed.

However, there are two key differences.

1) The SHRINK SPACE COMPACT option has the disadvantage of being more expensive to process as it has to concern itself with ensuring all necessary blocks can be emptied from the physical “end” of the index segment to be subsequently deallocated. This will result in more undo and redo being generated during the defragmentation of the index than would have been generated by the same corresponding COALESCE command.

2) The SHRINK SPACE COMPACT option has the advantage of being able to immediately deallocate the empty blocks, thereby reducing the actual size of the index segment by issuing a subsequent SHRINK SPACE option (although of course this can be performed in the one step by issuing SHRINK SPACE in the first place). However, the COALESCE option will not be able to just deallocate the free space. A subsequent Index SHRINK SPACE command on a previously coalesced index will require additional undo and redo than that of a previously “Shrunk” index as the necessary empty blocks are removed from the freelist and redistributed to allow for the de-allocation of blocks and the resetting of the High Water Mark of the index segment.

Note also that the Shrink option can only be used in Automatic Segment Space Management (ASSM) tablespaces.

Use Coalesce when the intent is to just defragment an index, knowing that the freed leaf blocks will be recycled by subsequent block splits, as it uses less resources than an equivalent Index Shrink Space.

Use Shrink Space when the intent is to reduce the actual storage allocated to an index, for example in the scenario where a table has permanently reduced its size and the index is unlikely to reuse the freed storage.

This demo highlights the Differences (and similarities) between an Index Coalesce and an Index Shrink Space.

Note however, that an index REBUILD might actually use substantially less resources than either a Coalesce or a Shrink Space and might reduce the height of an index as well.

But that’s a discussion for another day …

Index Internals – Rebuilding The Truth December 11, 2007

Posted by Richard Foote in Index Coalesce, Index Height, Index Internals, Index Rebuild, Index Shrink, Index statistics, Oracle Indexes, Oracle Myths, Oracle Opinion, Richard's Musings.
8 comments

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.

Recently updated version: Index Internals – Rebuilding The Truth

Follow

Get every new post delivered to your Inbox.

Join 1,808 other followers