jump to navigation

Index Rebuild, the Need vs the Implications Support Note 989093.1 (Getting Better) March 5, 2014

Posted by Richard Foote in Doc 122008.1, Doc 989093.1, Index Rebuild, Oracle Indexes, Oracle Myths.
4 comments

Once upon a time, Oracle Support had a note called Script: Lists All Indexes that Benefit from a Rebuild (Doc ID 122008.1) which lets just say I didn’t view in a particularly positive light :) Mainly because it gave dubious advice which included that indexes should be rebuilt if:

  • Deleted entries represent 20% or more of current entries
  • The index depth is more than 4 levels

It then detailed a script that ran a Validate Structure across all indexes in the database that didn’t belong in either the SYS or SYSTEM schema.

This script basically read through and sequentially locked all tables (maybe multiple times) in the database in order to list indexes that might not actually need a rebuild while potentially missing out on some that do. I could write a script that achieved the same result with far less overheads. For example, SELECT index_name FROM DBA_INDEXES where index_name like ‘A%’ and owner not in (‘SYS’, ‘SYSTEM’) would achieve a very similar result :)

Thankfully, note 122008.1 was eventually removed from My Oracle Support (MOS) some time ago, interestingly soon after I discussed the ramifications of this script in my Oracle Index seminars :)

I recently stumbled upon another related note on MOS regarding index rebuilds, Index Rebuild, the Need vs the Implications (Doc ID 989093.1). Although not perfect (for example while it mentions ANALYZE INDEX VALIDATE STRUCTURE can now be performed online, doing so means that INDEX_STATS is not populated making it a little pointless in this context), it is a significant improvement on the previous note and certainly well worth a read for Oracle newbies. 

It also references a script to investigate a b-tree index structure (Doc ID 989186.1) that doesn’t rely on the Validate Structure of an index, making it a far less problematic to use, while also keeping a useful history of index characteristics. Also worth checking out.

Index Rebuild – Does it use the Index or the Table ? (Nothing Touches Me) May 15, 2012

Posted by Richard Foote in Index Rebuild, Oracle Indexes, Secondary Indexes.
10 comments

A common question that gets asked is does Oracle access the index itself or the parent table during an index rebuild to extract the necessary data for the index ? Thought it might be worth a blog post to discuss.

Now if the index is currently in an UNUSABLE state, then Oracle clearly can’t use the existing index during the index rebuild operation. So we’ll assume both table and index are hunky dory.

OK, to setup the first demo (using 11.2.0.1), we create and populate a table and index with the index being somewhat smaller than the parent table as is most common:

SQL> create table bowie (id number, code number, name1 varchar2(30), name2 varchar2(30), name3 varchar2(30), name4 varchar2(30), name5 varchar2(30), name6 varchar2(30), name7 varchar2(30), name8 varchar2(30), name9 varchar2(30), name10 varchar2(30));

Table created.

SQL> insert into bowie select rownum, mod(rownum, 100), 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE','DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_code_i on bowie(code);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', estimate_percent=>null, cascade=> true);

PL/SQL procedure successfully completed.

If we look at the corresponding size of table and index:

SQL> select table_name, blocks from dba_tables where table_name = 'BOWIE';

TABLE_NAME                         BLOCKS
------------------------------ ----------
BOWIE                               19277

SQL> select index_name, leaf_blocks from dba_indexes where index_name = 'BOWIE_CODE_I';

INDEX_NAME                     LEAF_BLOCKS
------------------------------ -----------
BOWIE_CODE_I                          1948

As is common, the table is somewhat larger than the corresponding index.

Now in my first demo, I’m just going to perform a normal offline Index Rebuild. I’ll however trace the session to see what might be happening behind the scenes (the good old alter session set events ’10046 trace name context forever, level 12′; still does the job). I’ll also flush the buffer cache as well to ensure the trace file shows me which blocks from which object get accessed.

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> alter index bowie_code_i rebuild;

Index altered.

SQL> alter session set events '10046 trace name context off';

Session altered.

There’s lots of information of interest in the resultant trace file, well, for someone with an unhealthy interest in Oracle indexes anyways :) However, the portion that’s of direct interest in this discussion is to see which object Oracle accesses in order to read the necessary data for the index rebuild. The trace file will contain a relatively extensive section with the following wait events (the following is just a short sample):

WAIT #6: nam=’db file scattered read’ ela= 933 file#=4 block#=79339 blocks=5 obj#=75737 tim=20402099526
WAIT #6: nam=’db file scattered read’ ela= 1016 file#=4 block#=79344 blocks=8 obj#=75737 tim=20402102334
WAIT #6: nam=’db file scattered read’ ela= 978 file#=4 block#=79353 blocks=7 obj#=75737 tim=20402106904
WAIT #6: nam=’db file scattered read’ ela= 9519 file#=4 block#=80000 blocks=8 obj#=75737 tim=20402119605
WAIT #6: nam=’db file scattered read’ ela= 2800 file#=4 block#=80009 blocks=7 obj#=75737 tim=20402131869

….

If we query the database for the identity of object 75737:

SQL> select object_name from dba_objects where object_id = 75737;

OBJECT_NAME
-----------------------

BOWIE_CODE_I

We can see that Oracle has accessed the data from the Index itself, using multi-block reads. As the index is the smallest segment that contains the necessary data, Oracle can very efficiently read all the required data (the expensive bit) from the index itself, perform a sort of all the data (as a multi-block read will not return the data in a sorted format) and complete the rebuild process relatively quickly. Note the table is locked throughout the entire index rebuild operation preventing DML operations on the table/index and so for an offline index rebuild, Oracle can access the Index segment without complication.

I’m going to repeat the same process but this time perform an Online index rebuild operation:

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> alter index bowie_code_i rebuild online;

Index altered.

SQL> alter session set events '10046 trace name context off';

Session altered.

We notice this time there are many more wait events than previously and that another object is referenced:

WAIT #5: nam=’db file scattered read’ ela= 8259 file#=4 block#=5635 blocks=5 obj#=75736 tim=4520179453
WAIT #5: nam=’db file scattered read’ ela= 1656 file#=4 block#=5640 blocks=8 obj#=75736 tim=4520181368
WAIT #5: nam=’db file scattered read’ ela= 891 file#=4 block#=5649 blocks=7 obj#=75736 tim=4520182459
WAIT #5: nam=’db file scattered read’ ela= 886 file#=4 block#=5656 blocks=8 obj#=75736 tim=4520183544
WAIT #5: nam=’db file scattered read’ ela= 827 file#=4 block#=5665 blocks=7 obj#=75736 tim=4520184579

SQL> select object_name from dba_objects where object_id = 75736;

OBJECT_NAME
-------------------------

BOWIE

This time, the much larger BOWIE parent table has been accessed. So with an Online rebuild, Oracle is forced to use the parent table to access the data for the rebuild operation due to the concurrency issues associated with changes being permitted to the underlying table/index during the rebuild process. So although an online index rebuild has availability advantages, it comes at the cost of having to access the parent table which can result in much additional I/O operations. So if you don’t have availability concerns, an offline index rebuild is probably going to be the more efficient option.

In fact, Oracle can be quite clever in deciding which object to access with an offline rebuild …

In this next example, I’m going to create another table/index, only this time the index is somewhat larger than the parent table. This scenario is less common but certainly possible depending on circumstances:

SQL> create table bowie2 (id number, code number, name varchar2(30));

Table created.

SQL> insert into bowie2 select rownum, mod(rownum,100), 'DAVID BOWIE' from dual connect by level<= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie2_code_i on bowie2(code) pctfree 90;

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE2', estimate_percent=>null, cascade=> true);

PL/SQL procedure successfully completed.

SQL> select table_name, blocks from dba_tables where table_name = 'BOWIE2';

TABLE_NAME                         BLOCKS
------------------------------ ----------
BOWIE2                               3520

SQL> select index_name, leaf_blocks from dba_indexes where index_name = 'BOWIE2_CODE_I';

INDEX_NAME                     LEAF_BLOCKS
------------------------------ -----------
BOWIE2_CODE_I                        21726

So the index is indeed much larger than the table. Which object will Oracle access now during an offline rebuild ?

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> alter index bowie2_code_i rebuild;

Index altered.

SQL> alter session set events '10046 trace name context off';

Session altered.

A look at the trace file reveals:

WAIT #15: nam=’db file scattered read’ ela= 2278 file#=4 block#=81723 blocks=5 obj#=75744 tim=8570990574
WAIT #15: nam=’db file scattered read’ ela= 2733 file#=4 block#=81728 blocks=8 obj#=75744 tim=8570994765
WAIT #15: nam=’db file scattered read’ ela= 2398 file#=4 block#=81737 blocks=7 obj#=75744 tim=8570999057
WAIT #15: nam=’db file scattered read’ ela= 2661 file#=4 block#=81744 blocks=8 obj#=75744 tim=8571003369
WAIT #15: nam=’db file scattered read’ ela= 1918 file#=4 block#=81753 blocks=7 obj#=75744 tim=8571006709

SQL> select object_name from dba_objects where object_id = 75744;

OBJECT_NAME
----------------------------

BOWIE2

In this case, the smaller table segment is accessed. So during an offline rebuild, Oracle will access either the table or index, depending on which one is smaller and cheaper to read.

What if we now create another index that also contains the CODE column which is smaller than both the table and the existing index.

SQL> create index bowie2_code_id_i on bowie2(code, id);

Index created.

SQL> select index_name, leaf_blocks from dba_indexes where index_name = 'BOWIE2_CODE_ID_I';

INDEX_NAME                     LEAF_BLOCKS
------------------------------ -----------
BOWIE2_CODE_ID_I                      2642

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> alter index bowie2_code_i rebuild;

Index altered.

SQL> alter session set events '10046 trace name context off';

Session altered.

A look at the trace file reveals:

WAIT #6: nam=’db file scattered read’ ela= 2070 file#=4 block#=85179 blocks=5 obj#=75747 tim=8925949081
WAIT #6: nam=’db file scattered read’ ela= 2864 file#=4 block#=85184 blocks=8 obj#=75747 tim=8925957161
WAIT #6: nam=’db file scattered read’ ela= 2605 file#=4 block#=85193 blocks=7 obj#=75747 tim=8925969901
WAIT #6: nam=’db file scattered read’ ela= 10636 file#=4 block#=85536 blocks=8 obj#=75747 tim=8925989726
WAIT #6: nam=’db file scattered read’ ela= 2188 file#=4 block#=85545 blocks=7 obj#=75747 tim=8925996890

SQL> select object_name from dba_objects where object_id = 75747;

OBJECT_NAME
------------------------------

BOWIE2_CODE_ID_I

In this case, the smaller alterative index is actually accessed. So it might not be the table or the index being rebuilt that gets accessed, but the smallest segment that contains the data of interest which in this case is another index entirely.

My final little demo brings me back to the subject of secondary indexes on Index Organized Tables (IOTs) I’ve been recently discussing. In this example, I create an IOT and a much smaller secondary index:

SQL> create table bowie3 (id number constraint bowie_pk primary key, code number, name1 varchar2(30), name2 varchar2(30), name3 varchar2(30), name4 varchar2(30), name5 varchar2 (30), name6 varchar2(30), name7 varchar2(30), name8 varchar2(30), name9 varchar2(30), name10 varchar2(30)) organization index;

Table created.

SQL> insert into bowie3 select rownum, mod(rownum, 100), 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE', 'DAVID BOWIE','DAVID BOWIE','DAVID BOWIE', 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie3_code_i on bowie3(code);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE3', estimate_percent=>null, cascade=> true);

PL/SQL procedure successfully completed.

SQL> select index_name, leaf_blocks from dba_indexes where table_name = 'BOWIE3';

INDEX_NAME                     LEAF_BLOCKS
------------------------------ -----------
BOWIE_PK                             16950
BOWIE3_CODE_I                         2782

So the secondary index is much smaller. However, if I rebuild it offline:

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events '10046 trace name context forever, level 12';

Session altered.

SQL> alter index bowie3_code_i rebuild;

Index altered.

SQL> alter session set events '10046 trace name context off';

Session altered.

A look at the trace file reveals:

WAIT #5: nam=’db file scattered read’ ela= 13019 file#=4 block#=217856 blocks=4 obj#=75733 tim=8949436015
WAIT #5: nam=’db file scattered read’ ela= 1869 file#=4 block#=72915 blocks=5 obj#=75733 tim=8949438360
WAIT #5: nam=’db file scattered read’ ela= 3023 file#=4 block#=72920 blocks=8 obj#=75733 tim=8949442877
WAIT #5: nam=’db file scattered read’ ela= 2381 file#=4 block#=72929 blocks=7 obj#=75733 tim=8949448410
WAIT #5: nam=’db file scattered read’ ela= 2613 file#=4 block#=72936 blocks=8 obj#=75733 tim=8949453521

SQL> select object_name from dba_objects where object_id = 75733;

OBJECT_NAME
---------------------------

BOWIE_PK

In this case, we see that the much larger IOT PK segment is accessed and not the smaller secondary index. When rebuilding the secondary index of an IOT, Oracle has no choice but to access the parent IOT PK segment itself as of course the secondary index doesn’t contain all the necessary information required for the index rebuild operation. The physical guess component within the secondary index might be stale and the only way for Oracle to determine the correct current address of all the rows is to access the IOT PK segment. This is another disadvantage of secondary indexes associated with IOTs, even offline index rebuilds must access the potentially much larger IOT PK segment in order to ensure the correctness of the physical guess components of the logical rowids.

So the general answer of whether an index rebuild accesses the table or index is that it depends and that it could very well be neither of them …

Rebuilding Indexes and the Clustering Factor Solution (Move On) September 25, 2011

Posted by Richard Foote in Clustering Factor, Index Rebuild, Indexing Myth, Oracle Indexes, Quiz, Reverse Key Indexes.
1 comment so far

Excellent !! This quiz created quite a bit of debate and it was nice to sit back and read some interesting discussions.

The Clustering Factor is basically the measurement of how well aligned the data in the underlining table is in relation to the index and is the number of table related I/Os required to read the entire table via a full index scan. A value of the CF approaching the number of blocks in the table suggests the data is reasonably well sorted/clustered in relation to the index (although the CF could in theory be somewhat less than the blocks in the table of course). A value of the CF approaching the number of index entries suggests the data is not particularly well sorted/clustered in relation to the index and means we may need to re-visit the same table block numerous times to get the required data, thus decreasing the efficiency of using the index, increasing the costs associated with using the index and therefore decreasing the likelihood of the CBO using the index.

So for an index rebuild to actual have an impact on the CF on an index, means either the rows in the table needs to change or the order of the index entries needs to change.

However, when we typically rebuild an index, it has no impact at all on the table and so can’t possibly change the order of the rows there. Additionally, no matter how fragmented or inefficient the index structure might be, an index rebuild doesn’t change the order of the index entries either as they’re always sorted within the index in the order of the indexed columns.

Therefore an index rebuild typically has no impact at all on the CF of an index, no matter the current value of the CF.

However, there is an exception to this rule.

If we rebuild the index and change it from a NOREVERSE index to a REVERSE index, we now do change the order of the index. Significantly so, as the index entries are now in the order of the index column values when reversed. Therefore this can in turn significantly change the CF of an index.

Conversely, if we rebuild an index and change it from REVERSE to NOREVERSE, we likewise significantly change the order of the index entries and hence the value of the CF.

For a nice little demo, see David Aldridge’s comment or my previous discussion on Reverse Key Indexes.

Of course, it’s always nice to see new ideas and something I hadn’t considered was Gary Myer’s comment regarding changing the logic behind a Function-Based Index prior to a rebuild …

So the moral of this story is that no matter how poorly fragmented the index, how high or low the current CF of an index might be, rebuilding an index in order to improve the CF is a futile exercise and will change less than diddly-squat, except in the above mentioned special circumstances.

Now, back to another hearing of Pink Floyd’s masterpiece “The Dark Side of the Moon” in all its surround sound glory :)

Rebuilding Indexes and the Clustering Factor Quiz (One Of The Few) September 20, 2011

Posted by Richard Foote in Clustering Factor, Index Rebuild, Oracle Indexes, Quiz.
35 comments

Today’s question has been prompted by various recent comments regarding the Clustering Factor (CF) of an index and how to change the CF requires a reorg of the underlining table.

It used to be quite a common myth that if the CF of an index was greater that “X” or based on some nonsensical formula the CF was greater than “Y”, then rebuilding the index somehow made everything better again. I believe it’s now much more commonly accepted that rebuilding an index does not change the CF of an index in any way. By rebuilding an index, the actual order of the index entries remains unchanged as does the order of the rows within the table and so the resultant CF can’t possibly change.

Pick any index, no matter how bad the CF or badly fragmented the index or table, take fresh statistics and after rebuilding the index (and fresh statistics if compute statistics on the index isn’t performed), the CF will remain the same.

However, there are nearly always exceptions …

Give an example of when rebuilding an index will significantly change the actual CF of an index, assuming 100% accurate statistics are gathered before/after the index rebuild on a table with no DML.

There are actually two such examples that spring to mind :)

An Index Only Performs How Much Work ??? November 12, 2009

Posted by Richard Foote in Index Rebuild, Oracle Indexes, Oracle Myths.
34 comments

One of the main reasons suggested for performing periodic index rebuilds is to improve “performance”. After rebuilding indexes, applications now run so much faster. Users notice a significant improvement in performance. And so on.
 
There are of course situations when rebuilding specific indexes can indeed improve performance, here’s but one example I’ve discussed previously.
 
However, the question I always ask when someone claims an index rebuild has made things run faster is to simply ask why. Why is such a simple, but powerful question. Why have things improved ? What has specifically changed as a result of rebuilding the index, that Oracle has now reduced the overall work associated with performing the activity, to the point that things run noticeably faster.
 
Knowing why is really important because it confirms that indeed there was an improvement and that it was indeed associated directly with the index rebuild. It means when a similar situation arises again, you know how to directly resolve the problem appropriately and effectively next time. Also knowing why means you can determine the specific root cause, perhaps preventing things from deteriorating so bad in the first place, such that rebuilding the index becomes unnecessary. Prevention being the best cure …
 
Now the most common answer I get for why rebuilding an index has been so beneficial is because the index is now so much smaller after the rebuild that the overheads of reading the index have substantially reduced. If the index is smaller, it means one can read the same amount of index related data with less I/Os. Less I/Os means better performance.
 
For example, if you can reduce the index by half, you can fit the index into only half the number of leaf blocks and that’s a 50% performance improvement right there.
 
Well, firstly it assumes that the index has indeed reduced by half. It would actually be a relatively unusual index for it to be so poorly fragmented or for it to have so much “wasted” space that it could be rebuilt into only half the number of leaf blocks.
 
Possible but somewhat unusual.
 
However, it also assumes that by having a 50% performance improvement, reading the index blocks constitutes the vast majority of the work. Again possible in some scenarios.
 
With most index related activities though, reading the index blocks actually constitutes only a small percentage of the overall work. In most cases, the index only contributes a very small percentage of the overall I/O activity. Therefore by potentially only reducing a small percentage of the overall work by rebuilding just the index, the overall impact is generally going to be minimal.
 
I thought I might perform some basic mathematics to illustrate and put into perspective just what little impact index rebuilding can have in improving performance, even if by doing so the index dramatically reduces in size, because the index itself actually constitutes only a small percentage of the overall costs.
 
Let say we have one of these more unusual indexes that is so poorly fragmented that it has approximately 50% wasted space throughout the entire index structure. Let’s say rebuilding such an index reduces the overall size of the index by half.
 

Before the index rebuild, an index has 50% of wasted space and say:
 
Height of 3
 
1 Root Block
 
50 Intermediate Branch Blocks
 
20,000 Leaf blocks
 

After the rebuild, the index has no wasted space and has now:
 
Height of 3
 
1 Root Block
 
25 Intermediate Branch Blocks
 
10,000 Leaf Blocks
 

Let’s assume the table contains 2M rows and that they fit in 100,000 blocks (i.e. the index is about 1/10 that of the table and the average row size is such that we fit say 20 rows on average per block). Let’s also assume there’s nothing special about this index and that it has an “average” clustering factor of 1M, before and after the rebuild ;) 1M being somewhere in the middle of possible clustering factor values.

The first thing to note is that the height remains the same after such a rebuild, even though the index is only now half the size. It would be extremely unlikely and the index would have to be particularly small and within a very narrow range of sizes for all the contents of all the intermediate branch blocks to fit within just the one root block. The only way for the index height to reduce down from 3 would be for the contents of all intermediate branches to fit within the root block. Possible, but again quite unusual. 

OK, let’s look at the cost of various scans before and after the rebuild, using the index costing formula I’ve discussed recently.
 
If we’re after just one row (a unique index perhaps), then to read the one row before the rebuild would be:
 
1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.
 
After the rebuild, the total cost would be:
 
1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.
 
In effect, no change. Well, we’re not likely to have too much of a performance improvement there.

 
 
Let’s increase the size of the range scan. What if we retrieve 100 rows (or approx. 0.005% of data):
 
Before the rebuild it would be
 
1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 20000) +
50 for all the table blocks (0.00005 x 1M)
= 53.
 
After the rebuild it would be:
 
1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 10000) +
50 for all the table blocks (0.00005 x 1M)
= 53.
 
Again, no difference …

 
 
OK, let’s increase the number of rows accessed substantially to 10,000 (or approx. 0.5% of the data).
 
Before the rebuild it would be:
 
1 I/O for the root block +
1 I/O for a branch block +
100 for all the leaf blocks (0.005 x 20000) +
5000 for all the table blocks (0.005 x 1M)
= 5102.
 

After the rebuild it would be:
 
1 I/O for the root block +
1 I/O for a branch block +
50 for all the leaf blocks (0.005 x 10000) +
5000 for all the table blocks (0.005 x 1M)
= 5052.
 
Or in percentage terms, a reduction of I/Os of approximately 1%. That’s just 1 tiny little percent …
 
So even an index that accesses 10,000 rows, a reasonable number and at 0.5% a reasonable percentage of the overall table, even an index that has reduced in size by half, a substantial reduction in size, only reduces the overall number of I/Os by an unimpressive 1% for such a query in the above scneario.
 
Would reducing I/Os on such a query by 1% really improve performance “substantially” ? Will users really notice much of a difference ?
 
It’s of course all in the numbers and in how much work the actual index is performing, in how many I/Os are actually performed by the index itself and in how much of a reduction in the overall I/Os an index rebuild will actually contribute. I’ll leave it to you to plug in different ranges of selectivity to see what impact rebuilding such an index with the above characteristics might have.
 
The point is that for the vast majority of index related queries, most of the work is performed in getting the data out of the table, not the index.
 
Therefore, reducing the size of an index, even by possibly a substantial amount, may not necessarily reduce the overall I/Os associated with a query if the index only performs a tiny fraction of all the work. You could eliminate the index entirely and providing Oracle could still magically retrieve just the 0.5% of rows of interest in the above example, performance for such a query would unlikely improve “significantly”.
 
So not only must an index reduce in size but the activities associated with directly reading the index must constitute a significant proportion of the overall work (eg. fast full index scan, index only scans, etc.) or something else must have changed for performance to have improved “significantly”.

Rebuilding Indexes Every Sunday Afternoon !! October 28, 2009

Posted by Richard Foote in Index Rebuild, Oracle Indexes, Richard's Musings.
88 comments

I just had to share this amusing article on “Scheduling Oracle Index Rebuilding“.

Regular readers of this blog can no doubt pick the various inaccuracies and strawman arguments with this article. It’s was nice however to reminisce and take a stroll down memory lane back to the 80′s and 90′s when many DBAs indeed did spend every Sunday afternoon rebuilding indexes and the such during so-called maintainance windows.

However, if you’re like me and now work on sites where there is no such Sunday maintainance window because your users actually require and demand 24 x 7 access to their applications, because organisations still want to sell their products and services online during Sunday afternoons, because governments still want border control applications functioning on Sunday afternoons, because consumers still want access to their bank savings on Sunday afternoons, because police still need to access critical information about possible criminal activities on Sunday afternoons, because airlines still want to fly aircraft on Sunday afternoons, etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. etc. ……..

then perhaps the article may not be that useful to you afterall.

Of course, maybe you do have a maintainance window of some description but then have that other really annoying problem associated with modern databases, of them being big, really really big and so more difficult to maintain. Back in the 80′s and 90′s, databases were relatively small by todays standards and it was conceivable for indexes to be rebuilt with little or no thought. There was generally little point, but at least you could get away with it. However, if you’re like me and you have databases with indexes that total in the many terabytes, it just isn’t feasible or practical to rebuild all such indexes, even if you wanted to.

No. If you do have a maintainance window, you may want to make sure the valuable time is spent well, on activities that actually make a difference.

Of course you may also look after databases like I do where the percentage of indexes that actually benefit from a rebuild is not around 30% as suggested by the article but significantly below 1%. Therefore rebuilding 99+% of indexes for no practical gain or purpose may not be considered a “wise” use of maintainance opportunities.

One thing this article did make me ponder though is the number of database sites out there where the DBAs demand and force their businesses and associated end-users into having forced downtimes, into having their applications and business processes impacted and made unavailable (say) every Sunday morning, not because it’s actually necessary or because there’s a business or technical benefit but just because their DBAs say or think they still need to follow the processes and maintainance activities of the 80′s and 90′s. Interesting question that …

One last point I would make with this article (which BTW will change and be modified in the future, I can guarantee that, just save a copy and see). Why would you bother with validating the structure of all indexes if you plan to rebuild them all anyways ? And who is the more inept ? Someone in the Netherlands who attempts to run a script that attempts to validate structure of all indexes or someone who writes a script to validate structure indexes with an ALTER INDEX command ;)

Personally, I spend every Sunday afternoon relaxing and enjoying the weekend, playing sport, perhaps doing a bit of gardening and on a lovely sunny day, just sitting back with family and friends enjoying a barbecue and a few cold beers.

Meanwhile, all my database applications just run on happily along with no impact on business activities at all …

Larger Block Tablespace For Indexes Revisited Part III (Prove Yourself) March 2, 2009

Posted by Richard Foote in Index Block Size, Index Height, Index Internals, Index Rebuild, Oracle Indexes, Oracle Myths.
10 comments

Time to look a little at an Index Fast Full Scan (IFFS) with respect to moving indexes into a larger block tablespace.

To start, a few points about an IFFS. When performing an IFFS, Oracle will read each and every block in the index structure, including all branch and leaf blocks (up to the HWM), using multi-block read operations. So although an IFFS has its place, it’s a relatively expensive operation, especially if the index is large, as it needs to read all blocks in the current index structure. Therefore, it’s not generally considered a “common” operation, as in traditional index range scans, especially in OLTP environments. Effectively, an IFFS is the index equivalent of a Full Table Scan (FTS) and is performed in a similar fashion. During an IFFS, Oracle basically treats the index as if it were an smaller version of the table from which it can extract the necessary data.

So if an IFFS were indeed to be very common and could be performed so much more effectively in a larger block tablespace, this begs the question why not move tables in a larger block tablespace as well, so FTS can have the same benefits as the index. And if you move both tables and indexes into a larger block tablespace, why not create the database in the larger block and have done with it? Which brings us around to the question of selecting an appropriate block size for the database, which is a different discussion to the so-called benefits of just moving indexes into a larger block tablespace. For another day perhaps …

Now, although Robin’s demo clearly shows consistent gets are basically halved when an index is rebuilt with double the block size (which all sounds very impressive), what it doesn’t highlight however is that during an IFFS, approximately the same amount of data is still actually being read. Although there are certainly some efficiencies in having fewer logical reads, if each consistent gets is more expensive while the overall data being read is similar, you’re simply not going to get an enormous performance boost that simply halving consistent gets might suggest. Claims that things will suddenly run twice as fast (or 120 times as fast) purely by moving indexes into a larger block tablespace are shall we say “exaggerated” to say the least. However, as all index blocks within an index are being read with no subsequent table accesses, an IFFS is going to be as good as it gets when we talk about any possible performance improvements of moving indexes into a larger block tablespace.

Again, it’s all very simple to “give it a go” and see for yourself exactly what improvements one might expect. Move an index into a larger block tablespace and see those IFFS fly. Using exactly the same setup from Robin’s demo as in Parts I and II, let’s see the actual differences …

First, get some current session stats:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         35856
consistent gets               10685006
physical reads                    2582
index fast full scans (full)     33003

Now execute 1000 IFFS using the index in an 8K block tablespace …
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..1000 loop
  5       select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.23

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         36952
consistent gets               11106006
physical reads                    2582
index fast full scans (full)     34003

 
Note that response times are roughly 11.23 secs, mostly consisting of CPU at 10.96 secs. Note also consistent gets are 1000 x 421 as one would expect from Robin’s demo. Now lets see how things differ when we move the index into a 16K block tablespace.

 
SQL> alter index bowie_idx rebuild tablespace ts_16k;

Index altered.

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         43192
consistent gets               12372513
physical reads                    2789
index fast full scans (full)     40003

SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..1000 loop
  5       select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.14

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                               VALUE
----------------------------- ----------
CPU used by this session           44260
consistent gets                 12583513
physical reads                      2789
index fast full scans (full)       41003

 
Note there’s only a slight improvement in response times and CPU usage of a couple of percentage points, even though the number of consistent gets has halved. Now your milage may vary a tad but what you will not see is things suddenly using 1/2 the resources or running twice as fast (or running 120 times as fast). Because at the end of the day, both sets of IFFS are reading and processing approximately the same amount of data.

Note also that any such possible minor improvments in IFFS performance is likely going to be cancelled out or more likely overtaken by the increased costs associated with more typical index range scans due to the index not actually reducing in height as discussed in Part II. Again, what’s more common, smaller index range scans or IFFS ?

Robin’s demo and the demo above however assumes the index is fully cached, with no physical I/O (PIO) operations being performed. Now as an IFFS needs to read the entire index, it’s quite likely that an IFFS would require some physical I/O, especially if the index is large. Surely then, if an index is in a larger block tablespace, such physical I/Os would be more efficient because Oracle would be able to read more data with a multi-block read based on a larger block  size?

The answer is unfortunately no.  Another (rather poor) analogy to get the point across.

You go to the bank to take some money out. You’re only allowed to take out a maximum of $1000 per day from your account. You ask for this in $50 notes and you get 20 x $50 = $1000 in total. Your mate wants to take out more money and asks for the money in $100 notes. However, as the limit is set to $1000 a day, regardless of the denomination of the bank notes, he simply gets 10 x $100 = $1000. You both end up with exactly the same amount of money, it’s just that you have more bank notes.

An Oracle multi-block read works in exactly the same manner. There’s a maximum amount of data you can read which is calculated by the default block size x db_file_multiblock_read_count. If the default block size is 8K and the db_file_multiblock_read_count = 8, then the maximum size of a multiblock read is 8K x 8 = 64K.

If you attempt to perform a multi-block read using a non-default block size, Oracle simply divides the 64K by the default block size to determine how many blocks to read. So if you now attempt to perform a multi-block read from a 16K tablespace, Oracle will only read 16K x 4 blocks = 64K. The I/O size is the same regardless of the blocksize.

And this makes perfect sense. If you’ve tuned the size of a multi-block read perfectly with the default block size in mind, why would you want to (say) suddenly double this perfect size when you double the block size in another tablespace. If doubling a multi-block read size were to suddenly improved things, why not also double the multi-block read size for the default block size as well …

Again, it’s very easy and simple to test how changing the block size in a specific tablespace makes no difference to the behaviour of an associated multi-block read.

First flush the buffer cache and trace a session, while performing a multi-block read via an IFFS , first with a default 8K block size index:

SQL> show parameter db_file_m

NAME                             TYPE VALUE
----------------------------- ------- -----
db_file_multiblock_read_count integer     8

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events ’10046 trace name context forever, level 12′;

Session altered.

SQL> select /*+ index_ffs(bowie) */ count(*) from bowie;
 
The trace file will show that blocks=8 are being read during the IFFS multiblock read operation as the following extract from the trace file highlights:

 
WAIT #2: nam=’db file scattered read’ ela= 22897 file#=7 block#=38929 blocks=8 obj#=95742 tim=1217521768059
WAIT #2: nam=’db file scattered read’ ela= 47158 file#=7 block#=38937 blocks=8 obj#=95742 tim=1217521816003
WAIT #2: nam=’db file scattered read’ ela= 37776 file#=7 block#=38945 blocks=8 obj#=95742 tim=1217521854530

However, when you trace a session as it performs the IFFS with a 16K block

WAIT #1: nam=’db file scattered read’ ela= 55181 file#=6 block#=1030 blocks=4obj#=95742 tim=867071895617
WAIT #1: nam=’db file scattered read’ ela= 68932 file#=6 block#=1034 blocks=4obj#=95742 tim=867071965238
WAIT #1: nam=’db file scattered read’ ela= 67136 file#=6 block#=1038 blocks=4obj#=95742 tim=867072142114
 

We notice that Oracle is now only reading 4 blocks at a time, ensuring that the same 64K is read each time. So increasing the blocksize of an index doesn’t suddenly impact the manner or efficiency in which multi-block PIOs are performed.

Next, we’ll perform a similar test as before, comparing the difference between the 8K and 16K block index during an IFFS but this time with PIOs introduced. There are various ways one could do this. Make the buffer cache too small to fit an index or make the index too big for the buffer cache. Keeping with Robin’s demo, I’ll use exactly the same index definitions but this time run the following procedure in another session that constantly flushes the buffer caches. So in one session:

SQL> begin
  2  for i in 1..10000 loop
  3    execute immediate (‘alter system flush buffer_cache’);
  4  end loop;
  5  end;
  6  /

PL/SQL procedure successfully completed.

 
Meanwhile in another session, run a number of IFFS (200 this time) with an 8K block size index:

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          4851
consistent gets                1536024
physical reads                  423992
index fast full scans (full)      3205

 
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..200 loop
  5     select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6   end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:35.92

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          5248
consistent gets                1620224
physical reads                  507651
index fast full scans (full)      3405

 

Note the response time is 35.92 secs and that we used approximately 3.97 secs. This time we have performed a how bunch of PIO operations.

Same thing, this time with the 16K index:

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          6695
consistent gets                1789531
physical reads                  633216
index fast full scans (full)      4205

 
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..200 loop
  5     select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6   end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:36.59

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          7118
consistent gets                1831731
physical reads                  674861
index fast full scans (full)      4405

 
We now note that the 16K block index has the slightly slower response time at 36.59 secs and uses a tad more CPU at 4.23 secs. As soon as we introduce PIOs, the PIOs contribute significantly to the overall costs and may well result in the larger block index being more costly depending on how the PIO is performed. So even in Robin’s example, depending on the nature of PIOs, even an IFFS can potentially run slower with an index in a larger block tablespace. Remembering of course that larger indexes are less likely to be fully cached and are more likely to incur PIOs during an IFFS.

Here’s a demo I’ve used before that uses a different approach to highlight the same issue (which doesn’t rely on a different session “interfering” with things). Another excellent example is this one by Greg Rahn on this OTN thread where he shows an IFFS performing a tad slower in a larger block tablespace. As I’ve been saying, just give it a go and see for yourself.

Perhaps, simply halving the consistent gets when doubling the index block size doesn’t tell the whole story …

The next time you see someone reference Robin’s demo as some sort of “proof” that by moving indexes into a larger block tablespace and (perhaps) reducing the number of consistent reads, the index will somehow be flatter and/or more efficient, just remember that the index may not necessarily be flatter, it might actually have the same height and that subsequent index operations, including index range scans and IFFS could very well be more expensive, not less expensive, to perform.

The next time you see someone claim that by moving indexes into a larger block tablespace, performance has suddenly improved by 100% or that things suddenly run 120 times faster, just ask one simple question. Why ?Ask what else has changed, what else might be contributing to the incredible performance improvements, because when you actually test things out for yourself, you’ll noticed results are not anywhere near as “impressive”.

Larger Block Tablespace For Indexes Revisted: Part II (Money) February 23, 2009

Posted by Richard Foote in Index Block Size, Index Height, Index Rebuild, Oracle Indexes, Oracle Myths.
7 comments

In Part I I looked at Robin Schumacher’s “classic” example of the “so-called” benefits of rebuilding an index into a larger block tablespace. I started by highlighting that by simply rebuilding an index in a larger block tablespace and (say) halving the number of associated index blocks, it doesn’t necessarily mean the index will result in a “flatter structure” , that the index height will reduce. Robin’s demo is in fact a perfect example of this.

Let me start Part II by following this up with a little story …

“I went with a friend of mine to get some cash from the bank the other day in order to buy the latest David Bowie Box-Set. I got my cash in $50 notes but being an efficient, cost saving sort of bloke, my mate got out the same amount of cash in $100 notes. Although we both had the same amount of cash, his wallet was that little bit more compact and “efficient” than mine as he had less actual bank notes to carry.

However, when we got to the record store, we were surprised to discover that the actual “cost” of the David Bowie Box-Set was 2 bank notes of any domination, but with no change being given. Therefore, it cost me 2 x $50 notes to make my purchase. Unfortunately for my mate, it cost him 2 x $100 for the same thing as $100 notes were the smallest denomination he could use. Yes, I was a little bit mean not lending him some of my $50 notes but I was a little disappointed myself for not having any $5 notes on me at the time ;)

OK, it’s not a perfect analogy but you perhaps get the point …

If you have to pay with a quantity of bank notes and you only have larger bank notes, you end up paying more than you would if you could only have paid with the same number of smaller denomination bank notes.

If you have to pay for an index scan in database blocks and you’re forced to use larger index blocks, you actually end up paying more if you have to read the same number of index blocks anyways.

This is one of the potential dangers with rebuilding indexes in a larger block tablespace. And like I said, Robin’s little demo is a perfect example of this. You might indeed reduce and halve the number of index blocks but this might not be sufficient to actually flatten the index structure and reduce the actual height of the index. The height of the index after rebuilding the index can remain the same, so the minimum cost of performing a small range scan increases. Even if you reduce the index height, you can still end up paying more if the savings do not compensate you enough for the additional overhead associated with now having to read and process larger index blocks. 

Therefore, you potentially start paying more for smaller index range scans because you might not actually reduce the number of index blocks you visit for these types of index scans.

Taking exactly the same table/index definitions and data used to replicate Robin’s example in Part I, let’s see if there’s any difference in the costs associated with performing a number of small index range scans after rebuilding the index in a 16K block tablespace.

Again, it’s a simple case of just giving it a go and see for yourself. What resources are used if you perform a series of small index scans ? Do things really run faster ? Do we really use less resources ?  Having rebuilt such an index in a larger block tablespace and halved the number of associated leaf blocks, are we really better off ?

Let’s begin by setting up the same example as Robin’s demo as before …

SQL> create table bowie (id number not null, value varchar2(10));

Table created.

SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=187200;

187200 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=>true, estimate_percent=>null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> select index_name, blevel from user_indexes where index_name = ‘BOWIE_IDX’;

INDEX_NAME BLEVEL
---------- ------
BOWIE_IDX       1

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select height, btree_space, used_space from index_stats;

HEIGHT BTREE_SPACE USED_SPACE
------ ----------- ----------
     2     3336032    2988168

 

Note the index height when built in an 8K block tablespace is 2 …

Let’s now capture the current amount of CPU used by the session:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      36555
consistent gets            22520369
physical reads                 3750

 

Now we run a series of small index range scans …

SQL> set timing on

SQL> declare
  2  v_id    number;
  3  v_value varchar2(10);
  4  begin
  5   for o in 1..10 loop
  6    for i in 1..187200 loop
  7    select id, value into v_id, v_value from bowie where id = i;
  8    end loop;
  9  end loop;
 10  end;
 11  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:28.42

 

Let’s see how our resource stats have changed …

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      45346
consistent gets            28140519
physical reads                 3750

 

We note we have used approximately 87.91 CPU seconds. (Note: You can run this a number of times and determine an average figure).

Let’s now rebuild the index again in a 16K block tablespace:

SQL> alter index bowie_idx rebuild tablespace ts_16k;

Index altered.

SQL> select index_name, blevel from user_indexes where index_name = ‘BOWIE_IDX’;

INDEX_NAME BLEVEL
---------- ------
BOWIE_IDX       1

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select height, btree_space, used_space from index_stats;

    HEIGHT BTREE_SPACE USED_SPACE
---------- ----------- ----------
         2     3351776    2985662 

Note the index height remains at 2 …

If we run the same series of small index scans:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      45381
consistent gets            28142640
physical reads                 3957

SQL> declare
  2  v_id    number;
  3  v_value varchar2(10);
  4  begin
  5   for o in 1..10 loop
  6    for i in 1..187200 loop
  7    select id, value into v_id, v_value from bowie where id = i;
  8    end loop;
  9  end loop;
 10  end;
 11  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:42.44

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      54484
consistent gets            33760690
physical reads                 3957

We note that elapsed times have increased and we have now increased our overall CPU consumption to 91.03 CPU seconds as well.

As we can see, there has been no advantage with rebuilding the in index in the 16K block tablespace for these smaller index scans. In fact, there’s actually been an increase in the overall elapsed times and an increase in the overall CPU. Performance has not improved but has in fact worsened overall for these queries after rebuilding the index in a larger block tablespace.

You begin to get the point …

And of course, indexes are generally far more typically to be used in small index range scan operations than they are in performing Index Fast Full Index Scans. Just compare the numbers of index fast full scans vs. index fetch by key operations in your databases if you want some indication. 

However, Robin’s specific example used a SQL statement that performed an Index Fast Full Scan. Surely, such operations would improve dramatically if indexes were only rebuilt in a larger block tablespace ? Surely such operations would be able to read data more quickly, especially if we have to go to disk, as we would be able to read more data with each associated multiblock read if the index were built in a larger block tablespace ? Surely it would be worth all the extra effort and management considerations ?

Things will run at least 2 times faster, maybe even 150 times faster, right  ;)

We’ll see how the results can be a tad “disappointing” in the next post …

Index Create and Rebuild Locking Improvements in 11g (Ch Ch Ch Changes) February 11, 2008

Posted by Richard Foote in 11g, Index Rebuild, Locking Issues, Oracle General, Oracle Indexes.
15 comments

Although the CREATE INDEX … ONLINE and ALTER INDEX … REBUILD ONLINE options have been available for a long while, they can still introduce locking issues in highly active databases.

Oracle requires a table lock on the index base table at the start of the CREATE or REBUILD process (to guarantee DD information) and a lock at the end of the process (to merge index changes made during the rebuild into the final index structure).

These locks have two implications. Firstly, if there’s an active transaction on the base table of the index being created or rebuilt at the time one of these locks is required, the indexing process will hang. This will of course impact the time it takes to complete the indexing process. However the second far more serious issue is that any other active transactions on the base table starting after the indexing process hangs will likewise be locked and be prevented from continuing, until the indexing process obtains and releases its locks. In highly concurrent environments with many transactions, this can cause serious disruptions to the response times of these impacted transactions. Of course, depending on the time the initial locking transactions take to commit or rollback, this backlog of locked transactions can be quite significant.

Oracle11g has made some improvements in the locking implications regarding creating or rebuilding indexes online.

During the creation or rebuilding of an index online, Oracle still requires two associated table locks on the base table at the start and end of indexing process. If there’s an active transaction on the base table at the time one of these locks is required, the indexing process will still hang as its done previously until all these prior active transactions have completed. No change so far.

However, if the indexing process has been locked out and subsequent transactions relating to the base table start afterwards, these transactions will no longer in turn be locked out by the indexing table locks and are able to complete successfully. The indexing process no longer impacts other concurrent transactions on the base table, it will be the only process potentially left hanging while waiting to acquire its associated lock resource.

This means it may not be quite so “risky” to urgently introduce that new index or rebuild that troublesome index during core business hours due to the reduced locking implications introduced in 11g.

See this demo for Index Rebuild Locking Issues in 10g and the 11g Improvements.

Does this means we can now simply rebuild all our indexes, whenever ? Ummmm, no ;)

Index Rebuild vs. Coalesce vs. Shrink Space (Pigs – 3 Different Ones) February 8, 2008

Posted by Richard Foote in Index Coalesce, Index Rebuild, Index Shrink, Oracle General, Oracle Indexes, Performance Tuning.
30 comments

Previously, I discussed how an ALTER INDEX  … COALESCE is going to be less expensive in terms of using resources than an equivalent ALTER INDEX … SHRINK SPACE COMPACT (or ALTER INDEX … SHRINK SPACE) as the Coalesce doesn’t have to concern itself with ensuring all leaf blocks at the physical end of the index segment have all been moved to allow for the storage to be de-allocated from the index segment. If you just want to de-fragment an index and not necessarily reduce the overall space allocated to the segment, use Coalesce rather than the Shrink options as it’s cheaper.

But what about an ALTER INDEX … REBUILD, when, if ever, should it be used ?

Well the answer is as with most things Oracle, it depends.

Scenario One.

We have a table and the application deletes historical data but in a manner in which leaf blocks are not being entirely emptied. Basically, older stuff is removed, but it’s only removed in a random manner, from approximately the “earlier” 10% of the table. The index is sequenced which means only those leaf blocks in the “left-most” 10% of the index structure are impacted but all this deleted space is “deadwood” as new index entries are only being inserted into the “right-most” part of the index.

Note that basically 90% of the index is fine and very well utilised, it’s only 10% of the index that’s problematic. Of the problem 10% of leaf blocks, there’s plenty of free or deleted space, with many leaf blocks almost but not quite empty.

Coalesce (and indeed Shrink) will basically run through these 10% of fragmented leaf blocks and will merge the index row entries into as few leaf blocks as possible. With the 90% of blocks that are fine, Coalesce will basically read and then ignore them from any processing as there’s nothing that can be done for them.

Rebuild on the other hand will take an entirely different approach. It will (generally) read the entire existing index structure and will build a brand new, bright and shining index segment.  As part of this process, it will rebuild the entire index, it has no choice (assuming the index isn’t partitioned, but that’s another story) and will rebuild the 90% of the index that was actually perfect to begin with. Rebuilding 90% of something that doesn’t need rebuilding doesn’t sound particularly efficient and indeed it isn’t. As a result, the index rebuild will use substantially more resources and generate substantially more redo than an equivalent Coalesce (or Shrink Space).

Scenario Two.

We have an application that deletes data and it deletes data throughout the entire index structure. The deletes are significant with a substantial proportion of the overall rows having been deleted. Additionally, the table is not going to be repopulated with anything like the same volume of data or it won’t be repopulated for a substantial period of time. As such, all this deleted index space is “deadwood” as it’s not going to be used any time soon, if at all.

Now typically in this sort of scenario, it’s of course the table as much as the associated indexes that needs to be rebuilt. That’s a key point. However, maybe Full Table Scans are not an issue for this table so the wasted space in the table is not of urgent concern. Maybe the table in not in an ASSM tablespace or in a database that supports a Table Shrink command and maybe moving the table is not an immediate option due to availability concerns. For whatever reason (or lack of reason), the index needs to be de-fragmented.

Note it’s the entire index that’s problematic here and there could be portions of the index that have very few remaining index entries.

Now poor Coalesce (and indeed Shrink) has a bit of an issue here. They both merge index entries from two blocks into the one block where it can. However, if leaf blocks are really empty, these merged index entries may in turn be merged and moved again with index entries from yet another leaf block. And maybe yet again with another leaf block. And again and again … So a specific index entry may actually be moved into several different leaf blocks during the entire process. Each of these moves requires resources and generates redo and takes time.

Now the rebuild has an entirely different approach. As mentioned, it will basically (generally) read the entire exisiting index structure and will build a brand new one, but importantly as it does so will only have to locate a specific index entry once and once only. Also, as it’s the entire index structure that’s problematic, there’s no issue with fixing the entire index, as it’s all “broken”.

As a result of only having to deal with an existing index entry the once vs. the Coalesce which may relocate a specific index entry many times, the index rebuild is going to be substantially more efficient and potentially use significantly less resources and generate less redo.

This demo of the Differences between a Coalesce, Shrink Space and Rebuild shows when one out performs the other.

Basically, Coalesce is particularly efficient and uses less resources when the percentage of the overall index structure that’s problematic and fragmented is relatively small (less than approximately 20-25% of leaf blocks). Rebuild is particularly efficient when the percentage of the overall index structure that’s problematic and fragmented is relatively large and the average degree of fragmentation within an index leaf block is relatively high.  Note Pre 10g, an index needed to have at least 50% free space less pctfree in neighbouring leaf blocks for a Coalesce to be effective.

Now Rebuild (and Rebuild Online) potentially have locking implications that need to be considered although as we’ll see later, 11g has addressed some of these issues …

Index Internals – Rebuilding The Truth December 11, 2007

Posted by Richard Foote in Index Coalesce, Index Height, Index Internals, Index Rebuild, Index Shrink, Index statistics, Oracle Indexes, Oracle Myths, Oracle Opinion, Richard's Musings.
7 comments

The issue of when to rebuild indexes crops up again and again and again. Unfortunately, so do the same incorrect, myth-filled uninspired responses which considering how important and “key” (no pun intended) indexes are to database design and performance generally, is very puzzling indeed.

In the coming days I’ll post why I believe these index related myths simply refuse to go away …

This presentation was originally written several years ago but is still as relevant today as it’s always been.

Recently updated version: Index Internals – Rebuilding The Truth

Follow

Get every new post delivered to your Inbox.

Join 1,703 other followers