An Index Only Performs How Much Work ???

An Index Only Performs How Much Work ??? November 12, 2009

Posted by Richard Foote in Index Rebuild, Oracle Indexes, Oracle Myths.
trackback

One of the main reasons suggested for performing periodic index rebuilds is to improve “performance”. After rebuilding indexes, applications now run so much faster. Users notice a significant improvement in performance. And so on.

There are of course situations when rebuilding specific indexes can indeed improve performance, here’s but one example I’ve discussed previously.

However, the question I always ask when someone claims an index rebuild has made things run faster is to simply ask why. Why is such a simple, but powerful question. Why have things improved ? What has specifically changed as a result of rebuilding the index, that Oracle has now reduced the overall work associated with performing the activity, to the point that things run noticeably faster.

Knowing why is really important because it confirms that indeed there was an improvement and that it was indeed associated directly with the index rebuild. It means when a similar situation arises again, you know how to directly resolve the problem appropriately and effectively next time. Also knowing why means you can determine the specific root cause, perhaps preventing things from deteriorating so bad in the first place, such that rebuilding the index becomes unnecessary. Prevention being the best cure …

Now the most common answer I get for why rebuilding an index has been so beneficial is because the index is now so much smaller after the rebuild that the overheads of reading the index have substantially reduced. If the index is smaller, it means one can read the same amount of index related data with less I/Os. Less I/Os means better performance.

For example, if you can reduce the index by half, you can fit the index into only half the number of leaf blocks and that’s a 50% performance improvement right there.

Well, firstly it assumes that the index has indeed reduced by half. It would actually be a relatively unusual index for it to be so poorly fragmented or for it to have so much “wasted” space that it could be rebuilt into only half the number of leaf blocks.

Possible but somewhat unusual.

However, it also assumes that by having a 50% performance improvement, reading the index blocks constitutes the vast majority of the work. Again possible in some scenarios.

With most index related activities though, reading the index blocks actually constitutes only a small percentage of the overall work. In most cases, the index only contributes a very small percentage of the overall I/O activity. Therefore by potentially only reducing a small percentage of the overall work by rebuilding just the index, the overall impact is generally going to be minimal.

I thought I might perform some basic mathematics to illustrate and put into perspective just what little impact index rebuilding can have in improving performance, even if by doing so the index dramatically reduces in size, because the index itself actually constitutes only a small percentage of the overall costs.

Let say we have one of these more unusual indexes that is so poorly fragmented that it has approximately 50% wasted space throughout the entire index structure. Let’s say rebuilding such an index reduces the overall size of the index by half.

Before the index rebuild, an index has 50% of wasted space and say:

Height of 3

1 Root Block

50 Intermediate Branch Blocks

20,000 Leaf blocks

After the rebuild, the index has no wasted space and has now:

Height of 3

1 Root Block

25 Intermediate Branch Blocks

10,000 Leaf Blocks

Let’s assume the table contains 2M rows and that they fit in 100,000 blocks (i.e. the index is about 1/10 that of the table and the average row size is such that we fit say 20 rows on average per block). Let’s also assume there’s nothing special about this index and that it has an “average” clustering factor of 1M, before and after the rebuild 😉 1M being somewhere in the middle of possible clustering factor values.

The first thing to note is that the height remains the same after such a rebuild, even though the index is only now half the size. It would be extremely unlikely and the index would have to be particularly small and within a very narrow range of sizes for all the contents of all the intermediate branch blocks to fit within just the one root block. The only way for the index height to reduce down from 3 would be for the contents of all intermediate branches to fit within the root block. Possible, but again quite unusual.

OK, let’s look at the cost of various scans before and after the rebuild, using the index costing formula I’ve discussed recently.

If we’re after just one row (a unique index perhaps), then to read the one row before the rebuild would be:

1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.

After the rebuild, the total cost would be:

1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.

In effect, no change. Well, we’re not likely to have too much of a performance improvement there.

Let’s increase the size of the range scan. What if we retrieve 100 rows (or approx. 0.005% of data):

Before the rebuild it would be

1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 20000) +
50 for all the table blocks (0.00005 x 1M)
= 53.

After the rebuild it would be:

1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 10000) +
50 for all the table blocks (0.00005 x 1M)
= 53.

Again, no difference …

OK, let’s increase the number of rows accessed substantially to 10,000 (or approx. 0.5% of the data).

Before the rebuild it would be:

1 I/O for the root block +
1 I/O for a branch block +
100 for all the leaf blocks (0.005 x 20000) +
5000 for all the table blocks (0.005 x 1M)
= 5102.

After the rebuild it would be:

1 I/O for the root block +
1 I/O for a branch block +
50 for all the leaf blocks (0.005 x 10000) +
5000 for all the table blocks (0.005 x 1M)
= 5052.

Or in percentage terms, a reduction of I/Os of approximately 1%. That’s just 1 tiny little percent …

So even an index that accesses 10,000 rows, a reasonable number and at 0.5% a reasonable percentage of the overall table, even an index that has reduced in size by half, a substantial reduction in size, only reduces the overall number of I/Os by an unimpressive 1% for such a query in the above scneario.

Would reducing I/Os on such a query by 1% really improve performance “substantially” ? Will users really notice much of a difference ?

It’s of course all in the numbers and in how much work the actual index is performing, in how many I/Os are actually performed by the index itself and in how much of a reduction in the overall I/Os an index rebuild will actually contribute. I’ll leave it to you to plug in different ranges of selectivity to see what impact rebuilding such an index with the above characteristics might have.

The point is that for the vast majority of index related queries, most of the work is performed in getting the data out of the table, not the index.

Therefore, reducing the size of an index, even by possibly a substantial amount, may not necessarily reduce the overall I/Os associated with a query if the index only performs a tiny fraction of all the work. You could eliminate the index entirely and providing Oracle could still magically retrieve just the 0.5% of rows of interest in the above example, performance for such a query would unlikely improve “significantly”.

So not only must an index reduce in size but the activities associated with directly reading the index must constitute a significant proportion of the overall work (eg. fast full index scan, index only scans, etc.) or something else must have changed for performance to have improved “significantly”.

Comments»

1. Marcin Przepiorowski - November 12, 2009: Hi Richard,

From statistical point of view you are right. But what about your experience ? I got a impression that for that bad index rebuild can help. Please correct me if I wrong. If index size was 20000 and after rebuild it was 10000 it mean that there was 50 % of unused space (ex. due to concurrent rows deletion and inserting). For unknown number of index block number of rows inside that block can be smaller that average 100 rows per index block (2M rows /20000 blocks) and for another one much bigger. As I understand number of LIO will be depended on condition values – for some conditions it can be better than for others (ex. for id=1 will be OK, for id=1000 not OK). After index rebuild your statistics calculations will be independent from condition values because all index blocks will have more less same number of rows.

For sure that kind of index is very unusual and for common indexes rebuild can help only in case of Index Full Scans

regards,

LikeLike

Reply
Richard Foote - November 13, 2009: Hi Marcin

In my theoritical example, I’ve assumed the index is fragmented evenly throughout the entire index structure. Therefore, I’ve assumed you would save approximately 1/2 the associated index I/Os during an index scan, regardless of what part of the index you scan. Sure, there may be examples where an index is more fragmented in some parts of the index strucure and so more fragmented for specific index values than others.

However, in my experience, it’s quite rare for an index be so fragmented AND for an associated index related scan to constitute a significant proportion of the overall I/Os.

Possible but unusual.

LikeLike

Reply
2. PdV - November 13, 2009: Good points, and glad someone is making them.

I think LIO on proper index-access (unique or range scan) will always be largely determined by just the B-Level and rebuilding will not make much difference, not in LIO on that index.
And a rebuild index has a tendency to re-grow to its “stable” size or even out-of-control anyway. hence limited benefit of rebuilds.

I’m sure you can come up with plenty exceptions to my statements here, but in general…B-Level, and not index-size determine the LIO.

However, I can see one additional case where significantly reducing the size of an index might be usefull: Efficient use of db_cache.
If buffers in db_cache are needed elsewhere, space-wastage is bad.
The reclaimed space will benefit other segments, at least until the application DML grows the index segment back to its bigger size

Just my two eurocents.
Nice to see you back on topic!
Look forward to some interesting posts still.

LikeLike

Reply
Richard Foote - November 13, 2009: Hi Piet

Like everything, it depends.

A small scan repeated many times within say a nested loop is going to be impacted by the height.

LIOs as a result of a large range scan, accessing many 1000s of rows is more likely to be impacted by the leaf blocks.

The buffer cache is certainly one consideration, however generally a moderate amount of additional memory is likely to counter-balance the net effects of additional PIOs in a more cost effective manner than constant rebuilding.

LikeLike

Reply
3. David Aldridge - November 13, 2009: It’s arguable that making the index smaller improves the chance of avoiding physical reads, either on the index itself or on other segments (such as the table being accessed). It might be useful to have a look at v$segment_statistics to check the rate at which physical reads are being incurred as it’s generally pretty low, in my humble experience.

For example:

with data as
(select owner,
object_name,
sum(decode(statistic_name,’logical reads’ ,value,0)) logical_reads ,
sum(decode(statistic_name,’physical reads’ ,value,0)) physical_reads ,
sum(decode(statistic_name,’physical writes’,value,0)) physical_writes
from v$segment_statistics
–where object_type like ‘INDEX%’ and
where (statistic_name like ‘%reads%’ or statistic_name like ‘%writes%’)– and
— value > 0
group by owner,
object_name
)
select object_name,
owner,
logical_reads,
physical_reads,
physical_writes,
case logical_reads
when 0 then null
else Round(physical_reads/logical_reads,6)
end phys_div_log
from data
/

LikeLike

Reply
Richard Foote - November 13, 2009: Hi David

As I said to Piet, the caching characteristic of the database is certainly a consideration. However, note again in my example that it’s the table that generate by far the greater LIOs (and hence likely PIOs). If detected as an issue, with modern memory capabilities, a moderate increase in the buffer cache is often a cheaper, more cost effective solution anyways than constant index (and table) reorgs.

LikeLike

Reply
4. Narendra - November 13, 2009: I think it is fair to say that if the application has many queries that use the index, resulting in index full scan or index fast-full scan, then the index rebuild, that results in large reduction in branch and leaf blocks, will help a lot (aka make all those queries perform better).
One question though. Is it possible that index rebuild will help (or appear to help) in cases where table is accessed using index having bad clustering factor, which results in index access contributing to the large percentage of work?
Something like a Throw-Away cases

LikeLike

Reply
Richard Foote - November 13, 2009: Hi Narendra

It’s actually the other way around. An index with a good clustering factor is more likely to benefit from a rebuild as it therefore contributes a higher percentage of the overall LIOs.

Change the CF in my example to a better value and redo the maths.

The other point I would make is that if many queries use either a full index scan or a fast full index scan, especially in an OLTP environment, especially if the index is large, then something is likely not right as both are relatively expensive operations to perform. I would concentrate on why such an index is being constantly used in such a manner rather than simply rebuilding the index.

LikeLike

Reply
Narendra - November 14, 2009: Richard,

Not sure I understood the point. I thought the Throw-away case as mentioned in the documentation was returning many more no. of rows (thereby visiting many blocks) than those returned by corresponding table access. Isn’t it possible that such an index is reduced to half because of rebuild? If yes, won’t that have any positive impact on the query?
I must admit though I am bit confused about how exactly a plan like that mentioned in the documentation can occur. Tried to search for an example but could not find one. Would appreciate if you can point me to an example.

LikeLike

Reply
Jonathan Lewis - November 16, 2009: Narenda,
Looking at the plan (and row counts) in the “throwaway” example, I think the point they are making can be explained with an example like this:

You have an order table, with a customer_id column and an order_date column. You have a query which accesses this table with a join on customer_id and filters on a predicate using order_date, e.g:

where ord.customer_id = tabX.customer_id
and ord.order_date > trunc(sysdate – 7)

You have an index on customer_id, and an index on order_date. The optimizer chooses to use the the index on customer_id. But if the average customer has placed 150 orders, but only one in the last week, the row count statistics would be similar to the row counts shown in the throwaway example: 150 rows from the index for each row from the table.

In that case, rebuilding the index would be a total waste of effort, since almost all the work of the query would be done visiting 149 rows in the table to discard them because their dates were too early. The correct solution, obviously, would be to create the correct index, viz: (customer_id, order_date).

Now – this multi-column index probably would run at about 50% space utilisation (if every customer placed a lot of orders over a long period of time), but you might still be perfectly happy to ignore that “wastage” for various reasons – including the possibility that by the time your data size was large enough for it to matter you won’t want to rebuild an index that’s that big because of the time it takes to complete the job.

LikeLike

Reply
Richard Foote - November 17, 2009: Thank you Jonathan for answering Narenda’s query and apologises to you Narenda for missing the link that was actually just in your previous post.

Again it comes back to my main point that if the I/Os associated directly with index blocks only make up a small percentage of the overall rows, then an index rebuild is unlikely to be benefical for such a query.

Also note that an index with a poor CF is even less likely to benefit from a rebuild as a poor CF generally means an even greater proportion of table block visits vs. index block visits.

LikeLike

Reply
Narendra - November 18, 2009: Hello Jonathan/Richard,

Thank you for your detailed explanation and my apologies for late reply.
I just tried to create an example as you had described in order to observe the “Throw-away” behavior. But I could not get the results. Can you please help me?
SQL> create table orders as select object_id oid, object_name odesc,object_type cid, created from all_objects ;

Table created.

SQL> create table customers as select distinct object_type cid, ‘CUST’||object_type cname from all_objects ;

Table created.

SQL> select count(*) from orders ;

COUNT(*)
———-
40584

SQL> select count(*) from customers ;

COUNT(*)
———-
23

SQL> select created, count(*) from all_objects where created > trunc(sysdate) – 7 group by created ;

CREATED COUNT(*)
——— ———-
17-NOV-09 1
17-NOV-09 1
17-NOV-09 1
17-NOV-09 1

SQL> exec dbms_stats.gather_table_stats(user, ‘ORDERS’) ;

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(user, ‘CUSTOMERS’) ;

PL/SQL procedure successfully completed.

SQL> create index created_ind on orders(created) ;

Index created.

SQL> explain plan for select ord.oid, ord.odesc, cu.cname from orders ord, customers cu
2 where ord.cid = cu.cid and ord.created > trunc(sysdate) – 7 ;

Explained.

SQL> select * from table(dbms_xplan.display) ;

PLAN_TABLE_OUTPUT
———————————————————————————————————————————–
Plan hash value: 4144700952

——————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
——————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 69 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN | | 1 | 69 | 7 (15)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| ORDERS | 1 | 47 | 3 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | CREATED_IND | 1 | | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | CUSTOMERS | 23 | 506 | 3 (0)| 00:00:01 |
——————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

1 – access(“ORD”.”CID”=”CU”.”CID”)
3 – access(“ORD”.”CREATED”>TRUNC(SYSDATE@!)-7)

17 rows selected.

LikeLike

Reply
5. Richard Foote - November 15, 2009: Hi Narendra

Not sure what bit of the documentation you’re referring to ? If you can post a link to the specific part of the doco, I’ll see what relevance it has to the point I’m trying to make, which is simply that if index related I/Os are only (say) 1% of the total work within a execution plan step, then improving the index structure will only make a 1% improvement at best.

LikeLike

Reply
6. Blogroll Report 06/11/2009-13/11/2009 « Coskan’s Approach to Oracle - November 17, 2009: […] Richard Foote-An Index Only Performs How Much Work ??? […]

LikeLike

Reply
7. Barfo Rama - November 18, 2009: Let’s consider the case of an index that has a certain subset of blocks that are accessed often. They would be sitting out there in the SGA buffers. If those blocks are condensed to half their former size, that would be half of that part of the SGA, right?

Now, those are accessed often, so they would be pretty much always there in the SGA.

Now, let’s consider that a number of indices have these types of characteristics. Wouldn’t there be a case where the difference between these two size groups makes a difference on what other blocks are being cycled out of the buffers? There could be a large difference in performance when you compare a thrashed I/O because these popular blocks are big in toto, versus cpu buffer access. Given a large system with multiple apps that all have built up “kinda wrong” indexes, it doesn’t have to be a narrow edge case, does it?

So how rare is this kind of situation, where blindly rebuilding all indices could make a difference? More importantly, how would one quantify it to see if it is the situation?

LikeLike

Reply
Richard Foote - November 18, 2009: Hi Narendra

I try to keep my examples as simple as possible, so they can illustrate the point in a clear manner. You’re confusing things a tad with the join and the type of join used. To see throwaway in action, you really only need the one table retrieving a number of rows via an index but filtering most of them out after hitting the table.

So basing this on your example:

SQL> create table bowie as select * from dba_objects order by object_name ;

Table created.

SQL> create index bowie_created_i on bowie(created);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>’BOWIE’, tabname=>’BOWIE’, cascade=> true, estimate_percent=> null, method_opt=> ‘FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

SQL> select object_id from bowie where created > ’01-OCT-2009′ and object_id = 102343;

OBJECT_ID
———-
102343

Execution Plan
———————————————————-
Plan hash value: 2578339375

———————————————————————————————–
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
———————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | 13 | 117(0)| 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID| BOWIE | 1 | 13 | 117(0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | BOWIE_CREATED_I | 308 | | 2(0)| 00:00:01 |
———————————————————————————————–

The index is retrieving a whole heap of rows based on the date predicate but you filter (throwaway) most of them away because of the filer condition on object_id.

Ideally, you would have an index on either both columns or in this case just the objkect_id column.

However, getting this back to the point I make in the post, if you look at the trace file of such a query, you would notice most of the associated I/Os come from getting the data out of the table, not from reading the index:

STAT #2 id=1 cnt=1 pid=0 pos=1 obj=102584 op=’TABLE ACCESS BY INDEX ROWID BOWIE (cr=85 pr=11 pw=11 time=0 us cost=117 size=13 card=1)’
STAT #2 id=2 cnt=157 pid=1 pos=1 obj=102585 op=’INDEX RANGE SCAN BOWIE_CREATED_I (cr=3 pr=2 pw=2 time=2 us cost=2 size=0 card=308)’

85 consistent reads in accessing the table, just 3 in accessing the index.

Rebuilding such an index in order to improve the performance of this query would yield minimal improvement.

Hope this helps and makes some kinda sense 🙂

LikeLike

Reply
Narendra - November 18, 2009: Richard,

Thanks for your response and example.
Apologies if my questions have diverted your original topic but that was not my intention. I was under impression that the “Throw-away” case can also occur if the index grows larger and eventually may reach a stage where index access will need to visit more number of blocks than the corresponding table access. I was wrong as I got confused between “number of blocks” and “number of rows”. The “Throw-Away” example talks about number of rows which will be influenced by the index column selectivity.
Your example clearly shows that even in “throw-away” case, the consistent gets to access index will be less than those to access corresponding table.

p.s. BTW, as I was writing this, I was wondering whether it is possible that index growing in size (i.e. more number of leaf blocks) will result in more number of consistent gets.

LikeLike

Reply
Richard Foote - November 18, 2009: Hi Barfo

Yes, true.

But a few points to put things in perspective.

You won’t get a 50% improvement with most indexes, especially less so with some pctfree to spare, so most indexes won’t have these types of characteristics.

Next point is if caching issues are impacting performance, the far easier option is to simply increase memory allocation as necessary, rather than rebuilding indexes all the time, assuming you just pick those indexes that yield an actual benefit and reduce in size.

As many I/Os are associated with a table, why isn’t equal consideration given for tables considering they also “waste” space after (say) deletions and with pctused commonly set at 40% on most tables in non-ASSM tablespaces. With most tables, the tables themselves generate most of the I/O unless the table is heavily and almost exclusively accessed with the PK.

Finally, you can always check these things and see if excessive PIOs are causing a problem. v$segment_statistics has useful data in determining which indexes (or tables) have PIOs, v$sql gives you data on which sql is causing PIOs. One can benchmark the impact of index rebuilds by looking at the PIOs of any segment of interest (all if necessary) and any sql statements of interest and see if total I/Os significantly reduce and/or if total times for sql statements reduce.

So it can be quantified to a reasonable extent and it can be determined whether index rebuilds really are beneficial.

Are statements really running faster, key sql that are important to your business after rebulding that index or those indexes or all your indexes.

In my experience, rebuilding most indexes makes no measurable difference. However, rebuilding that specific index might make a measurable difference for that sql in that application. So I strongly suggest not rebuilding all your indexes when it’s only those handful that make a difference.

Basically, there’s no need to guess these things when one can actually know as Oracle is so well instrumented. If you know it’s made a difference, great. What was the difference, why the difference, you know. If you don’t know then you just don’t know.

However, you may get the same reduction in PIOs by simply increasing the buffer cache …

LikeLike

Reply
8. Richard Foote - November 19, 2009: Hi Narendra

Yes, an index “growing” in size can result in additional LIOs if it means Oracle now has to visit more leaf blocks (and in an extreme case, an additional branch block) than it did previously.

In my last example in the blog piece, the index scan accessing 10,000 rows had to read an additional 50 leaf blocks before the index was rebuilt.

However, this represented only 1% of all the LIOs performed during the scan and so these additional I/Os were relatively trivial.

However, if the LIOs on the index consisted of a far greater percentage of the total LIOs, then perhaps possible savings would be more significant.

So like most things it depends but as you don’t generally save (say) the 50% during an index rebuild/coalesce and as index scans don’t generally consist of a high percentage of the overall I/Os unless it’s a reasonably large scan of a good CF index or a index only scan of some description or a tiny index scan, then the overall benefits of such a rebuild/coalesce are likely to be minimal.

LikeLike

Reply
9. Log Buffer #170: a Carnival of the Vanities for DBAs | The Pythian Blog - November 21, 2009: […] Foote asks, An index only performs how much work???, the result of looking into exactly whi index rebuilds can improve performance so […]

LikeLike

Reply
10. Puzzled - November 26, 2009: We recently encountered an interesting behaviour.

The following two queries were run one after the other with no data/object changes in between.

col1 is a number(10) and col2 is a date. We are on Oracle 11.1.0.7.

select col1, col2 from tab1
where col1 < 2147483647
and col2 <= sysdate ;

0 db block gets
18696 consistent gets

select col1, col2 from tab1
where col1 <= 2147483646
and col2 <= sysdate ;

0 db block gets
469 consistent gets

The second query is faster and uses much less gets because it is most likely not sifting thru
some of the index rows with the 2147483647 value. The question is why/how ?

Is it an optimizer bug or known feature that <= performs better than < ?

For reference :

select count(*) from tab1 where col1 = 2147483647 ;

COUNT(*)
———-
8384159

LikeLike

Reply
11. Puzzled - November 26, 2009: Couple of data points :

There is a local index on col1 + col2 and the explain plan is the
same for both statements

SELECT STATEMENT
— 1.1 PARTITION RANGE ALL (Partitions: 1 – 19)
—- 2.1 INDEX RANGE SCAN I_HW_PRIORITY_ELIGIBLE_UTC INDEX (Partitions: 1 – 19)

LikeLike

Reply
Richard Foote - November 26, 2009: Hi Puzzled

Interesting !! A couple of things that spring to mind.

Firstly, although they’re kinda equivalent, they’re not actually the same and are costed differently by the CBO.

The first statement without the equality condition has a selectivity costed by the CBO as being appromiately:

(2147483647 – lowest value) / (highest value – lowest value)

So the selectivity is calculated as being the % of rows in the range 2147483647 to the lowest value out of the total range of values.

However, with an equality condition, it’s costed as being:

(2147483646 – lowest value) / (highest value – lowest value)

PLUS

the selectivity of one additional value (the value of 2147483646 itself). It’s treated as if the query was:

col1 < 2147483646 or col1 = 2147483646

So this may not end up with the same cost (depending on how the data is distributed within the ranges) and so the plan might be different.

But you say the plans are the same …

So the question I have is as you have a significant number col1 = 2147483647, did you likewise have many values of 2147483646 and may many of them have been deleted ? It might be Oracle is forced to read through a whole bunch of leaf blocks that are actually empty until it finally gets to the first occurance of 2147483647 ? As it's reading each of the 19 partitions, it might only be in a few of them where this might be happening ?

I would recommend tracing a session that runs slowly (dbms_monitor) and see within the trace file what all the additional blocks are that are being hit. If they are index blocks, dump a few of them and see what the contents might be.

In answer to your question, no a <= should not run faster but it depends on how it's actually costed by the CBO and what possible values might be in the boundary above just the <.

LikeLike

Reply
12. Henish - November 27, 2009: from your above reply

“did you likewise have many values of 2147483646 and
may many of them have been deleted ? It might be Oracle is forced to
read through a whole bunch of leaf blocks that are actually empty until it
finally gets to the first occurance of 2147483647 ?”

I did not get your point here,

should not be for predicate “col1 < 2147483646 or col1 = 2147483646" it has to go
through the empty leaf block to check col1 = 2147483646

Thanks

LikeLike

Reply
Richard Foote - November 27, 2009: Hi Henish

You are of course correct.

Note to self. Don’t try and write a reply that requires some thought just before going to bed 🙂

I go back to my last point, that being tracing the session and seeing why the extra I/Os, assuming the execution plans are indeed identical.

LikeLike

Reply
13. Puzzled - November 28, 2009: Thanks guys for the reply.

Our first initial thought was the empty leaf blocks as well since we
update the 2147483646 value a lot and move it to 2147483647.

So we did a CTAS of the original table, put on the same index
of col1+col2 and ran the queries. Same result – big diff in gets and performance – and explain plans. Note that the copied table was idle and no dml was happening.

We did run some traces for both queries.
Let me find them and paste them here.

LikeLike

Reply
Richard Foote - December 2, 2009: HI Puzzled

You mention here that there are big diffs in “explain plans” but you previously said there were no diffs in explain plans.

Which is it because the diff in explain plans can be easily explained due to the diff in costings as I originally described.

LikeLike

Reply
14. Puzzled - December 3, 2009: No difference in explain plans.
I meant big difference in gets and performance.

Sorry, my wording was confusing in the previous post.

Here are the trace files. Notice the difference in the cr value between
both. BTW, we have an SR and a Oracle bug open on this now 🙂

alter session set events ‘10046 trace name context forever, level 12′;

=====================
PARSING IN CURSOR #11 len=135 dep=0 uid=125 oct=3 lid=125 tim=1230257056699497 hv=947105316 ad=’f3230100’
select PRIORITY, count(*) from tab1
where PRIORITY <= 2147483646
and ELIGIBLE_UTC <= sysdate-6
group by PRIORITY
END OF STMT
PARSE #11:c=3999,e=3608,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1230257056699489
BINDS #11:
EXEC #11:c=0,e=79,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1230257056699649
WAIT #11: nam='SQL*Net message to client' ela= 2 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1230257056699679
FETCH #11:c=3000,e=2880,p=0,cr=17,cu=0,mis=0,r=0,dep=0,og=1,tim=1230257056702591
WAIT #11: nam='SQL*Net message from client' ela= 1252 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1230257056703912

=====================
PARSING IN CURSOR #11 len=134 dep=0 uid=125 oct=3 lid=125 tim=1230257033438778 hv=2542220544 ad='d7fce140'
select PRIORITY, count(*) from tab1
where PRIORITY < 2147483647
and ELIGIBLE_UTC <= sysdate-6
group by PRIORITY
END OF STMT
PARSE #11:c=0,e=48,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1230257033438776
BINDS #11:
EXEC #11:c=0,e=53,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1230257033438878
WAIT #11: nam='SQL*Net message to client' ela= 1 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1230257033438903
FETCH #11:c=1210815,e=1186557,p=0,cr=6862,cu=0,mis=0,r=0,dep=0,og=1,tim=1230257034625483
WAIT #11: nam='SQL*Net message from client' ela= 1467 driver id=1413697536 #bytes=1 p3=0 obj#=-1 tim=1230257034627110

LikeLike

Reply
Richard Foote - December 6, 2009: Hi Puzzled

The next thing to check is to dump some of the additional index blocks being accessed and see if the contents give any clue.

It’s not obvious (to me) why this would be happening.

Let us know if the SR gets any useful info.

LikeLike

Reply
15. Puzzled - December 3, 2009: One other thing –

I originally mentioned that ELIGIBLE_UTC was a date column.
It is actually a timestamp column though we only store thru SS in that.

LikeLike

Reply
16. Puzzled - January 20, 2010: Just to give you guys an update.

Oracle’s stance was that the extra gets was due to accounting for
fractions/decimals (inspite of the column being number(10,0) ).
After a lot of back and forth with Oracle, they have raised an
enhancement request for this.

See Bug # 9292477

LikeLike

Reply
Richard Foote - January 24, 2010: Thanks for the update, much appreciated.

LikeLike

Reply
17. Kwangseo - January 22, 2014: After massive deletion, index performance downgrades severely.
Below I simulated this interesting phenomenon with my analysis.

regards.

http://inhim.blog.me/100204134961

LikeLike

Reply
18. Richard Foote - February 14, 2014: Hi Kwangseo

I guess “interesting” is in the eye of the beholder 🙂

Yes, if you perform a massive deletion and don’t re-insert data back, then the deleted space doesn’t get re-used and a rebuild/coalesce of the index and just as importantly a reorg of the table might be warranted.

This is one of the “classic” use cases for a possible index rebuild, as I’ve discussed previously such as here:

Click to access index-internals-rebuilding-the-truth-ii.pdf

LikeLike

Reply

	PP on Costing Concatenated Indexes W…
	Zeeshan Moin on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	PP on The End Of Richard Foote…
	Ram on The Fake Index Trap (“No…
	Richard Foote on The End Of Richard Foote…
	NormanDunbar on The End Of Richard Foote…
	NormanDunbar on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Sayan Malakshinov on The End Of Richard Foote…

Richard Foote's Oracle Blog