jump to navigation

The CBO CPU Costing Model: Indexes vs. Full Table Scans November 25, 2009

Posted by Richard Foote in CBO, Full Table Scans, Oracle Indexes, System Statistics.

As previously promised, I thought I might look at how the CBO goes about costing a Full Table Scan (FTS) with system statistics and the CPU costing model, so we can understand why the CBO may have chosen one option over the other.
WARNING: You might need to grab a calculator to help you along :)

To illustrate, I’m simply going to use the original BOWIE_STUFF table and index setup I created in my earlier Introduction to the CBO. I’ll however recreate the demo here again from scratch to refresh your memory:
I first create a table that has 100,000 rows, with an indexed “ID” column that has 100 distinct, evenly distributed values. For those mathematically challenged, this means each distinct value will return 1000 rows.

SQL> CREATE TABLE bowie_stuff AS SELECT (mod(rownum,100)+1)*10 id, 'Ziggy Stardust' name FROM dual CONNECT BY LEVEL <= 100000;
Table created.
SQL> CREATE INDEX bowie_stuff_i ON bowie_stuff(id);
Index created.
SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
PL/SQL procedure successfully completed.
SQL> select blocks from dba_tables where table_name='BOWIE_STUFF';

Note the table has 329 blocks. It’s a number I’ll refer to a number of times throughout.

SQL> SELECT index_name, blevel, leaf_blocks, clustering_factor FROM user_indexes WHERE index_name = 'BOWIE_STUFF_I';

------------- ------

----------- -----------------
        207             32900

Note also that the index has a blevel of 1, 207 leaf blocks and a rather poor Clustering Factor (CF) of 32900, not close at all to the number of blocks in the table. As we’ll see, the CF is so bad that the CBO will choose a FTS over the index.

SQL> show parameter db_file_multi
NAME                          VALUE
----------------------------- -----
db_file_multiblock_read_count    16

Note the db_file_multiblock_read_count is manually set to 16. Relevant when calculating the cost of a FTS with the I/O costing model. Less so with CPU costing in use as we’ll see.

Finally, if we look at the system statistics in place:

SQL> select pname, pval1 from sys.aux_stats$ where pname in ('SREADTIM', 'MREADTIM', 'MBRC', 'CPUSPEED');

-------- -----
MBRC        10

All of these values will be relevant when calculating the cost of a FTS with the CPU costing model.
OK, we now have all the information we need to determine how the CBO will treat both index and FTS activities on this table.
Let’s start by refreshing ourselves with how the I/O based CBO model will deal with such a scenario.

SQL> alter session set "_optimizer_cost_model" = io;
Session altered.

OK, let’s just run a simple query that selects data for a specific ID. Remember, there are 100 evenly distributed distinct IDs so this query will return 1% of the data (1000 rows):

SQL> set autotrace traceonly
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
| Id  | Operation         | Name        | Rows  | Bytes | Cost  |
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |       |
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |       |


Note: the CBO has decided to use a FTS to select the 1% of rows as it has the lowest associated cost.
As previously discussed, the cost of using the index is approximately:
index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)
 = 1 + (207 x 0.01) + (32900 x 0.01) = 1 + 3 + 329 = 333        

Note: the 1 for the blevel can be dropped by the CBO bringing the cost down to 332, to be discussed another time .
As previously discussed, the FTS cost is approximately:
segment header I/O + ceil(table blocks/fudged mbrc value) 

Note: for a db_file_multiblock_read_count of 16, the adjusted, “fudged” value used by the CBO is approximately 10.4.

Therefore, for the above example, the FTS cost is calculated as:
= 1 + ceil(329/10.4) = 1 + 32 = 33
33 is significantly less than 333 so the FTS easily wins out.

So how do things change when using System Statistics and the CPU costing model ? How does the CBO calculate the cost of the FTS with the above system statistics in place ?
As I’ve previously discussed, the significant change with the CPU costing model is not so much how it impacts the cost of index accesses but that of the FTS.
Let’s run the same SQL, but this time using the CPU costing model:

SQL> alter session set "_optimizer_cost_model" = cpu;
Session altered.
SQL> SELECT * FROM bowie_stuff WHERE id = 420;
1000 rows selected.

| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time    |
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |     70(5)  | 00:00:01|
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |     70(5)  | 00:00:01|

Note: CBO is still picking the FTS as the CF is truly awful. However the cost of the FTS has increased significantly from 33 to 70, although nowhere near the approximate 333 cost of using the index.
So why has the FTS cost increased and how is this new cost calculated ?
As previously discussed , the CPU costing formula is basically:
(sum of all the single block I/Os x average wait time for a single block I/O +
 sum of all the multiblock I/Os x average wait time for a multiblock I/O +
 sum of all the required CPU cycles / CPU cycles per second)
average wait time for a single block I/O

If we first focus on the single block I/O portion of the formula, the only single block read considered by the CBO during a FTS is the one associated with reading the segment header. Note that the average wait time for a single block read is the SREADTIM system statistic.
If there’s just the one single block I/O, the single block read portion of the formula effectively equates to (1 x sreadtim) / sreadtim, which just equals 1. So 1 is basically added to the cost with regard to reading the segment header as it is with the I/O costing model.
OK, lets next look at the portion of the formula with regard to multiblock I/Os.
The sum of all the multiblock I/Os is calculated in a similar manner as it was with the I/O costing model. It’s simply the number of blocks in the table below the HWM (329 in our example) but this time divided by the MBRC system statistic. Note however the MBRC statistic isn’t some fudged, somewhat arbitrarily set figure based on the db_file_multiblock_read_count parameter, but the actual average size of multiblock I/Os in the specific database environment. Note also that the average wait time for a multiblock read is the MREADTIM system statistic.
So the total wait time for all multiblock reads in the above example is:
sum of all the multiblock I/Os x average wait time for a multiblock I/O = (BLOCKS/MBRC) x MREADTIM = ceil(329/10) x 10 = 330.
This value is then divided by the average wait time for a single block read (the SREADTIM system statistic) to give the overall cost of multiblock reads, but expressed in units of single block I/Os.
The total cost for multiblock I/Os is therefore:

So the total costs associated for all I/Os is the 1 for reading the segment header plus 66 for all the multiblock reads = 67.
However, the cost of the FTS is 70, not 67. Where does the additional cost of 3 come from ?
Well, that’s the CPU portion of the formula. The CBO has determined that the FTS operation will require ‘x’ number of CPU cycles and this value is then divided by the CPUSPEED to determine how long this CPU activity will take.
This CPU elapsed figure is then again divided by the average wait of a single block read (SREADTIM) to also put the CPU costs in units of single block reads. In this example, the total CPU related costs amount to 3.
Oracle gives us an indication of what the CPU component is in the overall cost within the execution plan via the %CPU value (which is 5 in the above execution plan). The (%CPU) value is the ceil of the overall percentage of CPU costs as calculated by the following formula:
%CPU = ceil(CPU related costs/overall costs)
So in our example, %CPU = ceil(3/70 x 100) = ceil(4.29) = 5% (as indeed displayed in the above execution plan).

Again, all the costs associated with a FTS with the CPU costing model can be derived and make some kinda sense. Providing all the necessary inputs are all actually correct and valid, the CBO will indeed correctly decide to use a FTS over an index when it’s the less expensive option.
I’ll next expand these points and why understanding how these costs are derived can be extremely useful.

You can now put your calculators away :)

An Index Only Performs How Much Work ??? November 12, 2009

Posted by Richard Foote in Index Rebuild, Oracle Indexes, Oracle Myths.

One of the main reasons suggested for performing periodic index rebuilds is to improve “performance”. After rebuilding indexes, applications now run so much faster. Users notice a significant improvement in performance. And so on.
There are of course situations when rebuilding specific indexes can indeed improve performance, here’s but one example I’ve discussed previously.
However, the question I always ask when someone claims an index rebuild has made things run faster is to simply ask why. Why is such a simple, but powerful question. Why have things improved ? What has specifically changed as a result of rebuilding the index, that Oracle has now reduced the overall work associated with performing the activity, to the point that things run noticeably faster.
Knowing why is really important because it confirms that indeed there was an improvement and that it was indeed associated directly with the index rebuild. It means when a similar situation arises again, you know how to directly resolve the problem appropriately and effectively next time. Also knowing why means you can determine the specific root cause, perhaps preventing things from deteriorating so bad in the first place, such that rebuilding the index becomes unnecessary. Prevention being the best cure …
Now the most common answer I get for why rebuilding an index has been so beneficial is because the index is now so much smaller after the rebuild that the overheads of reading the index have substantially reduced. If the index is smaller, it means one can read the same amount of index related data with less I/Os. Less I/Os means better performance.
For example, if you can reduce the index by half, you can fit the index into only half the number of leaf blocks and that’s a 50% performance improvement right there.
Well, firstly it assumes that the index has indeed reduced by half. It would actually be a relatively unusual index for it to be so poorly fragmented or for it to have so much “wasted” space that it could be rebuilt into only half the number of leaf blocks.
Possible but somewhat unusual.
However, it also assumes that by having a 50% performance improvement, reading the index blocks constitutes the vast majority of the work. Again possible in some scenarios.
With most index related activities though, reading the index blocks actually constitutes only a small percentage of the overall work. In most cases, the index only contributes a very small percentage of the overall I/O activity. Therefore by potentially only reducing a small percentage of the overall work by rebuilding just the index, the overall impact is generally going to be minimal.
I thought I might perform some basic mathematics to illustrate and put into perspective just what little impact index rebuilding can have in improving performance, even if by doing so the index dramatically reduces in size, because the index itself actually constitutes only a small percentage of the overall costs.
Let say we have one of these more unusual indexes that is so poorly fragmented that it has approximately 50% wasted space throughout the entire index structure. Let’s say rebuilding such an index reduces the overall size of the index by half.

Before the index rebuild, an index has 50% of wasted space and say:
Height of 3
1 Root Block
50 Intermediate Branch Blocks
20,000 Leaf blocks

After the rebuild, the index has no wasted space and has now:
Height of 3
1 Root Block
25 Intermediate Branch Blocks
10,000 Leaf Blocks

Let’s assume the table contains 2M rows and that they fit in 100,000 blocks (i.e. the index is about 1/10 that of the table and the average row size is such that we fit say 20 rows on average per block). Let’s also assume there’s nothing special about this index and that it has an “average” clustering factor of 1M, before and after the rebuild ;) 1M being somewhere in the middle of possible clustering factor values.

The first thing to note is that the height remains the same after such a rebuild, even though the index is only now half the size. It would be extremely unlikely and the index would have to be particularly small and within a very narrow range of sizes for all the contents of all the intermediate branch blocks to fit within just the one root block. The only way for the index height to reduce down from 3 would be for the contents of all intermediate branches to fit within the root block. Possible, but again quite unusual. 

OK, let’s look at the cost of various scans before and after the rebuild, using the index costing formula I’ve discussed recently.
If we’re after just one row (a unique index perhaps), then to read the one row before the rebuild would be:
1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.
After the rebuild, the total cost would be:
1 I/O for the root block + 1 I/O for a branch block + 1 I/O for a leaf block + 1 I/O for the table block = 4 LIOs.
In effect, no change. Well, we’re not likely to have too much of a performance improvement there.

Let’s increase the size of the range scan. What if we retrieve 100 rows (or approx. 0.005% of data):
Before the rebuild it would be
1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 20000) +
50 for all the table blocks (0.00005 x 1M)
= 53.
After the rebuild it would be:
1 I/O for the root block +
1 I/O for a branch block +
1 for all the leaf blocks (0.00005 x 10000) +
50 for all the table blocks (0.00005 x 1M)
= 53.
Again, no difference …

OK, let’s increase the number of rows accessed substantially to 10,000 (or approx. 0.5% of the data).
Before the rebuild it would be:
1 I/O for the root block +
1 I/O for a branch block +
100 for all the leaf blocks (0.005 x 20000) +
5000 for all the table blocks (0.005 x 1M)
= 5102.

After the rebuild it would be:
1 I/O for the root block +
1 I/O for a branch block +
50 for all the leaf blocks (0.005 x 10000) +
5000 for all the table blocks (0.005 x 1M)
= 5052.
Or in percentage terms, a reduction of I/Os of approximately 1%. That’s just 1 tiny little percent …
So even an index that accesses 10,000 rows, a reasonable number and at 0.5% a reasonable percentage of the overall table, even an index that has reduced in size by half, a substantial reduction in size, only reduces the overall number of I/Os by an unimpressive 1% for such a query in the above scneario.
Would reducing I/Os on such a query by 1% really improve performance “substantially” ? Will users really notice much of a difference ?
It’s of course all in the numbers and in how much work the actual index is performing, in how many I/Os are actually performed by the index itself and in how much of a reduction in the overall I/Os an index rebuild will actually contribute. I’ll leave it to you to plug in different ranges of selectivity to see what impact rebuilding such an index with the above characteristics might have.
The point is that for the vast majority of index related queries, most of the work is performed in getting the data out of the table, not the index.
Therefore, reducing the size of an index, even by possibly a substantial amount, may not necessarily reduce the overall I/Os associated with a query if the index only performs a tiny fraction of all the work. You could eliminate the index entirely and providing Oracle could still magically retrieve just the 0.5% of rows of interest in the above example, performance for such a query would unlikely improve “significantly”.
So not only must an index reduce in size but the activities associated with directly reading the index must constitute a significant proportion of the overall work (eg. fast full index scan, index only scans, etc.) or something else must have changed for performance to have improved “significantly”.

Zurich and Paris: Nice Way To Spend A Week November 10, 2009

Posted by Richard Foote in Richard's Musings, Travel.

I’ve just returned from a bit of whirlwind visit to Europe, teaching my Index Internals Seminar in two lovely cities, Zurich in Switzerland and Paris in France.

Coming from Australia, everything is relatively new with nearly anything over a 100 years old regarded as historically significant. I therefore love visiting European cities which are so rich in history and where you can walk past buildings regarded as “new” but actually built before Australia was even discovered by the Europeans.

I’ve been to Switzerland before but never to Zurich, so I was keen to spend my free day there doing what I enjoy doing best when visiting a city for the first time, just walking and exploring. I was really lucky in that the weather on my free day was next to perfect, clear and sunny, whereas it rained on all the other days I was in Switzerland.

I started my walk from the impressive “Zurich Hauptbahnhof” (Railway Station) which is actually a lovely old building in its own right. I then made my way down the picturesque Limmat, past wonderful buildings such as St. Peter’s Church with what is considered the largest clock face in Europe, Fraumunster Church which has some of the most beautiful stained glass windows I’ve ever seen, the huge 700-800 year old Grossmunster Cathedral and the elegant Opera House.

After walking around the shores of Zurich Lake for a while, I decided to take a cruise where I just sat on the top of the ferry-boat, soaking up the sun and all the amazing views, including the massive Alps looming on the horizon. Zurich Lake certainly puts Canberra’s own little Lake Burley Griffin to shame although I can cycle around Burley Griffin during my lunchtime, not sure I could do the same around Zurich Lake. I then just walked and walked throughout the “old town” district, exploring all the lovely old buildings with their bay windows and balconies and all the narrow alleyways, squares and little market places. I of course bought some Swiss chocolates for the folks back at home.

2009 Switzerland France 047

As it grew dark, Zurich lit up and all the lovely old buildings shone with a new brilliance. The only incident I had of note was on the train back to the hotel, when the train inspector upon checking my ticket gave me a shocked look, full of disdain as I was inadvertently sitting in the first class carriage !! I was led like a naughty boy to where I belonged with all the other second class citizens. The seminar on the following days went really well with an excellent class of folk (pun of course fully intended).

Later in the week, I went to Paris to present my second seminar. Paris is a city I had previously visited, many years ago as the first destination of my honeymoon. So I had to be careful not to have too good a time, else there might be a few problems awaiting me back at home :) Again I was really fortunately, having a perfect sunny day on my free day while it rained on every other day of my visit. I decided to basically walk from the Eiffel Tower to Notre-Dame Cathedral while crossing every single bridge on the Seine in between. Paris is absolutely one of my favourite cities in the world, it’s such a unique and special place. The walk was fantastic (I highly recommend it to anyone who hasn’t done it), with just so much to see and enjoy along the way. And boy are there a lot of bridges (I think I counted 14 bridges along the way), but none as impressive as the grand Pont Alexandre III.

2009 Switzerland France 108

Apart from seeing the Eiffel Tower at the start, I spent time at the charming Place de la Concorde, spent a few hours at the incredible d’Orsay Museum (incredible if you’re like me and love Impressionist Paintings by masters such as Van Gogh, Monet and Renoir), went past the one and only Louvre Museum and then onto Saint Louis Island and the stunning Notre Dame Cathedral where I spent some time just marvelling at the place.

I decided to walk all the way back, but this time exploring primarily the north bank with its shops, restaurants and sites such as Saint-Jacques Tower and Place Vendome. As it got dark, I finally made it up to the Avenue Des Champs Elysees and the famous Arc De Triomphe. To finish what was a fantastic day, I then walked all the way across back to where I had begun and the Eiffel Tower at night, so I could see it in all it’s lit-up glory and ambience, with its search beams circling the clear evening skies.

The only problem I had was being continually hassled again and again by people who kept saying I had just dropped a “golden ring” (I guess in the hope I would buy it as some sort of bargain). There must obviously be a good market for shower curtain rings in Paris …

The seminar was again great with a good crowd asking some really good questions. Everyone was very patient as I tried to not talk too fast for those whom English was a little rusty. For those that know me, talking slowly is not a skill that comes easily to me :)

I had a great time but thankfully, that’s all my travel over with for the forseeable future. Next year I will definitely reduce my travel commitments, spending much more time at home. Which of course also means perhaps spending more time on this blog as well :)


Get every new post delivered to your Inbox.

Join 1,703 other followers