The CPU Costing Model: A Few Thoughts Part V (Reality)January 13, 2010

Posted by Richard Foote in CBO, Multiblock Reads, Oracle Cost Based Optimizer, Richard's Musings, System Statistics.

There’s plenty more I could talk about regarding the CBO CPU costing model and system statistics but I’ll make this my final little comment on this subject for now.

As previously discussed, the CPU costing model basically takes the time it takes to perform the all necessary I/O related activities and all the time it takes to perform all necessary CPU related activities and adds them together to get the overall time to complete a task. The CBO then takes this total and divides it by the average time to perform a single block I/O so that it expresses the overall costs in units of single block I/Os.

There are two advantages with expressing CBO costs in this manner.

Firstly, it makes the move from the old I/O costing model a little easier in that the “units” of cost under both CBO costing models is very similar.

With the I/O costing model, the unit of cost was also basically the number of I/Os. It’s just that the CBO made no (automatic) distinction between the I/O costs associated with single and multiblock reads. The cost was simply the expected total number of I/Os for a given execution plan, with single block and multiblock I/Os being consider the same (unless the optimiser_index_cost_adj parameter kicked in).

With the CPU costing modelling, the costs are expressed specifically in units of single block I/Os. However, the CBO automatically takes into consideration and differentiates the relative costs associated with multiblock I/Os (and CPU operations) and incorporates them automatically into the final cost.

The other nice advantage is that one can use the actual cost values as an indication of how long an operation or execution plan is likely to take. The overall execution times of the plan are divided by the average time of a single block I/O when using the CPU costing formula. Therefore by multiplying these cost values out again by the average time of a single block I/O (SREADTIM system statistic), one can have an indicative idea of the overall expected execution time.

The overall execution times as estimated by the CBO using the CPU costing model is therefore basically = cost of execution plan multiplied by SREADTIM system statistic.

Using my previous example with the FTS where the overall cost of the execution plan was 70, and the SREADTIM system statistic was 5:

the overall execution time as estimated by the CBO is approximately 70 x 5 = 350 ms.

Now this of course is only an indicative value. As all system related statistics are simply averages, there could obviously be discrepancies with how long specific I/Os take to actually perform, the size and number of specific multiblock read operations, etc. There may also be caching characteristics of objects that may influence the actual number of physical reads and associated wait times, it doesn’t take into consideration time taken to actually return data to the “client”, etc. etc. etc.

However, it provides one with a rough “ballpark figure”. If the actual executions times in the above example were (say) 20 seconds, then it’s a strong indication that the CBO may have got it wrong, that it may have calculated the wrong cost and maybe as a result the wrong execution plan. Somewhere, something such as the segment statistics, the system statistics, optimizer parameters, etc. may be inaccurate and is causing the CBO to get its costings incorrect.

The CBO cost value doesn’t compare well to reality and so is perhaps worthy of further investigation.

The cost values associated with CPU costing model is not some random, ambiguous, mysterious number but a value that can often be derived and which can be most useful in determining and resolving problematic SQL statements and execution plans.

The CPU Costing Model: A Few Thoughts Part IV (Map of the Problematique)January 7, 2010

Posted by Richard Foote in CBO, Oracle Cost Based Optimizer, System Statistics.

It’s called the CPU Costing model because among other things, it includes the time associated with performing CPU operations.

The CPU Costing model formula once again:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

So the portion detailing the sum of all required CPU cycles divided by the CPU cycles per second can potentially contribute a significant proportion of the overall costs.

When I previously discussed the costs associated with the CPU model between using an index and a FTS, the FTS example I used had an overall cost of 70 but I calculated that the I/O component costs were only 67. Therefore the costs directly related to CPU operations with the FTS example was 3.

However, these CPU specific costs in this example may vary from database to database, although the FTS might be identical as might the required CPU cycles. However, a variable in all this is the CPU cycles per second system statistic (CPUSPEED) associated with a particular database environment.

Obviously, the faster the CPUs, the quicker it can perform the necessary CPU operations associated with the FTS (or any operation for that matter). Conversely, the slower the CPUs, the longer it will take to complete the necessary CPU related operations. The CPU costing model formula automatically takes all this into consideration.

In the previous example, the CPUSPEED system statistic was 1745.

Let’s now run the identical FTS but this time with a faster and a slower CPU and see how this might adjust the overall related costs when using the CPU costing model.

One can simulate a “faster” CPU by simply adjusting the CPUSPEED system statistic. Let’s make the CPUs appear 10 times faster:

SQL> exec dbms_stats.set_system_stats(pname=>’cpuspeed’, pvalue=>17450);

PL/SQL procedure successfully completed.

OK, let’s now see how this impacts the cost of the FTS:

SQL> SELECT * FROM bowie_stuff WHERE id = 420;

1000 rows selected.

Execution Plan
———————————————————-
Plan hash value: 910563088

——————————————————————-
|Id|Operation         |Name       |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————-
| 0|SELECT STATEMENT  |           |1000|18000|   67   (0)|00:00:01|
|*1| TABLE ACCESS FULL|BOWIE_STUFF|1000|18000|   67   (0)|00:00:01|
——————————————————————-

We notice that the overall cost has reduced from 70 down to 67. The cost of 3 that was previously attributed to just the CPU related costs have all disappeared and the costs are now just the 67 in relation to the I/O component.

The CPU is now so fast that it can effectively perform all the necessary operations in a negligible amount of time. An even faster CPU will not further improve the costs associated with this FTS as the costs now only include the I/O related components.

The (%CPU) value of (0) gives us this information if you didn’t follow how I derived the I/O cost of 67 in my previous post.

If we go the other way and now make the CPU about 1/10 the speed of the original example:

SQL> exec dbms_stats.set_system_stats(pname=>’cpuspeed’, pvalue=>175);

PL/SQL procedure successfully completed.

SQL> SELECT * FROM bowie_stuff WHERE id = 420;

1000 rows selected.

Execution Plan
———————————————————-
Plan hash value: 910563088

——————————————————————-
|Id|Operation         |Name       |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————-
| 0|SELECT STATEMENT  |           |1000|18000|   93  (28)|00:00:01|
|*1| TABLE ACCESS FULL|BOWIE_STUFF|1000|18000|   93  (28)|00:00:01|
——————————————————————-

We now notice the overall costs have jumped up considerably up from 70 up 93.

The costs associated directly with CPU activities have now increased up from 3 to 26. The CPU component is in the ballpark of 10 times as expensive/significant when you take into account rounding errors (the original 3 value was rounded accordingly). Remember also that these figures are times expressed in units of time it takes to perform a single block I/O.

The CPUs are now so slow that it takes a considerable amount of time to complete all the required CPU operations.

Note that the (%CPU) value is now a significant (28%) of the overall costs as derived from the following formula:

round(cpu related cost/total cost) x 100 = round(26/93 x 100) = 28.

So having a faster (or possibly slower) CPU when performing a hardware upgrade/change can result in potentially different execution plan costings (and as such different execution plans) when using the CPU CBO costing model.

It’s called the CPU costing model for a reason and as one would indeed hope, the speed of said CPU(s) can directly impact the associated costs and decisions made by the CBO.

The CPU Costing Model: A Few Thoughts Part III (Bang Bang)December 21, 2009

Posted by Richard Foote in CBO, Multiblock Reads, System Statistics.

One of the advantages of system statistics and the CPU costing model is in how the CBO deals with multiblock reads and the impact of the db_file_multiblock_read_count parameter.

When performing a full table (or fast full index) scan, the CBO needs to determine just how many multiblock read operations are likely to be performed so the associated operation can be costed correctly. The CBO simply takes the number of blocks to be read (e.g. for a table, BLOCKS in dba_tables), and divides this by the “effective” multiblock read count.

With the I/O costing model, the effective multiblock read count value is derived directly from the db_file_multiblock_read_count parameter. However, the CBO doesn’t use the actual db_file_multiblock_read_count value as the CBO knows that most multiblock read operations are unlikely to read the maximum possible number of  blocks.

The CBO takes this into consideration and therefore doesn’t simply divide the blocks in the table by the full db_file_multiblock_read_count value. It uses a “fudged” or an “adjusted” multiblock read value in its calculations. Jonathan Lewis discusses this very well in his Cost-Based Oracle Fundamentals book and lists how the adjusted multiblock read value for the following examples is adjusted:

db_file_multiblock_read_count of 8 is adjusted to 6.59
db_file_multiblock_read_count of 16 is adjusted to 10.40
db_file_multiblock_read_count of 32 is adjusted to 16.39
db_file_multiblock_read_count of 64 is adjusted to 25.84

System Statistics and the CPU costing model simplifies and addresses some of these issues.

The first point is that the CBO no longer uses a blind guess based on the size of the db_file_multiblock_read_count parameter, but rather an “educated” guess based on the actual average size of multiblock read operations in the specific database environment. The MBRC system statistic is the actual average size of a multiblock read operation during the period in which system statistics are collected.

Like any average or single generic figure, it will not always be right but at least it’s based on the actual average multiblock read size in the specific database environment. You can’t really ask more from an average figure than to use the actual average figure in your environment.

However, the other important point is that simply increasing the value of the db_file_multiblock_read_count parameter is no longer quite so dangerous as it will not now directly impact the costs of multiblock read operations if indeed increasing the parameter has no such or very minimal effect. Only if the average MBRC is actually changed and reflected in any updated system statistics will the subsequent CBO costs of FTS be adjusted.

Therefore, you can potentially apply one of 2 strategies when setting the db_file_multiblock_read_count parameter with the CPU costing model:

1) Simply set it to the highest supported value in your environment and get the “biggest bang for your buck” while the actual average MBRC is automatically captured by the system statistics and used for costing purposes by the CBO, or

2) Simply leave the parameter unset and let Oracle automatically determine and set the parameter to the largest physical read size supported by your environment (recommended)

Either way, you allow the multiblock reads to be as large and efficient as possible, you allow the actual average multiblock read size to be as large and efficient as possible, but the CBO will only use the actual average multiblock read that is achieved in your database environment when determining the cost of FTS operations. This means multiblock reads will be as large and as efficient as possible but will likely be costed appropriately by the CBO.

Not only are the I/O costs relating to multiblock reads likely to be more accurate in general as a result of using system statistics, but just as importantly, they also take into consideration the actual CPU related costs of such operations and automatically incorporate these into the final costs as well.

To be discussed next.

So in summary, when using CPU costing model, don’t set the db_file_multiblock_read_count parameter, let Oracle determine the maximum optimal size for you and ensure the system statistics are accurate and reflect the actual average MBRC size in your database environment.

The CPU Costing Model – A Few Thoughts Part IIDecember 14, 2009

Posted by Richard Foote in CBO, OPTIMIZER_INDEX_COST_ADJ, Oracle Cost Based Optimizer, Oracle Indexes, System Statistics.

As previously discussed, the formula used by the CBO using the CPU costing model is basically:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

When determining the multiblock I/Os costs associated with a FTS, the CBO basically:

– determines the number of multiblock operations (blocks in dba_tables / mbrc system statistic)

– then multiplies this out by the average wait time of a multiblock I/O (mreadtim system statistic)

to determine the total wait time for all expected multiblock read operations.

-This total wait time of all multiblock read operations is then finally divided by the average wait time for a single block I/O (sreadtim system statistic) to express the final cost in units of single block I/Os.

Remember these average wait times associated with both single and multiblock I/Os are actual wait times for these events as experienced in the specific database environment and captured during the collection of system statistics.

Therefore, the formula automatically takes into consideration and incorporates into the calculations any discrepancies and differences in wait times between a single and a multiblock I/O.

For example, if a multiblock I/O actually takes (say) 10ms to perform on average, while a single block I/O only takes (say) 5ms to perform on average, then the formula will automatically make the costs of performing multiblock reads to be twice as expensive as the costs associated with performing the single block reads as performed by index scans.

These discrepancies in costs and trying to make a level playing field when comparing the multiblock I/Os costs associated with FTS vs. the single block I/Os costs associated with index scan is precisely what the optimizer_index_cost_adj parameter was designed to addressed.

Rather than treat both types of I/Os as being the same, which is the default behaviour with the I/O costing model, the optimizer_index_cost_adj parameter is designed to adjust the single block read costs to ensure that they are indeed costed as being (say) 1/2 the cost as that of a typical multiblock I/O.

However, when using the CPU costing model, the optimizer_index_cost adj parameter is effectively redundant as the necessary adjustments are already incorporated into the final costs. The total time required to perform a multiblock read operation is divided by the time it takes on average to perform a single block read operation. Using the optimizer_index_cost_adj parameter, although supported and permissible, will likely result in the final CBO costs being adjusted inappropriately as the index related single block I/Os will “double-dip” and potentially reduce both as a result of the system statistic differences between sreadtim and mreadtim and also as a result of the optimizer_index_cost_adj parameter as well.

The system stats are much preferred provided they’re accurate and kept reasonably up to date, because one doesn’t need to “manually” change any associated database parameter.

Not only are the comparative differences between sreadtim and mreadtim maintained, but so are other useful system statistics such as the mbrc statistic to be discussed next.

So in summary, when using the CPU costing model, do not set the optimizer_index_cost_adj parameter at all. Leave it alone, collect representative system statistics and let the system statistics look after the comparative costs between single and multiblock I/Os for you automatically.

The CPU Costing Model: A Few Thoughts Part IDecember 8, 2009

Posted by Richard Foote in CBO, Oracle Cost Based Optimizer, System Statistics.

In the coming days, I’ll post a series of little entries highlighting a specific point in relation to the use of system statistics and the CPU cost model. In my previous post, we looked at how the cost of a FTS is calculated using the CPU costing model and how this generally results in an increase in the associated FTS cost over the I/O costing model.

The table in my demo though had an index with an appalling clustering factor and even though the cost of the FTS increased substantially from 33 to 70, this cost was still significantly less than the large cost associated with using such an inefficient index. So in that specific example, the change of FTS costs as introduced with the CPU costing model made no difference to the final execution plan.

The key point I want to emphasise with this post,  is that by increasing FTS costs as is common with the CPU costing model over the I/O costing model, this can of course potentially result in entirely different execution plans, especially if a candidate index has a reasonable clustering factor. Substantially increasing the associated costs of a FTS can be very significant, especially where the difference in costs between a FTS and an index can be much narrower for well clustered indexes.

In this previous I/O Costing Model example using the BOWIE_STUFF2 table, the index had an excellent clustering factor. However the query resulted in a FTS as the cost of 65 was just a little less than using an associated index:

SQL> select * from bowie_stuff2 where id in (20, 30, 40, 50, 60);
10000 rows selected.

Execution Plan
———————————————————-
Plan hash value: 573616353
——————————————————————
| Id | Operation | Name | Rows | Bytes | Cost |
——————————————————————
| 0 | SELECT STATEMENT | | 10000 | 175K| 65 |
|* 1 | TABLE ACCESS FULL| BOWIE_STUFF2 | 10000 | 175K| 65 |
——————————————————————

Remember, this was “addressed” and the CBO started using the index, by manually adjusting the optimizer_index_cost_adj parameter from its default value to a value of 75 as explained in this previous post on the effects of the optimizer_index_cost_adj parameter.

However, with system stats and the use of the CPU costing model, the extra FTS cost can have a direct impact on the resultant execution plan. Running the same query again, but this time without changing any optimizer parameters and using the same system stats as in my last post on the CPU Costing Model:

PNAME    PVAL1
-------- -----
CPUSPEED  1745
MBRC        10

SQL> select * from bowie_stuff2 where id in (20, 30, 40, 50, 60);

10000 rows selected.

Execution Plan
———————————————————-
Plan hash value: 2964430066

———————————————————————————————–
| Id  | Operation                    | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
———————————————————————————————–
|   0 | SELECT STATEMENT             |                | 10000 |   175K|       69(2)| 00:00:01 |
|   1 |  INLIST ITERATOR             |                |       |       |            |          |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF2   | 10000 |   175K|       69(2)| 00:00:01 |
|*  3 |    INDEX RANGE SCAN          | BOWIE_STUFF2_I | 10000 |       |       25(0)| 00:00:01 |
———————————————————————————————–

We notice that the CBO has now chosen the index automatically, without having to make any changes to the optimizer_index_cost_adj parameter at all.

Previously, the FTS costs were 65. However, the current costs for a FTS are at least:

(ceil(blocks/mbrc) x mreadtime) / sreadtime = (ceil(659/10) x 10) / 5 = 132.

132 is already way greater than the 69 cost associated with using the above index and the 132 cost doesn’t even take into consideration any additional costs related to CPU usage.

So in general, using the CPU costing model will likely increase the associated costs of FTS, making indexes automatically more “attractive” to the CBO as a result. This change alone in how the FTS in particular is costed using the CPU costing model can have a major impact in execution plans chosen by the CBO. This is but one of the key reasons why things can change so dramatically when moving from 9i to 10g where the CPU costing model is the default.

The CBO CPU Costing Model: Indexes vs. Full Table ScansNovember 25, 2009

Posted by Richard Foote in CBO, Full Table Scans, Oracle Indexes, System Statistics.

As previously promised, I thought I might look at how the CBO goes about costing a Full Table Scan (FTS) with system statistics and the CPU costing model, so we can understand why the CBO may have chosen one option over the other.

WARNING: You might need to grab a calculator to help you along 🙂

To illustrate, I’m simply going to use the original BOWIE_STUFF table and index setup I created in my earlier Introduction to the CBO. I’ll however recreate the demo here again from scratch to refresh your memory:

I first create a table that has 100,000 rows, with an indexed “ID” column that has 100 distinct, evenly distributed values. For those mathematically challenged, this means each distinct value will return 1000 rows.

SQL> CREATE TABLE bowie_stuff AS SELECT (mod(rownum,100)+1)*10 id, 'Ziggy Stardust' name FROM dual CONNECT BY LEVEL <= 100000;

Table created.

SQL> CREATE INDEX bowie_stuff_i ON bowie_stuff(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> select blocks from dba_tables where table_name='BOWIE_STUFF';

BLOCKS
------
329

Note the table has 329 blocks. It’s a number I’ll refer to a number of times throughout.

SQL> SELECT index_name, blevel, leaf_blocks, clustering_factor FROM user_indexes WHERE index_name = 'BOWIE_STUFF_I';

INDEX_NAME    BLEVEL
------------- ------
BOWIE_STUFF_I      1

LEAF_BLOCKS CLUSTERING_FACTOR
----------- -----------------
207             32900

Note also that the index has a blevel of 1, 207 leaf blocks and a rather poor Clustering Factor (CF) of 32900, not close at all to the number of blocks in the table. As we’ll see, the CF is so bad that the CBO will choose a FTS over the index.

SQL> show parameter db_file_multi

NAME                          VALUE
----------------------------- -----

Note the db_file_multiblock_read_count is manually set to 16. Relevant when calculating the cost of a FTS with the I/O costing model. Less so with CPU costing in use as we’ll see.

Finally, if we look at the system statistics in place:

SQL> select pname, pval1 from sys.aux_stats\$ where pname in ('SREADTIM', 'MREADTIM', 'MBRC', 'CPUSPEED');

PNAME    PVAL1
-------- -----
CPUSPEED  1745
MBRC        10

All of these values will be relevant when calculating the cost of a FTS with the CPU costing model.

OK, we now have all the information we need to determine how the CBO will treat both index and FTS activities on this table.

Let’s start by refreshing ourselves with how the I/O based CBO model will deal with such a scenario.

SQL> alter session set "_optimizer_cost_model" = io;

Session altered.

OK, let’s just run a simple query that selects data for a specific ID. Remember, there are 100 evenly distributed distinct IDs so this query will return 1% of the data (1000 rows):

SQL> set autotrace traceonly
SQL> SELECT * FROM bowie_stuff WHERE id = 420;

-----------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost  |
-----------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |       |
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |       |
-----------------------------------------------------------------

Note: the CBO has decided to use a FTS to select the 1% of rows as it has the lowest associated cost.

As previously discussed, the cost of using the index is approximately:

index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

= 1 + (207 x 0.01) + (32900 x 0.01) = 1 + 3 + 329 = 333

Note: the 1 for the blevel can be dropped by the CBO bringing the cost down to 332, to be discussed another time .

As previously discussed, the FTS cost is approximately:

segment header I/O + ceil(table blocks/fudged mbrc value)

Note: for a db_file_multiblock_read_count of 16, the adjusted, “fudged” value used by the CBO is approximately 10.4.

Therefore, for the above example, the FTS cost is calculated as:

= 1 + ceil(329/10.4) = 1 + 32 = 33

33 is significantly less than 333 so the FTS easily wins out.

So how do things change when using System Statistics and the CPU costing model ? How does the CBO calculate the cost of the FTS with the above system statistics in place ?

As I’ve previously discussed, the significant change with the CPU costing model is not so much how it impacts the cost of index accesses but that of the FTS.

Let’s run the same SQL, but this time using the CPU costing model:

SQL> alter session set "_optimizer_cost_model" = cpu;

Session altered.

SQL> SELECT * FROM bowie_stuff WHERE id = 420;

1000 rows selected.

--------------------------------------------------------------------------------
| Id  | Operation         | Name        | Rows  | Bytes | Cost (%CPU)| Time    |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |             |  1000 | 18000 |     70(5)  | 00:00:01|
|*  1 |  TABLE ACCESS FULL| BOWIE_STUFF |  1000 | 18000 |     70(5)  | 00:00:01|
--------------------------------------------------------------------------------

Note: CBO is still picking the FTS as the CF is truly awful. However the cost of the FTS has increased significantly from 33 to 70, although nowhere near the approximate 333 cost of using the index.

So why has the FTS cost increased and how is this new cost calculated ?

As previously discussed , the CPU costing formula is basically:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

If we first focus on the single block I/O portion of the formula, the only single block read considered by the CBO during a FTS is the one associated with reading the segment header. Note that the average wait time for a single block read is the SREADTIM system statistic.

If there’s just the one single block I/O, the single block read portion of the formula effectively equates to (1 x sreadtim) / sreadtim, which just equals 1. So 1 is basically added to the cost with regard to reading the segment header as it is with the I/O costing model.

OK, lets next look at the portion of the formula with regard to multiblock I/Os.

The sum of all the multiblock I/Os is calculated in a similar manner as it was with the I/O costing model. It’s simply the number of blocks in the table below the HWM (329 in our example) but this time divided by the MBRC system statistic. Note however the MBRC statistic isn’t some fudged, somewhat arbitrarily set figure based on the db_file_multiblock_read_count parameter, but the actual average size of multiblock I/Os in the specific database environment. Note also that the average wait time for a multiblock read is the MREADTIM system statistic.

So the total wait time for all multiblock reads in the above example is:

sum of all the multiblock I/Os x average wait time for a multiblock I/O = (BLOCKS/MBRC) x MREADTIM = ceil(329/10) x 10 = 330.

This value is then divided by the average wait time for a single block read (the SREADTIM system statistic) to give the overall cost of multiblock reads, but expressed in units of single block I/Os.

The total cost for multiblock I/Os is therefore:

((BLOCKS/MBRC) x MREADTIM)/ SREADTIM = 330/5 = 66.

So the total costs associated for all I/Os is the 1 for reading the segment header plus 66 for all the multiblock reads = 67.

However, the cost of the FTS is 70, not 67. Where does the additional cost of 3 come from ?

Well, that’s the CPU portion of the formula. The CBO has determined that the FTS operation will require ‘x’ number of CPU cycles and this value is then divided by the CPUSPEED to determine how long this CPU activity will take.

This CPU elapsed figure is then again divided by the average wait of a single block read (SREADTIM) to also put the CPU costs in units of single block reads. In this example, the total CPU related costs amount to 3.

Oracle gives us an indication of what the CPU component is in the overall cost within the execution plan via the %CPU value (which is 5 in the above execution plan). The (%CPU) value is the ceil of the overall percentage of CPU costs as calculated by the following formula:

%CPU = ceil(CPU related costs/overall costs)

So in our example, %CPU = ceil(3/70 x 100) = ceil(4.29) = 5% (as indeed displayed in the above execution plan).

Again, all the costs associated with a FTS with the CPU costing model can be derived and make some kinda sense. Providing all the necessary inputs are all actually correct and valid, the CBO will indeed correctly decide to use a FTS over an index when it’s the less expensive option.

I’ll next expand these points and why understanding how these costs are derived can be extremely useful.

You can now put your calculators away 🙂

The CBO CPU Costing Model and Indexes – Another IntroductionSeptember 16, 2009

Posted by Richard Foote in CBO, Index statistics, Oracle Indexes, System Statistics.

I’ve previously discussed some basic concepts and formulas regarding how the CBO derives index related costings via the I/O costing model. Time to look at system statistics and the CPU costing model with specific regard to indexes.

The first point I would make is that the CPU costing model has some significant improvements over the older I/O costing method and I would strongly recommend adopting the CPU costing model where possible. I’ll explain some of these improvements and advantages over the coming posts.

The I/O costing model basically looks at the cost of a specific execution plan in terms of the estimated number of physical I/Os. The less I/Os, the less costly and more efficient the execution plan and the faster the expected response times. There are however a number of short falls with this basic I/O costing strategy in that is doesn’t automatically differentiate between the costs associated with different types of I/Os (eg. between single block and multiblock reads), it doesn’t automatically determine a typical or average size of a multiblock I/O and it doesn’t cost and take into consideration the time and overheads associated with likely CPU resources.

The CPU costing model attempts to take into consideration these previous limitations. It automatically takes into consideration discrepancies between the time to complete an average single block I/O versus a multiblock I/O, automatically determines the average size of a multiblock I/Os so it can more accurately determine the likely number of multiblock I/Os in a FTS and automatically determines the expected CPU time for a specific task.

To use the CBO CPU costing model, one needs to collect system statistics so that CBO has this additional information, based on the actual system hardware characteristics (Note: since 10g, the hidden parameter _optimizer_cost_model defaults to ‘cpu’ and so is used by default). You do this with the dbms_stats.gather_system_stats procedure. You can collect “Noworkload” statistics in which Oracle basically randomly reads the database data files to determine base statistics such as the average I/O seek time, the average I/O transfer speed and the CPU speed. However,  I would rather recommend the collection of “Workload” stats which are based on the actual workload characteristics of your hardware, based on the real load on your system during the time in which system statistics are gathered (in which case Noworkload statistics are simply ignored).

You can gather Workload system statistics by either running:

dbms_stats.gather_system_stats(‘START’) to start the system stats collection process followed by dbms_stats.gather_system_stats(‘STOP’) to stop the collection process over a typical, workload period, or

dbms_stats.gather_system_stats(‘INTERVAL’, interval=> 120) to say collect system workload stats over a 120 minute period.

To view the collected system statistics, query SYS.AUX_STATS\$.

SQL> SELECT pname, pval1 FROM SYS.AUX_STATS\$
WHERE pname IN (‘SREADTIM’, ‘MREADTIM’, ‘MBRC’, ‘CPUSPEED’);

PNAME             PVAL1
------------ ----------
CPUSPEED           1745
MBRC                 10

The four systems statistics that I’ll focus on for now are:

SREADTIM – time in milliseconds for a single block I/O
MREADTIM– time in milliseconds for a multiblock I/O
CPUSPEED – million of CPU cycles per second
MBRC – average number of blocks actually read during multiblock read operations

In the above figures, just note therefore that a multiblock read on average takes approximately double the time of that of a single block read and that on average, 10 blocks are read during a multiblock read operation. This provides the CBO with vital information regarding how to now cost and compare potential execution plans.

The CBO CPU costing model basically looks at the total time required to complete an execution plan by summing:

total time to complete all single block I/O activity +
total time to complete all multiblock I/O activity +
total time to complete all the CPU activity

This can basically be calculated by:

sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second

In theory, this should provide the total response time to service an execution plan. However, to keep the actual “cost” figures calculated by the CBO consistent with the I/O costing model, the CBO divides this total time by the average time for a single block I/O, such that the full formula becomes:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the multiblock I/Os x average wait time for a multiblock I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

The final “cost” figure, even with the CPU costing model, is therefore still expressed in units of single block I/Os. This is an important point …

So how does the CBO determine the value of the various figures within this formula ? Well as we’ll see, the CBO get’s the required information both from the system statistics and from the costing formulas previously discussed with the I/O costing model.

However, for index related access paths, there’s some good news regarding being able to simplify matters somewhat.

The first bit of good news is that from the perspective of an index access path, there are no multiblock I/Os (except for a Fast Full Index Scan) and so the CPU costing formula can be simplified for indexes to remove the multiblock read component and be just:

(sum of all the single block I/Os x average wait time for a single block I/O +
sum of all the required CPU cycles / CPU cycles per second)
/
average wait time for a single block I/O

Secondly, if the CPU component is relatively trivial, it may not be sufficient enough to count towards the final cost. As smaller index scans are likely to consume little CPU, it means the CPU component can also generally be ignored. This reduces the formula for such index scans to just:

(sum of all the single block I/Os x average wait time for a single block I/O)
/
average wait time for a single block I/O

However, the average wait time for a single block I/O now becomes redundant in this simplified equation, reducing the cost to now be just:

sum of all the single block I/Os

Well the next bit of good news for those that have followed my previous blog entries with regard to the CBO and Indexes is that the previous formulas regarding the I/O costing model are still applicable when determining the sum of all expected I/Os. The sum of all the single block I/Os associated with an index scan is still basically:

sum of all the single block I/Os = index blevel + ceil(index selectivity x leaf blocks) + ceil(table selectivity x clustering factor)

In other words, for smaller index scan execution plans, the cost calculated by CBO using the CPU costing model is the same as with the I/O costing model. So no, I wasn’t wasting everyone’s time discussing the various formulas using the older I/O costing model 🙂

If we run the same demo as I ran previously in my initial post regarding the CBO and Indexes where the total cost of the index access plan was 18, but this time using the system statistics listed above:

SQL> alter session set “_optimizer_cost_model” = cpu;

Session altered.

SQL> SELECT * FROM bowie_stuff2 WHERE id = 420;

2000 rows selected.

Execution Plan
———————————————————-
Plan hash value: 134336835

——————————————————————————–
|Id|Operation                   |Name          |Rows|Bytes|Cost (%CPU)|Time    |
——————————————————————————–
| 0|SELECT STATEMENT            |              |2000|36000|   18   (0)|00:00:01|
| 1| TABLE ACCESS BY INDEX ROWID|BOWIE_STUFF2  |2000|36000|   18   (0)|00:00:01|
|*2|  INDEX RANGE SCAN          |BOWIE_STUFF2_I|2000|     |    9   (0)|00:00:01|
——————————————————————————–

We notice that the cost remains exactly the same at 9 for the index range scan component and exactly the same at 18 for the total execution plan when comparing the cost of using the IO costing model vs. the CPU costing model. Introducing system statistics hasn’t changed things for this particular index related execution plan.

And this is a very common observation. As indexes use single block I/Os, as the CBO cost remains as a unit of single block I/Os and as CPU consumption for an index scan is often trivial, the resultant costs for index access paths often remain unchanged with the CPU costing model.

Previously, we looked at how changing parameters such as the optimizer_index_cost_adj impacts the costings of index related execution plans to create a level playing field between index and FTS execution plans.

The key point to make with regard to system statistics and the CPU costing model is that in general, the system statistics and the associated formula will automatically ensure a level playing field. However, unlike the optimizer parameters, it will do so by typically adjusting the associated costs of the FTS (rather than the index accesses) as the true costs and wait times associated with multiblock FTS are calculated, but are divided by and expressed in units of single block reads.

So rather than decreasing the associated costs of an index access path, system statistics and the CPU costing model will typically create a level playing by automatically increasing the associated costs of a FTS as appropriate.

To be discussed further …