The CPU Costing Model: A Few Thoughts Part V (Reality) January 13, 2010Posted by Richard Foote in CBO, Multiblock Reads, Oracle Cost Based Optimizer, Richard's Musings, System Statistics.
There’s plenty more I could talk about regarding the CBO CPU costing model and system statistics but I’ll make this my final little comment on this subject for now.
As previously discussed, the CPU costing model basically takes the time it takes to perform the all necessary I/O related activities and all the time it takes to perform all necessary CPU related activities and adds them together to get the overall time to complete a task. The CBO then takes this total and divides it by the average time to perform a single block I/O so that it expresses the overall costs in units of single block I/Os.
There are two advantages with expressing CBO costs in this manner.
Firstly, it makes the move from the old I/O costing model a little easier in that the “units” of cost under both CBO costing models is very similar.
With the I/O costing model, the unit of cost was also basically the number of I/Os. It’s just that the CBO made no (automatic) distinction between the I/O costs associated with single and multiblock reads. The cost was simply the expected total number of I/Os for a given execution plan, with single block and multiblock I/Os being consider the same (unless the optimiser_index_cost_adj parameter kicked in).
With the CPU costing modelling, the costs are expressed specifically in units of single block I/Os. However, the CBO automatically takes into consideration and differentiates the relative costs associated with multiblock I/Os (and CPU operations) and incorporates them automatically into the final cost.
The other nice advantage is that one can use the actual cost values as an indication of how long an operation or execution plan is likely to take. The overall execution times of the plan are divided by the average time of a single block I/O when using the CPU costing formula. Therefore by multiplying these cost values out again by the average time of a single block I/O (SREADTIM system statistic), one can have an indicative idea of the overall expected execution time.
The overall execution times as estimated by the CBO using the CPU costing model is therefore basically = cost of execution plan multiplied by SREADTIM system statistic.
Using my previous example with the FTS where the overall cost of the execution plan was 70, and the SREADTIM system statistic was 5:
the overall execution time as estimated by the CBO is approximately 70 x 5 = 350 ms.
Now this of course is only an indicative value. As all system related statistics are simply averages, there could obviously be discrepancies with how long specific I/Os take to actually perform, the size and number of specific multiblock read operations, etc. There may also be caching characteristics of objects that may influence the actual number of physical reads and associated wait times, it doesn’t take into consideration time taken to actually return data to the “client”, etc. etc. etc.
However, it provides one with a rough “ballpark figure”. If the actual executions times in the above example were (say) 20 seconds, then it’s a strong indication that the CBO may have got it wrong, that it may have calculated the wrong cost and maybe as a result the wrong execution plan. Somewhere, something such as the segment statistics, the system statistics, optimizer parameters, etc. may be inaccurate and is causing the CBO to get its costings incorrect.
The CBO cost value doesn’t compare well to reality and so is perhaps worthy of further investigation.
The cost values associated with CPU costing model is not some random, ambiguous, mysterious number but a value that can often be derived and which can be most useful in determining and resolving problematic SQL statements and execution plans.
The CPU Costing Model: A Few Thoughts Part III (Bang Bang) December 21, 2009Posted by Richard Foote in CBO, Multiblock Reads, System Statistics.
add a comment
One of the advantages of system statistics and the CPU costing model is in how the CBO deals with multiblock reads and the impact of the db_file_multiblock_read_count parameter.
When performing a full table (or fast full index) scan, the CBO needs to determine just how many multiblock read operations are likely to be performed so the associated operation can be costed correctly. The CBO simply takes the number of blocks to be read (e.g. for a table, BLOCKS in dba_tables), and divides this by the “effective” multiblock read count.
With the I/O costing model, the effective multiblock read count value is derived directly from the db_file_multiblock_read_count parameter. However, the CBO doesn’t use the actual db_file_multiblock_read_count value as the CBO knows that most multiblock read operations are unlikely to read the maximum possible number of blocks.
This is due to a number of factors. For example a multiblock read operation can’t span across extent boundaries. The more common reason however is that Oracle will not read a block into the buffer cache that is already cached and so will split a multiblock read operation at the point where a block is already in cache. Therefore, if the db_file_multiblock_read_count is set to (say) 16 blocks, many of the actual multiblock read operations may read something less than the maximum 16 blocks if some of the blocks in the table are already cached. So if within the next 16 possible blocks to be read, the 6th block is already in cache, the multiblock read will only consist of the 5 uncached blocks and the next mutliblock read will commence from the 7th block.
The CBO takes this into consideration and therefore doesn’t simply divide the blocks in the table by the full db_file_multiblock_read_count value. It uses a “fudged” or an “adjusted” multiblock read value in its calculations. Jonathan Lewis discusses this very well in his Cost-Based Oracle Fundamentals book and lists how the adjusted multiblock read value for the following examples is adjusted:
db_file_multiblock_read_count of 8 is adjusted to 6.59
db_file_multiblock_read_count of 16 is adjusted to 10.40
db_file_multiblock_read_count of 32 is adjusted to 16.39
db_file_multiblock_read_count of 64 is adjusted to 25.84
Now these adjusted values are nothing more than globally implemented “guesses”, adjusted with the assumption that the larger the multiblock read operation, the more likely a block will actually be cached and the less likely the maximum possible multiblock read is going to be actually performed. Not only are these adjusted values simply guesses, but they’re derived directly from the db_file_multiblock_read_count parameter. By increasing this parameter, you directly impact the actual costs associated with FTS as the value by which to divide the number of blocks in the segment is impacted. Adjusting this parameter needs to be very carefully considered as it not only changes the number of blocks that might be read per multiblock read but the associated CBO costs as well. Simply increasing the db_file_multiblock_read_count blindly might lead Oracle to start favouring FTS operations inappropriately, as the increase might not actually result in fewer, more efficient multiblock reads but it will result in lower FTS costs.
System Statistics and the CPU costing model simplifies and addresses some of these issues.
The first point is that the CBO no longer uses a blind guess based on the size of the db_file_multiblock_read_count parameter, but rather an “educated” guess based on the actual average size of multiblock read operations in the specific database environment. The MBRC system statistic is the actual average size of a multiblock read operation during the period in which system statistics are collected.
Like any average or single generic figure, it will not always be right but at least it’s based on the actual average multiblock read size in the specific database environment. You can’t really ask more from an average figure than to use the actual average figure in your environment.
However, the other important point is that simply increasing the value of the db_file_multiblock_read_count parameter is no longer quite so dangerous as it will not now directly impact the costs of multiblock read operations if indeed increasing the parameter has no such or very minimal effect. Only if the average MBRC is actually changed and reflected in any updated system statistics will the subsequent CBO costs of FTS be adjusted.
Therefore, you can potentially apply one of 2 strategies when setting the db_file_multiblock_read_count parameter with the CPU costing model:
1) Simply set it to the highest supported value in your environment and get the “biggest bang for your buck” while the actual average MBRC is automatically captured by the system statistics and used for costing purposes by the CBO, or
2) Simply leave the parameter unset and let Oracle automatically determine and set the parameter to the largest physical read size supported by your environment (recommended)
Either way, you allow the multiblock reads to be as large and efficient as possible, you allow the actual average multiblock read size to be as large and efficient as possible, but the CBO will only use the actual average multiblock read that is achieved in your database environment when determining the cost of FTS operations. This means multiblock reads will be as large and as efficient as possible but will likely be costed appropriately by the CBO.
Not only are the I/O costs relating to multiblock reads likely to be more accurate in general as a result of using system statistics, but just as importantly, they also take into consideration the actual CPU related costs of such operations and automatically incorporate these into the final costs as well.
To be discussed next.
So in summary, when using CPU costing model, don’t set the db_file_multiblock_read_count parameter, let Oracle determine the maximum optimal size for you and ensure the system statistics are accurate and reflect the actual average MBRC size in your database environment.