jump to navigation

Oracle OpenWorld – Day 5 Hightlights October 16, 2009

Posted by Richard Foote in Oracle OpenWorld, Richard's Musings.
trackback

Started the day desperate again for a coffee at the OTN lounge.

I was looking forward to catching Kevin Closson at the Unconference but unfortunately the session was cancelled at the last minute. So spent some time having a nice chat with Chris Muir and Tim Hall who I had the pleasure to meet when I attended the ODOUG conference in Monterey earlier in the year.

I then attended an excellent roundtable discussion forum with the Oracle folks who write and manage the Oracle Cost Based Optimizer. Some excellent questions (including mine on the use of AUTO with method_opt 😉  ), although I wasn’t totally convinced with the answer given. However, I had to clap and clap rather loudly when they made special mention of not setting the OPTIMIZER_INDEX_COST_ADJ paramater and recommending rather strongly not to touch this parameter at all. Something I’ve been suggesting for quite some time. I also got some encouraging feedback on resolving a couple of  key issues I have with statistics gathering:

1) A better way of collecting the clustering factor statistics for indexes so that Oracle can remember more than just the last block it visited

2) The ability to collected extended statistics that span more than one table

Both are being actively looked at. However, the highlight for me was the fact they were playing “The Best Of Bowie” album after the presentation, just for me !!

Also had the pleasure of meeting Christian Antognini in person who’s writings I greatly respect.

My final session of the day and of the conference was a round table discussion with the Oracle Real-World Performance folks, including Graham Wood who I had the pleasure to meet earlier in the conference and Greg Rahn who I also met today for the first time. Again, a very good session with some very good questions and answers, especially the question of when should one rebuild an index and the answer that one should only do so in special cases. The fact that these guys promote this best practice in a large public forum such as this (although I was picked out by the panel from the audience that perhaps I should be up there answering this question !!), can only help in getting the message out there.

And so it ends. I had a fantastic time here at Oracle OpenWorld, met some fantastic people and made still more Oracle friends, but I’m now really looking forward to getting home after over 3 weeks on the road.

Comments»

1. josh - October 17, 2009

Hi richard,

can you share with me more clustering factor statistics that was discussed

Like

Richard Foote - October 19, 2009

Hi Josh / Alberto

The way I describe this issue is that the stats gathering function used by Oracle to calculate the clustering factor of an index is somewhat like a goldfish. It only has very limited memory and can only remember the last table block visited when it comes to not incrementing the CF. Even if Oracle visits a new table block that was visited just before the previous table block, Oracle goes nope, haven’t seen this block before, it’s not the same as the last table block as so increments the CF.

This is a fundamental flaw in the calculation of the CF. Because table blocks only recently visited (and recently visited might be say 5, or 10 or even 20 blocks ago) are almost certainly likely to still be cached in memory during an index scan and yet the CBO considers reading such blocks again as physcial I/Os in it’s calculations.

Features designed to prevent contention issues (such as freellists, freelist groups, ASSM, etc.) have the effect of ensuring recently inserted rows go into a small group of table blocks but has the nasty side effect of destroying the CF of an index as such rows are no longer sequenced precisely in the table blocks.

My recommendation and what the CBO team have taken on board and are actively investigating is to simply change the default “history” value of a recently accessed block when claculating the CF from 1 to are more suitable value (such as 5 or 7 or 10 or whatever). This is a relatively simply change to implement (although there are of course large testing considerations) but it means for many tables that are actually relatively well clustered but currently have a terrible CF, the function will calculate a far more “accurate” CF from the CBO costing perspective.

Like

Marcin Przepiorowski - October 19, 2009

Hi Richard,

Did they give you any time line ? Oracle 12g ?

regards,
Marcin

Like

josh - October 19, 2009

Hi Richard,

Is this something related to the sys_op_countchg function

Like

Richard Foote - October 25, 2009

Hi Josh

Yes, sys_op_countchg is the function that needs some tweaking IMHO (or at least the default parameters that are used).

Like

2. Alberto Dell'Era - October 17, 2009

Same here 🙂

Like

3. Alberto Dell'Era - October 19, 2009

Thanks Richard – I see the advantage, especially since the run-time engine does not necessarily accesses the blocks following strictly the index-key order, but it might collect the rowids in “batches”, order them, and then accesses the blocks, thus effectively minimizing the number of (consistent) gets and physical gets as well.

Since the most common SQL operation on indexes is the equality filter predicate, one possible alternative might be to introduce a “equality clustering factor”, that is, to calculate the number of distinct block ids belonging to each key value, and average them. Maybe one set: one for the first column, one for the first and the second, etc. to cope better with equality predicates referencing just some of the leading columns.

Reasoning such as this one can be found in Jonathan Lewis’ “Cost Based Oracle”, chapter about the clustering factor. I have not the book handy so I can’t check whether He said the very same thing or something similar; for sure I remember that He suggested collecting a set of clustering factor (I remember him referencing sys_op_countchg by the way).

Like

Richard Foote - October 25, 2009

Hi Alberto

Good thoughts. There are a number of possible alternatives like those you’ve mentioned or simply pick a historical value based on the average keys per leaf blocks (asd Oracle can sort all the rowids in a leaf block and determine average distinct table blocks visits per leaf block) or simplier still just pick a single value higher than 1 such as 5 or 10 or 42 or whatever, which would likely provide a better, more realistic CF value as table blocks referenced within these times during an index scan are likely to be cahced in most databases.

Like

4. Richard Foote - October 25, 2009

Hi Marcin

No timeline other than it’s an issue they’re aware of and are pursuing.

Like


Leave a comment