jump to navigation

Merry Christmas + A Little Present Of Sorts December 23, 2008

Posted by Richard Foote in Richard's Musings.
7 comments

I just want to wish all my readers a very Merry Christmas and a wonderful, index rebuild free New Year !!

I have a number of topics in the pipeline for 2009, including a look at Index Organized Tables, a discussion on the SYS_OP_COUNTCHG function and how Oracle uses it to calculate the Clustering Factor (not very well in some cases) and the systematic destruction of a number of index related myths that just refuse to go away. So beware all you little myth spreaders out there …   😉

2008 has been a great year for me personally with lots happening. However, the biggest and most wonderful surprise of all was not being awarded an Oracle ACE Director, but last night when I was awarded the The World’s Greatest Business Mind. I’m so very honoured and humbled by this award and I would like thank everyone who made it all possible, especially mum and dad.

I’m going to celebrate by enjoying some lovely mince pies while listening to David Bowie and Bing Crosby as they perform that Christmas Classic “Little Drummer Boy/Peace On Earth“. Enjoy !!

Have a great New Year everyone 🙂

Richard Foote Blog is One Year Old Today !! (Happy Birthday) December 11, 2008

Posted by Richard Foote in Richard's Musings.
9 comments

Wow, hasn’t the year just flown by !!

It doesn’t seem that long ago when I thought it might be a bit of fun to start a blog and write every so often on Oracle related matters, focusing in particular on trying to address some of those common questions regarding indexes that regularly pop up all the time. And of course mention David Bowie, Radiohead and Pink Floyd every so often.

But that was all 98 posts and exactly 12 months ago to the day.

There have been some really good discussions along the way (1057 comments in total), with some people being great and adding real value while others have highlighted what can happen to you when sadly you don’t manage to keep up with the times 😉

The most popular post has been this one regarding how the behaviour of DBMS_STATS changed between 9i and 10g.

The top searched item to get to the blog is strangely enough “Richard Foote”, with “METHOD_OPT” coming in second place. Obviously statistics is a subject matter worth getting into and discussing more deeply.

Perhaps the most controversial post was this very early one on why I thought (and still think) there are so many Oracle myths. Unfortunately, there are still people out there who have little idea, the most dangerous being those who have little idea they have little idea and those who know they have little idea but try and pretend otherwise !! As I mention in the piece, there are still a few dinosaurs out there who desperately cling to the old, incorrect ways else fear losing credibility. Hopefully, some of my writings over the year have helped to clarify a few points, for those of you of course interested in having a few points clarified.

The most fun post was probably my little April Fool’s joke. Some of the reactions were just hysterical !!

It’s been a really hectic year, with various overseas trips, conferences, seminar presentations and a new Radiohead album to contend with. Next year is looking busy as well but the focus will be on less work related travel and more family fun travel time which I’m really looking forward to. However there’ll no doubt be the opportunity to catch up with many of you at a conference or seminar somewhere on this little planet of ours.

Thanks for all your support and contributions in my first year at this blogging lark, there’s still lots more to discuss I assure you. For a start, Radiohead have been nominated for a whole bunch of 2009 Grammy awards …

Function-Based Indexes and Missing Statistics (No Surprises) December 4, 2008

Posted by Richard Foote in Function Based Indexes, Oracle Cost Based Optimizer, Oracle Indexes.
9 comments

Thought I might mention a common trap associated with Function-Based Indexes following a recent question on the OTN Database Forum.

First of all, considered this “simple” little scenario. You have a column in a table that’s a standard VARCHAR2 field and it has 100 distinct values. Therefore, if you select just one value, assuming even distribution of data, you’ll select approximately 1% of the data.

However, this table can potentially allow mixed case fields, with values such as ‘Bowie’, ‘bowie’, ‘BowiE’, etc.

Therefore, you decide to write a query that first converts all the fields to UPPER case such that you can now compare all possible values of a specific name to say ‘BOWIE’.

Remembering you previously had a selectivity of 1%, what’s the selectively of the column after you’ve converted everything to upper case ?

Ummmm, actually, there’s no easy answer to that. It all depends on how many actual different names you have in your data. In theory, you may now have just 1 distinct value or you may have 100 distinct values, you can’t really tell.

However, what if the function did something more complex and converted the name to some numerical value based on the sum of all the characters via some weird formula. In theory, you could have no distinct values in that everything could get converted to a NULL or you could possibly have as many distinct values as there are rows in the table.

Again, Oracle can’t really tell how many rows may get selected based on the determination of such a function.

When a predicate is used with a function call, Oracle has great difficulty in determining the correct selectivity. Therefore when you create a Function-Based Index, which Oracle assumes you may wish to access, Oracle behind the scenes creates a hidden virtual column on the parent table in order to capture the data characteristics of the function so that the CBO can make an accurate determination of the selectivity associated with using the function.

If Oracle knows the low value, the high value and the number of distinct values of the data associated with the function on a column, it can then accurately determine the associated selectivity and cardinality when the function is used on the column (assuming even distribution of data) and hence calculate an accurate cost and determine whether the use of the function-based index is appropriate.

However, and here comes the trap, when a function-based index is created, Oracle will now (since 10g) automatically calculate the statistics associated with the index (such as the blevel, number of leaf blocks, clustering factor, etc.) but it will NOT calculate the statistics associated with the hidden virtual column as these statistics are associated with the parent table, not directly with the index itself.

Here I create a new function-based index on a table:

SQL> create index pink_floyd_upp_tab_name_i on pink_floyd(upper(table_name));

 

Index created.

 

But there are still no statistics on the hidden virtual column created by Oracle behind the scenes.

 

SQL> select column_name, num_distinct, hidden_column, virtual_column from dba_tab_cols where table_name=’PINK_FLOYD’;

 

COLUMN_NAME     NUM_DISTINCT HID VIR
--------------- ------------ --- ---
OWNER                     74 NO  NO
TABLE_NAME              5739 NO  NO
NUM_ROWS                 886 NO  NO
BLOCKS                   152 NO  NO
SYS_NC00005$                 YES YES

Therefore, even after you’ve created the function-based index, Oracle still has no idea on the selectivity associated the function because the necessary virtual column statistics are still missing. Oracle simply takes a “guess” and as with most guesses, it’s quite likely to be wrong which means the selectivity is likely to be wrong which means the costings is likely to be wrong which means the execution plan could very well be wrong as well.

What should you do after you create a function-based index ? You should collect the statistics on the hidden virtual column so that the CBO can accurately determine the cardinality associated with using the function and so make correct decisions regarding the most appropriate execution plan.

The ‘FOR ALL HIDDEN COLUMNS’ method_opt option with DBMS_STATS is one way of collecting just the necessary table statistics on these hidden virtual columns:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘PINK_FLOYD’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL HIDDEN COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.
SQL> select column_name, num_distinct, hidden_column, virtual_column from dba_tab_cols where table_name=’PINK_FLOYD’;

COLUMN_NAME     NUM_DISTINCT HID VIR
--------------- ------------ --- ---
OWNER                     74 NO  NO
TABLE_NAME              5739 NO  NO
NUM_ROWS                 886 NO  NO
BLOCKS                   152 NO  NO
SYS_NC00005$            5739 YES YES

This demo on missing virtual column statistics asscociated with function-based indexes goes through this whole issue.

My early Christmas wish to Oracle would be for the CBO to at least consider the DISTINCT_KEYS index statistic when determining the correctly selectivity when hidden virtual column statistics are missing on function-based indexes, rather than take it’s 0.4% / 1% guess …

More on virtual columns, hidden and otherwise, in the future …