jump to navigation

How To Really Create A Function-Based Index (A Day In The Life) July 12, 2009

Posted by Richard Foote in Function Based Indexes, Humour, Oracle Indexes.
trackback

“I heard the news today, oh boy” (John Lennon 1967).

It was with some amusement that someone pointed out Don Burleson’s latest 11 July 2009 “Oracle News” piece : “How to index on a SQL CASE Statement” (the article it links to is dated 29 December 2008).

It must obviously be a very very slow news day as the ability to create a function-based index has been around for a long time, the SQL CASE “statement” (expression actually) being around since at least Oracle8i days. So it’s not exactly “new” news.

However, what’s particularly amusing in this little “news” piece, is the SQL statement used to demonstrate how to create an index on a SQL CASE statement:

create index
case_index as
(case SOURCE_TRAN
when ‘PO’ then PO_ID
when ‘VOUCHER’ then voucher_id
ELSE journal_id
end = ‘0000000001’
END);

No, this is not some new weird piece of SQL. The problem of course is that this most definitely is NOT how to create a function-based index with a CASE statement. It’s fundamentally syntactically totally wrong, not just in one place, but in several places.

If you were to just TEST this SQL, you would get the following error:

SQL> create table case_tab (source_tran varchar2(5), po_id varchar2(5), voucher_id varchar2(5), journal_id varchar2(5));

Table created.

SQL> create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end);
create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end)
                        *
ERROR at line 1:
ORA-00969: missing ON keyword

 

You see, to create an index, you must at the very least need to specify the table that is being indexed …

Additionally, the AS statement is incorrect and the CASE statement itself is wrong and incorrectly specifies an additional END condition. There’s a lot wrong here.

If you really want to create such an index, then you need the SQL syntax to at least look something like this:

SQL> create index case_tab_idx on case_tab (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end);

Index created.

 

The “News” article then suggests you need to collect statistics on the index:

EXEC DBMS_STATS.gather_index_stats(‘OWNER’, ‘CASE_INDEX’);

However, this advice is again not quite correct. Remember, this “News” piece is only a day or two old and Oracle has been automatically collecting statistics on indexes as they’re created (or rebuilt) since Oracle 10g. Providing the table already has statistics, Oracle will automatically collect statistics on the index as it’s being created. There is therefore no need to collect index statistics again after the index has been created.

But but but, what does not have statistics and what really should be collected after you create any new function-based index are the statistics on the virtual column that is created on the parent table as part of creating any function-based index. This is vitally important, else the function-index may not be used by the CBO as expected. For example:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘CASE_TAB’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL HIDDEN COLUMNS SIZE 1’);

I discuss this issue in some detail the post “Function-Based Indexes And Missing Statistics“.

So some advice in summary:

1) If you’re going to create a function-based index, get the syntax correct. It helps,  it really does.

2) Ensure you collect the associated statistics on the table virtual column that is created automatically behind the scenes.

3) Don’t believe everything you read or hear, even if it’s on the “news” …  🙂

 

UPDATE: 13 July 2009: The “News” article has now been amended to expand the discussion on statistics and reference the dbms_stats.gather_table_stats procedure for collecting hidden column statistics on the function-based index virtual columns as I discussed. The article is still dated 29 December 2008.

I also notice the newly added gathering hidden column statistics example used in both the news article and in the referenced “important notes” is simply a cut ‘n’ paste from this very blog post. I can easily tell because:

  1. It uses the same table_name as I made up in my example (CASE_TAB)
  2. It has the same error as in my original version with the missing quote when defining the table_name 😉

Comments»

1. Rob van Wijk - July 12, 2009

Very typical 🙂

Also, there is no such thing as a SQL CASE statement. SQL only knows CASE expressions, PL/SQL has a CASE statement.

Regards,
Rob.

Like

Richard Foote - July 14, 2009

Hi Rob

True Indeed 🙂

Like

2. Tony - July 12, 2009

If they guy can’t even proof his own article then why would I want him “tuning” my production system. How Don has risen to such elite status in the Oracle community is a marketing wonder.

Like

Boneist - July 13, 2009

Really? He has “elite” status?!

*blinks*

Not in my corner, he doesn’t! (And what status he does have would be based on his very own words… unless, of course, he followed his own rule and it’s an imposter that (used to) post on the OTN forums…. *{;-)

Like

3. Asif Momen - July 13, 2009

Hi,

Yea…everything that’s printed/published isn’t really always true. Here’s another case:

http://www.remote-dba.net/oracle_10g_tuning/t_oracle10g_dynamic_sampling.htm

The document is undated and has two points mentioned incorrectly.

1) “By default, Oracle will sample 32 random blocks.”

2) “It is also important to know that dynamic sampling does not occur on tables that contain less than 32 blocks.”

You will find why these points are incorrect here:

http://momendba.blogspot.com/2008/11/dynamic-sampling-myths-dispelled.html

Regards

Asif Momen

Like

Richard Foote - July 14, 2009

Hi Asif

Oh, there’s a few … 😦

Like

4. Malcolm Chalmers - July 13, 2009

Thank you. Not only have you have pointed out that even ‘experts’ can make mistakes, but you have explained clearly and completely why it is wrong.
I read Don’s blog quite often and always take what he says with a grain of salt. As I do everything on the internet. But I will be double checking his code even more so now. 🙂

Like

Richard Foote - July 14, 2009

Hi Malcolm

Everyone makes mistakes, everyone.

However, it’s how people repond to making a mistake, it’s how people react, it’s how people learn and it’s how people can make the mistake a positive step in furthering self-improvement that’s important and reveiling about the person.

That and the frequency which they make mistakes of course 😉

It’s one way I determine who really deserves the title “expert” or not, despite how much one might make the self claim …

Like

5. Bernard Polarski - July 13, 2009

Articles terminated with “Don Burleson”: it is not a signature, it is a warning. One should reconfigure its browser for pages where the String Burleson appears so that a pop up triggers and signal ‘Site not reliable’. On Google it is a pleague as the suspect seems to pays so that its crappy numerous sites appear first. A pleague

Like

6. PdV - July 14, 2009

Richard,

Thx for the heads up on the virtual/hidden column item, I had overseen that.
And a good piece overall.

As for the google becoming the “manual” of the DBA, well, those who use it will just gradually have to discover what works and what doesnt.

I find it a sign of dba-maturity when someone has seen through the more infamous top-result-sites that shout “buy my stampeding book”.

And vice-versa, if someone has certain books on their desk, or uses certain sites/printouts, I thread carefully.
And Prove stuff with scripts. Think for yourself. And Prove.

The brits have a way of putting it: To Assume makes an A-s-s out of U and Me.

Like

Richard Foote - July 14, 2009

Hi Piet

The brits also have a way of getting draws in cricket when they don’t deserve to 🙂

Like

7. Guy Harrison - July 14, 2009

In 11g, I think there are two other ways to get statistics collected for indexed expressions:

1) Collect extended statistics directly on the expression. So for instance, if we had a function SALES_CATEGORY, we might do this:

DBMS_STATS.gather_table_stats (ownname => USER,
tabname => ‘SALES’,
method_opt => ‘FOR ALL COLUMNS FOR COLUMNS
(sale_category(amount_sold))’ );

2) Create a virtual column on the expression, then index that column. So for the same example as above we might create the following virtual column, then index the column and collect stats as usual:

SQL> ALTER TABLE SALES ADD sales_category GENERATED
2 ALWAYS AS (sale_category(amount_sold));

I think I like the first method better, because the statistics will still exist even if the index is dropped and – unlike the second approach – it doesn’t change the logical structure of the table.

Like

Richard Foote - July 14, 2009

Hi Guy

Yes indeed, 11g has some nice features with regard to extended stats, visible virtual columns and the such and their use with indexed expressions.

Extended stats that span multiple columns is a nice touch too, although Oracle’s ability to now use the index stats to determine better cardinality estimates sometimes negates the necessity of some of these stats.

It would be really nice if Oracle could extend multi-column stats to span across tables …

Like

Mohamed Houri - July 17, 2009

One of the most important qualities an expert has is its modesty and its availability for others. In addition to that, he should absolutely possess a good human education what ever his technical education is. A very long time ago, an Egyptian poet expressed this fact in a very interesting poem from which I have extracted and translated the most significant “phrase”
“Don’t think that a science will suffice alone, if the scientist does not enrich his science with good human education”

That is said, it’s absolutely no shame to make mistakes; we all know that people who don’t make mistakes are people who don’t work. And if we start working we are subject to errors. It is through the errors we make that we learn and acquire a practical knowledge.

The problem resides in the superiority the auto proclaimed experts show. They have no consideration for people contradicting their claims and myths instead of discussing with them and challenging their ideas.

Unfortunately, very often, such a kind of auto proclaimed experts has good communication skills so that they arrive to gather many persons behind their myths.

And more seriously, the world became very large; internet has suppressed the information border so that we are all subject to a full bench of wrong and false information and we all have to furnish efforts in order to distinguish the bad from the good one.

Hopefully, Richard, Jonathan Lewis, Tom Kyte and many others have saved us the time effort that we would have spent in distinguishing the Oracle material they are giving us. Without any doubt they represent the real expert one can take as an example to follow.

Like

8. Richard Foote - July 20, 2009

Hi Mohamed

Thank you for your kind comparisions.

Unfortunately, yes we have much “information polution” as I discussed in this post:

Why are there so many Oracle related myths ? The Inconvenient Truth

There are many qualities in a true expert. Being wrong on a continual basis is certainly not one of them 😉

Like

9. Andrew - March 22, 2010

I do not need the internet to be accurate. That would be a silly expectation.

What I do like about the internet and BC is that I can search for a vague concept and find interesting key words.

Perhaps I did not know that function based indexes existed. If I read about it after a search that ended up at BC I might go give it a try. I will probably go get Oracle published doc’s on how to do it right and do some benchmarking.

If the only one saying you are an expert is yourself it is nice that you have convinced at least one person!

This is true on the internet as well as brick and mortar office buildings. Regardless of age, young internet users will eventually figure it out. For those that can’t I am sure they have a lot more problems in life than BC’s interesting PR campaign.

At the moment I am trying to figure out how to tune a recursive/hierarchy model I am creating. I like this article you wrote it is giving me a few ideas.

Thank you

Like

Richard Foote - March 24, 2010

Hi Andrew

Yes, expecting the internet to be accurate is certainly a big ask 🙂

If someone claims to be an “expert” with 25 years+ of experience, one might have shall we say certain “expectations”.

I’ve stopped having any such expectations a long time ago …

Like

10. Brian TKatch - March 22, 2010

Richard, do you provide the proofread-and-fix service to Don for free?

Like

Andy - March 23, 2010

@Brian,

I’ve asked Richard the same question before 😉 The latest news is that Richard was given an Amazon voucher by someone from BC. So, naturally, I wonder if Richard has exchanged that voucher for BC’s Performance Tuning book :p

Like

Richard Foote - March 24, 2010

Hi Brian

All free and delivered with a smile 🙂

Note BTW, this article is still hopeless wrong at the time of writing this comment.

I only have time to be bothered with just the occasional, generally index related gaff.

When someone discusses for example how to invoke a nested loop join but only has the one table in the FROM clause in all the examples given:

http://www.dba-oracle.com/t_tuning_sql_nested_loops_joins.htm

well, there’s only so much you can do to help, other than wipe the tears away 😦

Like

11. Automatic Indexing: JSON Expressions Part II (“Without You I’m Nothing” | Richard Foote's Oracle Blog - April 14, 2022

[…] the number of rows (2 million) in the table. As I’ve discussed previously (as in this rather humourous post), the issue here is that the CBO has no idea what the expected cardinality might be, as the output […]

Like


Leave a comment