jump to navigation

How To Really Create A Function-Based Index (A Day In The Life) July 12, 2009

Posted by Richard Foote in Function Based Indexes, Humour, Oracle Indexes.
21 comments

“I heard the news today, oh boy” (John Lennon 1967).

It was with some amusement that someone pointed out Don Burleson’s latest 11 July 2009 “Oracle News” piece : “How to index on a SQL CASE Statement” (the article it links to is dated 29 December 2008).

It must obviously be a very very slow news day as the ability to create a function-based index has been around for a long time, the SQL CASE “statement” (expression actually) being around since at least Oracle8i days. So it’s not exactly “new” news.

However, what’s particularly amusing in this little “news” piece, is the SQL statement used to demonstrate how to create an index on a SQL CASE statement:

create index
case_index as
(case SOURCE_TRAN
when ‘PO’ then PO_ID
when ‘VOUCHER’ then voucher_id
ELSE journal_id
end = ‘0000000001’
END);

No, this is not some new weird piece of SQL. The problem of course is that this most definitely is NOT how to create a function-based index with a CASE statement. It’s fundamentally syntactically totally wrong, not just in one place, but in several places.

If you were to just TEST this SQL, you would get the following error:

SQL> create table case_tab (source_tran varchar2(5), po_id varchar2(5), voucher_id varchar2(5), journal_id varchar2(5));

Table created.

SQL> create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end);
create index case_index as (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end = ‘00001’ end)
                        *
ERROR at line 1:
ORA-00969: missing ON keyword

 

You see, to create an index, you must at the very least need to specify the table that is being indexed …

Additionally, the AS statement is incorrect and the CASE statement itself is wrong and incorrectly specifies an additional END condition. There’s a lot wrong here.

If you really want to create such an index, then you need the SQL syntax to at least look something like this:

SQL> create index case_tab_idx on case_tab (case source_tran when ‘PO’ then po_id when ‘VOUCHER’ then voucher_id else journal_id end);

Index created.

 

The “News” article then suggests you need to collect statistics on the index:

EXEC DBMS_STATS.gather_index_stats(‘OWNER’, ‘CASE_INDEX’);

However, this advice is again not quite correct. Remember, this “News” piece is only a day or two old and Oracle has been automatically collecting statistics on indexes as they’re created (or rebuilt) since Oracle 10g. Providing the table already has statistics, Oracle will automatically collect statistics on the index as it’s being created. There is therefore no need to collect index statistics again after the index has been created.

But but but, what does not have statistics and what really should be collected after you create any new function-based index are the statistics on the virtual column that is created on the parent table as part of creating any function-based index. This is vitally important, else the function-index may not be used by the CBO as expected. For example:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> ‘CASE_TAB’, estimate_percent=>null, cascade=>true, method_opt=> ‘FOR ALL HIDDEN COLUMNS SIZE 1’);

I discuss this issue in some detail the post “Function-Based Indexes And Missing Statistics“.

So some advice in summary:

1) If you’re going to create a function-based index, get the syntax correct. It helps,  it really does.

2) Ensure you collect the associated statistics on the table virtual column that is created automatically behind the scenes.

3) Don’t believe everything you read or hear, even if it’s on the “news” …  🙂

 

UPDATE: 13 July 2009: The “News” article has now been amended to expand the discussion on statistics and reference the dbms_stats.gather_table_stats procedure for collecting hidden column statistics on the function-based index virtual columns as I discussed. The article is still dated 29 December 2008.

I also notice the newly added gathering hidden column statistics example used in both the news article and in the referenced “important notes” is simply a cut ‘n’ paste from this very blog post. I can easily tell because:

  1. It uses the same table_name as I made up in my example (CASE_TAB)
  2. It has the same error as in my original version with the missing quote when defining the table_name 😉