jump to navigation

Index Skip Scan – Does Index Column Order Matter Any More ? (Warning Sign) March 10, 2008

Posted by Richard Foote in Index Access Path, Index Skip Scan, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning.
35 comments

I’ve already written a few posts regarding concatenated indexes and things to consider (and not consider) when deciding the column order of a concatenated (composite) index.

A comment I see from time to time is that the whole question of column order within an index is now somewhat redundant anyways as Oracle since 9i has introduced the Index Skip Scan access path. Prior to 9i, if the leading column of an index wasn’t specified in a predicate, the index was effectively ignored by the CBO. However, if the leading column isn’t referenced now, Oracle can use the index anyways via an Index Skip Scan access path.

So the column order in a concatenated index doesn’t matter that much now, right ?

Well, not quite ….

An Index Skip Scan can only actually be used and considered by the CBO in very specific scenarios and is often an indicator there’s either a missing index or an exisiting index has the columns in the wrong order.

If the leading column of an index is missing, it basically means the values in subsequently referenced columns in the index can potentially appear anywhere within the index structure as the index entries are sorted primarily on the leading indexed column. So if we have column A with 100,000 distinct values and column B with 100,000 distinct values and an index based on (A,B), all index entries are sorted primarily on column A and within a specific value of column A, sorted by column B. Therefore if we attempt a search on just Column B = 42, these values could potentially appear anywhere within the index structure and so the index can not generally be effectively used.

However, what if the leading column actually contained very few distinct values ? Yes, the subsequent column(s) values could appear anywhere within the index structure BUT if these subsequent columns have relatively high cardinality, once we’ve referenced the required index entries for a specific occurrence of a leading column value, we can ignore all subsequent index row entries with the same leading column value. If the leading column has few distinct values, this means we can potentially “skip” several/many leaf blocks until the leading column value changes again, where we can again “probe” the index looking for the subsequent indexed column values of interest.

So if we have a leading column with few distinct values, we may be able to use the index “relatively” efficiently by probing the index as many times as we have distinct leading column values.

On the other hand, if the leading column has relatively high cardinality then an Index Skip Scan is not a viable option. An index can generally store many hundreds of index entries per index leaf block, depending on the block size and the average size of the index row entries of course. So if the leading column were to change just once on average within the subsequent index leaf block, Oracle would be forced to scan the next index leaf block anyways as it may contain the required index row entry.

For Oracle to be able to “skip” an index leaf block, the leaf block must contain nothing but the same leading column value as last changed in a preceding leaf block. Hopefully, there are many leaf blocks where the leading column value doesn’t change and so hopefully there are many leaf blocks that can potentially be “skipped”.

Therefore, the cardinality of the leading column is crucial for an Index Skip Scan to be viable. In the example above where we had 100,000 distinct values for columns A and B, unless the table is massive, it’s unlikely an Index Skip Scan will be viable regardless of which column is the first column in the index. However, if column B only had 10 distinct values, then an index based on (B,A) may very well be able to use an Index Skip Scan whereas an index on (A,B) would not.

Note though that an Index Skip Scan must probe the index at least as many times as there are distinct values of the leading column. This will not be as efficient as an index that only requires the one index probe. Therefore although a query with a predicate based on A=42 could use an Index Skip Scan  with an index on (B,A) assuming column B had few distinct values, an index on (A,B)  or (A) would be more efficient as it would only require the one index probe.

However, if the performance of index (B,A) were “good enough” and/or a search on just A=42 was uncommon, then the index on (B,A) may be quite adequate and an index on (A,B) may be unnecessary. The index on (B,A) would also be able to handle queries based on columns A and B and queries based on just column B (providing the CBO determined the selectivity acceptable, which it might for unevenly distributed rare values of column B).

See this Index Skip Scan demo to see when it all may prove useful.

No, an Index Skip Scan doesn’t mean we don’t need to consider the column order of an index. If anything, it’s something else that needs to be considered and along with index compression, is another reason why low cardinality leading index columns have advantages.

Introduction To Reverse Key Indexes: Part II (Another Myth Bites The Dust) January 16, 2008

Posted by Richard Foote in Index Access Path, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Myths, Performance Tuning, Reverse Key Indexes.
27 comments

In Part I, we saw how with Reverse Key Indexes, Oracle will basically take the indexed value, reverse it and then index the reversed value. As a result, data that would ordinarily be logically sorted within an index structure will now be randomly distributed. This therefore negates the use of Reverse Key Indexes with range predicates, with the CBO not even considering them in its costings.

This is all the information we need to dispel a rather bizarre suggestion that has been doing the rounds regarding using Reverse Key Indexes to deal with LIKE predicates that have a leading wildcard. For example, such a suggestion can be found here and within an OTN discussion here.

Basically the suggestion is to:

1) Create a Reverse Key Index on the column to be searched with a LIKE predicate having a leading wildcard (such %, _).

2) Instead of writing the query as usual, e.g.

SELECT * FROM bowie_table WHERE name LIKE ‘%BOWIE’

rewrite the query programmatically such as:

SELECT * FROM bowie_table WHERE name LIKE ‘EIWOB%’;

by reversing the required text and now having the wildcard at the end.

The Reverse Key Index stores the data in a reversed format identical to say ‘EIWOB’, so Oracle should be able to use the Reverse Key Index to efficiently find all rows that start with ‘EIWOB’ as they’re all grouped together within the index structure, right ?

Ummm, wrong.

Ignoring the fact the example in the above link is somewhat meaningless as it uses a leading and a trailing wildcard in both queries and so assuming the first query only has a leading wildcard and the second query only has a trailing wildcard, this suggested use of a Reverse Key Index can not possibly work on any current version of Oracle.

There are a few fundamental problems with this suggestion but in summary not only will it not work but worse, it will actually return the wrong results.

The suggestion is correct as far as indeed, using a normal index to return data with a LIKE statement containing a leading wildcard will negate the use of an index range scan, the CBO doesn’t even consider it. An index hint may push Oracle to use a Full Index Scan, but not an Index Range Scan.

However using a Reverse Index Key to solve this is unfortunately doomed to failure for two very simple reasons.

One, as we have already seen, Oracle also ignores Index Range Scans for Reverse Key Indexes with range predicates and unfortunately, a query such as WHERE name LIKE ‘EIWOB%’ is a range scan. The CBO simply doesn’t consider the Reverse Key Index in it’s deliberations.

Two, is of course that Oracle has no possible way of knowing that when you say LIKE ‘EIWOB%’, what you really mean is search for all records ending with BOWIE, LIKE ‘%BOWIE’. How can Oracle possibly know this ? If it could use the index (which it can’t) Oracle would only reverse the search string around anyways and use the index to look for indexed entries beginning with ‘BOWIE’ within the index structure, remembering everything is of course stored in reverse within the index.

So Oracle is actually searching for all records starting with ‘EIWOB’, not ending with ‘BOWIE’ which are two entirely different things.

The net result of using this suggested strategy is not good.

1) Oracle ignores the Reverse Key Index anyways as a LIKE ‘EIWOB%’ is a range predicate
2) Oracle therefore performs a Full Table Scan anyways
3) As the query is effectively searching for all records that start with ‘EIWOB’, not as expected all records that end with ‘BOWIE’, the two queries in the example will actually return completely different results

The Reverse Key Indexes Part II Demo shows how this suggested use of a Reverse Key Index is a very very bad idea …

However, if you want to solve the issue of efficiently finding the results of a LIKE ‘%BOWIE’, there are some possible approaches one could take that will use an index and return correct results.

One possible solution (as mentioned in the OTN link listed at the beginning) is to create a Function-Based Index using the REVERSE Function, (Warning: this function is undocumented and unsupported):

CREATE INDEX bowie_reverse_func_i ON bowie(REVERSE(name));

A query such as WHERE REVERSE(name) LIKE ‘EIWOB%’ or better still WHERE REVERSE(name) LIKE REVERSE(‘%BOWIE’) can now both potentially use the index.

The reverse function will reverse the name column (from say ‘DAVID BOWIE’ to ‘EIWOB DIVAD’) and the LIKE range predicate can work with the index as it’s a Function-Based index rather than a Reverse Key Index and it’s not using a LIKE with a leading wildcard. A column containing ‘DAVID BOWIE’, but stored as ‘EIWOB DIVAD’ within the index, can be found efficiently via an index range scan using this Function-Based Index.

I’ve included an example on effectively using a Function-Based Index with the Reverse Function at the end of the above demo. There’s also a discussion and other alternatives at Gints Plivna’s Blog.

Another alternative is to use an Oracle Text Index, which also has the capability of dealing logically with queries such as %BOWIE% but as they say, that’s a topic for another day.

More on Reverse Key Indexes to come as well.

Introduction To Reverse Key Indexes: Part I January 14, 2008

Posted by Richard Foote in Index Access Path, Oracle Cost Based Optimizer, Oracle Indexes, Reverse Key Indexes.
27 comments

Following on from the “8 things You May Not Know About Indexes”, #7 regarding Reverse Key Indexes requires a number of posts to do the subject justice. However, Part I will focus of the specific issue related to point # 7, namely:

“A REVERSE index can quite happily be used by the CBO to perform index range scans within an execution plan”.

Reverse Key Indexes are designed to resolve a specific issue, that being index block contention. Many indexes in busy database environments with lots of concurrent inserts (and in some scenarios updates and deletes as well) can suffer from index block contention (as highlighted by high levels of “buffer busy waits” and “read by other session” wait events for the index segments). Monotonically increasing indexes, such as Primary Keys generated by a sequence, are especially prone to contention as all inserts need to access the maximum “right-most” leaf block.  This is of particular concern in RAC environments, where this “hot” index block needs to be accessed by all the instances and is being bounced around the various SGAs causing expensive block transfers between instances.

A solution is make the index a Reverse Key Index.

CREATE INDEX bowie_reverse_idx ON bowie(id) REVERSE;

A Reverse Key Index simply takes the index column values and reverses them before inserting into the index. “Conceptually”, say the next generated ID is 123456, Oracle will reverse it to 654321 before inserting into the index. It will then take the next generated ID 123457 and reverse it to 754321 and insert it into the index and so on. By doing this, inserts are spread across the whole index structure, ensuring the right most block is no longer the only index leaf block being hammered. Index contention is dramatically reduced or eliminated entirely.

Reverse Key Indexes address a specific problem but may in turn introduce a number of problems themselves.

One problem is the simple fact index entries are no longer sorted in their natural order. Value 123456 is no longer adjacent to value 123457 in the index structure, they’re likely to be found in completely different leaf blocks. Therefore a range predicate (such as BETWEEN 123450 and 123460) can no longer be found by a single index probe, Oracle would be forced to search for each specific index value separately as each value in the range is likely to be in differing leaf blocks.

This makes it all just too difficult and troublesome for the Cost Based Optimizer (CBO). As a result, the CBO totally ignores Reverse Key Indexes when processing Range Predicates (eg. BETWEEN, <, >, <=, >=, LIKE etc.). Even innocent looking range predicates such as “BETWEEN 123456 and 123457”, with just the 2 values of interest are ignored by the CBO. A 10053 trace shows how the CBO totally ignores Reverse Key Indexes and doesn’t even bother to cost such accesses when processing Range Predicate conditions.

In the above example and in scenarios where it’s possible and practical to convert range predicates, use an IN clause instead, e.g. “IN (123456, 123457)” as Oracle will convert this into an OR clause with each equality condition usable with the Reverse Key Index.

Oracle is also clever enough to identify equality conditions that may be written as a range scan (e.g. BETWEEN 123456 and 123456) and use a Reverse Key Index accordingly.

Hints won’t work either. You may be able to force Oracle into performing a Full Index Scan but it will not perform an Index Range Scan with a Range Predicate.

But doesn’t all this mean I’m wrong when I suggested a Reverse Key Index can be used by the CBO to use Index Range Scans.

No 🙂

I’ve only described how Oracle ignores the use of a Reverse Key Index for Range Predicates, however Index Range Scans are quite possible.

Remember, a Reverse Key Index will reverse all values and if two values happen to have the same value or two index entries happen to have the same leading column, then all such values are indeed stored together and are logically adjacent to one another.

For example, if the Reverse Key Index is Non-Unique, Oracle must perform an Index Range Scan, even for equality predicates. I discussed this in some detail when discussing the differences between a Unique and a Non-Unique Index. Even if the column or columns have a PK or a Unique Key Constraint, Oracle will still check the next index entry “just in case” there are indeed duplicate values. Also, although usually used for monotonically index columns, there’s nothing preventing you from creating a Reverse Key Index on a Non-Unique column and all duplicate values must reside together in the index structure. Therefore an equality search that uses any Non-Unique Reverse Key Index will generate an Index Range Scan access

But even Unique indexes can be used to perform an Index Range Scan.

If you have a multi-column Unique Index but not all columns are being searched (although the leading column must be known), then again, all index values with the same leading column (or columns) must be stored together in the Reverse Key Index and an Index range Scan can be performed for such equality conditions.

For some examples of what I’ve discussed see this Reverse Key Part I Demo.

So yes, a Reverse Key Index can indeed be used by the CBO to perform Index Range Scans.

There are also a number of other issues (and indeed myths) associated with Reverse Key Indexes that will be discussed in the fullness of time.

Introduction to Fake / Virtual / NOSEGMENT Indexes January 11, 2008

Posted by Richard Foote in Fake Indexes, Index Access Path, NOSEGMENT Option, Oracle Cost Based Optimizer, Oracle Indexes, Virtual Indexes.
10 comments

OK, as promised, answer to index fact #5 you may not have known:

“It’s possible to make the CBO reference and use within an execution plan indexes that don’t in actual fact exist”.

Before I start, please note this feature is not officially documented other than the odd Metalink note and requires the setting of an undocumented parameter to work, so please exercise caution.

Fake Indexes (also known as Virtual or Nosegment Indexes) have been around for a long time, since 8i days. They’re used primarily by Oracle Enterprise Manager and its Tuning Pack which has various wizards that can do “what if” type analysis. One of these is the Index Wizard which can kinda “pretend” to create an index and see what the Cost Based Optimizer might do if such an index really existed.

It’s possible to create these Fake indexes manually by using the NOSEGMENT clause when creating an index:

CREATE INDEX Bowie_idx ON Bowie_Table(Ziggy) NOSEGMENT;

This will populate some (but not many) DD related tables but will not actually create an index segment or consume any actual storage. It’s not maintained in any way by DML operations on the parent table and it can’t be altered or rebuilt as can a conventional, “real” index (it will generate an ORA-08114 error if you try to do so). You can analyze or run dbms_stats over the index but the index is not treated as analyzed as such (as can be seen via a 10053 trace).

It’s also only visible to the CBO, if and only if a session has the following parameter set:

ALTER SESSION SET “_use_nosegment_indexes” = true;

The CBO will now consider the index and potentially include it within an execution plan. However, at execution time Oracle can of course not use the index and will revert to the next best thing.

A Fake index is basically an index you have when you don’t really have an index in order to see if it could be useful if it really existed.

This Fake Indexes Demo shows how they work and can be used.

1 down, 7 to go … 😉

Introduction To Linguistic Indexes – Part I January 3, 2008

Posted by Richard Foote in Index Access Path, Indexing Tricks, Linguistic Indexes, Oracle Indexes.
11 comments

Characters are sorted by default based on numeric values defined by the default character encoding scheme (known as Binary Sorting). For us Australians, this is fine as we (generally) speak English and the English alphabet is nicely sorted in ascending order by ASCII and EBCDIC standards. However, many other languages are not so fortunate as the binary sort does not sort the data in many language’s alphabetic sort order.  Oracle has many Globalization Support features to help users in other languages get over these issues (all very interesting and topics for many a Blog entry in the future).

However, even us Australians have issues when it comes to “case-insensitive” searches, where data may be stored in many different cases (eg. Ziggy, ZIGGY, ZiGgY, etc.) and we want to return all data that matches a character value, regardless of its case.

The issue of course is that by default, all text searches are case-sensitive. For example a search WHERE name=’ZIGGY’ will only return ‘ZIGGY’ but not ‘Ziggy’ or ‘ZiGgY’ etc.

The standard fix is for the application to convert the data to a consistent case when performing the search. For example a search WHERE UPPER(Name) = ‘ZIGGY’ will return all values of “ZIGGY” regardless of their case but this will negate the use of any standard index on the Name column.

Therefore, a Function-Based index is required, say based on UPPER(Name), to ensure an efficient index access is possible for case insensitive searches.

However, this often requires an additional index to be created and for the application to be explicitly written to make use of the function-based index defined function.

Now the best cure for this problem is simply to ensure all data is stored in a consistent case (ZIGGY, ZIGGY, ZIGGY) but this may not always be practical or even desirable in some cases.

Another possible solution is the use of a Linguistic Index. This is an index that is created based on a specific case insensitive linguistic language or multilingual option that ensures the index entries are sorted in the linguistic language order, not on the default binary order of the database encoding scheme.

Basic steps are:

1) Create a Linguistic Index, eg.

CREATE INDEX case_search_ling_name_i ON case_search(NLSSORT(name,’NLS_SORT=GENERIC_M_CI’));

2) Set NLS_SORT in the session (or set parameter) to use the required Linguistic sort option , eg.

ALTER SESSION SET NLS_SORT=’GENERIC_M_CI’;   

Simply append _CI in the Linguistic sort option to make it Case-Insensitive or _AI to make it Accent-Insensitive.

(Note: if binary ordering is generally adequate, NLS_SORT can simply be set to ‘BINARY_CI’ for Binary Case-Insensitive searches)

3) Set NLS_COMP in the session (or set parameter) to use Linguistic Sorts/Case Insensitive Searches, eg.

ALTER SESSION SET NLS_COMP=’LINGUISTIC’;

A search now based on WHERE name=’ZIGGY’ will automatically perform a case-insensitive search without the need to modify the application to use specific functions.

For a full demo, see Use Linguistic Indexes Demo.

However, before you rush out and start using Linguistic Indexes to possibly simplify the use of case insensitive searches, note there are various disadvantages to Linguistic Indexes, which can somewhat dampen their appeal. These will be covered in Part II of this series.

Differences between Unique and Non-Unique Indexes (Part II) December 21, 2007

Posted by Richard Foote in Index Access Path, Index Internals, Indexing Tricks, Oracle Cost Based Optimizer, Oracle Indexes, Primary Key, Unique Indexes.
25 comments

The most significant difference between a Unique and a Non-Unique index is of course the simple fact that in one index, all index entries MUST be unique and in the other index there can be duplicates of an index entry.

Although an obvious distinction between the two, it’s also a crucial difference as well.

When Oracle uses a Unique Index to scan for a specific value (via an equality predicate on all indexed columns or when policing a constraint ), there can only be one of two possible results. The value can exist returning at the very most one value or the value doesn’t exist returning 0 values. That’s it, 1 row or none. The value either exists or it doesn’t.

This fact means Oracle doesn’t have to worry about a whole bunch of things when dealing with Unique indexes during equality or unique checking processes. It doesn’t have to check the next index entry just in case there’s a second or more entries with the same value. It doesn’t have to worry about the potential of having to skip across to the next leaf page if the specific value it reads happens to be the maximum value in the current leaf page. It doesn’t have to worry about pointers to these “adjacent” leaf blocks changing on it due to block splits. It doesn’t have to concern itself with potentially visiting more than the one table data block during the index access operation.

Life is simple, it’s either 1 row or none.

Not so for Non-Unique indexes. With a Non-Unique index, there are no such guarantees. With a Non-Unique index, there are 3 categories of possibilities. An index scan could return 0 rows, it could return 1 row or it could return more than one row. It could potentially need to go and visit more than the current leaf block to return all the matching rows. It could potentially need to go and visit more than one table block.

Life’s not quite so “simple” for a Non-Unique index.

Note also and most importantly that life gets no easier for a Non-Unique index that polices a PK or Unique key constraint.

Even though there’s a PK or Unique constraint on a column, to Oracle, it’s just another Non-Unique index with the same “vague” possibilities. Remember that PK and Unique constraints can be enabled with NOVALIDATE meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index. Remember constraints can be DEFERRABLE, meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index.

This means that Oracle has to concern itself with a number of additional overheads, including having to “check” the next index entry, “just in case” it matches the required index value. It has to concern itself even with the possibility of having to visit the next index leaf block, “just in case”.

You will note when Oracle performs an equality search using a Unique Index, Oracle will perform an “INDEX UNIQUE SCAN” because the index entries MUST be unique.

You will note however when Oracle performs an equality search using a Non-Unique index, even if there’s a PK or Unique constraint of the column(s), Oracle will perform an INDEX RANGE SCAN, because it needs to scan multiple index entries “just in case”.

So are there any actual implications as a result of any of this ?

Yes.

When Oracle actually reads an index and processes the associated blocks in the buffer cache(s), Oracle uses a number of latches. These latches are used primarily to “protect” memory structures from concurrent activity. Very simplistically, by grabbing a latch, Oracle effectively performs a “lock” on the associated memory structure, perform whatever activity needs to be performed and releases the latch. These latches get grabbed and released (hopefully) extremely quickly (order of 1/10s of ms), but it’s a non zero value.

The issue with latches is that they’re a point of serialisation. If two (or more) processes want a specific latch, one (or more) has to wait. Latches also burn CPU. Only a teensy weeny bit at a time but some CPU nonetheless. They burn CPU while acquiring the latch and if fail due to latch contention, while attempting again and again to acquire the latch. They also burn CPU while performing the specific operation necessary of the latch.

Basically, the more latches, the greater the potential for contention, the greater the potential for latch related wait activity and perhaps most important of all, more CPU is required. In busy systems, there can be massive numbers of latch events and the best way to tune these events is to reduce where possible the number of latches required by the database environment. It’s one of the key reasons we try and reduce LIOs in a database as much as possible, to reduce the latch and CPU load on the system.

Because of the differences highlighted between Unique and Non-Unique indexes, the number and manner of latches required between the two indexes differs. And it differs significantly …

In this little demo, Latch Differences Between Unique and Non-Unique Indexes Demo, we compare the latches required to read an identical table, using a 2 level index. The  differences between the latch overheads of a Unique and a Non-Unique index are most interesting.

When using a Unique Index, Oracle required 3 consistent gets (one for the index root block, one for the leaf block and one for the table block). BUT, each consistent get was a consistent gets – examination, a special type of consistent get which only requires 1 latch (rather than the standard 2 latches).

So that’s a sum of 3 latches.

However, when using a Non-Unique index, Oracle required 4 consistent gets (one for the index root block, one for the leaf block, one for the table block and an additional one to recheck the leaf block for any duplicate index entries). BUT, only the 1 consistent read (for the index root block) was actually the “cheaper” consistent gets – examination, the other 3 were the more costly 2 latch variety.

So that’s a sum of 7 latches.

3 latches for the Unique index and 7 latches for the Non-Unique index.

That’s an increase of 133.3% in latches between the two types of indexes.

Now, the height of the index will change the ratio of latch difference between the two indexes. Also, in a busy system, there could potentially be differences in the types of latches used due to the current state or additional activity in a block.

However, the potential difference in latch requirements between a Unique or Non-Unique index can be very significant. But does a few additional latches here and there really make much of a difference ?

Well, of course it depends. On small scale systems with smaller loads, fewer indexes, fewer users and excess resources, the noticeable differences may be negligible.

However, in larger scale (especially OLTP) environments, a particular index may be accessed 100s or maybe 1000s of times a second. There may be 1000s of tables with 1000s of corresponding PK and Unique constraints policed by 1000s of Unique (or Non-Unique) indexes. It’s therefore not really of question of a few latches here or there. It’s a question of potentially a very significant proportion of overall latch related overheads.

Potentially when accessed, Non-Unique indexes could be generating double the latch related overheads for equality unique scan or unique checking index activity. Remember, the best way to tune latches and reduce latch contention is to simply reduce the requirement and load for latches.

The overall reduction in CPU and latch related wait activity could be significant between Unique and Non-Unique indexes because by using Non-Unique indexes you in the order of double the latches required for such activities.

Note also this doesn’t require any special parameters to be set or special tuning or monitoring by the DBA. It simply requires using Unique indexes to police PK or Unique constraints when there are no requirements of Non-Unique indexes. You then potentially gain a benefit each and every time the index is used for unique scan accesses.

Guess what type of access is extremely common in large scale OLTP environments …

The next time you complain about high CPU consumption or high latch contention and you’re tuned the application to death, just ask yourself how many Non-unique indexes are policing your PK or Unique Key constraints …

Local Index Issue With Partitioned PK and Unique Key Constraints December 20, 2007

Posted by Richard Foote in Constraints, Index Access Path, Local Indexes, Oracle Indexes, Partitioning, Performance Tuning, Unique Indexes.
12 comments

Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:

“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”

Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.

Let me explain why.

Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!

Let’s start by mentioning constraints again.

Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.

Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).

Lets say a table is Range Partitioned on column ‘A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!

Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.

Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.

This is actually a good thing.

If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).

It actually makes a lot of sense to do this.

Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.

Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.

If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.

Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.

In other words, the rules apply equally to both Unique and Non-Unique indexes.

So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint*  to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.

Little demo to illustrate: Local Index Issue With Partitioned PK and Unique Key Constraints

Outlier Values – An Enemy Of The Index December 13, 2007

Posted by Richard Foote in Index Access Path, Indexing Tricks, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Outlier Values.
15 comments

Outlier values are basically values that sit way way outside the standard range of a column’s normal value range.

Data can be a funny thing and sometimes there are values that are naturally “exceptional”. However, very commonly, outlier values are used by applications to represent bizarre default values, to avoid confusion with legitimate values. For example, I look after an application that uses the American Date Of Independence as it’s “default” date.

Usually, these weird outlier values are used to avoid nulls values, as nulls can be problematic and can not be indexed (well actually you can index a null column but we’ll leave that for another blog entry).

However, outlier values while (maybe) solving one problem, can introduce some very significant problems in return.

Firstly, the CBO “hates” outlier values as it potentially totally screws up the CBO’s selectivity calculations. The selectivity of a range scan is basically calculated by the CBO to be the number of values in the range of interest divided by the full range of possible values (IE. the max value minus the min value). Therefore if this calculation is invalidated by a massive and disprotionate “hole” in the full range of possible values, the CBO can get things horribly wrong.

See here for a simple demonstration:  Outlier Selectivity Problem

Additionally, indexes “hate” outlier values as it prevents Oracle using the 90-10 block split to keep indexes nice and compact and is forced to use 50-50 block splits instead. Basically a 90-10 block split is considered if and only if the index entry to be inserted is equal or greater than the current maximum value.  An outlier value that is also the maximum value,  usually means monotonically increasing values (such as sequences, dates, etc.) don’t actually insert the maximum value. Therefore, not only do indexes perform 50-50 splits but this 50% of free space is never used, as all new values are all almost, but not quite, maximum values.

Little demo to highlight this problem: Outlier Index Space Utilisation Problem 

In summary, avoid outlier values if at all possible.  They generally cause more problems than they solve !!

Invisible Indexes December 11, 2007

Posted by Richard Foote in 11g, Index Access Path, Invisible Indexes, Oracle Cost Based Optimizer, Oracle Indexes, Performance Tuning.
15 comments

New in 11g are “Invisible Indexes”, which are basically indexes that exist and are maintained by Oracle but are “invisible” to the CBO. Specific sessions can be set to see these invisible indexes as necessary.

Potentially useful if one has a problematic (and very large) index causing performance issues that you want to make invisible until the specific issue is addressed without the expensive of having to drop and latter recreate the index. Also useful if you want to introduce a new index but want it to be invisible until it’s been given a workout first in a specific “test” session.

Here’s a bit of a demo: Invisible Indexes