Unique Bitmap Indexes Part II (You Can’t Do That) March 30, 2010
Posted by Richard Foote in Bitmap Indexes, Oracle Indexes, Unique Indexes.6 comments
As discussed in the previous post, a Bitmap index on a unique column will be larger than a corresponding Btree index due to the additional overheads associated with each index entry (such as the additional rowid, the additional column length bytes and the bitmap column itself). Oracle therefore attempts to protect you from explicitly creating such a “Unique Bitmap” index.
For example, you can not specify both UNIQUE and BITMAP when creating an index. To do so would make little sense.
A bitmap index must therefore be Non-Unique by definition. Any attempt to explicitly create a Unique Bitmap index will fail.
SQL> drop index bowie_bitmap_i;
Index dropped.
SQL> create unique bitmap index bowie_bitmap_i on bowie(id) pctfree 0;
create unique bitmap index bowie_bitmap_i on bowie(id) pctfree 0
*
ERROR at line 1:
ORA-00968: missing INDEX keyword
SQL> create bitmap unique index bowie_bitmap_i on bowie(id) pctfree 0;
create bitmap unique index bowie_bitmap_i on bowie(id) pctfree 0
*
ERROR at line 1:
ORA-00968: missing INDEX keyword
The CREATE INDEX syntax only caters for either the BITMAP or the UNIQUE option.
Although Oracle permits the use of a Non-Unique index to police either a Primary Key (PK) or Unique Key (UK) constraint, a bitmap index is not permitted to police such constraints. Again, it makes little sense having a bitmap index police such constraints as an equivalent Btree index is going to be more efficient.
If an existing bitmap index exists on a column, Oracle can not use it to police the constraint:
SQL> create bitmap index bowie_bitmap_i on bowie(id) pctfree 0;
Index created.
SQL> alter table bowie add primary key (id);
alter table bowie add primary key (id)
*
ERROR at line 1:
ORA-01408: such column list already indexed
Oracle is attempting to create a Btree index to police the new PK constraint but it can’t create it as an existing bitmap index already exists. Oracle will not create a Btree index if the same column list is already indexed.
It makes no difference if we if declare the constraint as deferrable (or invalidate) where a Non-Unique index is required:
SQL> alter table bowie add primary key (id) novalidate;
alter table bowie add primary key (id) novalidate
*
ERROR at line 1:
ORA-01408: such column list already indexed
SQL> alter table bowie add primary key (id) deferrable;
alter table bowie add primary key (id) deferrable
*
ERROR at line 1:
ORA-01408: such column list already indexed
Attempting to create a Bitmap index at the same time as the constraint is equally fruitless:
SQL> alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id));
alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id))
*
ERROR at line 1:
ORA-00968: missing INDEX keyword
SQL> alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id) deferrable);
alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id) deferrable)
*
ERROR at line 1:
ORA-00968: missing INDEX keyword
So definitely, looking at creating a Bitmap index on a unique column is not a sensible thing to attempt both because the resultant bitmap index would be larger than a corresponding Btree index if permitted and because in many scenarios as discussed, Oracle simply won’t let you do it anyways.
OK, so a unique column is not suitable for a Bitmap index. The question remains at what point does it make sense to create a bitmap index ? The answer is reasonably obvious when one understands the structure of both types of indexes although the answer may surprise some folks. I’ll look at this question next …
Unique Bitmap Indexes Part I (Unnatural Selection) March 24, 2010
Posted by Richard Foote in Bitmap Indexes, Index Internals, Oracle Indexes, Unique Indexes.17 comments
As I’ve discussed previously, a Bitmap index can be considered over a B-tree index (where concurrent DML is not an issue) even if there are potentially tens of millions of distinct values, in a table that has say hundreds of millions of rows.
However, if a column is unique or “approaches” uniqueness, then one has gone too far and the bitmap index is going to be larger and less efficient than an equivalent b-tree index. So you wouldn’t consider a bitmap index on a column with a million distinct values if the table itself only has in the vicinity of a million rows as well.
To understand why a column approaching uniqueness shouldn’t be considered as a bitmap index, one only needs to understand the structure and corresponding differences of index entries in both bitmap and b-tree indexes.
I’ll begin by creating a simple table and populating it with a million rows.
SQL> create table bowie (id number, name varchar2(20)); Table created. SQL> insert into bowie select rownum, 'Ziggy Stardust' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit; Commit complete.
Note that the ID column is unique. We can therefore create a Unique b-tree index:
SQL> create unique index bowie_unique_i on bowie(id) pctfree 0; Index created. SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_UNIQUE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS DISTINCT_KEYS --------------- ---------- ----------- ------------- BOWIE_UNIQUE_I 2 1875 1000000
Note that the unique index has 1875 leaf blocks. If we dump a leaf block and look at some (say 5) of the index entries:
row#0[8025] flag: ------, lock: 0, len=11, data:(6): 02 00 6b 0a 00 00 col 0; len 2; (2): c1 02 row#1[8014] flag: ------, lock: 0, len=11, data:(6): 02 00 6b 0a 00 01 col 0; len 2; (2): c1 03 row#2[8003] flag: ------, lock: 0, len=11, data:(6): 02 00 6b 0a 00 02 col 0; len 2; (2): c1 04 row#3[7992] flag: ------, lock: 0, len=11, data:(6): 02 00 6b 0a 00 03 col 0; len 2; (2): c1 05 row#4[7981] flag: ------, lock: 0, len=11, data:(6): 02 00 6b 0a 00 04
We notice the length of these first 5 index entries are all 11 bytes (len=11).
An index entry from this Unique index basically consists of the indexed value (col 0) which is 2 bytes in length in the above sample plus the following overhead:
2 bytes for flags and locks
6 bytes for the rowid
1 byte for the index column length
So there’s a total of 9 bytes of overhead per index entry in this index in addition to the index value itself. Note also there’s an index entry for each and every indexed value. This is always the case for a non-compressed b-tree index.
If we now compare this with an equivalent Non-Unique index on the same column:
SQL> drop index bowie_unique_i; Index dropped. SQL> create index bowie_nonunique_i on bowie(id) pctfree 0; Index created. SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_NONUNIQUE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS DISTINCT_KEYS ----------------- ---------- ----------- ------------- BOWIE_NONUNIQUE_I 2 1999 1000000
We notice the index is now somewhat larger than the equivalent Unique index, with there now being 1999 leaf blocks, an increase of 124 leaf blocks. A block dump of a leaf block reveals the key difference:
row#0[8024] flag: ------, lock: 0, len=12 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 00 6b 0a 00 00 row#1[8012] flag: ------, lock: 0, len=12 col 0; len 2; (2): c1 03 col 1; len 6; (6): 02 00 6b 0a 00 01 row#2[8000] flag: ------, lock: 0, len=12 col 0; len 2; (2): c1 04 col 1; len 6; (6): 02 00 6b 0a 00 02 row#3[7988] flag: ------, lock: 0, len=12 col 0; len 2; (2): c1 05 col 1; len 6; (6): 02 00 6b 0a 00 03 row#4[7976] flag: ------, lock: 0, len=12 col 0; len 2; (2): c1 06 col 1; len 6; (6): 02 00 6b 0a 00 04
As I’ve discussed previously, Oracle makes the Non-Unique index effectively unique by adding the rowid as an additional indexed column within the index entry (col 1 is this additional index column comprising the rowid). There are therefore 2 columns in the index entry, not just the one (denoted by col 0 and col 1). This ensures all duplicate indexed values are subsequently sorted in rowid order within the index and can be efficiently accessed as necessary.
The consequence of this subtle difference is that an additional byte is now required to store the length of the rowid column and so the total overhead increases from 9 bytes to 10 bytes per index entry. The overall length of an index entry has therefore increased from 11 to 12 byes (len=12) and this results in the overall increase of 124 leaf blocks in the index, required to effectively store these additional 1 million bytes.
If we now create the index as an equivalent bitmap index:
SQL> drop index bowie_nonunique_i; Index dropped. SQL> create bitmap index bowie_bitmap_i on bowie(id) pctfree 0; Index created. SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_BITMAP_I'; INDEX_NAME BLEVEL LEAF_BLOCKS DISTINCT_KEYS ----------------- ---------- ----------- ------------- BOWIE_BITMAP_I 2 3124 1000000
We now notice the index has increased substantially from 1999 leaf blocks for the Non-Unique index to 3124 leaf blocks.
Again, a dump of an index leaf block highlights the reason for the increase:
row#0[8015] flag: ------, lock: 0, len=21 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 00 6b 0a 00 00 col 2; len 6; (6): 02 00 6b 0a 00 07 col 3; len 1; (1): 00 row#1[7994] flag: ------, lock: 0, len=21 col 0; len 2; (2): c1 03 col 1; len 6; (6): 02 00 6b 0a 00 00 col 2; len 6; (6): 02 00 6b 0a 00 07 col 3; len 1; (1): 01 row#2[7973] flag: ------, lock: 0, len=21 col 0; len 2; (2): c1 04 col 1; len 6; (6): 02 00 6b 0a 00 00 col 2; len 6; (6): 02 00 6b 0a 00 07 col 3; len 1; (1): 02 row#3[7952] flag: ------, lock: 0, len=21 col 0; len 2; (2): c1 05 col 1; len 6; (6): 02 00 6b 0a 00 00 col 2; len 6; (6): 02 00 6b 0a 00 07 col 3; len 1; (1): 03 row#4[7931] flag: ------, lock: 0, len=21 col 0; len 2; (2): c1 06 col 1; len 6; (6): 02 00 6b 0a 00 00 col 2; len 6; (6): 02 00 6b 0a 00 07 col 3; len 1; (1): 04
The index entry structure is now somewhat different. We now have an index that has not 1 column (as in the Unique index) or 2 columns (as in the Non-unique index) but 4 columns in the index entry. As before, we still have the index column value of 2 bytes but we now have the following overheads per index entry:
2 bytes for flags and locking (as before)
1 byte for the indexed column length (as before)
6 bytes for a rowid index column (col 1) stating the start of a range of rowids that are covered by the particular index entry
1 byte for the length of this start rowid index column
6 bytes for a rowid index column (col 2) stating the end of a range of rowids that are covered by the particular index entry
1 byte for the length of this end rowid index column
1 byte for the bitmap bit sequence column (col 3) required for all the bits referencing rows within the above rowid ranges
1 byte for the length of this bitmap column
So the total overhead for each of the 5 index entries listed above is now 19 bytes, not 9 or 10 bytes as for the equivalent b-tree indexes. The length of an index entry is therefore 21 bytes in total, not 11 or 12 bytes as for the equivalent b-tree indexes.
A few important points to note.
As the columns are effectively unique, the number of index entries are the same for both b-tree and bitmap indexes. A key advantage of a bitmap index over a b-tree index is that for each distinct value, a single index entry is sufficient to cater for a range of rowids, potentially covering the whole table. For example, a specific column value with 100 duplicates may only need the one index entry for the column value within a bitmap index, but would require 100 different index entries within a (non-compressed) b-tree. However, as the column in the above example is unique, there are no duplicate values and so this potential saving is not possible in this bitmap index.
Notice also the size of the bitmap string for each index entry is actually tiny, just a single byte, even though there are 1 million rows in the table. It doesn’t even look like it’s using a million bits to store the necessary bitmap string information for each index entry. This is because for each index entry, only one bit is ever set to 1 (“true”), all other occurrences are logically false as only 1 row in the table ever has the specific index value. Remember, the column values are effectively unique.
Therefore, Oracle can use a very narrow range of rowid ranges for each index entry and effectively not bother storing details for the vast majority of the possible rowid ranges within the table as there’s only one bit that’s of interest and it only corresponds to a specific part of the table. Even in cases where there might just be a duplicate here or there, most values would be zeros (false) regardless and can be compressed extremely efficiently (topic for another day).
Although many folks commonly think otherwise (see original Burleson article for a perfect example of the following misperception), if a column which is unique or is approaching uniqueness is indexed via a bitmap index, the overheads associated with the bitmap string in the index entry is usually very minimal as by definition most bit values are logically “false” (0), with only the rare “true” (1) bit value needing to be stored and reference.
The issue is not necessarily with the overheads associated with just the bitmap string per se but also with the other overhead components, namely the additional rowid and column length bytes.
In short, the bitmap index can’t be any more efficient that use just 1 byte to store the necessary bitmap string information (plus 1 byte for the bitmap string length), however 19 bytes of overhead is the minimum required, mainly because of the requirement to store 2 rowids instead of 1 rowid and for all the additional index column length bytes. If the bitmap index needs to cater for a wider range of rowids and for more occurrences of 1s (true) values, then the overheads associated with the bitmap sequence can of course be much more considerable than the 1 byte (again, a topic for another day).
Therefore, the bitmap index is going to be significantly less efficient if the indexed values are unique or near unique as there’s all this additional overhead per index entry without the subsequent savings by not having to store separate index entries for duplicates column values. There needs to be at least some measure of duplication within a column for a bitmap index to have some chance of being the more efficient when compared to an equivalent b-tree index.
However, how many duplicate values within a column are actually necessary for a bitmap index to be considered and be viable alternative ? The answer is far fewer than many may think (again see original Burleson article for a common misunderstanding in this respect), although this question will be addressed in a future post on the subject.
Indexes And Small Tables Part VI (Loaded) May 19, 2009
Posted by Richard Foote in Constraints, Oracle Indexes, Small Indexes, Unique Indexes.9 comments
Thanks to comments by PdV, I need yet another Part before I can look at completing this series
OK, we’ve reached the stage in Part V of accessing this small, one block table with a Unique Index. This has reduced the number of consistent gets to 2, with both consistent get operations being the “pinless”, one latch consistent get examinations.
We basically need one consistent get to read the index and the other consistent get to read the row from the table block.
Not bad.
However, if we could somehow just store all the columns of interest within the index structure, we could potentially do even better because then we wouldn’t need to visit the table segment at all. A single consistent get from the index and bingo, we can retrieve all the required data directly from the index.
Overloading an index in this way is actually quite a common tuning trick. Simply add additional columns within the select list so that all the required columns can be found within the index, thereby elliminating the need to access the table at all and so potentially improve performance.
However, when it comes to overloading a unique index designed to specifically police a PK or Unique constraint we have a slight problem. Oracle will not allow a unique constraint to be policed by a unique index that does not have the same column list. It’s not a problem for a non-unique index (providing the leading columns match the constraint columns), but it’s an issue for a unique index.
Therefore, in our little example, we can’t simply create a single concatenated unique index on both the ID and NAME columns and use it to police a unique constraint on just the ID column. We must either use a unique index on the ID column or use a non-unique index on both the ID and NAME columns. If we want to create a unique index on both the ID and NAME columns, we would need to create an additional index on the ID column to police the PK on the ID column or change our business rules to make both the ID and NAME the PK (which is not likely something we would want to do). Note also by doing creating 2 unique indexes, we would effectively be storing the ID column in three separate places, within the table, within the ID index and also within the ID and NAME index. Again, not something we’re likely going to want to do.
To illustrate the point, drop any existing indexes on the SMALL table. If we attempt to create a unique index on both the ID and NAME columns while making the ID column only the PK, Oracle is going to complain:
SQL> alter table small add primary key(id) using index (create unique index small_uk_i on small(id, name));
alter table small add primary key(id) using index (create unique index small_uk_i on small(id, name))
*
ERROR at line 1:
ORA-14196: Specified index cannot be used to enforce the constraint.try and cr
If we create a unique index first on both the ID and NAME columns:
SQL> create unique index small_uk_i on small(id, name);
Index created.
And then hope Oracle will simply use this index to police a PK constraint on just the ID column, we’ll be sadly disappointed as Oracle will actually created another unique index on the ID column behind the scenes:
SQL> alter table small add primary key(id);
Table altered.
SQL> select c.index_name, c.column_name from user_indexes i, user_ind_columns cwhere i.index_name = c.index_name and i.table_name = c.table_name and i.table_name = ‘SMALL’;
INDEX_NAME COLUMN_NAME ------------ ------------ SMALL_UK_I ID SMALL_UK_I NAME SYS_C009759 ID
The CBO will of course potentially look at using our SMALL_UK_I concatenated unique index to perform the select statement of our demo, but the efficiency results are not quite what we’re after:
SQL> select id, name from small where id = 42;
ID NAME ---------- ----- 42 BOWIE ----------------------------------- |Id| Operation | Name | ----------------------------------- | 0| SELECT STATEMENT | | |*1| INDEX RANGE SCAN| SMALL_UK_I| ----------------------------------- Statistics ------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 465 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We first note Oracle is indeed using the concatenated, unique SMALL_UK_I index as it can retrieve all the necessary data from the query directly from the index.
However, also note the CBO is performing an index range scan, not a unique scan as only a portion of the index (the ID) is specified in the WHERE clause. Oracle doesn’t pick up the fact the select operation is actually unique as not all columns within the SMALL_UK_I unique index used by the CBO are specified in the WHERE clause. This despite the fact the ID is actually the defined PK of the table.
Therefore, Oracle is still performing 2 consistent get operations as there may be more rows to retrieve after performing the first fetch within the SQL*PLUS session. Also, if we examined the types of consistent reads being performed, we would note that neither of them are consistent get – examinations.
So we’re really not that far ahead of just using the unique index on the ID column as we did in Part V of this series. We still require 2 consistent gets (although neither of them are now examinations) and we’re having to store the ID in three separate locations for our trouble, rather than two.
Wouldn’t it be nice if we could have a PK index on just the ID column, but somehow add the NAME column (or any other columns of interest) to the index structure so that we only need to visit the index structure, thereby storing the ID in only the one index. Then we could potentially access the data with just one consistent get and with it being a unique index, it would be a consistent get examination requiring only the one latch access.
Hell, wouldn’t it be nice if we didn’t even bother with the table segment at all as all queries of interest would never actually need to access and use the table segment anyways, thereby storing the PK in possibly just the one location.
Such a solution has of course been possible for a long time …
Indexes And Small Tables Part V (It’s No Game) May 13, 2009
Posted by Richard Foote in Index Internals, Oracle Indexes, Small Indexes, Unique Indexes.10 comments
So far in our little example, we’ve looked at how accessing a row of a one block table via a FTS required 4 consistent gets while accessing this same table via a Non-unique index reduced the consistent gets down to 3.
Time to take the next step and improve the efficiency yet further of accessing this small one block table.
We’re now going to replace the Non-unique index with a Unique Index instead. We can obviously do this because the values on the indexed ID column are indeed unique.
Now it’s always a good idea of course to document these table business rules (such as a column being unique) inside the database, however it’s somewhat alarming just how many application just don’t this. I’ve also previously discussed how a PK or Unique constraint can actually be policed via a Non-Unique index so there are many reasons why a small table might not have an associated Unique index.
Not least of course the incorrect perception that an index is not going to be much use on a small table anyways …
So let’s now replace the Non-Unique index with a Unique index instead:
SQL> drop index small_id_i;
Index dropped.
SQL> alter table small add primary key (id) using index (create unique index small_id_i on small(id));
Table altered.
OK, so now we have our Unique index in place. Let’s now run the same query again to see how our consistent gets related statistics might change:
SQL> select * from small where id = 42;
ID NAME ---------- ----- 42 BOWIE -------------------------------------------- |Id |Operation |Name | -------------------------------------------- | 1|TABLE ACCESS BY INDEX ROWID|SMALL | |* 2| INDEX UNIQUE SCAN |SMALL_ID_I| -------------------------------------------- Statistics -------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 465 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We notice our first significant change. The number of consistent gets has reduced further down to just 2.
Why ?
Because with a Unique index, there can only be a maximum of 1 row returned. It’s simply not possible to return 2 or more rows.
Therefore, when selecting this one row, Oracle doesn’t have to perform the second fetch operation to confirm there are indeed no more rows to return. The first fetch will either return the one row of interest or none at all when resolving an equality predicate. That’s it, there are no other possibilities. We return the row of interest by simply accessing the index block (1 consistent get) followed by the table block (the second consistent get).
As we have previously, if we look at the actual consistent gets statistics of interest by running the following query in another session before/after the select statement:
SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n WHERE s.statistic# =n.statistic# AND s.sid = 141 AND n.name LIKE ‘consistent%’;
NAME VALUE ------------------------------ ----- consistent gets 31236 consistent gets - examination 5084
And again afterwards ..
SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n WHERE s.statistic# =n.statistic# AND s.sid = 141 AND n.name LIKE ‘consistent%’;
NAME VALUE ------------------------------ ----- consistent gets 31238 (+2) consistent gets - examination 5086 (+2)
We notice we have indeed only performed the 2 consistent gets. But we also notice another significant difference, that being both consistent gets are now the “cheaper” consistent gets – examination.
This means that the latches required to now perform this select statement via the Unique index is just 2, down from 6 for the Non-unique index and 8 from the FTS.
Generally during a consistent get, Oracle needs to grab the cache buffers chain latch so it can pin the specific block in memory and then grab the latch again so that it can subsequently unpin the block once it’s finished processing the block. Each of these accesses to the latch and the subsequently pin/unpinning of the block requires CPU and is a possible source of contention within the database.
For some operations that only require a very very quick “read and get out of there” type operation and/or on blocks that are unlikely to change within a given point of time, Oracle uses a cheaper consistent get operation which doesn’t actually require the block to be pinned. There’s no point in pinning the block as it’s only going to be read and accessed for a short time (shorter than might otherwise be required when processing a block in memory) and the block is unlikely to change anyways.
So for these operations, Oracle uses a cheaper consistent get called a consistent gets – examination. These consistent gets examinations only need grab the cache buffers chain latch before quickly reading the block and releasing the latch once the read operation is complete. Therefore it only needs to grab and release the cache buffer chains latch the once without having to pin/unpin the block, which means less CPU and less latch contention overall.
Now this isn’t particularly well documented. Often discussions mention reads of undo blocks as being candidates for consistent gets examinations as these reads are likely to be both relatively quick and a specific undo block is unlikely to change as only one transaction can actually update an undo block at a given time.
Getting back to indexes, reads of index root blocks are another candidate mentioned as again a read of an index root block is going to be very quick and an index root block is unlikely to change at a given point of time.
However, what is not well documented at all is the fact that any block accessed during an index Unique Scan is accessed via a consistent get – examination, including the consistent get associated with reading the table block as well. This is because again, any such read operation is going to be relatively quick as the most that ever needs to be read is the one index related entry and the one table row.
The net result is that now accessing a row from a small table via a Unique index requires only 2 latch accesses vs. the initial FTS example which required 8 latch gets as none of the FTS consistent gets are examinations.
Now you might say that these are all very small numbers, that 4 consistent reads isn’t that much, that 8 latches isn’t really that huge a number and reducing 8 latches down to 2 latches isn’t really going to be that noticeable. Yes it is effectively a 75% reduction but it’s a 75% reduction of not very much.
And if you’re discussing a single read of a single small lookup table you would likely be right.
But what if the small table is accessed frequently by the application, perhaps many 1,000s of times per minute. What if you have many such small tables, often used in small join operations by your OLTP applications. What if you have large numbers of users in a large application with many many such small table accesses. This effectively 75% saving can potentially become very significant, both in terms of the reduction in CPU load and also in the reduction of latch contention, which in turn can further reduce CPU loads.
A small improvement multiplied by a large amount can indeed make a difference …
However, I have one more step to go yet in further improving the efficiency of these small table lookups via an index.
One which can reduce the overall overheads by yet another 50% …
Differences Between Unique and Non-Unique Indexes Part 4.5 (Fix You) March 30, 2009
Posted by Richard Foote in Fragmented Indexes, Index Internals, Non-Unique Indexes, Oracle Indexes, Unique Indexes.5 comments
In my last post, Part IV in this series, we looked at how a Unique Index can reuse space deleted within the same logical transaction, whereas a Non-Unique Index can not. Deleted space within a Non-Unique index can only be reused by subsequent transactions.
It’s sometimes important to appreciate this distinction because as discussed in the various OTN and Ask Tom threads mentioned in Part IV, there are times when this can make a significant difference to the manageability and efficiency of the resultant index.
Now, it’s not everyday someone might for example delete all rows in a table and repopulate it again within a single transaction (the TRUNCATE command was of course developed for a reason). However, perhaps an application was developed without your involvement, perhaps a large proportion but not all of the data is being deleted or as someone mentioned on OTN, perhaps the table in question is a Materialized View being fully refreshed within a refresh group. There could therefore be occasions when a single transaction might indeed perform a large delete followed by a similarly sized insert.
In which case, whether an index is defined as Unique or Non-Unique might make a difference …
To begin with, let’s populate a table with 1M rows and create an associated Unique index:
SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;
1000000 rows created.
SQL> commit;
Commit complete.
SQL> create unique index bowie_idx on bowie(id);
Index created.
Let’s look at the size of this newly created Unique index:
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 2176 2087 0
OK, let’s now delete the entire table and repopulate it again, within the same logical transaction …
SQL> delete bowie;
1000000 rows deleted.
SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;
1000000 rows created.
SQL> commit;
Commit complete.
Let’s look at the size difference for the Unique Index and see how many deleted index entries we have as a result:
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 2176 2087 0
OK good, the index is actually identical in size and we have no deleted entries, not a one. All the deleted entries as a result of the delete command have been reused by the subsequent insert statement. This means of course that the index is just as efficient now after all this DML activity, as it was when the index was first created.
Let’s perform exactly the same demo, but this time with a Non-Unique index and see any differences …
SQL> drop table bowie;
Table dropped.
SQL> create table bowie (id number, name varchar2(20));
Table created.
SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;
1000000 rows created.
SQL> commit;
Commit complete.
SQL> create index bowie_idx on bowie(id);
Index created.
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 2304 2226 0
The first difference we notice is that the Non-Unique index after it has just been created is somewhat larger than the equvalent Unique index (2226 leaf blocks vs. 2087 leaf blocks). This is a direct result of the Non-Unique index having to store an extra byte for the length byte associated with the rowid being an additional index column for each and every one of the 1M index entries.
SQL> delete bowie;
1000000 rows deleted.
SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;
1000000 rows created.
SQL> commit;
Commit complete.
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 4608 4518 1000000
OK not quite so good, big difference here. Previously, the Unique Index remained unchanged and had no deleted index entries. However, the Non-Unique index is now effectively double the size it was previously and has 1M deleted index entries still within the index structure. Not a one was recycled and reused within the logical transaction.
This index is now potentially problematic, especially if there are going to be no or few subsequent inserts until it next gets refreshed, where the deleted entries can be reused but the current entries may again remain in the index after they’ve been deleted.
Again, it’s important to understand what is going on here so one can take the appropriate adminstration steps. Perhaps it might be better to drop the index and recreate it after the transaction (if permitted). Perhaps the truncate command isn’t such a bad idea after all (if permitted). Perhaps it might be better to police the Unique constraint with a Unique rather than a Non-Unique index after all.
Perhaps, it might be better to not perform the above within a single transaction and issue an intermediate commit after all (if permitted) …
SQL> drop table bowie;
Table dropped.
SQL> create table bowie (id number, name varchar2(20));
Table created.
SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;
1000000 rows created.
SQL> commit;
Commit complete.
SQL> create index bowie_idx on bowie(id);
Index created.
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 2304 2226 0
SQL> delete bowie;
1000000 rows deleted.
Because if we just issue the commit at this point in the process …
SQL> commit;
Commit complete.
SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;
1000000 rows created.
SQL> commit;
Commit complete.
SQL> analyze index bowie_idx validate structure;
Index analyzed.
SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;
BLOCKS LF_BLKS DEL_LF_ROWS ---------- ---------- ----------- 2304 2226 0
We would not have this problem as the subsequent transaction that performs the insert can reused all the deleted space associated with the first delete transaction.
If one understands how indexes work and understands how deleted space can be reused, one can prevent many potential issues and unnecessary maintenance tasks.
Prevention is always the best cure …
Differences Between Unique and Non-Unique Indexes Part IV (Take It Back) March 25, 2009
Posted by Richard Foote in Index Internals, Non-Unique Indexes, Oracle Indexes, Unique Indexes.11 comments
I’ve previously discussed various differences between Unique and Non-Unique indexes (Part I, Part II and Part III) and why I have a preference to implement Unique indexes whenever possible and practical.
Various recent discussions on the OTN forums and on Ask Tom reminded me that I hadn’t yet discussed on this blog another subtle, but potentially significant difference between Unique and Non-Unique indexes.
As I’ve previously discussed, there’s actually no such thing as a Non-Unique index entry as such as Oracle ensures all index entries are effectively unique by adding the rowid to the index key for all Non-Unique indexes. Fundamentally, this is essential because Oracle needs some way of efficiently finding the precise index entry associated with an update or delete operation. Without having the rowid as part of the index key, Oracle would be forced to navigate to the first occurrence of an index value and search through all occurrences of the index value until it finds the specific entry containing the rowid of interest. This could potentially result in visiting many leaf blocks if the index value spans multiple leaf blocks. By including the rowid as the last index key column, non-unique index values are further ordered based on the corresponding rowid within the same indexed values. Oracle can therefore always navigate directly to the leaf block containing the exact index entry of interest as the rowid can be included in the branch blocks to determine both the index entry and rowid ranges found in specific leaf blocks.
If we look at a simple example by creating a one row table with a Non-unique index:
SQL> create table bowie (id number, name varchar2(20));
Table created.
SQL> insert into bowie values (1, ‘BOWIE’);
1 row created.
SQL> commit;
Commit complete.
SQL> create index bowie_idx on bowie(id);
Index created.
SQL> select header_file, header_block from dba_segments where segment_name = ‘BOWIE_IDX’;
HEADER_FILE HEADER_BLOCK ----------- ------------ 7 124937
Let’s dump the index block …
SQL> alter system dump datafile 7 block 124938;
System altered.
Leaf block dump
===============
header address 425713756=0x195fe05c
kdxcolev 0
KDXCOLEV Flags = – - -
kdxcolok 0
kdxcoopc 0×80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0×26
kdxcofeo 8024=0x1f58
kdxcoavs 7986
kdxlespl 0
kdxlende 0
kdxlenxt 0=0×0
kdxleprv 0=0×0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——, lock: 0, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c1 e7 8a 00 00
Notice how the rowid is an additional index column within the index entry for the Non-Unique index.
Now if we were to delete and subsequently re-insert a row in the table with same index value within a single transaction, note the rowid of the new row by definition will differ from the deleted row. Therefore, we would need a different index entry for the new index row because if the rowids differ, then the associated index entries must differ as well. Note also (and this is critical) that because we would have a different rowid, if we had multiple index entries with the same key, this new index entry might not be in the same logical order as that of the deleted index entry. In fact, it’s quite possible that the new index entry might actually need to be stored in a totally different leaf block if this specific index value spanned multiple index leaf blocks because the index entries, including the rowids must always be logically ordered.
Therefore, if we were to delete and re-insert the same index value within a single transaction, Oracle is forced to create a new index entry and will not reuse the existing, deleted index entry.
So continuing with the demo, let’s delete the row:
SQL> delete bowie;
1 row deleted.
and now re-insert a row with the same indexed value within the same transaction:
SQL> insert into bowie values (1, ‘THIN WHITE DUKE’);
1 row created.
SQL> commit;
Commit complete.
If we look at a block dump now …
Leaf block dump
===============
header address 425713756=0x195fe05c
kdxcolev 0
KDXCOLEV Flags = – - -
kdxcolok 0
kdxcoopc 0×80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0×28
kdxcofeo 8012=0x1f4c
kdxcoavs 7972
kdxlespl 0
kdxlende 1
kdxlenxt 0=0×0
kdxleprv 0=0×0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: —D–, lock: 2, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c1 e7 8a 00 00
row#1[8012] flag: ——, lock: 2, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c1 e7 8a 00 01
—– end of leaf block dump —–
We notice the previous index entry has been logically deleted and Oracle has created a new index entry with the new associated rowid.
Let’s now run exactly the same demo again, but this time with a Unique index instead of the Non-Unique index …
SQL> drop table bowie;
Table dropped.
SQL> create table bowie (id number, name varchar2(20));
Table created.
SQL> insert into bowie values (1, ‘BOWIE’);
1 row created.
SQL> commit;
Commit complete.
SQL> create unique index bowie_idx on bowie(id);
Index created.
SQL> select header_file, header_block from dba_segments where segment_name = ‘BOWIE_IDX’;
HEADER_FILE HEADER_BLOCK ----------- ------------ 7 125193
SQL> alter system dump datafile 7 block 125194;
System altered.
Leaf block dump
===============
header address 371859548=0x162a205c
kdxcolev 0
KDXCOLEV Flags = – - -
kdxcolok 0
kdxcoopc 0×80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0×26
kdxcofeo 8025=0x1f59
kdxcoavs 7987
kdxlespl 0
kdxlende 0
kdxlenxt 0=0×0
kdxleprv 0=0×0
kdxledsz 6
kdxlebksz 8036
row#0[8025] flag: ——, lock: 0, len=11, data:(6): 01 c1 e8 8a 00 00
col 0; len 2; (2): c1 02
—– end of leaf block dump —–
Notice the big difference here. Because the index has been defined as Unique, all the associated index entries must be unique. It’s simply not possible to have duplicate index entries within a Unique index structure. Therefore, it’s not necessary to have the rowid as a separate column of the index entry as the index values themselves are sufficient to uniquely identify each and every index entry. The rowid is basically just another piece of overhead associated with the index entry rather than a separate index column. The length of this unique index entry is just 11 bytes, where it was previously 12 bytes, because we no longer need to store the length byte associated with the second index column necessary in the Non-unique index for the rowid.
And now comes the subtle difference …
If we were to now delete and re-insert the same index value within a single transaction, Oracle can now reuse the same, deleted index entry, because the index entry is effectively identical to the deleted one. The only possible difference is the rowid but the rowid is no longer a part of the index column list and so can just be updated as necessary. Note also (and this is the critical bit for Unique indexes), because the actual index value remains the same, the order of the index entry within the index must also remain the same. There is no need to move the re-inserted index entry to another part of the index structure because deleting and re-inserting the same index entry does not logically alter the order of where the index entry must reside.
Therefore, if we were to delete and re-insert the same index value within a single transaction, Oracle does not need to create a new index entry and can simply reuse the existing, deleted index entry.
So continuing with the demo, let’s delete the row:
SQL> delete bowie;
1 row deleted.
and now re-insert a row with the same indexed value within the same transaction:
SQL> insert into bowie values (1, ‘THIN WHITE DUKE’);
1 row created.
SQL> commit;
Commit complete.
If we look at a block dump now …
Leaf block dump
===============
header address 371859548=0x162a205c
kdxcolev 0
KDXCOLEV Flags = – - -
kdxcolok 0
kdxcoopc 0×80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0×26
kdxcofeo 8025=0x1f59
kdxcoavs 7987
kdxlespl 0
kdxlende 0
kdxlenxt 0=0×0
kdxleprv 0=0×0
kdxledsz 6
kdxlebksz 8036
row#0[8025] flag: ——, lock: 2, len=11, data:(6): 01 c1 e8 8a 00 01
col 0; len 2; (2): c1 02
—– end of leaf block dump —–
We note that Oracle has indeed reused the previously deleted index entry and has simply updated the rowid with the new rowid value. There is no deleted index entry, Oracle has simply changed the associated rowid to that of the new row in the table for the existing Unique index entry.
Where an Update of an index entry is actually effectively a delete and an insert of an index entry, notice that by contrast for Unique indexes, a delete and a re-insert operation is effectively an update of an index entry !!
In my next post I’ll highlight how this difference can be critical to the behaviour and efficiency of an index and why it’s important to understand how indexes work to avoid and prevent potential issues.
NOVALIDATE Constraints Part II – Does It Matter ? July 30, 2008
Posted by Richard Foote in Constraints, Novalidate Constraints, Oracle Cost Based Optimizer, Oracle Indexes, Primary Key, Unique Indexes.13 comments
As promised, more on NOVALIDATE constraints.
As previously discussed, a Primary Key or a Unique Key constraint that’s not validated is by default policed by a Unique index regardless. If there are no duplicate values, then no worries (yes, I’m Australian), Oracle will create the Unique index and enable the constraint in a NOVALIDATE state. If there are duplicate values, Oracle will complain about creating the Unique Index to police the constraint and you must either explicitly create a Non-Unique index when creating the constraint or use an existing Non-Unique index.
So what are the implications, if any, of having a Primary key constraint in a NOVALIDATE state, especially if a Unqiue index is used to police the constraint ? The data must really be unique else the Unique Index could not have been created, right ? Also Oracle can take advantage of all the benefits associated with having a Unique index such as less consistent reads and latching overheads as previously discussed.
Following on from the demo in Part I, if we have a table with a Primary Key in a NOVALIDATE state, policed by a Unique Index:
SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;
Table altered.
SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY’;
CONSTRAINT_NAME VALIDATED UNIQUENES --------------- ------------- --------- ZIGGY_PK NOT VALIDATED UNIQUE
Oracle will be able to use the Unique index to perform a nice, efficient, low latching Unique Scan with the index:
SQL> SELECT * FROM ziggy WHERE id = 42;
------------------------------------------ |Id|Operation |Name | ------------------------------------------ | 0|SELECT STATEMENT | | | 1| TABLE ACCESS BY INDEX ROWID|ZIGGY | |*2| INDEX UNIQUE SCAN |ZIGGY_PK| ------------------------------------------
Everything’s perfect regardless of the PK constraint not being validated, right ?
Well, not exactly.
Remember, a PK constraint requires the data to be Unique AND Not Null. Now the Unique Index guarantees the data is indeed unique but it does nothing to protect us from having possible NULL values in our data. The index will simply ignore and not index any index entries that are fully NULL, therefore the PK column(s) could potentially, just maybe, contain NULLS. Brian Tkatch in a comment in Part I has a nice example of how this is possible.
This mean Oracle can not guarantee the index has index entries for every row in the table as any rows with a NULL PK will not be indexed. This can have serious reprecussions for the CBO when deciding an optimal execution plan.
For example, a query such as the following COUNT(*) query which could potentially be serviced via a “smaller” PK index segment can not use the Unique index and is forced to use either another index or a Full Table Scan:
SQL> select count(*) from ziggy;
--------------------------------- | Id| Operation | Name | --------------------------------- | 0| SELECT STATEMENT | | | 1| SORT AGGREGATE | | | 2| TABLE ACCESS FULL| ZIGGY| ---------------------------------
Another example, this query with an ORDER BY clause could potentially use the Unique index to retrieve the data and so avoid the sort operation as the Clustering Factor of the index is very good. However, it can’t as again, the CBO can’t guarantee all data will be retrieved via the index:
SQL> select * from ziggy order by id;
10000 rows selected.
--------------------------------- | Id| Operation | Name | --------------------------------- | 0| SELECT STATEMENT | | | 1| SORT ORDER BY | | | 2| TABLE ACCESS FULL| ZIGGY| ---------------------------------
However, if only we just validate the constraint, everything changes:
SQL> ALTER TABLE ziggy ENABLE VALIDATE PRIMARY KEY;
Table altered.
The COUNT(*) query suddenly starts using the index as a cheaper alternative as now, there can’t be any null values and so the index must reference all possible rows:
SQL> select count(*) from ziggy;
------------------------------------- |Id|Operation | Name | ------------------------------------- | 0|SELECT STATEMENT | | | 1| SORT AGGREGATE | | | 2| INDEX FAST FULL SCAN| ZIGGY_PK| -------------------------------------
The ORDER BY query suddenly starts using the index and avoids performing the sort operation as again, the index will now guarantee all rows are returned in a sorted order:
SQL> select * from ziggy order by id;
10000 rows selected.
------------------------------------------ |Id|Operation |Name | ------------------------------------------ | 0|SELECT STATEMENT | | | 1| TABLE ACCESS BY INDEX ROWID|ZIGGY | | 2| INDEX FULL SCAN |ZIGGY_PK| ------------------------------------------
The moral of the story. Provide the CBO with as much information as possible, as it can potentially use the information to determine a more optimal execution plan. Having a NOVALIDATE constraint possibly hides valuable information from the CBO and so needs to be used with caution.
NOVALIDATE Constraints – No really … July 28, 2008
Posted by Richard Foote in Constraints, Indexing Tricks, Novalidate Constraints, Oracle Indexes, Primary Key, Unique Indexes.18 comments
There have been a number of posts recently on the OTN database forum regarding the whole topic of NOVALIDATE of constraints and the associated indexes so I thought it might be worth going over a couple of interesting little quirks with all this.
A NOVALIDATE constraint is basically a constraint which can be enabled but for which Oracle will not check the existing data to determine whether there might be data that currently violates the constraint.
This is useful if we know there’s data that violates the constraint but we want to quickly put on a constraint to prevent further violations, with the intention to clean up any possible violations at some future point in time.
It’s also potentially useful if we know the data is clean and so want to prevent the potentially significant overheads of Oracle having to check all the data to ensure there are indeed no violations.
I previously discussed the use of Non-Unique Indexes for manageing Primary and Unique Key Constraints but there are a few little traps one can easily fall into if one doesn’t understand these two very important fundamentals:
- By default, Oracle will attempt to create a Unique Index to police a PK or UK constraint
- A NOVALIDATE constraint requires a Non-Unique Index for the constraint to really be “Novalidated”
Get these two concepts confused and things can easily get a little difficult to follow …
Here’s a little example of how things can start to get confusing. First, let’s create a simple little table and populate it with a few rows.
SQL> CREATE TABLE ZIGGY (id NUMBER, text VARCHAR2(20));
Table created.
SQL> INSERT INTO ziggy SELECT rownum , ‘Ziggy’ FROM dual CONNECT BY LEVEL <= 10000;
10000 rows created.
Note that the ID column is populated with unique values. However, let’s now introduce a duplicate value, 42:
SQL> INSERT INTO ziggy VALUES (42, ‘DUPLICATE’);
1 row created.
SQL> COMMIT;
Commit complete.
OK, we now want to add a Primary Key to this table but because we suspect there might be some duplicate values which we intend to clean up at some future point in time, we want to create the constraint with NOVALIDATE:
SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;
ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE
*
ERROR at line 1:
ORA-02437: cannot validate (BOWIE.ZIGGY_PK) – primary key violated
Now what the hell is going on here ?
We clearly stated we want to create a NOVALIDATE constraint but Oracle appears to be ignoring this and is validating the constraint regardless and so generating an error because of the duplicate entry.
Why ?
Because by default Oracle will attempt to create a Unique index when creating a PK constraint. A Unique index MUST always contain unique values and so complains when it stumbles across our duplicate 42 ID value. The constraint is being effectively validated because the unique index will only be created providing there are indeed no duplicate values.
Not how I would have designed things but there you go …
However, if we either have an existing Non-Unique index which Oracle can use or we explicitly create a Non-Unique index, then we can proceed with creating the NOVALIDATE constraint as required:
SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id)
USING INDEX(CREATE INDEX ziggy_pk ON ziggy(id)) ENABLE NOVALIDATE;
Table altered.
If we look at the status of the constraint and the type of index used to police the constraint, we notice that the index is indeed a Non-Unique index and the constraint has not been validated:
SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY’;
CONSTRAINT_NAME VALIDATED UNIQUENES --------------- ------------- --------- ZIGGY_PK NOT VALIDATED NONUNIQUE
We have a PK constraint even though there are currently duplicate values of the PK column in the data.
OK, let’s now drop and the constraint, the Unique Index and delete the duplicate row:
SQL> ALTER TABLE ziggy DROP PRIMARY KEY;
Table altered.
SQL> DROP INDEX ZIGGY_PK;
Index dropped.
SQL> DELETE ziggy WHERE id = 42 and rownum <= 1;
1 row deleted.
SQL> COMMIT;
Commit complete.
The data is now clean and we have no existing constraint or index on the ID column:
SQL> SELECT constraint_name, validated FROM user_constraints WHERE table_name= ‘ZIGGY’;
no rows selected
Let’s now do something that based on our understanding might appear to be a little odd, let’s try and recreate the constraint in a NOVALIDATE state but with a Unique index. This of course should now work as there are indeed no duplicates within the data:
SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;
Table altered.
Success !! Let’s now look at the state of the constraint and the type of index created:
SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY’;
CONSTRAINT_NAME VALIDATED UNIQUENES --------------- ------------- --------- ZIGGY_PK NOT VALIDATED UNIQUE
As expected, we have a constraint that’s policed by a Unique index that has not been validated.
This might appear be a little odd, because the question you might well now ask is why bother no validating a constraint that has effectively been validated anyways as the use of the Unique index has guaranteed there can not possibly be any duplicate values else the creation of the Unique index would have failed ?
We effectively have a validated constraint which Oracle is still classifying as being not validated
Then again, maybe not …
More later.
Differences between Unique and Non-Unique Indexes (Part III) December 30, 2007
Posted by Richard Foote in Constraints, Index Internals, Oracle Indexes, Performance Tuning, Unique Indexes.5 comments
A comment by Robert in Part II of this series reminded me of another subtle difference between Unique and Non-Unique Indexes. Now this difference is likely to be of minimal consequence to most applications as most applications don’t generally have problems with Primary Key (PK) or Unique Key (UK) constraint violations (and if they do, this is likely to be the least of their worries). But it’s a interesting difference nonetheless, something to keep in the back of your mind and a little tit-bit to end the year on.
When a row is inserted into a table or when a PK or UK is modified, Oracle of course needs to ensure that either the PK or UK constraint is not violated. If the constraint is policed via a Unique index, as previously discussed, Oracle knows the value must and can only ever be unique and so performs the constraint validation before the Unique index is actually modified. If the PK or UK is violated, the Unique index can not possibly have been changed as all the associated index entries must always be unique and so only the undo (and redo) of the changes associated with the table data blocks are actually generated and need to be subsequently rolled back.
However, if the PK or UK constraint is policed via a Non-Unique index, the mechanism for applying the changes differs somewhat. As the index is Non-Unique, as previously discussed, Oracle is not quite so certain as to the state of play and performs the constraint validation after the associated changes are made to the Non Unique index. If the PK or UK constraint is violated, both undo and redo of the Non-Unique index has been generated and both changes to the table data blocks and the index blocks need to be rolled back.
This means there’s an extra cost associated with violating a constraint if the constraint is policed via a Non-Unique Index vs. a Unique index. When performing media recovery, it also means that there’s an additional cost associated with performing the recovery. Obviously the more frequent the constraint violations, the greater the overall penalties. Also, the larger the PK or UK values, the greater the penalties.
See this little demo to illustrate the differences between a Unique and a Non-Unique index in the redo and undo generated when a constraint is violated: Difference in redo and undo between a Unique and a Non-Unique Index.
As mentioned, this difference in behaviour between Unique and Non-Unique Indexes is unlikely to be an issue. However, in applications or environments where there may be a significant number of such violations, it may be something to keep in the back of your mind.
For a more detailed discussion and where it could be an issue, see Eric Emrick’s presentation.
Differences between Unique and Non-Unique Indexes (Part II) December 21, 2007
Posted by Richard Foote in Index Access Path, Index Internals, Indexing Tricks, Oracle Cost Based Optimizer, Oracle Indexes, Primary Key, Unique Indexes.18 comments
The most significant difference between a Unique and a Non-Unique index is of course the simple fact that in one index, all index entries MUST be unique and in the other index there can be duplicates of an index entry.
Although an obvious distinction between the two, it’s also a crucial difference as well.
When Oracle uses a Unique Index to scan for a specific value (via an equality predicate on all indexed columns or when policing a constraint ), there can only be one of two possible results. The value can exist returning at the very most one value or the value doesn’t exist returning 0 values. That’s it, 1 row or none. The value either exists or it doesn’t.
This fact means Oracle doesn’t have to worry about a whole bunch of things when dealing with Unique indexes during equality or unique checking processes. It doesn’t have to check the next index entry just in case there’s a second or more entries with the same value. It doesn’t have to worry about the potential of having to skip across to the next leaf page if the specific value it reads happens to be the maximum value in the current leaf page. It doesn’t have to worry about pointers to these “adjacent” leaf blocks changing on it due to block splits. It doesn’t have to concern itself with potentially visiting more than the one table data block during the index access operation.
Life is simple, it’s either 1 row or none.
Not so for Non-Unique indexes. With a Non-Unique index, there are no such guarantees. With a Non-Unique index, there are 3 categories of possibilities. An index scan could return 0 rows, it could return 1 row or it could return more than one row. It could potentially need to go and visit more than the current leaf block to return all the matching rows. It could potentially need to go and visit more than one table block.
Life’s not quite so “simple” for a Non-Unique index.
Note also and most importantly that life gets no easier for a Non-Unique index that polices a PK or Unique key constraint.
Even though there’s a PK or Unique constraint on a column, to Oracle, it’s just another Non-Unique index with the same “vague” possibilities. Remember that PK and Unique constraints can be enabled with NOVALIDATE meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index. Remember constraints can be DEFERRABLE, meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index.
This means that Oracle has to concern itself with a number of additional overheads, including having to “check” the next index entry, “just in case” it matches the required index value. It has to concern itself even with the possibility of having to visit the next index leaf block, “just in case”.
You will note when Oracle performs an equality search using a Unique Index, Oracle will perform an “INDEX UNIQUE SCAN” because the index entries MUST be unique.
You will note however when Oracle performs an equality search using a Non-Unique index, even if there’s a PK or Unique constraint of the column(s), Oracle will perform an INDEX RANGE SCAN, because it needs to scan multiple index entries “just in case”.
So are there any actual implications as a result of any of this ?
Yes.
When Oracle actually reads an index and processes the associated blocks in the buffer cache(s), Oracle uses a number of latches. These latches are used primarily to “protect” memory structures from concurrent activity. Very simplistically, by grabbing a latch, Oracle effectively performs a “lock” on the associated memory structure, perform whatever activity needs to be performed and releases the latch. These latches get grabbed and released (hopefully) extremely quickly (order of 1/10s of ms), but it’s a non zero value.
The issue with latches is that they’re a point of serialisation. If two (or more) processes want a specific latch, one (or more) has to wait. Latches also burn CPU. Only a teensy weeny bit at a time but some CPU nonetheless. They burn CPU while acquiring the latch and if fail due to latch contention, while attempting again and again to acquire the latch. They also burn CPU while performing the specific operation necessary of the latch.
Basically, the more latches, the greater the potential for contention, the greater the potential for latch related wait activity and perhaps most important of all, more CPU is required. In busy systems, there can be massive numbers of latch events and the best way to tune these events is to reduce where possible the number of latches required by the database environment. It’s one of the key reasons we try and reduce LIOs in a database as much as possible, to reduce the latch and CPU load on the system.
Because of the differences highlighted between Unique and Non-Unique indexes, the number and manner of latches required between the two indexes differs. And it differs significantly …
In this little demo, Latch Differences Between Unique and Non-Unique Indexes Demo, we compare the latches required to read an identical table, using a 2 level index. The differences between the latch overheads of a Unique and a Non-Unique index are most interesting.
When using a Unique Index, Oracle required 3 consistent gets (one for the index root block, one for the leaf block and one for the table block). BUT, each consistent get was a consistent gets – examination, a special type of consistent get which only requires 1 latch (rather than the standard 2 latches).
So that’s a sum of 3 latches.
However, when using a Non-Unique index, Oracle required 4 consistent gets (one for the index root block, one for the leaf block, one for the table block and an additional one to recheck the leaf block for any duplicate index entries). BUT, only the 1 consistent read (for the index root block) was actually the “cheaper” consistent gets – examination, the other 3 were the more costly 2 latch variety.
So that’s a sum of 7 latches.
3 latches for the Unique index and 7 latches for the Non-Unique index.
That’s an increase of 133.3% in latches between the two types of indexes.
Now, the height of the index will change the ratio of latch difference between the two indexes. Also, in a busy system, there could potentially be differences in the types of latches used due to the current state or additional activity in a block.
However, the potential difference in latch requirements between a Unique or Non-Unique index can be very significant. But does a few additional latches here and there really make much of a difference ?
Well, of course it depends. On small scale systems with smaller loads, fewer indexes, fewer users and excess resources, the noticeable differences may be negligible.
However, in larger scale (especially OLTP) environments, a particular index may be accessed 100s or maybe 1000s of times a second. There may be 1000s of tables with 1000s of corresponding PK and Unique constraints policed by 1000s of Unique (or Non-Unique) indexes. It’s therefore not really of question of a few latches here or there. It’s a question of potentially a very significant proportion of overall latch related overheads.
Potentially when accessed, Non-Unique indexes could be generating double the latch related overheads for equality unique scan or unique checking index activity. Remember, the best way to tune latches and reduce latch contention is to simply reduce the requirement and load for latches.
The overall reduction in CPU and latch related wait activity could be significant between Unique and Non-Unique indexes because by using Non-Unique indexes you in the order of double the latches required for such activities.
Note also this doesn’t require any special parameters to be set or special tuning or monitoring by the DBA. It simply requires using Unique indexes to police PK or Unique constraints when there are no requirements of Non-Unique indexes. You then potentially gain a benefit each and every time the index is used for unique scan accesses.
Guess what type of access is extremely common in large scale OLTP environments …
The next time you complain about high CPU consumption or high latch contention and you’re tuned the application to death, just ask yourself how many Non-unique indexes are policing your PK or Unique Key constraints …
Local Index Issue With Partitioned PK and Unique Key Constraints December 20, 2007
Posted by Richard Foote in Constraints, Index Access Path, Local Indexes, Oracle Indexes, Partitioning, Performance Tuning, Unique Indexes.5 comments
Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:
“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”
Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.
Let me explain why.
Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!
Let’s start by mentioning constraints again.
Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.
Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).
Lets say a table is Range Partitioned on column ’A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!
Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.
Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.
This is actually a good thing.
If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).
It actually makes a lot of sense to do this.
Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.
Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.
If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.
Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.
In other words, the rules apply equally to both Unique and Non-Unique indexes.
So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint* to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.
Little demo to illustrate: Local Index Issue With Partitioned PK and Unique Key Constraints
Differences between Unique and Non-Unique Indexes (Part I) December 18, 2007
Posted by Richard Foote in Constraints, Deferrable Constraints, Index Internals, Indexing Tricks, Novalidate Constraints, Oracle Indexes, Primary Key, Unique Indexes.27 comments
I’ve had a number of comments regarding my earlier blog entry where I recommended avoiding Deferrable and Novalidate constraints unless you need them and consider using Unique Indexes rather than Non-Unique Indexes where possible.
Why such a recommendation, aren’t Unique and Non-Unique indexes practically the same thing when it comes to policing constraints ?
Sure one index explicitly prevents the insertion of duplicates while the other doesn’t. Yes, dropping/disabling a constraint policed by an automatically created Unique index causes the index to be dropped if you forget the KEEP INDEX clause.
But that’s about it, right ?
Well, if you need a constraint to be deferrable, then you must create (either implicitly or explicitly) a Non-Unique index. If you want to enable a constraint with novalidate, then again you can only do so with a Non-Unique index in place policing the constraint.
It does all rather sound like Non-Unique indexes have all the advantages and allows for all the flexibility one could want. Non-Unique indexes allows for both deferrable and novalidate constraints, they don’t get dropped when the associated constraint is dropped / disabled and they can actually police both PK and Unique constraints.
What possible benefits are there in Unique Indexes ?
Well, providing you don’t need your constraints to be deferrable, you validate your constraints when they get created/enabled and you don’t go around dropping PK and/or Unique constraints on too regular a basis (or remember the KEEP INDEX clause if you don’t want your index dropped when you do), then there are a number of reasons why you may just want to consider using Unique indexes over Non-Unique indexes.
There are actually a number of key differences between Unique and Non-Unique indexes, both in the manner in which they’re stored by Oracle and in the manner in which they get processed.
In Part I, I’m just going to focus on the differences in how Oracle physically stores index entries.
In actual fact, there’s really no such thing as a Non-Unique index in Oracle. In order for Oracle to be able to determine the location of any specific index row entry and for Oracle to be able to determine an appropriate “order” for each index row entry, internally, Oracle coverts all Non-Unique indexes into a Unique index. It does this by using the associated ROWID of the index row entry as an additional “column”. As each ROWID is unique, this effectively makes all index entries in a Non-Unique index unique as well. Oracle uses the unique combination of the Non-Unique index value and the associated ROWID to then determine the appropriate order and hence appropriate location within the index structure in which to store the index row entry.
By Oracle making the ROWID an additional column, it also has to allocate an additional byte per index row entry in order to store the length of this column. That’s one teeny weeny little byte extra for each and every index row entry.
So what ?
Well, for indexes that don’t have a particularly large index key length, that one byte can be a significant proportion of the overall key length. Now Oracle needs to allocate 2 byes per row entry for various flags and locking information, it requires 6 bytes for the rowid and 1 byte for each column entry. That’s 9 bytes minimum plus the length of the indexed value itself.
Well how large is a typical unique index entry? Well that of course all depends and some PK / (and especially) Unique values can be quite large. But many many PK values are simply sequenced based numerical values, created nice and small so as to reduce overheads when stored in dependent child tables.
But can it really make any noticeable difference ?
Well, this little demo shows two tables with 1 million numeric PK values: Compare internal index storage between Unique and Non-Unique Indexes
Table test1 is created with a Non-Unique Index, table test2 is created with a Unique Index. The demo shows a partial block dump of a leaf block from each index, highlighting how the Non-Unique index requires an additional byte per index row entry.
The Unique index manages to hold 533 leaf entries in the block while the Non-Unique index could only hold 500. Comparing the total sizes of the two indexes, the Unique index required 1875 leaf blocks while the Non-Unique index required 1999 leaf blocks.
That’s an increase of approximately 6.6% in leaf blocks required for the Non-Unique index to store exactly the same number of index entries as the Unique Index (in this particular example).
That’s 6.6% less storage, that’s a reduction of 6.6% in block splitting and block allocations, that’s a reduction of 6.6% in the cost of full index scans, that’s 6.6% less memory required to cache the index, etc. etc.
The point here is that these savings don’t require any expensive, periodic rebuilding of indexes. They doesn’t require any additional fancy scripts or additional monitoring and processing. The DBA doesn’t have to calculate irrelevant statistics or demand scheduled outages to claim these savings.
This a getting more “dollars for your buck” freebie from Oracle purely and simply by using a Unique index instead of an Non-Unique index.
Note also that not one or two but ALL of your numeric based PKs have the potential to get these types of savings. Obviously the larger the actual PK or Unique key values, the lesser a byte is in proportion to the overall key length and the less percentage savings.
But it’s not a bad payback for many many of your indexes, purely and simply by using Unique indexes instead of Non-unique indexes where possible …
This is but one of the benefits of using Unique Indexes. More (potentially significant) advantages to follow …
