jump to navigation

12c Indexing Extended Data Types Part I (A Big Hurt) September 12, 2013

Posted by Richard Foote in 12c, Extended Data Types, Function Based Indexes, Oracle Indexes, Unique Indexes.
10 comments

The maximum size for VARCHAR2, NVARCHAR and RAW columns has been extended to 32767 bytes with the Oracle 12c Database. However, indexing such columns with standard indexes comes with some challenges.

These extended data types are not enabled by default within the database but can easily be done so by following these steps:

  1. Restart the database in UPGRADE mode
  2. Change the setting of MAX_STRING_SIZE to EXTENDED
  3. Run the rdbms/admin/utl32k.sql script as sysdba
  4. Restart the database

We can now create a table with a larger than 4000 byte VARCHAR2 column (Note such larger column values are actually stored out of line from the rest of the table, I might discuss this in another post) :

SQL> create table bowie (id number, text varchar2(32000));
Table created.

However, if we try now to create an index on such a column:

SQL> create index bowie_text_i on bowie(text);
create index bowie_text_i on bowie(text)
 *
ERROR at line 1:
ORA-01450: maximum key length (6398) exceeded

We find Oracle complains that the possible index length is going to be too large for my (8K) block sized index. So, is it possible to index such extended columns ?

Let’s populate this table with some data:

SQL> insert into bowie (id, text) values (1, 'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa');
1 row created.

SQL> commit;

Commit complete.

SQL> select length(text) from bowie;

LENGTH(TEXT)
------------
        1110

SQL> insert into bowie (id, text)
     select 2, text||text||text||text||text||text||text||text||text||text
     from bowie;

1 row created.

SQL> commit;

Commit complete.

SQL> select length(text) from bowie;

LENGTH(TEXT)
------------
        1110
       11100

SQL> insert into bowie (id, text)
     select rownum+2, to_char(rownum)||'BOWIE'
     from dual connect by level<=99998;

99998 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

So yes, we definitely have at least one very large Text value (some 11100 bytes) in our table. How cool. One method of creating a valid index on this extended column is to use a function-based index based on a hash value of this column. For example:

SQL> create index bowie_hash_text_i on bowie(standard_hash(text));

Index created.

SQL> select index_name, num_rows, leaf_blocks from dba_indexes where index_name = 'BOWIE_HASH_TEXT_I';

INDEX_NAME             NUM_ROWS LEAF_BLOCKS
-------------------- ---------- -----------
BOWIE_HASH_TEXT_I        100000         447

This index can now be used effectively for subsequent equality based predicates, for example:

SQL> select * from bowie where text = '42BOWIE';

Execution Plan
----------------------------------------------------------

Plan hash value: 1900956348
---------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name              | Rows  | Bytes| Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                   |     1 |    16|   203   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE             |     1 |    16|   203   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | BOWIE_HASH_TEXT_I |   400 |      |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(INTERNAL_FUNCTION("TEXT") AND "TEXT"='42BOWIE')
2 - access(STANDARD_HASH("TEXT")=HEXTORAW('A2C98939EDB479BC3EB0CDC560DDCD1575D47F62'))

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
4  consistent gets
0  physical reads
0  redo size
610  bytes sent via SQL*Net to client
544  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)

1  rows processed

So the index has been used to very efficiently retrieve data based on an equality predicate on the extended TEXT column.

However, range based predicates are problematic as Oracle has no easy way to find and retrieve all such data via the index when the data in the index is effectively randomised hashed values. For example:

SQL> select * from bowie where text like 'aaaaaaaaaaaaaaaaaaaaaa%';

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |    16 |   208   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |     1 |    16 |   208   (2)| 00:00:01 |
---------------------------------------------------------------------------

SQL> select * from bowie where text between '4299BOWIE' and '42BOWIE';

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     2 |    32 |   208   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |     2 |    32 |   208   (2)| 00:00:01 |
---------------------------------------------------------------------------

SQL> select * from bowie where text > 'zzz';

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     1 |    17 |   219   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |     1 |    17 |   219   (2)| 00:00:01 |
---------------------------------------------------------------------------

The above are all examples of predicates that can’t use our hash based function-based index, even though the CBO is estimating very few rows to be returned.

If we try now to make this extended column unique via a constraint:

SQL> alter table bowie add constraint bowie_text_unq unique (text);
alter table bowie add constraint bowie_text_unq unique (text)
*
ERROR at line 1:

ORA-01450: maximum key length (6398) exceeded

We hit our problem again. Oracle tries to make a unique index on the Text column, but it can’t because the extended column definition could potentially exceed the maximum allowable key length.

We can get around this in a similar fashion, but by adding a virtual hash column to the table and basing the Unique constraint on this column instead:

SQL> drop index bowie_hash_text_i;

Index dropped.

SQL> alter table bowie add (text_hash as (standard_hash(text)));

Table altered.

SQL> alter table bowie add constraint bowie_text_unq unique (text_hash);

Table altered.

This can now be used to effectively protect the uniqueness of the original Text column:

SQL> insert into bowie (id, text) values (1000001, '42BOWIE');
insert into bowie (id, text) values (1000001, '42BOWIE')
*
ERROR at line 1:

ORA-00001: unique constraint (BOWIE.BOWIE_TEXT_UNQ) violated

This index can now be used in a similar manner as before for equality based predicates:

SQL> select * from bowie where text = '42BOWIE';

ID  TEXT       TEXT_HASH
--- ---------- ----------------------------------------
44     42BOWIE A2C98939EDB479BC3EB0CDC560DDCD1575D47F62

Execution Plan
----------------------------------------------------------

Plan hash value: 2691947611
----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |     1 |    16 |     2   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| BOWIE          |     1 |    16 |     2   (0)| 00:00:01 |
|*  2 |   INDEX UNIQUE SCAN         | BOWIE_TEXT_UNQ |     1 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(INTERNAL_FUNCTION("TEXT") AND "TEXT"='42BOWIE')
2 - access("BOWIE"."TEXT_HASH"=HEXTORAW('A2C98939EDB479BC3EB0CDC560DDCD1575D47F62'))

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
3  consistent gets
0  physical reads
0  redo size
702  bytes sent via SQL*Net to client
544  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
1  rows processed

But with the same restrictions with range based predicates:

SQL> select * from bowie where text between '429BOWIE' and '42BOWIE';

ID  TEXT       TEXT_HASH
--- ---------- ----------------------------------------
44     42BOWIE A2C98939EDB479BC3EB0CDC560DDCD1575D47F62
431   429BOWIE A7E2B59E1429DB4964225E7A98A19998BC3D2AFD

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |     2 |    32 |   208   (2)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |     2 |    32 |   208   (2)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(INTERNAL_FUNCTION("TEXT") AND "TEXT"<='42BOWIE' AND "TEXT">='429BOWIE')

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
758  consistent gets
0  physical reads
0  redo size
787  bytes sent via SQL*Net to client
544  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
2  rows processed

I’ll look at other indexing options with these new extended columns in Part II.

12c Asynchronous Global Index Maintenance Part III (Re-Make/Re-Model) August 7, 2013

Posted by Richard Foote in 12c, Asynchronous Global Index Maintenance, Global Indexes, Oracle Indexes, Partitioning, Unique Indexes.
3 comments

As I discussed previously in Part I, the space occupied by orphaned row entries associated with asynchronously maintained global indexes is not automatically reclaimed by subsequent DML operations within the index. Hence the need to clean out these orphaned index entries via the various options discussed in Part II.

However, a good question by Jason Bucata asked what about Unique indexes. “If the index entries aren’t marked deleted but are truly still “there” in the structure, does that mean you can’t use this feature if any global indexes are unique” ?

So now I need a Part III to answer this question :)

So same demo setup as before but this time with a Unique index on the ID column:

SQL> create table muse (id number, code number, name varchar2(30)) partition by range (id)
(partition muse1 values less than (1000001), partition muse2 values less than (2000001), partition muse3 values less than (maxvalue));

Table created.

SQL> insert into muse select rownum, mod(rownum,100000), 'DAVID BOWIE' from dual connect by level <= 3000000;

3000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'MUSE', estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> create unique index muse_id_i on muse(id);

Index created.

Let’s now drop a table partition and confirm we indeed do have orphaned unique index entries:

SQL> alter table muse drop partition muse1 update global indexes;

Table altered.

SQL> select index_name, num_rows, s.blocks, leaf_blocks, status, orphaned_entries
from dba_indexes i, dba_segments s where i.index_name = s.segment_name and table_name='MUSE';

INDEX_NAME       NUM_ROWS     BLOCKS LEAF_BLOCKS STATUS   ORP
-------------- ---------- ---------- ----------- -------- ---
MUSE_ID_I         3000000      11264        8216 VALID    YES

SQL> analyze index muse_id_i validate structure;

Index analyzed.

SQL> select name, blocks, lf_rows, del_lf_rows from index_stats;

NAME                     BLOCKS    LF_ROWS DEL_LF_ROWS
-------------------- ---------- ---------- -----------
MUSE_ID_I                  9216    3000000     1000000

If we take a look at a partial block dump of the first (left-most) index leaf block at this stage:

Block header dump: 0x018076dc
Object id on Block? Y
seg/obj: 0x16c11 csc: 0x00.29239b itc: 2 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18076d8 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x02 0xffff.000.00000000 0x00000000.0000.00 C— 0 scn 0x0000.0029239b
Leaf block dump
===============
header address 360728164=0x15804664
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 404
kdxcofbo 844=0x34c
kdxcofeo 1675=0x68b
kdxcoavs 831
kdxlespl 0
kdxlende 0
kdxlenxt 25196253=0x18076dd
kdxleprv 0=0x0
kdxledsz 10
kdxlebksz 8036
row#0[8021] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 00
col 0; len 2; (2): c1 02
row#1[8006] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 01
col 0; len 2; (2): c1 03
row#2[7991] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 02
col 0; len 2; (2): c1 04
row#3[7976] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 03
col 0; len 2; (2): c1 05
row#4[7961] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 04
col 0; len 2; (2): c1 06
row#5[7946] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 05
col 0; len 2; (2): c1 07
row#6[7931] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 06
col 0; len 2; (2): c1 08
row#7[7916] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 07
col 0; len 2; (2): c1 09
row#8[7901] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 08
col 0; len 2; (2): c1 0a
row#9[7886] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 09
col 0; len 2; (2): c1 0b
row#10[7871] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0a
col 0; len 2; (2): c1 0c
row#11[7856] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0b
col 0; len 2; (2): c1 0d
row#12[7841] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0c
col 0; len 2; (2): c1 0e
row#13[7826] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0d
col 0; len 2; (2): c1 0f
row#14[7811] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0e
col 0; len 2; (2): c1 10
row#15[7796] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0f
col 0; len 2; (2): c1 11

As discussed previously, the index leaf block remains “untouched” after the drop table partition operation and has no index entries actually marked as “deleted”. However, just take a note of the rowid values of the first 10 rows. I’m now going to reinsert new rows with an ID between 1 and 10 that were previously deleted as part of dropping the table partition …

SQL> insert into muse select rownum, 42, 'ZIGGY STARDUST' from dual connect by level <= 10;

10 rows created.

SQL> commit;

Commit complete.

SQL> analyze index muse_id_i validate structure;

Index analyzed.

SQL> select name, blocks, lf_rows, del_lf_rows from index_stats;

NAME                     BLOCKS    LF_ROWS DEL_LF_ROWS
-------------------- ---------- ---------- -----------
MUSE_ID_I                  9216    3000000      999990

We can see that the number of so-called deleted leaf rows is now only 999990 and has decreased by the 10 rows we’ve inserted.

If we take a look now at the first index leaf block again:

Block header dump: 0x018076dc
Object id on Block? Y
seg/obj: 0x16c11 csc: 0x00.29239b itc: 2 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18076d8 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x02 0x0004.012.0000079d 0x014045c7.0122.44 –U- 10 fsc 0x0000.00292503
Leaf block dump
===============
header address 360612452=0x157e8264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 404
kdxcofbo 844=0x34c
kdxcofeo 1675=0x68b
kdxcoavs 831
kdxlespl 0
kdxlende 0
kdxlenxt 25196253=0x18076dd
kdxleprv 0=0x0
kdxledsz 10
kdxlebksz 8036
row#0[8021] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 86
col 0; len 2; (2): c1 02
row#1[8006] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 87
col 0; len 2; (2): c1 03
row#2[7991] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 88
col 0; len 2; (2): c1 04
row#3[7976] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 89
col 0; len 2; (2): c1 05
row#4[7961] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8a
col 0; len 2; (2): c1 06
row#5[7946] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8b
col 0; len 2; (2): c1 07
row#6[7931] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8c
col 0; len 2; (2): c1 08
row#7[7916] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8d
col 0; len 2; (2): c1 09
row#8[7901] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8e
col 0; len 2; (2): c1 0a
row#9[7886] flag: ——-, lock: 2, len=15, data:(10): 00 01 6c 0f 01 80 91 2a 00 8f
col 0; len 2; (2): c1 0b
row#10[7871] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0a
col 0; len 2; (2): c1 0c
row#11[7856] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0b
col 0; len 2; (2): c1 0d
row#12[7841] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0c
col 0; len 2; (2): c1 0e
row#13[7826] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0d
col 0; len 2; (2): c1 0f
row#14[7811] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0e
col 0; len 2; (2): c1 10
row#15[7796] flag: ——-, lock: 0, len=15, data:(10): 00 01 6c 0e 01 80 11 30 00 0f
col 0; len 2; (2): c1 11

We notice that the first 10 index entries now have different rowids from the previous block dump.

So this is an exception to the rule. With a Unique index, Oracle will indeed reuse the storage occupied by the orphaned Unique index entry if the same unique value as an orphaned value is subsequently re-inserted. This is not the case with Non-Unique indexes, even if such Non-Unique indexes are used to police either a PK or Unique Key constraint.

So the new valid index entries and any existing orphaned entries can be read and/or ignored as necessary:

SQL> select * from muse where id between 1 and 100;

ID       CODE NAME
---------- ---------- --------------------
1         42 ZIGGY STARDUST
2         42 ZIGGY STARDUST
3         42 ZIGGY STARDUST
4         42 ZIGGY STARDUST
5         42 ZIGGY STARDUST
6         42 ZIGGY STARDUST
7         42 ZIGGY STARDUST
8         42 ZIGGY STARDUST
9         42 ZIGGY STARDUST
10        42 ZIGGY STARDUST

10 rows selected.

Execution Plan
----------------------------------------------------------

Plan hash value: 2515419874

------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name      | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |           |     1 |    23 |     4   (0)| 00:00:01 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| MUSE      |     1 |    23 |     4   (0)| 00:00:01 |     1 |     1 |
|*  2 |   INDEX RANGE SCAN                         | MUSE_ID_I |   100 |       |     3   (0)| 00:00:01 |       |       |
------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ID">=1 AND "ID"<=100)

filter(TBL$OR$IDX$PART$NUM("MUSE",0,8,0,"MUSE".ROWID)=1)

Statistics
----------------------------------------------------------

5  recursive calls
0  db block gets
7  consistent gets
0  physical reads
0  redo size
974  bytes sent via SQL*Net to client
544  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
1  sorts (memory)
0  sorts (disk)
10  rows processed

However, if new unique values are inserted into the table but with ID values that didn’t previously exist:

SQL> insert into muse select rownum+3000000, 42, 'ZIGGY STARDUST'
from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index muse_id_i validate structure;

Index analyzed.

SQL> select name, blocks, lf_rows, del_lf_rows from index_stats;

NAME                     BLOCKS    LF_ROWS DEL_LF_ROWS
-------------------- ---------- ---------- -----------
MUSE_ID_I                 11264    4000000      999990

We notice that the number of so-called deleted leaf entries remains the same after inserting the 1M new rows.

So in this scenario, the effectively “empty” leaf blocks containing nothing but orphaned unique index entries are not re-cycled and reused by subsequent index block splits as they would have been had they contained nothing but deleted index entries.

So Unique indexes in the unlikely event that such unique values are subsequently reinserted are an exception to the general rule of orphaned global index entries having to be “cleaned out”.

Unique Bitmap Indexes Part II (You Can’t Do That) March 30, 2010

Posted by Richard Foote in Bitmap Indexes, Oracle Indexes, Unique Indexes.
6 comments

As discussed in the previous post, a Bitmap index on a unique column will be larger than a corresponding Btree index due to the additional overheads associated with each index entry (such as the additional rowid, the additional column length bytes and the bitmap column itself). Oracle therefore attempts to protect you from explicitly creating such a “Unique Bitmap” index. 

For example, you can not specify both UNIQUE and BITMAP when creating an index. To do so would make little sense.
  
A bitmap index must therefore be Non-Unique by definition. Any attempt to explicitly create a Unique Bitmap index will fail.
 
SQL> drop index bowie_bitmap_i;
 
Index dropped.
 
SQL> create unique bitmap index bowie_bitmap_i on bowie(id) pctfree 0;
create unique bitmap index bowie_bitmap_i on bowie(id) pctfree 0
              *
ERROR at line 1:
ORA-00968: missing INDEX keyword
 

SQL> create bitmap unique index bowie_bitmap_i on bowie(id) pctfree 0;
create bitmap unique index bowie_bitmap_i on bowie(id) pctfree 0
              *
ERROR at line 1:
ORA-00968: missing INDEX keyword
 

The CREATE INDEX syntax only caters for either the BITMAP or the UNIQUE option.
 
Although Oracle permits the use of a Non-Unique index to police either a Primary Key (PK) or Unique Key (UK) constraint, a bitmap index is not permitted to police such constraints. Again, it makes little sense having a bitmap index police such constraints as an equivalent Btree index is going to be more efficient.
 
If an existing bitmap index exists on a column, Oracle can not use it to police the constraint:
 
SQL> create bitmap index bowie_bitmap_i on bowie(id) pctfree 0;
 
Index created.
 
SQL> alter table bowie add primary key (id);
alter table bowie add primary key (id)
*
ERROR at line 1:
ORA-01408: such column list already indexed
 

Oracle is attempting to create a Btree index to police the new PK constraint but it can’t create it as an existing bitmap index already exists. Oracle will not create a Btree index if the same column list is already indexed.
 
It makes no difference if we if declare the constraint as deferrable (or invalidate) where a Non-Unique index is required:
 

SQL> alter table bowie add primary key (id) novalidate;
alter table bowie add primary key (id) novalidate
*
ERROR at line 1:
ORA-01408: such column list already indexed
 

SQL> alter table bowie add primary key (id) deferrable;
alter table bowie add primary key (id) deferrable
*
ERROR at line 1:
ORA-01408: such column list already indexed
 

Attempting to create a Bitmap index at the same time as the constraint is equally fruitless:

SQL> alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id));
alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id))
                                                           *
ERROR at line 1:
ORA-00968: missing INDEX keyword
 

SQL> alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id) deferrable);
alter table bowie add primary key (id) using index (create bitmap index bowie_bitmap_i on bowie(id) deferrable)
                                                           *
ERROR at line 1:
ORA-00968: missing INDEX keyword
 

So definitely, looking at creating a Bitmap index on a unique column is not a sensible thing to attempt both because the resultant bitmap index would be larger than a corresponding Btree index if permitted and because in many scenarios as discussed, Oracle simply won’t let you do it anyways.

OK, so a unique column is not suitable for a Bitmap index. The question remains at what point does it make sense to create a bitmap index ? The answer is reasonably obvious when one understands the structure of both types of indexes although the answer may surprise some folks. I’ll look at this question next …

Unique Bitmap Indexes Part I (Unnatural Selection) March 24, 2010

Posted by Richard Foote in Bitmap Indexes, Index Internals, Oracle Indexes, Unique Indexes.
17 comments

As I’ve discussed previously, a Bitmap index can be considered over a B-tree index (where concurrent DML is not an issue) even if there are potentially tens of millions of distinct values, in a table that has say hundreds of millions of rows.
 
However, if a column is unique or “approaches” uniqueness, then one has gone too far and the bitmap index is going to be larger and less efficient than an equivalent b-tree index. So you wouldn’t consider a bitmap index on a column with a million distinct values if the table itself only has in the vicinity of a million rows as well.
 
To understand why a column approaching uniqueness shouldn’t be considered as a bitmap index, one only needs to understand the structure and corresponding differences of index entries in both bitmap and b-tree indexes.
 
I’ll begin by creating a simple table and populating it with a million rows.


SQL> create table bowie (id number, name varchar2(20));
 
Table created.
 
SQL> insert into bowie select rownum, 'Ziggy Stardust' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.

   

Note that the ID column is unique. We can therefore create a Unique b-tree index:
 


SQL> create unique index bowie_unique_i on bowie(id) pctfree 0;

Index created.

SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_UNIQUE_I';

INDEX_NAME          BLEVEL LEAF_BLOCKS DISTINCT_KEYS
--------------- ---------- ----------- -------------
BOWIE_UNIQUE_I           2        1875       1000000

 

Note that the unique index has 1875 leaf blocks. If we dump a leaf block and look at some (say 5) of the index entries:
 

row#0[8025] flag: ------, lock: 0, len=11, data:(6):  02 00 6b 0a 00 00
col 0; len 2; (2):  c1 02
row#1[8014] flag: ------, lock: 0, len=11, data:(6):  02 00 6b 0a 00 01
col 0; len 2; (2):  c1 03
row#2[8003] flag: ------, lock: 0, len=11, data:(6):  02 00 6b 0a 00 02
col 0; len 2; (2):  c1 04
row#3[7992] flag: ------, lock: 0, len=11, data:(6):  02 00 6b 0a 00 03
col 0; len 2; (2):  c1 05
row#4[7981] flag: ------, lock: 0, len=11, data:(6):  02 00 6b 0a 00 04

 

We notice the length of these first 5 index entries are all 11 bytes (len=11).
 
An index entry from this Unique index basically consists of the indexed value (col 0) which is 2 bytes in length in the above sample plus the following overhead:
 
2 bytes for flags and locks
6 bytes for the rowid
1 byte for the index column length
 
So there’s a total of 9 bytes of overhead per index entry in this index in addition to the index value itself. Note also there’s an index entry for each and every indexed value. This is always the case for a non-compressed b-tree index.
 
If we now compare this with an equivalent  Non-Unique index on the same column:

 
 
SQL> drop index bowie_unique_i;
 
Index dropped.
 
SQL> create index bowie_nonunique_i on bowie(id) pctfree 0;
 
Index created.
 
SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_NONUNIQUE_I';
 
INDEX_NAME            BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------- ---------- ----------- -------------
BOWIE_NONUNIQUE_I          2        1999       1000000
 

 

We notice the index is now somewhat larger than the equivalent Unique index, with there now being 1999 leaf blocks, an increase of 124 leaf blocks. A block dump of a leaf block reveals the key difference:
 

 
row#0[8024] flag: ------, lock: 0, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  02 00 6b 0a 00 00
row#1[8012] flag: ------, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  02 00 6b 0a 00 01
row#2[8000] flag: ------, lock: 0, len=12
col 0; len 2; (2):  c1 04
col 1; len 6; (6):  02 00 6b 0a 00 02
row#3[7988] flag: ------, lock: 0, len=12
col 0; len 2; (2):  c1 05
col 1; len 6; (6):  02 00 6b 0a 00 03
row#4[7976] flag: ------, lock: 0, len=12
col 0; len 2; (2):  c1 06
col 1; len 6; (6):  02 00 6b 0a 00 04

 

As I’ve discussed previously, Oracle makes the Non-Unique index effectively unique by adding the rowid as an additional indexed column within the index entry (col 1 is this additional index column comprising the rowid). There are therefore 2 columns in the index entry, not just the one (denoted by col 0 and col 1). This ensures all duplicate indexed values are subsequently sorted in rowid order within the index and can be efficiently accessed as necessary.
 
The consequence of this subtle difference is that an additional byte is now required to store the length of the rowid column and so the total overhead increases from 9 bytes to 10 bytes per index entry. The overall length of an index entry has therefore increased from 11 to 12 byes (len=12) and this results in the overall increase of 124 leaf blocks in the index, required to effectively store these additional 1 million bytes.
 
If we now create the index as an equivalent bitmap index:
 

  
 
SQL> drop index bowie_nonunique_i;
 
Index dropped.
 
SQL> create bitmap index bowie_bitmap_i on bowie(id) pctfree 0;
 
Index created.
 
SQL> select index_name, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name = 'BOWIE_BITMAP_I';
 
INDEX_NAME            BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------- ---------- ----------- -------------
BOWIE_BITMAP_I             2        3124       1000000

  

We now notice the index has increased substantially from 1999 leaf blocks for the Non-Unique index to 3124 leaf blocks.
 
Again, a dump of an index leaf block highlights the reason for the increase:
 

 
row#0[8015] flag: ------, lock: 0, len=21
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  02 00 6b 0a 00 00
col 2; len 6; (6):  02 00 6b 0a 00 07
col 3; len 1; (1):  00
row#1[7994] flag: ------, lock: 0, len=21
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  02 00 6b 0a 00 00
col 2; len 6; (6):  02 00 6b 0a 00 07
col 3; len 1; (1):  01
row#2[7973] flag: ------, lock: 0, len=21
col 0; len 2; (2):  c1 04
col 1; len 6; (6):  02 00 6b 0a 00 00
col 2; len 6; (6):  02 00 6b 0a 00 07
col 3; len 1; (1):  02
row#3[7952] flag: ------, lock: 0, len=21
col 0; len 2; (2):  c1 05
col 1; len 6; (6):  02 00 6b 0a 00 00
col 2; len 6; (6):  02 00 6b 0a 00 07
col 3; len 1; (1):  03
row#4[7931] flag: ------, lock: 0, len=21
col 0; len 2; (2):  c1 06
col 1; len 6; (6):  02 00 6b 0a 00 00
col 2; len 6; (6):  02 00 6b 0a 00 07
col 3; len 1; (1):  04
 

 

The index entry structure is now somewhat different. We now have an index that has not 1 column (as in the Unique index) or 2 columns (as in the Non-unique index) but 4 columns in the index entry. As before, we still have the index column value of 2 bytes but we now have the following overheads per index entry:
 
2 bytes for flags and locking (as before)
1 byte for the indexed column length (as before)
6 bytes for a rowid index column (col 1) stating the start of a range of rowids that are covered by the particular index entry
1 byte for the length of this start rowid index column
6 bytes for a rowid index column (col 2) stating the end of a range of rowids that are covered by the particular index entry
1 byte for the length of this end rowid index column
1 byte for the bitmap bit sequence column (col 3) required for all the bits referencing rows within the above rowid ranges
1 byte for the length of this bitmap column
 
So the total overhead for each of the 5 index entries listed above is now 19 bytes, not 9 or 10 bytes as for the equivalent b-tree indexes. The length of an index entry is therefore 21 bytes in total, not 11 or 12 bytes as for the equivalent b-tree indexes.

A few important points to note.
 
As the columns are effectively unique, the number of index entries are the same for both b-tree and bitmap indexes. A key advantage of a bitmap index over a b-tree index is that for each distinct value, a single index entry is sufficient to cater for a range of rowids, potentially covering the whole table. For example, a specific column value with 100 duplicates may only need the one index entry for the column value within a bitmap index, but would require 100 different index entries within a (non-compressed) b-tree. However, as the column in the above example is unique, there are no duplicate values and so this potential saving is not possible in this bitmap index.
 
Notice also the size of the bitmap string for each index entry is actually tiny, just a single byte, even though there are 1 million rows in the table. It doesn’t even look like it’s using a million bits to store the necessary bitmap string information for each index entry. This is because for each index entry, only one bit is ever set to 1 (“true”), all other occurrences are logically false as only 1 row in the table ever has the specific index value. Remember, the column values are effectively unique.

Therefore, Oracle can use a very narrow range of rowid ranges for each index entry and effectively not bother storing details for the vast majority of the possible rowid ranges within the table as there’s only one bit that’s of interest and it only corresponds to a specific part of the table. Even in cases where there might just be a duplicate here or there, most values would be zeros (false) regardless and can be compressed extremely efficiently (topic for another day).

Although many folks commonly think otherwise (see original Burleson article for a perfect example of the following misperception), if a column which is unique or is approaching uniqueness is indexed via a bitmap index, the overheads associated with the bitmap string in the index entry is usually very minimal as by definition most bit values are logically “false” (0), with only the rare “true” (1) bit value needing to be stored and reference.

The issue is not necessarily with the overheads associated with just the bitmap string per se but also with the other overhead components, namely the additional rowid and column length bytes.
 
In short, the bitmap index can’t be any more efficient that use just 1 byte to store the necessary bitmap string information (plus 1 byte for the bitmap string length), however 19 bytes of overhead is the minimum required, mainly because of the requirement to store 2 rowids instead of 1 rowid and for all the additional index column length bytes. If the bitmap index needs to cater for a wider range of rowids and for more occurrences of 1s (true) values, then the overheads associated with the bitmap sequence can of course be much more considerable than the 1 byte (again, a topic for another day).
 
Therefore, the bitmap index is going to be significantly less efficient if the indexed values are unique or near unique as there’s all this additional overhead per index entry without the subsequent savings by not having to store separate index entries for duplicates column values. There needs to be at least some measure of duplication within a column for a bitmap index to have some chance of being the more efficient when compared to an equivalent b-tree index.
 
However, how many duplicate values within a column are actually necessary for a bitmap index to be considered and be viable alternative ? The answer is far fewer than many may think (again see original Burleson article for a common misunderstanding in this respect), although this question will be addressed in a future post on the subject.

Indexes And Small Tables Part VI (Loaded) May 19, 2009

Posted by Richard Foote in Constraints, Oracle Indexes, Small Indexes, Unique Indexes.
9 comments

Thanks to comments by PdV, I need yet another Part before I can look at completing this series ;)

OK, we’ve reached the stage in Part V of accessing this small, one block table with a Unique Index. This has reduced the number of consistent gets to 2, with both consistent get operations being the “pinless”, one latch consistent get examinations.

We basically need one consistent get to read the index and the other consistent get to read the row from the table block.

Not bad.

However, if we could somehow just store all the columns of interest within the index structure, we could potentially do even better because then we wouldn’t need to visit the table segment at all. A single consistent get from the index and bingo, we can retrieve all the required data directly from the index.

Overloading an index in this way is actually quite a common tuning trick. Simply add additional columns within the select list so that all the required columns can be found within the index, thereby elliminating the need to access the table at all and so potentially improve performance.

However, when it comes to overloading a unique index designed to specifically police a PK or Unique constraint we have a slight problem. Oracle will not allow a unique constraint to be policed by a unique index that does not have the same column list. It’s not a problem for a non-unique index (providing the leading columns match the constraint columns), but it’s an issue for a unique index.

Therefore, in our little example, we can’t simply create a single concatenated unique index on both the ID and NAME columns and use it to police a unique constraint on just the ID column. We must either use a unique index on the ID column or use a non-unique index on both the ID and NAME columns. If we want to create a unique index on both the ID and NAME columns, we would need to create an additional index on the ID column to police the PK on the ID column or change our business rules to make both the ID and NAME the PK (which is not likely something we would want to do). Note also by doing creating 2 unique indexes, we would effectively be storing the ID column in three separate places, within the table, within the ID index and also within the ID and NAME index. Again, not something we’re likely going to want to do.

To illustrate the point, drop any existing indexes on the SMALL table. If we attempt to create a unique index on both the ID and NAME columns while making the ID column only the PK, Oracle is going to complain:

SQL> alter table small add primary key(id) using index (create unique index small_uk_i on small(id, name));
alter table small add primary key(id) using index (create unique index small_uk_i on small(id, name))
*
ERROR at line 1:
ORA-14196: Specified index cannot be used to enforce the constraint.try and cr

If we create a unique index first on both the ID and NAME columns:

SQL> create unique index small_uk_i on small(id, name);

Index created.

And then hope Oracle will simply use this index to police a PK constraint on just the ID column, we’ll be sadly disappointed as Oracle will actually created another unique index on the ID column behind the scenes:

SQL> alter table small add primary key(id);

Table altered.

SQL> select c.index_name, c.column_name from user_indexes i, user_ind_columns cwhere i.index_name = c.index_name and i.table_name = c.table_name and i.table_name = ‘SMALL';

INDEX_NAME   COLUMN_NAME
------------ ------------
SMALL_UK_I   ID
SMALL_UK_I   NAME
SYS_C009759  ID

 

The CBO will of course potentially look at using our SMALL_UK_I concatenated unique index to perform the select statement of our demo, but the efficiency results are not quite what we’re after:

SQL> select id, name from small where id = 42;

        ID NAME
---------- -----
        42 BOWIE
-----------------------------------
|Id| Operation        | Name      |
-----------------------------------
| 0| SELECT STATEMENT |           |
|*1|  INDEX RANGE SCAN| SMALL_UK_I|
-----------------------------------
Statistics
-------------------------------------------
  0  recursive calls
  0  db block gets
  2  consistent gets
  0  physical reads
  0  redo size
465  bytes sent via SQL*Net to client
396  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  1  rows processed

 

We first note Oracle is indeed using the concatenated, unique SMALL_UK_I index as it can retrieve all the necessary data from the query directly from the index.

However, also note the CBO is performing an index range scan, not a unique scan as only a portion of the index (the ID) is specified in the WHERE clause. Oracle doesn’t pick up the fact the select operation is actually unique as not all columns within the SMALL_UK_I unique index used by the CBO are specified in the WHERE clause. This despite the fact the ID is actually the defined PK of the table.

Therefore, Oracle is still performing 2 consistent get operations as there may be more rows to retrieve after performing the first fetch within the SQL*PLUS session. Also, if we examined the types of consistent reads being performed, we would note that neither of them are consistent get – examinations.

So we’re really not that far ahead of just using the unique index on the ID column as we did in Part V of this series. We still require 2 consistent gets (although neither of them are now examinations) and we’re having to store the ID in three separate locations for our trouble, rather than two.

Wouldn’t it be nice if we could have a PK index on just the ID column, but somehow add the NAME column (or any other columns of interest) to the index structure so that we only need to visit the index structure, thereby storing the ID in only the one index. Then we could potentially access the data with just one consistent get and with it being a unique index, it would be a consistent get examination requiring only the one latch access.

Hell, wouldn’t it be nice if we didn’t even bother with the table segment at all as all queries of interest would never actually need to access and use the table segment anyways, thereby storing the PK in possibly just the one location.

Such a solution has of course been possible for a long time …

;)

Indexes And Small Tables Part V (It’s No Game) May 13, 2009

Posted by Richard Foote in Index Internals, Oracle Indexes, Small Indexes, Unique Indexes.
12 comments

So far in our little example, we’ve looked at how accessing a row of a one block table via a FTS required 4 consistent gets while accessing this same table via a Non-unique index reduced the consistent gets down to 3.

Time to take the next step and improve the efficiency yet further of accessing this small one block table.

We’re now going to replace the Non-unique index with a Unique Index instead. We can obviously do this because the values on the indexed ID column are indeed unique.

Now it’s always a good idea of course to document these table business rules (such as a column being unique) inside the database, however it’s somewhat alarming just how many application just don’t this. I’ve also previously discussed how a PK or Unique constraint can actually be policed via a Non-Unique index so there are many reasons why a small table might not have an associated Unique index.

Not least of course the incorrect perception that an index is not going to be much use on a small table anyways …

So let’s now replace the Non-Unique index with a Unique index instead:

SQL> drop index small_id_i;

Index dropped.

SQL> alter table small add primary key (id) using index (create unique index small_id_i on small(id));

Table altered.

 
OK, so now we have our Unique index in place. Let’s now run the same query again to see how our consistent gets related statistics might change:

 
SQL> select * from small where id = 42;

 

        ID NAME
---------- -----
        42 BOWIE

--------------------------------------------
|Id |Operation                  |Name      |
--------------------------------------------
|  1|TABLE ACCESS BY INDEX ROWID|SMALL     |
|* 2| INDEX UNIQUE SCAN         |SMALL_ID_I|
--------------------------------------------

Statistics
--------------------------------------------
  0  recursive calls
  0  db block gets
  2  consistent gets
  0  physical reads
  0  redo size
465  bytes sent via SQL*Net to client
396  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  1  rows processed

 

We notice our first significant change. The number of consistent gets has reduced further down to just 2.

Why ?

Because with a Unique index, there can only be a maximum of 1 row returned. It’s simply not possible to return 2 or more rows.

Therefore, when selecting this one row, Oracle doesn’t have to perform the second fetch operation to confirm there are indeed no more rows to return. The first fetch will either return the one row of interest or none at all when resolving an equality predicate. That’s it, there are no other possibilities. We return the row of interest by simply accessing the index block (1 consistent get) followed by the table block (the second consistent get).

As we have previously, if we look at the actual consistent gets statistics of interest by running the following query in another session before/after the select statement:

SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n WHERE s.statistic# =n.statistic# AND s.sid = 141 AND n.name LIKE ‘consistent%';

NAME                           VALUE
------------------------------ -----
consistent gets                31236
consistent gets - examination   5084

 
And again afterwards ..

 
SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n WHERE s.statistic# =n.statistic# AND s.sid = 141 AND n.name LIKE ‘consistent%';

NAME                           VALUE
------------------------------ -----
consistent gets                31238 (+2)
consistent gets - examination   5086 (+2)

We notice we have indeed only performed the 2 consistent gets. But we also notice another significant difference, that being both consistent gets are now the “cheaper” consistent gets – examination.

This means that the latches required to now perform this select statement via the Unique index is just 2, down from 6 for the Non-unique index and 8 from the FTS.

Generally during a consistent get, Oracle needs to grab the cache buffers chain latch so it can pin the specific block in memory and then grab the latch again so that it can subsequently unpin the block once it’s finished processing the block. Each of these accesses to the latch and the subsequently pin/unpinning of the block requires CPU and is a possible source of contention within the database.

For some operations that only require a very very quick “read and get out of there” type operation and/or on blocks that are unlikely to change within a given point of time, Oracle uses a cheaper consistent get operation which doesn’t actually require the block to be pinned. There’s no point in pinning the block as it’s only going to be read and accessed for a short time (shorter than might otherwise be required when processing a block in memory) and the block is unlikely to change anyways.

So for these operations, Oracle uses a cheaper consistent get called a consistent gets – examination. These consistent gets examinations only need grab the cache buffers chain latch before quickly reading the block and releasing the latch once the read operation is complete. Therefore it only needs to grab and release the cache buffer chains latch the once without having to pin/unpin the block, which means less CPU and less latch contention overall.

Now this isn’t particularly well documented. Often discussions mention reads of undo blocks as being candidates for consistent gets examinations as these reads are likely to be both relatively quick and a specific undo block is unlikely to change as only one transaction can actually update an undo block at a given time.

Getting back to indexes, reads of index root blocks are another candidate mentioned as again a read of an index root block is going to be very quick and an index root block is unlikely to change at a given point of time.

However, what is not well documented at all is the fact that any block accessed during an index Unique Scan is accessed via a consistent get – examination, including the consistent get associated with reading the table block as well. This is because again, any such read operation is going to be relatively quick as the most that ever needs to be read is the one index related entry and the one table row.

The net result is that now accessing a row from a small table via a Unique index requires only 2 latch accesses vs. the initial FTS example which required 8 latch gets as none of the FTS consistent gets are examinations.

Now you might say that these are all very small numbers, that 4 consistent reads isn’t that much, that 8 latches isn’t really that huge a number and reducing 8 latches down to 2 latches isn’t really going to be that noticeable. Yes it is effectively a 75% reduction but it’s a 75% reduction of not very much.

And if you’re discussing a single read of a single small lookup table you would likely be right.

But what if the small table is accessed frequently by the application, perhaps many 1,000s of times per minute. What if you have many such small tables, often used in small join operations by your OLTP applications. What if you have large numbers of users in a large application with many many such small table accesses. This effectively 75% saving can potentially become very significant, both in terms of the reduction in CPU load and also in the reduction of latch contention, which in turn can further reduce CPU loads.

A small improvement multiplied by a large amount can indeed make a difference …

However, I have one more step to go yet in further improving the efficiency of these small table lookups via an index.

One which can reduce the overall overheads by yet another 50% …

Differences Between Unique and Non-Unique Indexes Part 4.5 (Fix You) March 30, 2009

Posted by Richard Foote in Fragmented Indexes, Index Internals, Non-Unique Indexes, Oracle Indexes, Unique Indexes.
5 comments

In my last post, Part IV in this series, we looked at how a Unique Index can reuse space deleted within the same logical transaction, whereas a Non-Unique Index can not.  Deleted space within a Non-Unique index can only be reused by subsequent transactions.

It’s sometimes important to appreciate this distinction because as discussed in the various OTN and Ask Tom threads mentioned in Part IV, there are times when this can make a significant difference to the manageability and efficiency of the resultant index.

Now, it’s not everyday someone might for example delete all rows in a table and repopulate it again within a single transaction (the TRUNCATE command was of course developed for a reason). However, perhaps an application was developed without your involvement, perhaps a large proportion but not all of the data is being deleted or as someone mentioned on OTN, perhaps the table in question is a Materialized View being fully refreshed within a refresh group. There could therefore be occasions when a single transaction might indeed perform a large delete followed by a similarly sized insert.

In which case, whether an index is defined as Unique or Non-Unique might make a difference …

To begin with, let’s populate a table with 1M rows and create an associated Unique index:

SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create unique index bowie_idx on bowie(id);

Index created.

 

Let’s look at the size of  this newly created Unique index:

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      2176       2087           0

 

OK, let’s now delete the entire table and repopulate it again, within the same logical transaction

SQL> delete bowie;

1000000 rows deleted.

SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

 

Let’s look at the size difference for the Unique Index and see how many deleted index entries we have as a result:

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      2176       2087           0

 

OK good, the index is actually identical in size and we have no deleted entries, not a one. All the deleted entries as a result of the delete command have been reused by the subsequent insert statement. This means of course that the index is just as efficient now after all this DML activity, as it was when the index was first created.

 

Let’s perform exactly the same demo, but this time with a Non-Unique index and see any differences …

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, name varchar2(20));

Table created.

SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(id);

Index created.

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      2304       2226           0

 

The first difference we notice is that the Non-Unique index after it has just been created is somewhat larger than the equvalent Unique index (2226 leaf blocks vs. 2087 leaf blocks). This is a direct result of the Non-Unique index having to store an extra byte for the length byte associated with the rowid being an additional index column for each and every one of the 1M index entries.

SQL> delete bowie;

1000000 rows deleted.

SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      4608       4518     1000000

OK not quite so good, big difference here. Previously, the Unique Index remained unchanged and had no deleted index entries. However, the Non-Unique index is now effectively double the size it was previously and has 1M deleted index entries still within the index structure. Not a one was recycled and reused within the logical transaction.

This index is now potentially problematic, especially if there are going to be no or few subsequent inserts until it next gets refreshed, where the deleted entries can be reused but the current entries may again remain in the index after they’ve been deleted.

Again, it’s important to understand what is going on here so one can take the appropriate adminstration steps. Perhaps it might be better to drop the index and recreate it after the transaction (if permitted). Perhaps the truncate command isn’t such a bad idea after all (if permitted). Perhaps it might be better to police the Unique constraint with a Unique rather than a Non-Unique index after all.

Perhaps, it might be better to not perform the above within a single transaction and issue an intermediate commit after all (if permitted) …

 

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, name varchar2(20));

Table created.

SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(id);

Index created.

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      2304       2226           0

SQL> delete bowie;

1000000 rows deleted.

Because if we just issue the commit at this point in the process …

SQL> commit;

Commit complete.

SQL> insert into bowie select rownum, ‘PINK FLOYD’ from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select blocks, lf_rows, lf_blks, del_lf_rows from index_stats;

    BLOCKS    LF_BLKS DEL_LF_ROWS
---------- ---------- -----------
      2304       2226           0

 

We would not have this problem as the subsequent transaction that performs the insert can reused all the deleted space associated with the first delete transaction. 

If one understands how indexes work and understands how deleted space can be reused, one can prevent many potential issues and unnecessary maintenance tasks.

Prevention is always the best cure …

Differences Between Unique and Non-Unique Indexes Part IV (Take It Back) March 25, 2009

Posted by Richard Foote in Index Internals, Non-Unique Indexes, Oracle Indexes, Unique Indexes.
11 comments

I’ve previously discussed various differences between Unique and Non-Unique indexes (Part I, Part II and Part III) and why I have a preference to implement Unique indexes whenever possible and practical.

Various recent discussions on the OTN forums and on Ask Tom reminded me that I hadn’t yet discussed on this blog another subtle, but potentially significant difference between Unique and Non-Unique indexes.

As I’ve previously discussed, there’s actually no such thing as a Non-Unique index entry as such as Oracle ensures all index entries are effectively unique by adding the rowid to the index key for all Non-Unique indexes. Fundamentally, this is essential because Oracle needs some way of efficiently finding the precise index entry associated with an update or delete operation. Without having the rowid as part of the index key, Oracle would be forced to navigate to the first occurrence of an index value and search through all occurrences of the index value until it finds the specific entry containing the rowid of interest. This could potentially result in visiting many leaf blocks if the index value spans multiple leaf blocks. By including the rowid as the last index key column, non-unique index values are further ordered based on the corresponding rowid within the same indexed values. Oracle can therefore always navigate directly to the leaf block containing the exact index entry of interest as the rowid can be included in the branch blocks to determine both the index entry and rowid ranges found in specific leaf blocks.

If we look at a simple example by creating a one row table with a Non-unique index:

SQL> create table bowie (id number, name varchar2(20));

Table created.

SQL> insert into bowie values (1, ‘BOWIE’);

1 row created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(id);

Index created.

SQL> select header_file, header_block from dba_segments where segment_name = ‘BOWIE_IDX';

HEADER_FILE HEADER_BLOCK
----------- ------------
          7       124937

Let’s dump the index block …

SQL> alter system dump datafile 7 block 124938;

System altered.
Leaf block dump
===============
header address 425713756=0x195fe05c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0x26
kdxcofeo 8024=0x1f58
kdxcoavs 7986
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——, lock: 0, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 c1 e7 8a 00 00

Notice how the rowid is an additional index column within the index entry for the Non-Unique index.

Now if we were to delete and subsequently re-insert a row in the table with same index value within a single transaction, note the rowid of the new row by definition will differ from the deleted row. Therefore, we would need a different index entry for the new index row because if the rowids differ, then the associated index entries must differ as well. Note also (and this is critical) that because we would have a different rowid, if we had multiple index entries with the same key, this new index entry might not be in the same logical order as that of the deleted index entry. In fact, it’s quite possible that the new index entry might actually need to be stored in a totally different leaf block if this specific index value spanned multiple index leaf blocks because the index entries, including the rowids must always be logically ordered.

Therefore, if we were to delete and re-insert the same index value within a single transaction, Oracle is forced to create a new index entry and will not reuse the existing, deleted index entry.

So continuing with the demo, let’s delete the row:

SQL> delete bowie;

1 row deleted.

and now re-insert a row with the same indexed value within the same transaction:

SQL> insert into bowie values (1, ‘THIN WHITE DUKE’);

1 row created.

SQL> commit;

Commit complete.

If we look at a block dump now …

Leaf block dump
===============
header address 425713756=0x195fe05c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 8012=0x1f4c
kdxcoavs 7972
kdxlespl 0
kdxlende 1
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: —D–, lock: 2, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 c1 e7 8a 00 00
row#1[8012] flag: ——, lock: 2, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 c1 e7 8a 00 01
—– end of leaf block dump —–

We notice the previous index entry has been logically deleted and Oracle has created a new index entry with the new associated rowid.

 

Let’s now run exactly the same demo again, but this time with a Unique index instead of the Non-Unique index …

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, name varchar2(20));

Table created.

SQL> insert into bowie values (1, ‘BOWIE’);

1 row created.

SQL> commit;

Commit complete.

SQL> create unique index bowie_idx on bowie(id);

Index created.

SQL> select header_file, header_block from dba_segments where segment_name = ‘BOWIE_IDX';

HEADER_FILE HEADER_BLOCK
----------- ------------
          7       125193

SQL> alter system dump datafile 7 block 125194;

System altered.

Leaf block dump
===============
header address 371859548=0x162a205c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0x26
kdxcofeo 8025=0x1f59
kdxcoavs 7987
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 6
kdxlebksz 8036
row#0[8025] flag: ——, lock: 0, len=11, data:(6):  01 c1 e8 8a 00 00
col 0; len 2; (2):  c1 02
—– end of leaf block dump —–

 

Notice the big difference here. Because the index has been defined as Unique, all the associated index entries must be unique. It’s simply not possible to have duplicate index entries within a Unique index structure. Therefore, it’s not necessary to have the rowid as a separate column of the index entry as the index values themselves are sufficient to uniquely identify each and every index entry. The rowid is basically just another piece of overhead associated with the index entry rather than a separate index column. The length of this unique index entry is just 11 bytes, where it was previously 12 bytes, because we no longer need to store the length byte associated with the second index column necessary in the Non-unique index for the rowid.

And now comes the subtle difference …

If we were to now delete and re-insert the same index value within a single transaction, Oracle can now reuse the same, deleted index entry, because the index entry is effectively identical to the deleted one. The only possible difference is the rowid but the rowid is no longer a part of the index column list and so can just be updated as necessary. Note also (and this is the critical bit for Unique indexes), because the actual index value remains the same, the order of the index entry within the index must also remain the same. There is no need to move the re-inserted index entry to another part of the index structure because deleting and re-inserting the same index entry does not logically alter the order of where the index entry must reside.

Therefore, if we were to delete and re-insert the same index value within a single transaction, Oracle does not need to create a new index entry and can simply reuse the existing, deleted index entry.

So continuing with the demo, let’s delete the row:

SQL> delete bowie;

1 row deleted.

and now re-insert a row with the same indexed value within the same transaction:

SQL> insert into bowie values (1, ‘THIN WHITE DUKE’);

1 row created.

SQL> commit;

Commit complete.

If we look at a block dump now …

Leaf block dump
===============
header address 371859548=0x162a205c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 1
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0x26
kdxcofeo 8025=0x1f59
kdxcoavs 7987
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 6
kdxlebksz 8036
row#0[8025] flag: ——, lock: 2, len=11, data:(6):  01 c1 e8 8a 00 01
col 0; len 2; (2):  c1 02
—– end of leaf block dump —–

We note that Oracle has indeed reused the previously deleted index entry and has simply updated the rowid with the new rowid value. There is no deleted index entry, Oracle has simply changed the associated rowid to that of the new row in the table for the existing Unique index entry.

Where an Update of an index entry is actually effectively a delete and an insert of an index entry, notice that by contrast for Unique indexes, a delete and a re-insert operation is effectively an update of an index entry !!

In my next post I’ll highlight how this difference can be critical to the behaviour and efficiency of an index and why it’s important to understand how indexes work to avoid and prevent potential issues.

NOVALIDATE Constraints Part II – Does It Matter ? July 30, 2008

Posted by Richard Foote in Constraints, Novalidate Constraints, Oracle Cost Based Optimizer, Oracle Indexes, Primary Key, Unique Indexes.
13 comments

As promised, more on NOVALIDATE constraints.

As previously discussed, a Primary Key or a Unique Key constraint that’s not validated is by default policed by a Unique index regardless. If there are no duplicate values, then no worries (yes, I’m Australian), Oracle will create the Unique index and enable the constraint in a NOVALIDATE state. If there are duplicate values, Oracle will complain about creating the Unique Index to police the constraint and you must either explicitly create a Non-Unique index when creating the constraint or use an existing Non-Unique index.

So what are the implications, if any, of having a Primary key constraint in a NOVALIDATE state, especially if a Unqiue index is used to police the constraint ? The data must really be unique else the Unique Index could not have been created, right ? Also Oracle can take advantage of all the benefits associated with having a Unique index such as less consistent reads and latching overheads as previously discussed.

Following on from the demo in Part I, if we have a table with a Primary Key in a NOVALIDATE state, policed by a Unique Index:

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;

Table altered.

SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY';

CONSTRAINT_NAME VALIDATED     UNIQUENES
--------------- ------------- ---------
ZIGGY_PK        NOT VALIDATED UNIQUE

Oracle will be able to use the Unique index to perform a nice, efficient, low latching Unique Scan with the index:

SQL> SELECT * FROM ziggy WHERE id = 42;

------------------------------------------
|Id|Operation                   |Name    |
------------------------------------------
| 0|SELECT STATEMENT            |        |
| 1| TABLE ACCESS BY INDEX ROWID|ZIGGY   |
|*2|  INDEX UNIQUE SCAN         |ZIGGY_PK|
------------------------------------------

Everything’s perfect regardless of the PK constraint not being validated, right ?

Well, not exactly.

Remember, a PK constraint requires the data to be Unique AND Not Null. Now the Unique Index guarantees the data is indeed unique but it does nothing to protect us from having possible NULL values in our data. The index will simply ignore and not index any index entries that are fully NULL, therefore the PK column(s) could potentially, just maybe, contain NULLS. Brian Tkatch in a comment in Part I has a nice example of how this is possible.

This mean Oracle can not guarantee the index has index entries for every row in the table as any rows with a NULL PK will not be indexed. This can have serious reprecussions for the CBO when deciding an optimal execution plan.

For example, a query such as the following COUNT(*) query which could potentially be serviced via a “smaller” PK index segment can not use the Unique index and is forced to use either another index or a Full Table Scan:

SQL> select count(*) from ziggy;

---------------------------------
| Id| Operation          | Name |
---------------------------------
|  0| SELECT STATEMENT   |      |
|  1|  SORT AGGREGATE    |      |
|  2|   TABLE ACCESS FULL| ZIGGY|
---------------------------------

Another example, this query with an ORDER BY clause could potentially use the Unique index to retrieve the data and so avoid the sort operation as the Clustering Factor of the index is very good. However, it can’t as again, the CBO can’t guarantee all data will be retrieved via the index:

SQL> select * from ziggy order by id;

10000 rows selected.

---------------------------------
| Id| Operation          | Name |
---------------------------------
|  0| SELECT STATEMENT   |      |
|  1|  SORT ORDER BY     |      |
|  2|   TABLE ACCESS FULL| ZIGGY|
---------------------------------

However, if only we just validate the constraint, everything changes:

SQL> ALTER TABLE ziggy ENABLE VALIDATE PRIMARY KEY;

Table altered.

The COUNT(*) query suddenly starts using the index as a cheaper alternative as now, there can’t be any null values and so the index must reference all possible rows:

SQL> select count(*) from ziggy;

-------------------------------------
|Id|Operation             | Name    |
-------------------------------------
| 0|SELECT STATEMENT      |         |
| 1| SORT AGGREGATE       |         |
| 2|  INDEX FAST FULL SCAN| ZIGGY_PK|
-------------------------------------

The ORDER BY query suddenly starts using the index and avoids performing the sort operation as again, the index will now guarantee all rows are returned in a sorted order:

SQL> select * from ziggy order by id;

10000 rows selected.

------------------------------------------
|Id|Operation                   |Name    |
------------------------------------------
| 0|SELECT STATEMENT            |        |
| 1| TABLE ACCESS BY INDEX ROWID|ZIGGY   |
| 2|  INDEX FULL SCAN           |ZIGGY_PK|
------------------------------------------

The moral of the story. Provide the CBO with as much information as possible, as it can potentially use the information to determine a more optimal execution plan. Having a NOVALIDATE constraint possibly hides valuable information from the CBO and so needs to be used with caution.

NOVALIDATE Constraints – No really … July 28, 2008

Posted by Richard Foote in Constraints, Indexing Tricks, Novalidate Constraints, Oracle Indexes, Primary Key, Unique Indexes.
22 comments

There have been a number of posts recently on the OTN database forum regarding the whole topic of NOVALIDATE of constraints and the associated indexes so I thought it might be worth going over a couple of interesting little quirks with all this.

A NOVALIDATE constraint is basically a constraint which can be enabled but for which Oracle will not check the existing data to determine whether there might be data that currently violates the constraint.

This is useful if we know there’s data that violates the constraint but we want to quickly put on a constraint to prevent further violations, with the intention to clean up any possible violations at some future point in time.

It’s also potentially useful if we know the data is clean and so want to prevent the potentially significant overheads of Oracle having to check all the data to ensure there are indeed no violations.

I previously discussed the use of Non-Unique Indexes for manageing Primary and Unique Key Constraints but there are a few little traps one can easily fall into if one doesn’t understand these two very important fundamentals:

  1. By default, Oracle will attempt to create a Unique Index to police a PK or UK constraint
  2. A NOVALIDATE constraint requires a Non-Unique Index for the constraint to really be “Novalidated”

Get these two concepts confused and things can easily get a little difficult to follow …

Here’s a little example of how things can start to get confusing. First, let’s create a simple little table and populate it with a few rows.

SQL> CREATE TABLE ZIGGY (id NUMBER, text VARCHAR2(20));

Table created.

SQL> INSERT INTO ziggy SELECT rownum , ‘Ziggy’ FROM dual CONNECT BY LEVEL <= 10000;

10000 rows created.

Note that the ID column is populated with unique values. However, let’s now introduce a duplicate value, 42:

SQL> INSERT INTO ziggy VALUES (42, ‘DUPLICATE’);

1 row created.

SQL> COMMIT;

Commit complete.

OK, we now want to add a Primary Key to this table but because we suspect there might be some duplicate values which we intend to clean up at some future point in time, we want to create the constraint with NOVALIDATE:

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;
ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE
                                 *
ERROR at line 1:
ORA-02437: cannot validate (BOWIE.ZIGGY_PK) – primary key violated

Now what the hell is going on here ?

We clearly stated we want to create a NOVALIDATE constraint but Oracle appears to be ignoring this and is validating the constraint regardless and so generating an error because of the duplicate entry.

Why ?

Because by default Oracle will attempt to create a Unique index when creating a PK constraint. A Unique index MUST always contain unique values and so complains when it stumbles across our duplicate 42 ID value. The constraint is being effectively validated because the unique index will only be created providing there are indeed no duplicate values.

Not how I would have designed things but there you go …

However, if we either have an existing Non-Unique index which Oracle can use or we explicitly create a Non-Unique index, then we can proceed with creating the NOVALIDATE constraint as required:

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id)
USING INDEX(CREATE INDEX ziggy_pk ON ziggy(id)) ENABLE NOVALIDATE;

Table altered.

If we look at the status of the constraint and the type of index used to police the constraint, we notice that the index is indeed a Non-Unique index and the constraint has not been validated:

SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY';

CONSTRAINT_NAME VALIDATED     UNIQUENES
--------------- ------------- ---------
ZIGGY_PK        NOT VALIDATED NONUNIQUE

We have a PK constraint even though there are currently duplicate values of the PK column in the data.

OK, let’s now drop and the constraint, the Unique Index and delete the duplicate row:

SQL> ALTER TABLE ziggy DROP PRIMARY KEY;

Table altered.

SQL> DROP INDEX ZIGGY_PK;

Index dropped.

SQL> DELETE ziggy WHERE id = 42 and rownum <= 1;

1 row deleted.

SQL> COMMIT;

Commit complete.

The data is now clean and we have no existing constraint or index on the ID column:

SQL> SELECT constraint_name, validated FROM user_constraints WHERE table_name= ‘ZIGGY';

no rows selected

Let’s now do something that based on our understanding might appear to be a little odd, let’s try and recreate the constraint in a NOVALIDATE state but with a Unique index. This of course should now work as there are indeed no duplicates within the data:

SQL> ALTER TABLE ziggy ADD CONSTRAINT ziggy_pk PRIMARY KEY(id) ENABLE NOVALIDATE;

Table altered.

Success !! Let’s now look at the state of the constraint and the type of index created:

SQL> SELECT constraint_name, validated, uniqueness
FROM user_constraints c, user_indexes i
WHERE c.constraint_name = i.index_name AND c.table_name= ‘ZIGGY';

CONSTRAINT_NAME VALIDATED     UNIQUENES
--------------- ------------- ---------
ZIGGY_PK        NOT VALIDATED UNIQUE 

As expected, we have a constraint that’s policed by a Unique index that has not been validated.

This might appear be a little odd, because the question you might well now ask is why bother no validating a constraint that has effectively been validated anyways as the use of the Unique index has guaranteed there can not possibly be any duplicate values else the creation of the Unique index would have failed ?

We effectively have a validated constraint which Oracle is still classifying as being not validated :)

Then again, maybe not …

More later.

Differences between Unique and Non-Unique Indexes (Part III) December 30, 2007

Posted by Richard Foote in Constraints, Index Internals, Oracle Indexes, Performance Tuning, Unique Indexes.
5 comments

A comment by Robert in Part II of this series reminded me of another subtle difference between Unique and Non-Unique Indexes. Now this difference is likely to be of minimal consequence to most applications as most applications don’t generally have problems with Primary Key (PK) or Unique Key (UK) constraint violations (and if they do, this is likely to be the least of their worries). But it’s a interesting difference nonetheless, something to keep in the back of your mind and a little tit-bit to end the year on.

When a row is inserted into a table or when a PK or UK is modified, Oracle of course needs to ensure that either the PK or UK constraint is not violated. If the constraint is policed via a Unique index, as previously discussed, Oracle knows the value must and can only ever be unique and so performs the constraint validation before the Unique index is actually modified. If the PK or UK is violated, the Unique index can not possibly have been changed as all the associated index entries must always be unique and so only the undo (and redo) of the changes associated with the table data blocks are actually generated and need to be subsequently rolled back.

However, if the PK or UK constraint is policed via a Non-Unique index, the mechanism for applying the changes differs somewhat. As the index is Non-Unique, as previously discussed, Oracle is not quite so certain as to the state of play and performs the constraint validation after the associated changes are made to the Non Unique index. If the PK or UK constraint is violated, both undo and redo of the Non-Unique index has been generated and both changes to the table data blocks and the index blocks need to be rolled back.

This means there’s an extra cost associated with violating a constraint if the constraint is policed via a Non-Unique Index vs. a Unique index. When performing media recovery, it also means that there’s an additional cost associated with performing the recovery. Obviously the more frequent the constraint violations, the greater the overall penalties. Also, the larger the PK or UK values, the greater the penalties.

See this little demo to illustrate the differences between a Unique and a Non-Unique index in the redo and undo generated when a constraint is violated: Difference in redo and undo between a Unique and a Non-Unique Index.

As mentioned, this difference in behaviour between Unique and Non-Unique Indexes is unlikely to be an issue. However, in applications or environments where there may be a significant number of such violations, it may be something to keep in the back of your mind.

For a more detailed discussion and where it could be an issue, see Eric Emrick’s presentation.

Differences between Unique and Non-Unique Indexes (Part II) December 21, 2007

Posted by Richard Foote in Index Access Path, Index Internals, Indexing Tricks, Oracle Cost Based Optimizer, Oracle Indexes, Primary Key, Unique Indexes.
22 comments

The most significant difference between a Unique and a Non-Unique index is of course the simple fact that in one index, all index entries MUST be unique and in the other index there can be duplicates of an index entry.

Although an obvious distinction between the two, it’s also a crucial difference as well.

When Oracle uses a Unique Index to scan for a specific value (via an equality predicate on all indexed columns or when policing a constraint ), there can only be one of two possible results. The value can exist returning at the very most one value or the value doesn’t exist returning 0 values. That’s it, 1 row or none. The value either exists or it doesn’t.

This fact means Oracle doesn’t have to worry about a whole bunch of things when dealing with Unique indexes during equality or unique checking processes. It doesn’t have to check the next index entry just in case there’s a second or more entries with the same value. It doesn’t have to worry about the potential of having to skip across to the next leaf page if the specific value it reads happens to be the maximum value in the current leaf page. It doesn’t have to worry about pointers to these “adjacent” leaf blocks changing on it due to block splits. It doesn’t have to concern itself with potentially visiting more than the one table data block during the index access operation.

Life is simple, it’s either 1 row or none.

Not so for Non-Unique indexes. With a Non-Unique index, there are no such guarantees. With a Non-Unique index, there are 3 categories of possibilities. An index scan could return 0 rows, it could return 1 row or it could return more than one row. It could potentially need to go and visit more than the current leaf block to return all the matching rows. It could potentially need to go and visit more than one table block.

Life’s not quite so “simple” for a Non-Unique index.

Note also and most importantly that life gets no easier for a Non-Unique index that polices a PK or Unique key constraint.

Even though there’s a PK or Unique constraint on a column, to Oracle, it’s just another Non-Unique index with the same “vague” possibilities. Remember that PK and Unique constraints can be enabled with NOVALIDATE meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index. Remember constraints can be DEFERRABLE, meaning that even with a PK or Unique constraint, there’s still the possibility of duplicate index entries in the Non-Unique index.

This means that Oracle has to concern itself with a number of additional overheads, including having to “check” the next index entry, “just in case” it matches the required index value. It has to concern itself even with the possibility of having to visit the next index leaf block, “just in case”.

You will note when Oracle performs an equality search using a Unique Index, Oracle will perform an “INDEX UNIQUE SCAN” because the index entries MUST be unique.

You will note however when Oracle performs an equality search using a Non-Unique index, even if there’s a PK or Unique constraint of the column(s), Oracle will perform an INDEX RANGE SCAN, because it needs to scan multiple index entries “just in case”.

So are there any actual implications as a result of any of this ?

Yes.

When Oracle actually reads an index and processes the associated blocks in the buffer cache(s), Oracle uses a number of latches. These latches are used primarily to “protect” memory structures from concurrent activity. Very simplistically, by grabbing a latch, Oracle effectively performs a “lock” on the associated memory structure, perform whatever activity needs to be performed and releases the latch. These latches get grabbed and released (hopefully) extremely quickly (order of 1/10s of ms), but it’s a non zero value.

The issue with latches is that they’re a point of serialisation. If two (or more) processes want a specific latch, one (or more) has to wait. Latches also burn CPU. Only a teensy weeny bit at a time but some CPU nonetheless. They burn CPU while acquiring the latch and if fail due to latch contention, while attempting again and again to acquire the latch. They also burn CPU while performing the specific operation necessary of the latch.

Basically, the more latches, the greater the potential for contention, the greater the potential for latch related wait activity and perhaps most important of all, more CPU is required. In busy systems, there can be massive numbers of latch events and the best way to tune these events is to reduce where possible the number of latches required by the database environment. It’s one of the key reasons we try and reduce LIOs in a database as much as possible, to reduce the latch and CPU load on the system.

Because of the differences highlighted between Unique and Non-Unique indexes, the number and manner of latches required between the two indexes differs. And it differs significantly …

In this little demo, Latch Differences Between Unique and Non-Unique Indexes Demo, we compare the latches required to read an identical table, using a 2 level index. The  differences between the latch overheads of a Unique and a Non-Unique index are most interesting.

When using a Unique Index, Oracle required 3 consistent gets (one for the index root block, one for the leaf block and one for the table block). BUT, each consistent get was a consistent gets – examination, a special type of consistent get which only requires 1 latch (rather than the standard 2 latches).

So that’s a sum of 3 latches.

However, when using a Non-Unique index, Oracle required 4 consistent gets (one for the index root block, one for the leaf block, one for the table block and an additional one to recheck the leaf block for any duplicate index entries). BUT, only the 1 consistent read (for the index root block) was actually the “cheaper” consistent gets – examination, the other 3 were the more costly 2 latch variety.

So that’s a sum of 7 latches.

3 latches for the Unique index and 7 latches for the Non-Unique index.

That’s an increase of 133.3% in latches between the two types of indexes.

Now, the height of the index will change the ratio of latch difference between the two indexes. Also, in a busy system, there could potentially be differences in the types of latches used due to the current state or additional activity in a block.

However, the potential difference in latch requirements between a Unique or Non-Unique index can be very significant. But does a few additional latches here and there really make much of a difference ?

Well, of course it depends. On small scale systems with smaller loads, fewer indexes, fewer users and excess resources, the noticeable differences may be negligible.

However, in larger scale (especially OLTP) environments, a particular index may be accessed 100s or maybe 1000s of times a second. There may be 1000s of tables with 1000s of corresponding PK and Unique constraints policed by 1000s of Unique (or Non-Unique) indexes. It’s therefore not really of question of a few latches here or there. It’s a question of potentially a very significant proportion of overall latch related overheads.

Potentially when accessed, Non-Unique indexes could be generating double the latch related overheads for equality unique scan or unique checking index activity. Remember, the best way to tune latches and reduce latch contention is to simply reduce the requirement and load for latches.

The overall reduction in CPU and latch related wait activity could be significant between Unique and Non-Unique indexes because by using Non-Unique indexes you in the order of double the latches required for such activities.

Note also this doesn’t require any special parameters to be set or special tuning or monitoring by the DBA. It simply requires using Unique indexes to police PK or Unique constraints when there are no requirements of Non-Unique indexes. You then potentially gain a benefit each and every time the index is used for unique scan accesses.

Guess what type of access is extremely common in large scale OLTP environments …

The next time you complain about high CPU consumption or high latch contention and you’re tuned the application to death, just ask yourself how many Non-unique indexes are policing your PK or Unique Key constraints …

Local Index Issue With Partitioned PK and Unique Key Constraints December 20, 2007

Posted by Richard Foote in Constraints, Index Access Path, Local Indexes, Oracle Indexes, Partitioning, Performance Tuning, Unique Indexes.
11 comments

Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:

“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”

Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.

Let me explain why.

Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!

Let’s start by mentioning constraints again.

Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.

Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).

Lets say a table is Range Partitioned on column ‘A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!

Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.

Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.

This is actually a good thing.

If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).

It actually makes a lot of sense to do this.

Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.

Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.

If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.

Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.

In other words, the rules apply equally to both Unique and Non-Unique indexes.

So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint*  to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.

Little demo to illustrate: Local Index Issue With Partitioned PK and Unique Key Constraints

Differences between Unique and Non-Unique Indexes (Part I) December 18, 2007

Posted by Richard Foote in Constraints, Deferrable Constraints, Index Internals, Indexing Tricks, Novalidate Constraints, Oracle Indexes, Primary Key, Unique Indexes.
32 comments

I’ve had a number of comments regarding my earlier blog entry where I recommended avoiding Deferrable and Novalidate constraints unless you need them and consider using Unique Indexes rather than Non-Unique Indexes where possible.

Why such a recommendation, aren’t Unique and Non-Unique indexes practically the same thing when it comes to policing constraints ?

Sure one index explicitly prevents the insertion of duplicates while the other doesn’t. Yes, dropping/disabling  a constraint policed by an automatically created Unique index causes the index to be dropped if you forget the KEEP INDEX clause.

But that’s about it, right ?

Well, if you need a constraint to be deferrable, then you must create (either implicitly or explicitly) a Non-Unique index. If you want to enable a constraint with novalidate, then again you can only do so with a Non-Unique index in place policing the constraint.

It does all rather sound like Non-Unique indexes have all the advantages and allows for all the flexibility one could want. Non-Unique indexes allows for both deferrable and novalidate constraints, they don’t get dropped when the associated constraint is dropped / disabled and they can actually police both PK and Unique constraints.

What possible benefits are there in Unique Indexes ?

Well, providing you don’t need your constraints to be deferrable, you validate your constraints when they get created/enabled and you don’t go around dropping PK and/or Unique constraints on too regular a basis (or remember the KEEP INDEX clause if you don’t want your index dropped when you do), then there are a number of reasons why you may just want to consider using Unique indexes over Non-Unique indexes.

There are actually a number of key differences between Unique and Non-Unique indexes, both in the manner in which they’re stored by Oracle and in the manner in which they get processed.

In Part I, I’m just going to focus on the differences in how Oracle physically stores index entries.

In actual fact, there’s really no such thing as a Non-Unique index in Oracle. In order for Oracle to be able to determine the location of any specific index row entry and for Oracle to be able to determine an appropriate “order” for each index row entry, internally, Oracle coverts all Non-Unique indexes into a Unique index. It does this by using the associated ROWID of the index row entry as an additional “column”. As each ROWID is unique, this effectively makes all index entries in a Non-Unique index unique as well. Oracle uses the unique combination of the Non-Unique index value and the associated ROWID to then determine the appropriate order and hence appropriate location within the index structure in which to store the index row entry.

By Oracle making the ROWID an additional column, it also has to allocate an additional byte per index row entry in order to store the length of this column. That’s one teeny weeny little byte extra for each and every index row entry.

So what ?

Well, for indexes that don’t have a particularly large index key length, that one byte can be a significant proportion of the overall key length. Now Oracle needs to allocate 2 byes per row entry for various flags and locking information, it requires 6 bytes for the rowid and 1 byte for each column entry. That’s 9 bytes minimum plus the length of the indexed value itself.

Well how large is a typical unique index entry? Well that of course all depends and some PK  / (and especially) Unique values can be quite large. But many many PK values are simply sequenced based numerical values, created nice and small so as to reduce overheads when stored in dependent child tables.

But can it really make any noticeable difference ?

Well, this little demo shows two tables with 1 million numeric PK values: Compare internal index storage between Unique and Non-Unique Indexes

Table test1 is created with a Non-Unique Index, table test2 is created with a Unique Index. The demo shows a partial block dump of a leaf block from each index, highlighting how the Non-Unique index requires an additional byte per index row entry.

The Unique index manages to hold 533 leaf entries in the block while the Non-Unique index could only hold 500. Comparing the total sizes of the two indexes, the Unique index required 1875 leaf blocks while the Non-Unique index required 1999 leaf blocks.

That’s an increase of approximately 6.6% in leaf blocks required for the Non-Unique index to store exactly the same number of index entries as the Unique Index (in this particular example).

That’s 6.6% less storage, that’s a reduction of 6.6% in block splitting and block allocations, that’s a reduction of 6.6% in the cost of full index scans, that’s 6.6% less memory required to cache the index, etc. etc.

The point here is that these savings don’t require any expensive, periodic rebuilding of indexes. They doesn’t require any additional fancy scripts or additional monitoring and processing. The DBA doesn’t have to calculate irrelevant statistics or demand scheduled outages to claim these savings.

This a getting more “dollars for your buck”  freebie from Oracle purely and simply by using a Unique index instead of an Non-Unique index.

Note also that not one or two but ALL of your numeric based PKs have the potential to get these types of savings. Obviously the larger the actual PK or Unique key values, the lesser a byte is in proportion to the overall key length and the less percentage savings.

But it’s not a bad payback for many many of your indexes, purely and simply by using Unique indexes instead of Non-unique indexes where possible …

This is but one of the benefits of using Unique Indexes. More (potentially significant) advantages to follow …

Follow

Get every new post delivered to your Inbox.

Join 1,918 other followers