jump to navigation

Differences between Unique and Non-Unique Indexes (Part I) December 18, 2007

Posted by Richard Foote in Constraints, Deferrable Constraints, Index Internals, Indexing Tricks, Novalidate Constraints, Oracle Indexes, Primary Key, Unique Indexes.
trackback

I’ve had a number of comments regarding my earlier blog entry where I recommended avoiding Deferrable and Novalidate constraints unless you need them and consider using Unique Indexes rather than Non-Unique Indexes where possible.

Why such a recommendation, aren’t Unique and Non-Unique indexes practically the same thing when it comes to policing constraints ?

Sure one index explicitly prevents the insertion of duplicates while the other doesn’t. Yes, dropping/disabling  a constraint policed by an automatically created Unique index causes the index to be dropped if you forget the KEEP INDEX clause.

But that’s about it, right ?

Well, if you need a constraint to be deferrable, then you must create (either implicitly or explicitly) a Non-Unique index. If you want to enable a constraint with novalidate, then again you can only do so with a Non-Unique index in place policing the constraint.

It does all rather sound like Non-Unique indexes have all the advantages and allows for all the flexibility one could want. Non-Unique indexes allows for both deferrable and novalidate constraints, they don’t get dropped when the associated constraint is dropped / disabled and they can actually police both PK and Unique constraints.

What possible benefits are there in Unique Indexes ?

Well, providing you don’t need your constraints to be deferrable, you validate your constraints when they get created/enabled and you don’t go around dropping PK and/or Unique constraints on too regular a basis (or remember the KEEP INDEX clause if you don’t want your index dropped when you do), then there are a number of reasons why you may just want to consider using Unique indexes over Non-Unique indexes.

There are actually a number of key differences between Unique and Non-Unique indexes, both in the manner in which they’re stored by Oracle and in the manner in which they get processed.

In Part I, I’m just going to focus on the differences in how Oracle physically stores index entries.

In actual fact, there’s really no such thing as a Non-Unique index in Oracle. In order for Oracle to be able to determine the location of any specific index row entry and for Oracle to be able to determine an appropriate “order” for each index row entry, internally, Oracle coverts all Non-Unique indexes into a Unique index. It does this by using the associated ROWID of the index row entry as an additional “column”. As each ROWID is unique, this effectively makes all index entries in a Non-Unique index unique as well. Oracle uses the unique combination of the Non-Unique index value and the associated ROWID to then determine the appropriate order and hence appropriate location within the index structure in which to store the index row entry.

By Oracle making the ROWID an additional column, it also has to allocate an additional byte per index row entry in order to store the length of this column. That’s one teeny weeny little byte extra for each and every index row entry.

So what ?

Well, for indexes that don’t have a particularly large index key length, that one byte can be a significant proportion of the overall key length. Now Oracle needs to allocate 2 byes per row entry for various flags and locking information, it requires 6 bytes for the rowid and 1 byte for each column entry. That’s 9 bytes minimum plus the length of the indexed value itself.

Well how large is a typical unique index entry? Well that of course all depends and some PK  / (and especially) Unique values can be quite large. But many many PK values are simply sequenced based numerical values, created nice and small so as to reduce overheads when stored in dependent child tables.

But can it really make any noticeable difference ?

Well, this little demo shows two tables with 1 million numeric PK values: Compare internal index storage between Unique and Non-Unique Indexes

Table test1 is created with a Non-Unique Index, table test2 is created with a Unique Index. The demo shows a partial block dump of a leaf block from each index, highlighting how the Non-Unique index requires an additional byte per index row entry.

The Unique index manages to hold 533 leaf entries in the block while the Non-Unique index could only hold 500. Comparing the total sizes of the two indexes, the Unique index required 1875 leaf blocks while the Non-Unique index required 1999 leaf blocks.

That’s an increase of approximately 6.6% in leaf blocks required for the Non-Unique index to store exactly the same number of index entries as the Unique Index (in this particular example).

That’s 6.6% less storage, that’s a reduction of 6.6% in block splitting and block allocations, that’s a reduction of 6.6% in the cost of full index scans, that’s 6.6% less memory required to cache the index, etc. etc.

The point here is that these savings don’t require any expensive, periodic rebuilding of indexes. They doesn’t require any additional fancy scripts or additional monitoring and processing. The DBA doesn’t have to calculate irrelevant statistics or demand scheduled outages to claim these savings.

This a getting more “dollars for your buck”  freebie from Oracle purely and simply by using a Unique index instead of an Non-Unique index.

Note also that not one or two but ALL of your numeric based PKs have the potential to get these types of savings. Obviously the larger the actual PK or Unique key values, the lesser a byte is in proportion to the overall key length and the less percentage savings.

But it’s not a bad payback for many many of your indexes, purely and simply by using Unique indexes instead of Non-unique indexes where possible …

This is but one of the benefits of using Unique Indexes. More (potentially significant) advantages to follow …

About these ads

Comments»

1. dizwell - December 18, 2007

6% of trivial is still trivial!

Meanwhile, being locked out of a large production table for a couple of hours as a largeish index you never realised got dropped in the first place gets re-created isn’t so trivial.

It’s a peculiar line of argument, anyway. If you just dropped your index altogether, you’d be able to save 100% of block splits, 100% of leaf nodes and 100% of the memory needed to cache the index!

But of course, the index is there for a reason, and incurs costs by its very existence.

Well, non-uniqueness is an attribute that exists for a reason, and incurs costs by its very existence. You are right to point out those costs, but wrong (I think) to conclude that they are excessive to the point of warranting the cheer-leader performance you’re doing on behalf of uniqueness!

2. Alberto Dell'Era - December 19, 2007

I’m wondering whether a non-unique index might deteriorate the 90/10 block split optimization – since now the rowid is part of the key, hence the 90/10 might trigger only if you insert a new row that has the indexed columns AND the rowid greater than the “current” max value.

It’s probably a question of academic interest only though (since if the data is unique and you choose a non-unique index, the indexed columns are going to be always different anyway; and if the data is non-unique, you don’t have many options ;)

3. Richard Foote - December 19, 2007

Hi Howard

The fact you’ve accidentally dropped an index in production without realising it I think highlights a fundamental difference between us.

You might want to consider moving to 11g ASAP as index builds /rebuilds incur no DML-blocking locks at all …

4. Noons - December 19, 2007

Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.
I must admit I tend to use non-unique and constraint-based uniqueness. Mostly because I deal with various release levels of db and the “KEEP INDEX” option is relatively recent: IIRC it only became available in 10g?
Interesting though that there is such a difference in sizing for unique indexes. Had never realised that.

5. dizwell - December 19, 2007

Come on, Richard. Treat the subject as it deserves. **Most** people won’t be using 11g for quite some time (I give it a couple of years minimum).

And please at least acknowledge that we’re not talking about *me* dropping an index without realising it. That would indeed be dumb. We’re talking about the *database* dropping it without telling anyone.

Or perhaps it doesn’t! The real issue I have here is that you issue one command (alter table X disable constraint Y) and you may or may not end up dropping an index, depending on all sorts of issues! It’s the non-deterministic nature of things I hate…

Still, when someone has to start responding with personally-directed comments, I guess the technical argument is more or less over.

6. Richard Foote - December 19, 2007

I’ve decided to just say I strongly disagree with Howard’s view and leave it at that …

7. Richard Foote - December 19, 2007

Hi Alberto

Yes, rowid does have a part to play with regard to how index block splits occur.

Quick demo that 90-10 splits still do occur:

SQL> create table in_rainbows (thom varchar2(20));

Table created.

SQL> create index in_rainbows_idx on in_rainbows(thom);

Index created.

SQL> insert into in_rainbows select ‘Radiohead’ from dual connect by level <= 100000;

SQL> commit;

Commit complete.

SQL> analyze index in_rainbows_idx validate structure;

Index analyzed.

SQL> select pct_used from index_stats;

PCT_USED
———-
100

So yes, block splits are based on the contents of the rowid. We still performed 90-10 splits and the index was nice and full.

However, if we insert lots and lots of the same value, and it isn’t the maximum value, then we potentially have a problem. Because generally, rows get insert in rowid order. At least the block id and index row entry portions generally keep increasing (unless Oracle reuses a previously deallocated extent or deleted space within a table block, or to some extent if using ASSM or lots of freelists). The file number can flip flop across if a tablespace has multiple datafiles but that simply means we have two or however many of hot portions of an index that get hit for that specific value.

When we perform the 50-50 split (as it’s not the maximum value), the 1/2 empty portions of the leaf block are not going to be used, because the rowid usally increments. Therefore it’s quite possible to get fragmentation.

Same example as before but this time, I’ll insert a max value first:

SQL> drop table in_rainbows;

Table dropped.

SQL> create table in_rainbows (thom varchar2(20));

Table created.

SQL> create index in_rainbows_idx on in_rainbows(thom);

Index created.

SQL> insert into in_rainbows values (‘ZZZ’);

1 row created.

SQL> commit;

Commit complete.

SQL> insert into in_rainbows select ‘Radiohead’ from dual connect by level <= 100000;

SQL> commit;

Commit complete.

SQL> analyze index in_rainbows_idx validate structure;

Index analyzed.

SQL> select pct_used from index_stats;

PCT_USED
———-
51

Now the index is only 1/2 full instead of the previous 100% full.

Hopes it makes sense.

8. The Rumble Down Under « die Seilerwerks - December 19, 2007

[...] Don Seiler Looks like Richard Foote won’t be on Howard Rogers’ Christmas card list for much longer (see the [...]

9. Alberto Dell'Era - December 19, 2007

Yes, my question was a bit more subtle actually ;)

Could be simplified as follows: consider a single-column table that contains only one row, and the singleton row contains ‘Radiohead’ and its rowid is ‘+infinite’ (the maximum rowid that can be generated, say on the last block of the [say, single-datafile] tablespace where the table resides).
A non-unique index is defined on the column.

If I insert gazillions of rows, all with the same value ‘Radiohead’, will the presence of the ‘+infinite’ rowid prevent the 90/10 optimization from taking place, since the pair (‘Radiohead’, rowid) to be inserted is always going to be less than the current max value (‘Radiohead’, +infinite), even when, by some magic incantation, i can insert the rowid as an ordered ascending sequence ?

I’ve tried myself, but it’s very hard to find the “magic incantation” – so I hope you already know the answer :)

10. Richard Foote - December 19, 2007

And the answer is yes.

I don’t have time at the moment but one way (maybe not the easiest way) would be to create a tablespace with 1 datafile (say No 10) and filled it with table A. Then add another datafile to the tablespace (say No 11) and create table B (and the index although it could go anywhere) in here. Then shrink the datafile No 11 so there’s no free space. Now drop table A, freeing up space in datafile No 10. Now increase the size of table B so that it starts using the space in datafile 10.

You will find that as it not using the max rowid anymore, it will start to perform 50-50 splits within the index …

11. Alberto Dell'Era - December 19, 2007

Thanks Richard, most appreciated and very interesting !

12. Richard Foote - December 20, 2007

Hi Noons

I’m sorry, for some reason your comment was initially treated as spam !!

Please , if you don’t see your comment get through, just contact me directly.

To make up for ignoring you for so long and because I thought your comments on local indexes was so interesting, I though I’ll write a separate Blog entry just for you. How special :)

Yes, I believe keep indexes came with 10g.

But like I said previously, if you never disable PK/Unique constraints or you managed it properly when you do and you don’t need to novalidate or defer a constraint, then Unique indexes should be seriously considered.

13. Richard Foote - December 20, 2007

Hi Alberto

A Blog entry just for you as well !!

14. dizwell - December 21, 2007

Keep indexes made it in 9i, actually. See http://tinyurl.com/2l7mfq

15. coskan - January 14, 2008

Hi Mr Foote,

For the example on 90/10 split.

How does it implemented on ASSM tablespaces.

when I run the first one on ASSM tablespace it is not 100 but 64. What is the difference

16. Richard Foote - January 14, 2008

Hi Coskan

It implements it the same. Do you have a demo ? Database version and Platform ?

SQL> create table ziggy (id number, name varchar2(20));

Table created.

SQL> create index ziggy_idx on ziggy(id) tablespace index_ts;

Index created.

SQL> insert into ziggy select rownum, ‘David Bowie’ from dual connect by level <= 100000;

SQL> commit;

Commit complete.

SQL> analyze index ziggy_idx validate structure;

Index analyzed.

SQL> select pct_used from index_stats;

PCT_USED
———-
100

SQL> select tablespace_name, segment_space_management from dba_tablespaces where tablespace_name = ‘INDEX_TS';

TABLESPACE_NAME SEGMEN
—————————— ——
INDEX_TS AUTO

17. coskan - January 14, 2008

Hi Again

The Demo is below

SQL>
SQL> select * from v$version;

BANNER
—————————————————————-
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 – 64bi
PL/SQL Release 10.2.0.2.0 – Production
CORE 10.2.0.2.0 Production
TNS for HPUX: Version 10.2.0.2.0 – Production
NLSRTL Version 10.2.0.2.0 – Production

SQL> select default_tablespace from user_users;

DEFAULT_TABLESPACE
——————————
UTILS

SQL> select segment_space_management from dba_tablespaces where tablespace_name=’UTILS';

SEGMEN
——
AUTO

SQL> drop table in_rainbows purge;

Table dropped.

SQL> create table in_rainbows (thom varchar2(20));

Table created.

SQL> create index in_rainbows_idx on in_rainbows(thom);

Index created.

SQL> insert into in_rainbows select ‘Radiohead’ from dual connect by level commit;

Commit complete.

SQL> analyze index in_rainbows_idx validate structure;

Index analyzed.

SQL> select pct_used from index_stats;

PCT_USED
———-
64

SQL>

18. Richard Foote - January 14, 2008

Hi Coskan

Yes, the problem here is due to the fact the index is non-unique and it uses the rowid as part of the index key. With ASSM, Oracle effectively selects blocks at “random” from within blocks currently referenced by the segment bitmaps when inserting into the table.

Therefore rows aren’t necessarily being inserted into the table in block id order. Therefore the rowid for a newly inserted row isn’t necessarily the maximum rowid that has ever been inserted. Therefore if an index block fills and it wasn’t the result of the max rowid (which in many cases it won’t because it might be due to row being inserted into a table block physically prior to the last table block that’s ever been used), it won’t perform the 90-10 split and will perform 50-50 splits.

So it’s the table being in an ASSM TS that’s the problem, not the index.

I talked about how non-unique indexes are really unique in this blog entry http://richardfoote.wordpress.com/2007/12/20/do-rowid-index-row-entry-columns-impact-index-block-splits/.

Hope it makes sense.

19. Richard Foote - February 22, 2008

“Then I ran across a monster who was sleeping by a tree.
And I looked and frowned as the monstor was me.
Well I said hello and I said hello.
And I asked why not and I replied I don’t know”

20. Henish - November 25, 2009

Respected Sir,

In your demo above you said

“Well, for indexes that don’t have a particularly large index key length,
that one byte can be a significant proportion of the overall key length.
Now Oracle needs to allocate 2 byes per row entry for various flags and locking information,
it requires 6 bytes for the rowid and 1 byte for each column entry.
That’s 9 bytes minimum plus the length of the indexed value itself ”

So will non-unique index require 9 bytes more per index entry then unique index Or just one byte ?

Also is in non-unique index is rowid store twice i.e. as key column and as rowid?

Thanks

Richard Foote - November 25, 2009

Hi Henish

Just one byte extra for using a non-unique index per index entry. Not a lot I know but it adds up and it’s a “free” saving you get by implementing a Unique index. The above example creates a unique index that’s about 6.5% smaller, without ever having to rebuild the index.

No, the rowid is just stored the once. But it’s stored now as part of the index column list and so the index entry needs this extra byte to store the index column length.

So it’s just 1 byte extra for the column length, per index entry, that’s all.

21. Unique Bitmap Indexes Part I (Unnatural Selection) « Richard Foote’s Oracle Blog - March 24, 2010

[...] I’ve discussed previously, Oracle makes the Non-Unique index effectively unique by adding the rowid as an additional indexed [...]

22. [the] index and a kind of a kind of difference between index come - August 23, 2011

[...] article introduced the well the similarities and differences between the two kinds of index:Differences Between Unique and Non-Unique IndexesIn the final analysis, is that the two types of index of different structure.Quoted the this article [...]

23. daxeshpatelia - September 19, 2011

Hi
I am new to this blog and I have started reading this blog but when I click on link it gives following error. I already created an account with wordpress.com but still getting error. Can anyone help me how to solve this error.

Regards
Dax

“— 403: Access Denied —

This file requires authorization:

You must both be a user of this blog as well as be currently logged into WordPress.com”

24. Raj - December 17, 2012

Hi Richard,

It is a nice blog. Thanks for sharing the knowledge with us.

25. Raj - December 17, 2012

Hi Richard,

I am new to DBA line. i have few questions on index.

1. How the disk i/os depends on the B-tree level when the table is structured with a B-tree index
2.How to use an existing index to enforce a new constraint you are creating?

26. Richard Foote - December 18, 2012

Hi Raj

I don’t quite follow your first question. If not already in the buffer cache, when performing an index ranger scan, Oracle will have to perform a disk IO to read an index branch block at each level of the index in order to get to the specific leaf block(s) of interest. If the table is an Index Organised one, then each scan will need to read a block at each index level.

Oracle automatically uses an exisitng index to police a new constraint, if a suitable index already exists (which simply needs the leading columns of the index to match that of the constraint).

Hope this answers your questions.

27. sachinpawarid - December 7, 2013

Richard i am not able to see the content of the file “http://richardfoote.files.wordpress.com/2007/12/comparison-between-unique-and-non-unique-indexes.txt” I am getting that i should be member of the blog. How can i do that to i am ble to see the contents of this file? Thanks Sachin

28. Richard Foote - January 3, 2014

Hi Sachin

Yes, many of the older articles have links that no longer work. Sloooooowly getting them fixed.

29. vipankumarsharma - July 2, 2014

Hi Richard,

Can you please review if this right assessment?

There is another difference between non unique and unique indexes that non unique indexes might cause more buffer gets that unique indexes.

SQL> create index t1_id_ix on t1(id);

Index created.

SQL> select * /*+ gather_plan_statistics */ from t1 where id=102;

ID N1 V1 PADDING
———- ———- ———- —————————————————————————————————-
102 51 xxxxxx 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

SQL> select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

PLAN_TABLE_OUTPUT
———————————————————————————————————————————————–
SQL_ID brqkj67g3ty0u, child number 0
————————————-
select * /*+ gather_plan_statistics */ from t1 where id=:”SYS_B_0″

Plan hash value: 666863167

————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 3 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 3 |
|* 2 | INDEX RANGE SCAN | T1_ID_IX | 1 | 1 | 1 |00:00:00.01 | 2 |
————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

2 – access(“ID”=:SYS_B_0)

19 rows selected.

SQL> drop index t1_id_ix;

Index dropped.

SQL> create unique index t1_id_ix on t1(id);

Index created.

SQL> select * /*+ gather_plan_statistics */ from t1 where id=102;

ID N1 V1 PADDING
———- ———- ———- —————————————————————————————————-
102 51 xxxxxx 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

SQL> select * from table(dbms_xplan.display_cursor(null,null,’allstats last’));

PLAN_TABLE_OUTPUT
———————————————————————————————————————————————–
SQL_ID brqkj67g3ty0u, child number 0
————————————-
select * /*+ gather_plan_statistics */ from t1 where id=:”SYS_B_0″

Plan hash value: 1600258760

————————————————————————————————–
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
————————————————————————————————–
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2 |
| 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 2 |
|* 2 | INDEX UNIQUE SCAN | T1_ID_IX | 1 | 1 | 1 |00:00:00.01 | 1 |
————————————————————————————————–

Predicate Information (identified by operation id):
—————————————————

2 – access(“ID”=:SYS_B_0)

19 rows selected.

SQL> set line 80
SQL> desc t1
Name Null? Type
—————————————– ——– —————————-
ID NUMBER
N1 NUMBER
V1 VARCHAR2(10)
PADDING VARCHAR2(100)

SQL>

30. About Andalusian Horses - September 13, 2014

I every time spent my half an hour to read this blog’s content all the time
along with a mug of coffee.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,861 other followers

%d bloggers like this: