jump to navigation

Rebuilding Indexes and the Clustering Factor Quiz (One Of The Few) September 20, 2011

Posted by Richard Foote in Clustering Factor, Index Rebuild, Oracle Indexes, Quiz.
trackback

Today’s question has been prompted by various recent comments regarding the Clustering Factor (CF) of an index and how to change the CF requires a reorg of the underlining table.

It used to be quite a common myth that if the CF of an index was greater that “X” or based on some nonsensical formula the CF was greater than “Y”, then rebuilding the index somehow made everything better again. I believe it’s now much more commonly accepted that rebuilding an index does not change the CF of an index in any way. By rebuilding an index, the actual order of the index entries remains unchanged as does the order of the rows within the table and so the resultant CF can’t possibly change.

Pick any index, no matter how bad the CF or badly fragmented the index or table, take fresh statistics and after rebuilding the index (and fresh statistics if compute statistics on the index isn’t performed), the CF will remain the same.

However, there are nearly always exceptions …

Give an example of when rebuilding an index will significantly change the actual CF of an index, assuming 100% accurate statistics are gathered before/after the index rebuild on a table with no DML.

There are actually two such examples that spring to mind 🙂

Comments»

1. David Aldridge - September 21, 2011

Rebuilding and reversing?

drop table t1;

create table t1 (col1)
as
select rownum
from dual
connect by level <= 100000;

create index i1 on t1 (col1);

select clustering_factor
from user_indexes
where index_name = 'I1';

alter index i1 rebuild reverse;

select clustering_factor
from user_indexes
where index_name = 'I1';

table T1 dropped.
table T1 created.
index I1 created.
index I1 altered.
table T1 dropped.
table T1 created.
index I1 created.
CLUSTERING_FACTOR
———————-
153

index I1 altered.
CLUSTERING_FACTOR
———————-
99988

… and NOREVERSE'ing of course.

Like

Richard Foote - September 21, 2011

Hi David

Interesting …

I’ll have more to say later 😉

Like

2. Flado - September 21, 2011

A secondary index on an IOT: I don’t even need to rebuild it to change its CF 🙂
(Oracle 11.2)


drop table t purge;
create table t (l primary key, f ) organization index as select level l, rpad(level,100) f from dual connect by leveltrue, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';
update t set l=l-1e6 where dbms_random.value(1,f)true, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';
alter index tf rebuild;
begin dbms_stats.gather_table_stats(null, 'T', cascade=>true, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';

Results:

drop table t succeeded.
create table succeeded.
create index succeeded.
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 20280

5613 rows updated
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 24288

alter index tf succeeded.
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 24288

Like

Flado - September 21, 2011

Mangled… sorry about that.
Another try:

drop table t purge;
create table t (l primary key, f ) organization index as select level l, rpad(level,100) f from dual connect by level<=1e5;
create index tf on t(f);
begin dbms_stats.gather_table_stats(null, 'T', cascade=>true, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';
update t set l=l-1e6 where dbms_random.value(1,f)<1e3;
begin dbms_stats.gather_table_stats(null, 'T', cascade=>true, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';
alter index tf rebuild;
begin dbms_stats.gather_table_stats(null, 'T', cascade=>true, estimate_percent=>100); end;
/
select index_name, clustering_factor
from user_indexes where index_name='TF';

Results...

drop table t succeeded.
create table succeeded.
create index succeeded.
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 20280

5613 rows updated
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 24288

alter index tf succeeded.
anonymous block completed
INDEX_NAME CLUSTERING_FACTOR
------------------------------ ----------------------
TF 24288

Like

Flado - September 21, 2011

Oops… I’m talking utter nonsence. Sorry about that.

Here are my (sensible this time, I hope) thoughts:
Since th CF depends only on two things – the order of the key values and the order of the rowids, a rebuild must change one of these in order to affect the CF.
We’re not allowed to move the table rows around, so that leaves only the order of the keys.
A rebuild can change that order if:
a) it reverses it, as David showed, or
b) a parameter has changed that affects the ordering of the keys – NLS parameters come to mind, probably in conjunction with a function-based index
In any case, we would end up with e different index, so the rebuild would not be safe.

For non-unique indexes the rowids are part of the key, but we can affect their order.
For bitmap indexes, two rowids per entry are part of the key and can be changed via a rebuild, but the CF has no meaning there.

Sorry, there is no Oracle for Maemo, so I cannot test any of this right now.

Cheers,
Flado

Like

3. Flado - September 21, 2011

“For non-unique indexes the rowids are part of the key, but we can NOT affect their order.”
No more comments – promise.

Like

Richard Foote - September 21, 2011

Hi Flado

A few comments for now.

Non-Unique indexes indeed does include the rowid as part of the index entry, but will simply rebuilding an index impact this.

Indeed the CF for a bitmap index is basically the number of index entries and so is effectively meaningless.

Like

David Aldridge - September 21, 2011

I’ve never seen a convincing explanation for why bitmap indexes should be treated like this. It is still provably (and intuitively) more efficient to physically cluster data segment rows by bitmap indexed columns, and that ought to make a difference to the cost calculation for bitmap index plans.

Like

4. vishaldesai - September 21, 2011

Index rebuild with different DOP will change clustering factor with no DML on table.

Demo 1: Block size 4k
=====================

VDESAI@db1 > create table tab1 (column1 number);

Table created.

VDESAI@db1 >
VDESAI@db1 > insert into tab1 select rownum from dual connect by level
VDESAI@db1 > create index ind1 on tab1(column1) ;

Index created.

VDESAI@db1 >
VDESAI@db1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FOR
ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db1 >
VDESAI@db1 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 4643 6296

VDESAI@db1 >
VDESAI@db1 >
VDESAI@db1 > alter index ind1 rebuild parallel 16 ;

Index altered.

VDESAI@db1 >
VDESAI@db1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FOR
ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db1 >
VDESAI@db1 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 4591 3282

Demo 2: Block size 8k
=====================

VDESAI@db2 > drop table tab1;

Table dropped.

VDESAI@db2 >
VDESAI@db2 > create table tab1 (column1 number);

Table created.

VDESAI@db2 >
VDESAI@db2 > insert into tab1 select rownum from dual connect by level
VDESAI@db2 > create index ind1 on tab1(column1) ;

Index created.

VDESAI@db2 >
VDESAI@db2 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db2 >
VDESAI@db2 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2226 1516

VDESAI@db2 >
VDESAI@db2 >
VDESAI@db2 > alter index ind1 rebuild parallel 16 ;

Index altered.

VDESAI@db2 >
VDESAI@db2 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db2 >
VDESAI@db2 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2233 1933

Demo 3: Block size 16k
======================

–DOP 16

VDESAI@db3 > drop table tab1;

Table dropped.

VDESAI@db3 >
VDESAI@db3 > create table tab1 (column1 number);

Table created.

VDESAI@db3 >
VDESAI@db3 > insert into tab1 select rownum from dual connect by level
VDESAI@db3 > create index ind1 on tab1(column1) ;

Index created.

VDESAI@db3 >
VDESAI@db3 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db3 >
VDESAI@db3 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
1 1099 752

VDESAI@db3 >
VDESAI@db3 >
VDESAI@db3 > alter index ind1 rebuild parallel 16 ;

Index altered.

VDESAI@db3 >
VDESAI@db3 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db3 >
VDESAI@db3 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
1 1107 941

–DOP 4

VDESAI@db3 > drop table tab1;

Table dropped.

VDESAI@db3 >
VDESAI@db3 > create table tab1 (column1 number);

Table created.

VDESAI@db3 >
VDESAI@db3 > insert into tab1 select rownum from dual connect by level
VDESAI@db3 > create index ind1 on tab1(column1) ;

Index created.

VDESAI@db3 >
VDESAI@db3 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db3 >
VDESAI@db3 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
1 1099 752

VDESAI@db3 >
VDESAI@db3 >
VDESAI@db3 > alter index ind1 rebuild parallel 4 ;

Index altered.

VDESAI@db3 >
VDESAI@db3 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, method_opt=>’FOR TABLE FO
R ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@db3 >
VDESAI@db3 > select blevel,leaf_blocks,clustering_factor
2 from user_indexes
3 where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
1 1101 797

Producer/Consumer rows processed by Slave processes
===================================================

VDESAI@db3> alter index ind1 rebuild parallel 16;

Index altered.

VDESAI@db3> @tq
Show PX Table Queue statistics from last Parallel Execution in this session…

TQ_ID
(DFO,SET) DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES OPEN_TIME AVG_LATENCY WAITS T
———- ———- ———- ———- ———- ———- ———- ———– ———-
:TQ10000 1 0 Ranger 240 9404 0 0 16 1 QC
Producer 65025 1170725 0 0 14 3 P065
Producer 74495 1342681 0 0 20 7 P076
Producer 15975 288451 0 0 3 1 P075
Producer 37255 671757 0 0 7 2 P072
Producer 111735 2009575 0 0 24 7 P066
Producer 59865 1078418 0 0 15 5 P071
Producer 58535 1055058 0 0 15 5 P073
Producer 61195 1103014 0 0 15 7 P068
Producer 58535 1053659 0 0 12 3 P069
Producer 47895 863381 0 0 9 2 P067
Producer 53215 957929 0 0 9 2 P070
Producer 69175 1246837 0 0 12 3 P078
Producer 58535 1055105 0 0 12 4 P077
Producer 42575 767604 0 0 11 4 P079
Producer 133015 2395605 0 0 29 7 P074
Producer 53215 959311 0 0 8 3 P064
Consumer 108073 1947294 0 0 28 25 P063
Consumer 68251 1229918 0 0 28 25 P055
Consumer 67332 1213335 0 0 28 25 P056
Consumer 59850 1078495 0 0 28 25 P050
Consumer 50706 913756 0 0 28 25 P049
Consumer 48704 877692 0 0 28 25 P054
Consumer 49546 892815 0 0 28 25 P053
Consumer 57188 1030510 0 0 28 25 P052
Consumer 52866 952663 0 0 28 25 P051
Consumer 53449 963175 0 0 28 25 P057
Consumer 60679 1093385 0 0 28 25 P061
Consumer 49541 892774 0 0 28 25 P058
Consumer 69756 1256943 0 0 28 25 P060
Consumer 59021 1063534 0 0 28 25 P062
Consumer 63077 1136620 0 0 28 25 P059
Consumer 81961 1466797 0 0 28 25 P048

:TQ10001 1 1 Producer 1 294 0 0 0 0 P061
Producer 1 294 0 0 0 0 P062
Producer 1 294 0 0 0 0 P057
Producer 1 294 0 0 0 0 P053
Producer 1 294 0 0 0 0 P058
Producer 1 294 0 0 0 0 P049
Producer 1 294 0 0 0 0 P060
Producer 1 294 0 0 0 0 P052
Producer 1 294 0 0 0 0 P063
Producer 1 294 0 0 0 0 P050
Producer 1 294 0 0 0 0 P056
Producer 1 294 0 0 0 0 P055
Producer 1 294 0 0 0 0 P059
Producer 1 294 0 0 0 0 P051
Producer 1 294 0 0 0 0 P048
Producer 1 294 0 0 0 0 P054
Consumer 16 4704 0 0 7 3 QC

50 rows selected.

VDESAI@db3> alter index ind1 rebuild parallel 4;

Index altered.

VDESAI@db3> @tq
Show PX Table Queue statistics from last Parallel Execution in this session…

TQ_ID
(DFO,SET) DFO_NUMBER TQ_ID SERVER_TYP NUM_ROWS BYTES OPEN_TIME AVG_LATENCY WAITS T
———- ———- ———- ———- ———- ———- ———- ———– ———-
:TQ10000 1 0 Ranger 12 624 0 0 2 0 QC
Producer 277813 5004077 0 0 6 2 P052
Producer 235413 4240195 0 0 7 1 P055
Producer 231423 4168217 0 0 8 1 P053
Producer 255363 4594913 0 0 10 1 P054
Consumer 126350 2276456 0 0 6 3 P049
Consumer 116214 2093845 0 0 6 3 P051
Consumer 633081 11395983 0 0 6 3 P048
Consumer 124355 2240494 0 0 6 3 P050

:TQ10001 1 1 Producer 1 294 0 0 0 0 P049
Producer 1 294 0 0 0 0 P048
Producer 1 294 0 0 0 0 P051
Producer 1 294 0 0 0 0 P050
Consumer 4 1176 0 0 2 0 QC

14 rows selected.

Like

Richard Foote - September 21, 2011

Hi Vishaldesai

Thanks for all your efforts with the demos 🙂

Unfortunately however, the answer is incorrect and does not qualify for a number of reasons.

Basically, the DOP does not in fact change the CF, it only changes the manner in which the stats are collected, which in turn can result in inaccurate statistics, rather than a change of the CF as such.

The wording in my question was quite clear, “Assuming 100% accurate statistics”, you simply have forced Oracle to sample 100% of data but in a more efficient manner that doesn’t result in 100% accurate stats.

It might all be worth a separate blog piece to discuss, but to prove my point, simply look at the CF after you perform the rebuild as the stats are automatically recalculated since 10g and before you use dbms_stats. You will see then that the CF is totally unchanged. When you actually collect the stats, the degree has now changed to 16, which means Oracle has no accurate method to merge the CF from the parallel processes which results in a less accurate figure.

So it’s not the rebuild that changed the CF it was recalculating stats with a degree greater than 1. You didn’t need to rebuild the index to do this, a simple alter index blah parallel 16 would have sufficed.

So, no good. If I accept this answer, than I would have to accept any answer that didn’t collect 100% accurate stats.

It’s similar to your PCTFREE demo …

Like

vishaldesai - September 22, 2011

My bad. I should have checked 10046 trace which shows sample applied to sys_op_countchg.

Like

5. vishaldesai - September 21, 2011

Another case where clustering factor of index rebuild can change with no dml activity.

–pctfree 50 estimate_percent auto

VDESAI@ipspr1_dv1 > alter index ind1 rebuild pctfree 50;

Index altered.

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null,method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 4096 1516

VDESAI@ipspr1_dv1 > alter index ind1 rebuild pctfree 70;

Index altered.

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null,method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 7055 5912

–pctfree 70 estimate_percent 100

VDESAI@ipspr1_dv1 > alter index ind1 rebuild pctfree 50;

Index altered.

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>100, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 4096 1516

VDESAI@ipspr1_dv1 > alter index ind1 rebuild pctfree 70;

Index altered.

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>100, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 7038 1516

Like

vishaldesai - September 21, 2011

Ignore comment 5 regarding pctfree changes. It has simply to do with estimate_percent.

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>10, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2270 2460

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>20, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2232 2454

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>30, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2232 2422

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>40, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2254 2462

VDESAI@ipspr1_dv1 > exec dbms_stats.gather_table_stats(user, ‘TAB1′, null, estimate_percent=>50, method_opt=>’FOR TABLE FOR ALL COLUMNS SIZE REPEAT’, cascade=>true);

PL/SQL procedure successfully completed.

VDESAI@ipspr1_dv1 > select blevel,leaf_blocks,clustering_factor from user_indexes where index_name = ‘IND1’;

BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
———- ———– —————–
2 2111 2361

Like

Richard Foote - September 21, 2011

Hi Vishaldesai

No worries, the stats must be 100% accurate to qualify 😉

Like

6. Martin Preiss - September 21, 2011

Richard,

it seems that there are already two correct answers (rebuild reverse, rebuild parallel) so I want to add a wrong one: blocksize. Or more precisely: the blocksize of the index segment. A change of the blocksize of the table segment would change the clustering factor (that would be a physical reorganization) – but the blocksize of the index segment has no impact on the clustering factor.

drop table test_blocksize;

create table test_blocksize
as
select rownum id
     , mod(rownum,10) col2
	 , lpad('*', 100, '*') filler
  from dual
connect by level <= 10000
 order by 2;

select tablespace_name
     , block_size
  from dba_tablespaces;

TABLESPACE_NAME                BLOCK_SIZE
------------------------------ ----------
TEST_16K                            16384
TEST_2K                              2048  
 
 
create index test_2k_idx tablespace test_2k on test_blocksize(col2); 
 
select index_name
     , clustering_factor 
  from user_indexes
 where index_name = 'TEST_2K_IDX';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
TEST_2K_IDX                                  157

alter index test_2k_idx rebuild tablespace test_16k;

select index_name
     , clustering_factor 
  from user_indexes
 where index_name = 'TEST_2K_IDX';

INDEX_NAME                     CLUSTERING_FACTOR
------------------------------ -----------------
TEST_2K_IDX                                  157

Seems to be some conditioned reflex that I begin to think about blocksizes when I see “clustering factor” and “rebuilding indexes” …

Regards

Martin

Like

Richard Foote - September 21, 2011

Hi Martin

No, rebuilding an index with a different DOP doesn’t count as a valid answer. The index entries still end up in the same order and so you end up with the same CF. You just collect less accurate stats when you use dbms_stats but you would end up with the same CF if you looked at the stats right after the rebuild.

And indeed, changing the blocksize won’t work as well 😉

Thanks for the demo to illustrate !!

Like

Martin Preiss - September 21, 2011

Hi Richard,

thank you for the clarification.

Like

7. Dax - September 21, 2011

Hi

I am getting following error while accessing any demo. Can some one help me how to resolve this?

“— 403: Access Denied —

This file requires authorization:

You must both be a user of this blog as well as be currently logged into WordPress.com”

Regards
Dax

Like

Richard Foote - September 21, 2011

Hi Dax,

Yeak I know, my earlier notepad text files are no good and I need to convert them sometime to a different format.

When I have the time 😦

Like

8. John Brady - September 21, 2011

Assuming that the Clustering Factor can be changed by a simple index rebuild under specific circumstances, then I would guess at a table with many deleted rows in it. Prior to the rebuild but after the deletion of many rows, the index could be quite large and sparse, with most leaf blocks having very few valid entries in them. This might lead to a high (poor) Clustering Factor (a guess on my part).

Following an index rebuild, the leaf pages would now be dense and fully populated with entries, which might decrease the value of the Clustering Factor (better clustering), assuming the data in the table was physically ordered by the columns in the index.

I can also see that if “statistics were up to date” before the rebuild, then the index would still be in order with regard to the data in the table, and so the Clustering Factor might be “correct” anyway, and not change. So just a guess from me.

What might change things more is if the index allowed duplicates and there were a great many duplicate values in the table before the deletion. And the deletion deleted all the duplicates but one. Is it possible that the index with many duplicates would not have them in any particular order within the same values in the leaf blocks, and so Clustering Factor could be poor (high value). Post deletion there are no duplicates, and the index rebuild would be perfect with a good Clustering Factor (low value).

John

Like

David Aldridge - September 21, 2011

John,

I think you’re off the mark on this, as the algorithm for clustering factor would not depend on the index being sparse — as I understand it the index is processed in a range scan from low to high and the clustering factor is incremented by 1 for every rowid that represents a different data segment block to the prior rowid.

The number of index leaf blocks scanned would not be a consideration for the clustering factor, although it could well affect the cost determination for an index range scan (and maybe a single value lookup as well, albeit indirectly through affecting the b-level).

Like

Richard Foote - September 21, 2011

Hi David

Spot on 🙂

Like

Richard Foote - September 21, 2011

Hi John

No deletes have no impact as the CF is only based on non-deleted index entries and all these remain and remain in the same order after a rebuild.

So no matter how fragmented, the rebuild will not change, except for 2 special scenarios.

Note also that there really isn’t a non-unique index in Oracle, all indexes are effectively unique because Oracle will add the rowid if necessary to ensure this is always the case.

So no, whether the index is unique/non-unique doesn’t describe these scenarios.

Like

John Brady - September 21, 2011

As I said, it was a total guess on my part. “Clutching at straws” as it were, trying to quickly think up anything that might have an impact on an index’s structure when rebuilt. Good to have it confirmed that deleting rows has no impact on the index statistics, and hence Oracle’s likelihood of choosing it.

John

Like

9. Martin Preiss - September 21, 2011

John,

in my understanding deleted rows in the index structure should have no impact on the clustering factor because the factor only tells the cbo “how likely contiguous rows from an index range scan will point to the same table block” (Randolf Geist: http://oracle-randolf.blogspot.com/2010/01/clusteringfactor-what-if-analysis.html). In Randolf’s Blog there is a query using some dbms_rowid functions to determine the clustering factor for an index without the need to actually create the index (and there’s a similar query in Christian Antognini’s Performance Tuning book too). So the clustering factor is only a change count for table block access when range scanning the index.

Regards

Martin

Like

Richard Foote - September 21, 2011

Hi Martin

Spot on 🙂

Like

10. albertofrosi - September 21, 2011

Hi Richard,
I tryed to do some tests for create a couple of situation in 10.2.0.5.0.
In the first example I tryed to alter index with rebuild reverse and CF is changed.
In the second I tryed drop and recreate the table but in this case CF remains the same.
Therefore the CF can be change but in my first example in worst.
I’ve for this table 37771 blocks and a CF for the index 108557, after rebuild it’s worsened 114889

First:
select blocks from user_tables where table_name=’F123′;

blocks
37771

select clustering_factor from user_indexes where index_name=’F123_1′;

CF
108557

begin
dbms_stats.gather_table_stats(null, ‘F123’, cascade=>true, estimate_percent=>100);
end;
/

alter index F123_1 rebuild reverse;

begin
dbms_stats.gather_table_stats(null, ‘F123’, cascade=>true, estimate_percent=>100);
end;
/

CF
114889

Second:

drop table F123;

create table F123 as select * from F123_old;

CREATE UNIQUE INDEX “CRPDTA”.”F123_1bck” ON “CRPDTA”.”F123″
(
“SDDOCO”, “A”, “B”, “C”
)
REVERSE PCTFREE 10 INITRANS 2 MAXTRANS 255 COMPUTE STATISTICS STORAGE
(
INITIAL 57344 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645 PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT
)
TABLESPACE “misterx” ;

begin
dbms_stats.gather_table_stats(null, ‘F123′, cascade=>true, estimate_percent=>100);
end;
/

select blocks from user_tables where table_name=’F123′;

blocks
38005

select clustering_factor from user_indexes where index_name=’F123_1bck’;

CF
114889

alter index F123_1bck rebuild reverse;

begin
dbms_stats.gather_table_stats(null, ‘F123’, cascade=>true, estimate_percent=>100);
end;
/

CF
114889

In the previous post the CF should be about like blocks table to optimize or help CBO choice but for me in both examples don’t so.

Ciao
Alberto

Like

Richard Foote - September 21, 2011

Hi Alberto

Interesting …

Thanks 🙂

Like

11. Richard Foote - September 21, 2011

@David

It all comes down to Oracle not being able to look at a single rowid in the index entry to see if the block id of the rowid in the next index entry changes. There are the 2 rowids that cover a whole range of possible rowids and so it’s impossible to determine the CF by simply looking at the index itself.

So Oracle basically goes here’s an index entry that covers a whole range of rowids, increment CF by 1, here’s another index entry that covers a whole range of rowids, increment CF by 1, and so on for all index entries.

The processing would have to be totally different and Oracle just hasn’t ever bothered with it, because it would be quite a tricky thing to do no doubt with some accuracy. It can’t just look at all the 1 bits and see how clustered they are because the 0s in between could be other column values or phamton rows that don’t really exist. So it would have to pop down to the table itself all the time which would be really expensive.

But I agree, it would be a useful thing to have for bitmap indexes as the “real” CF is so critical to the efficiency of using the index.

Like

David Aldridge - September 21, 2011

Hmmm, I’m not convinced that it would be that difficult — the internal processing is more complex of course, but the statistics gathering process doesn’t have to deal with that because it just runs SQL statements. The clustering factor can be calculated from a simple query optimised to use the index regardless of it’s internal organisation, as Randolf Geist demonstrated.

Arguably there might be a more useful way of calculating the clustering factor for bitmap indexes that takes account of a reduced likelihood of encountering range-based predicates, though, as the current method would probably be a little prejudiced against high cardinality bitmap indexes.

Here’s a little script …

drop table bm_test;

create table bm_test(col1 not null, col2 not null)
as
select trunc(rownum-1,-3),mod(rownum,10)*1000
from dual
connect by level <= 10000;

select col1,count(*) from bm_test group by col1 order by 1;
select col2,count(*) from bm_test group by col2 order by 1;

create index bm_idx1 on bm_test (col1);
create index bm_idx2 on bm_test (col2);

select index_name,clustering_factor as btree_clustering_factor
from user_indexes
where table_name = 'BM_TEST'
/

drop index bm_idx1;
drop index bm_idx2;

create bitmap index bm_idx1 on bm_test (col1);
create bitmap index bm_idx2 on bm_test (col2);

select index_name,clustering_factor as bitmap_clustering_factor
from user_indexes
where table_name = 'BM_TEST'
order by index_name
/

select
sys_op_countchg(substrb(row_id,1,15), 1) as col1_clustering_factor
from
(
select /*+ no_merge no_eliminate_oby index_combine(bm_test bm_idx1) */
rowid as row_id
from
bm_test
where
col1 is not null
order by
col1,
rowid
);

select
sys_op_countchg(substrb(row_id,1,15), 1) as col2_clustering_factor
from
(
select /*+ no_merge no_eliminate_oby index_combine(bm_test bm_idx2) */
rowid as row_id
from
bm_test
where
col2 is not null
order by
col2,
rowid
);

table BM_TEST dropped.
table BM_TEST created.
COL1 COUNT(*)
———————- ———————-
0 1000
1000 1000
2000 1000
3000 1000
4000 1000
5000 1000
6000 1000
7000 1000
8000 1000
9000 1000

10 rows selected

COL2 COUNT(*)
———————- ———————-
0 1000
1000 1000
2000 1000
3000 1000
4000 1000
5000 1000
6000 1000
7000 1000
8000 1000
9000 1000

10 rows selected

index BM_IDX1 created.
index BM_IDX2 created.
INDEX_NAME BTREE_CLUSTERING_FACTOR
—————————— ———————–
BM_IDX2 160
BM_IDX1 16

index BM_IDX1 dropped.
index BM_IDX2 dropped.
bitmap index BM_IDX1 created.
bitmap index BM_IDX2 created.
INDEX_NAME BITMAP_CLUSTERING_FACTOR
—————————— ————————
BM_IDX1 10
BM_IDX2 10

COL1_CLUSTERING_FACTOR
———————-
16

COL2_CLUSTERING_FACTOR
———————-
160

Like

12. Heitor Kirsten - September 22, 2011

Well,

I would say that the clustering factor is going to change a lot after the rebuild when we do an “alter table tablename compress move”… Indeed, a rebuild will become mandatory after this.

Not sure if it qualifies to the quiz.

Heitor

Like

13. Heitor Kirsten - September 22, 2011

Just another thought…

If the data in the table is kind of sparse and we issue an “alter table tablename shrink space” before rebuild the index, then the clustering factor could have a big change.

In this situation the rebuild will not be mandatory…

Anyway, waiting for the answer. 🙂

Heitor

Like

14. vishaldesai - September 24, 2011

Hi Richard,

CBO fundamentals page 71 says “You can often reduce the number of leaf_blocks (and, very occasionally, the blevel) of an index by
rebuilding the index; but rebuilding an index has no effect on the clustering_factor”.

Page 68 describes how clustering factor is determined and based on that I am really curios to know how clustering factor can change with index rebuild and no DML activity.

Thanks,
Vishal

Like

15. Gary Myers (@syd_oracle) - September 25, 2011

I think you could do it with a Function Based Index on TO_CHAR(date_column) if, prior to the index rebuild, you changed the nls_date_language of the session.

http://blog.sydoracle.com/2009/12/happy-new-year-and-fun-with-dates.html

That fits a more generic model of rebuilding an FBI where the underlying logic has changed.

Rebuilding an UNUSABLE index could change the CF too as the old index wouldn’t necessarily have been in step with the actual table data.

Like

16. Rebuilding Indexes and the Clustering Factor Solution (Move On) « Richard Foote’s Oracle Blog - September 25, 2011

[…] a nice little demo, see David Alridge’s comment or my previous discussion on Reverse Key […]

Like


Leave a comment