jump to navigation

Important !! Clustering Factor Calculation Improvement (Fix You) May 8, 2013

Posted by Richard Foote in 11g, ASSM, CBO, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
trackback

Believe me, this article is worth reading :)

I’m currently not allowed to discuss Oracle 12c Database goodies but I am allowed to discuss things perhaps initially intended for 12c that are currently available and already back-ported to 11g. This includes a wonderful improvement in the manageability of how the Clustering Factor (CF) of an index can now be calculated. Many thanks to Martin Decker for pointing this out to me.

As anyone who has attended my Index Seminars will know, the CF of an index is one of the most important statistics used by the Cost Based Optimizer (CBO) in determining the most efficient execution plan. As such, it has always been an issue for me that the manner in which the CF is calculated has been so flawed.

Basically, the CF is calculated by performing a Full Index Scan and looking at the rowid of each index entry. If the table block being referenced differs from that of the previous index entry, the CF is incremented. If the table block being referenced is the same as the previous index entry, the CF is not incremented. So the CF gives an indication of how well ordered the data in the table is in relation to the index entries (which are always sorted and stored in the order of the index entries). The better (lower) the CF, the more efficient it would be to use the index as less table blocks would need to be accessed to retrieve the necessary data via the index.

However, there’s a basic flaw here. The CF calculation doesn’t take into consideration the fact the referenced table block, although maybe different from the previous one index entry, might already have recently been accessed. As such, during an index scan, the table block being accessed is almost certainly still cached in the buffer cache from the previous access, thereby not reducing the effectiveness of the index in any appreciable manner. A classic example of this would be a table with a few freelists. Although the data being inserted is not ordered precisely within the same data blocks, the data might actually be very well clustered within only a few blocks of each other.

Picture a table with 100 rows being inserted by 2 sessions simultaneously, each inserting 50 rows based on an ordered sequence. With one freelist, the data is basically inserted in one block first and then once full a second table block. The data is therefore perfectly ordered/clustered and the CF will evaluate to a value of 2 on such an indexed column. But with 2 freelists, one session could insert data into one block while the other session inserts into a second block, with the ordered sequenced values being randomly distributed among the 2 blocks.  The CF could now potentially evaluate to a value of 100 as the rows are jumbled or “toggled” across the two blocks. This is a much much worse value (2 vs. 100) that can adversely impact the CBO calculations, although the efficiency of such an index is really almost identical as both table blocks are certain to be cached during an index scan regardless.

This is also a very common scenario with Automatic Segment Space Management (ASSM) tablespaces as I’ve discussed previously, which of course is now the default these days.

OK, let’s look at an example scenario. I’ll begin by creating a simple little table, an ordered sequence and a procedure that inserts 100,000 rows into the table:


SQL> create table bowie (id number, text varchar2(30));

Table created.

SQL> create sequence bowie_seq order;

Sequence created.

SQL> CREATE OR REPLACE PROCEDURE bowie_proc AS

2  BEGIN

3     FOR i IN 1..100000 LOOP

4         INSERT INTO bowie VALUES (bowie_seq.NEXTVAL, 'ZIGGY STARDUST');

5         COMMIT;

6     END LOOP;

7  END;

8  /

Procedure created.

We note the table lives in an ASSM tablespace:


SQL> select table_name, i.tablespace_name, segment_space_management

from dba_tables i, dba_tablespaces t   where i.tablespace_name = t.tablespace_name and table_name='BOWIE';

TABLE_NAME   TABLESPACE_NAME                SEGMEN

------------ ------------------------------ ------

BOWIE        USERS                          AUTO

We next have 3 different sessions that simultaneously run the procedure to load the table. Note that an ordered sequence is used which means the 3 sessions are randomly grabbing the next sequenced value to insert. The data though is basically being inserted in order of the ID column, it’s just that the data is being distributed across a few blocks as we go along the table, rather than strictly one block after the other.


SQL> exec bowie_proc

PL/SQL procedure successfully completed.

Let’s create an index on the ID (sequenced) column and collect fresh statistics:


SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE',      estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000            241465

We notice that although the data in the table in reality is actually quite well clustered/ordered on the ID column, the actual CF of the index is not reflecting this. At a massive 241,465 it’s an extremely high (bad) CF, much closer in value to rows in the table than the number of table blocks, as the CF calculation keeps flipping back and forth between differing blocks. With such a high CF, the CBO is therefore going to cost an index scan accordingly:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------

| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |       |   389 |  7780 |   310   (1)| 00:00:04 |

|*  1 |  TABLE ACCESS FULL| BOWIE |   389 |  7780 |   310   (1)| 00:00:04 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("ID"<=429 AND "ID">=42)

Statistics

----------------------------------------------------------

0  recursive calls

1  db block gets

1093  consistent gets

0  physical reads

0  redo size

4084  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

Even though only approx. 0.13% of rows are being accessed and more importantly a similar low percentage of table blocks, the CBO has determined that a Full Table Scan (FTS) is the cheaper alternative. This is an all too familiar scenario, all down to the fact the CF is not accurately reflecting the true clustering of the data and subsequent efficiency of the index.

Finally, at long last, there’s now an official fix for this !!

Bug 13262857 Enh: provide some control over DBMS_STATS index clustering factor computation INDEX describes this scenario and currently has available patches that can be applied on both Exadata databases and Oracle versions 11.1.0.7, 11.2.0.2 and 11.2.0.3. The patches (eg. Patch ID 15830250) describe the fix as addressing “Index Clustering Factor Computation Is Pessimistic“. I couldn’t have described it better myself :)

Once applied (the following demo is on a patched 11.2.0.3 database), there is a new statistics collection preference that can be defined, called TABLE_CACHED_BLOCKS. This basically sets the number of table blocks we can assume would already be cached when performing an index scan and can be ignored when incrementing the CF during statistics gathering. The default is 1 (i.e. as performed presently) but can be set up to be a value between 1 and 255, meaning during the collection of index statistics, it will not increment the CF if the table block being referenced by the current index entry has already been referenced by any of the prior 255 index entries (if set to 255). It basically sets the appropriate parameter in the sys_op_countchg function used to calculate the CF value during statistic gathering to not increment the CF if the current table block has already been accessed “x” index entries previously.

The TABLE_CACHED_BLOCKS preference can be set by either the DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS or DBMS_STATS.SET_DATABASE_PREFS procedures.

So let’s now change the TABLE_CACHED_BLOCKS preference for this table and re-calculate the index statistics:


SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',

pname=>'TABLE_CACHED_BLOCKS', pvalue=>42);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000              1035

We notice that the CF has now been significantly reduced (down from 241465 to just 1035), reflecting far more accurately the true clustering of the data when considering the actual effectiveness of using the index.

If we now run the same query as before:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3472402785

------------------------------------------------------------------------------------------

| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |

------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |            |   389 |  7780 |     4   (0)|00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      |   389 |  7780 |     4   (0)|00:00:01 |

|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I |   389 |       |     2   (0)|00:00:01 |

------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("ID">=42 AND "ID"<=429)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

6  consistent gets

0  physical reads

0  redo size

9882  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

We notice the index is now being selected by the CBO. At a cost of 4 (previously the cost was somewhat greater than the 310 cost of the FTS), this much more accurately reflects the true cost of using the index (notice only 6 consistent gets are performed).

Being able to now set the TABLE_CACHED_BLOCKS preference during statistics collection finally gives us a fully supported and easy method to collect more accurate CF statistics. This in turn can only lead to more informed and accurate decisions by the CBO and ultimately better performing applications. Although available right now via the back ported patches, this will no doubt all be fully documented once the 12c database is finally released.

I can’t recommend enough the use of this new capability :)

About these ads

Comments»

1. vijaysehgal - May 9, 2013

Dear Richard,
this is definitely a good feature and very well explained with e.g. will try it out as soon as I get a chance.

Thanks as always for sharing the information.

Warm Regards,
Vijay Sehgal.

Richard Foote - May 10, 2013

Thanks Vijay :)

2. Clustering_factor | Oracle Scratchpad - May 9, 2013

[...] I’ve been meaning to write this post for two or three months, ever since Sean Molloy sent me an email about short blog note from Martin Decker describing Bug 13262857  Enh: provide some control over DBMS_STATS index clustering factor computation. Unfortunately I’ve not yet had time to investigate the patch, but I don’t think I need to any more because Richard Foote has written it up in his latest blog post. [...]

3. Ron - May 9, 2013

Richard where did you get the value 42 from for this line ?
pvalue=>42

Regards
Ron

Richard Foote - May 10, 2013

Hi Ron

I have to admit the reasoning behind the value 42 was not very scientific in this case :)

http://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker's_Guide_to_the_Galaxy

Amit Saraswat - May 16, 2013

Hi Richard,

Thanks for such a nice explanation on new feature. Just little doubtful about the line where you have mentioned CF as 200. I am assuming 100 inserts in to an empty table.
How could CF > number of records in table ?

Could you please explain?

==================================
Picture a table with 100 rows being inserted by 2 sessions simultaneously, each inserting 50 rows based on an ordered sequence.
……………………………………….
But with 2 freelists, one session could insert data into one block while the other session inserts into a second block, with the ordered sequenced values being randomly distributed among the 2 blocks. The CF could now potentially evaluate to a value of 200 as the rows are jumbled or “toggled” across the two blocks.
==================================

Thanks
Amit S.

4. Timur Akhmadeev - May 10, 2013

Hi Richard

thanks for the info. Two things to note:

1. What will be the default value of TABLE_CACHED_BLOCKS in 12c. I hope it’s not 42 ;)
2. Be careful with DBMS_STATS.SET_SCHEMA_PREFS: in 11g it doesn’t set schema preference, but rather uses a loop through existing schema tables and a call to the SET_TABLE_PREFS, which means new tables will not use this setting.

Richard Foote - May 10, 2013

Hi Timur

The default is 1 to keep with current behaviour, although I would suggest that 42 would probably be a better default :). Note there is also an “AUTO” option which will take 1% of table blocks or 0.1% of the buffer cache, whichever is less.

Thanks for the heads-up regarding setting the schema prefs.

5. Patch for index clustering factor | DBA n00b - May 10, 2013
6. Interesting post on clustering factor | Bobby Durrett's DBA Blog - May 10, 2013
7. Jared - May 10, 2013

Very cool, thanks Richard.

I think this can be put to use fairly soon for at least one client.

Richard Foote - May 10, 2013

Hi Jared

I think this will be put to use in many many databases once word gets around. Doing my best in that regard :)

Hope life is treating you well !!

8. Ali - May 10, 2013

How can we calculate a good value for TABLE_CACHED_BLOCKS parameter (pvalue)?

Richard Foote - May 10, 2013

Hi Ali

Good question.

I would perhaps look at the number of freelists (ASSM is equivalent to about 16) as an initial guide.

However, that said, note what the impact is here. It’s Oracle determining the likelihood of a table block already being cached when being accessed during an index scan. For most databases and for most queries that access a reasonable number of rows (where the decision to just go for the table scan becomes more of an option), if an index has just accessed a block say 42 IOs ago, it’s still very likely the block will be cached. So the cost of performing this other LIO is minimised here.

By default, the CBO considers the access of every block to effectively be a physical I/O and factors this into it’s cost calculations. In actual fact, a table block once accessed via an index in most cases is unlikely to be re-read from disk again in the course of the query, unless the cache is very small and active and/or the index scan very huge and expensive.

So although a larger value (like 42 or 142) might sound unrealistic, it’s probably a lot more realistic than the default of 1 has ever been.

The danger of course is for smaller queries that access less than this value of rows looking particularly cheap, but they’re likely to be accessed by an index anyways, with the CF having less overall impact, although the cost ramifications of them being performed in inner joins needs to be considered.

As I said, the freelist/ASSM value is a minimum guideline but I still like 42 :)

9. Noons - May 10, 2013

Hmmm…. looks like YABAP (yet another “black arts” parameter)…
Agreed entirely, it can make a difference. But what is the guidance for setting its value? Between 1 and 255 is, I dare say, a little on the vague side.
I wonder if we should look at rows/block and use a low multiple of that as a starting point, in the absence of other almost impossible to obtain data, like historic locality of reference in queries. Or else I’d go with Timur: 42 ;)

Richard Foote - May 10, 2013

Hi Noons

Nice to hear from you :)

See my reply above to Ali. I don’t really consider it a dark art as any value greater than 1 is likely to be an improvement, even 255. The very nature of index scans means the index costs are likely to be better represented now in the vast majority of scenarios and can be made no cheaper than the number of table blocks that contain rows anyways !!

Note for larger tables, greater than approx. 200M, the auto setting will be the maximum 255 value.

That said, I might start a movement to change the default to 42 :)

10. Horia Berca - May 10, 2013

Hi Richard,

Thanks for your post. I did test your scenario and played a bit with the value of 42. Apparently, this is the highlight!
Here are the results, quite interesting, I would say:

pvalue = 1 ->>> CF = 226348
pvalue = 2 ->>> CF = 98337
pvalue = 3 ->>> CF = 1129
pvalue = 4 ->>> CF = 1040
pvalue = 5 ->>> CF = 1036
pvalue = 10 ->>> CF = 1036
pvalue = 42 ->>> CF = 1035
pvalue = 255 ->>> CF = 1035

So we see that when jumping from a value of 2 -> 3 the CF computation becomes realistic.
Would be nice to find an explanation to this.

Regards,
Horia Berca

Richard Foote - May 10, 2013

Hi Horia

Cool, love it when I see people having a play and explore :)

The explanation is as follows. If you ran my scenario, you had 3 sessions concurrently inserting into the table. So basically, there were 3 blocks being “toggled” during the insert process at a given time. With 3 sessions, effectively 3 freelists were used so the clustering was only “out” by 3 blocks at a given time.

Therefore, by setting the TABLE_CACHED_BLOCKS to 3 or so, any indexed value you’re sitting at would have been accessed roughly within the last 3 or so blocks. That’s all that’s required therefore to get the “prefect” CF because all the IDs were indeed ordered within 3 blocks of each other.

So a very modest increase in this case was all that was necessary to get the desired results and a very large value can only get you to the number of table blocks containing rows anyways and so do no harm.

You can hopefully see why I’m so damn excited by this :)

Horia Berca - May 10, 2013

Richard,

Thanks for the explanation.
Have extended the test to 6 concurrent sessions, the results below.

Just to keep the this small, we have
num_rows = 600000
blocks = 2260

pvalue = 3 ->>> CF = 174168
pvalue = 4 ->>> CF = 31840
pvalue = 5 ->>> CF = 8256
pvalue = 6 ->>> CF = 2299
pvalue = 7 ->>> CF = 2127
pvalue = 10 ->>> CF = 2105
pvalue = 42 ->>> CF = 2101

And NOW!!!!!, which may be the secret souce, we can input for pvalue —>>> dbms_stats.auto_table_cached_blocks,
which apparently gives the expected (realistic computation for CF).

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>’BOWIE’,pname=>’TAB
LE_CACHED_BLOCKS’, pvalue=>DBMS_STATS.AUTO_TABLE_CACHED_BLOCKS);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>’BOWIE_ID_I’, es
timate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_facto
r FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.ind
ex_name=’BOWIE_ID_I';

TABLE_NAME INDEX_NAME BLOCKS
—————————— —————————— ———-
NUM_ROWS CLUSTERING_FACTOR
———- —————–
BOWIE BOWIE_ID_I 2260
600000 2102

SQL> select dbms_stats.get_param(‘AUTO_TABLE_CACHED_BLOCKS’) FROM DUAL;
select dbms_stats.get_param(‘AUTO_TABLE_CACHED_BLOCKS’) FROM DUAL
*
ERROR at line 1:
ORA-20001: Invalid input values for pname
ORA-06512: at “SYS.DBMS_STATS”, line 30915

SQL> select dbms_stats.get_param(‘TABLE_CACHED_BLOCKS’) FROM DUAL;

DBMS_STATS.GET_PARAM(‘TABLE_CACHED_BLOCKS’)
——————————————————————————–

1

Not very sure if when pvalue is set this way it picks 1 or different value, but we’ve seen that explicitly setting pvalue = 1 does not give good computation to CF,
and when set to DBMS_STATS.AUTO_TABLE_CACHED_BLOCKS this works fine.

Did not find on MOS anything about DBMS_STATS.AUTO_TABLE_CACHED_BLOCKS.

Cheers!
Horia

P.S. So in a real world, hardguessing the pvalue for each table would be a Sisyphus work, and setting it to DBMS_STATS.AUTO_TABLE_CACHED_BLOCKS might be the choice to go for.
We shall keep an eye on it.

David Aldridge - May 10, 2013

The levelling off in the clustering factor at a particular table_cached_blocks seems to be very significant factor.

Let’s say that it follows a pattern of:

1-> 1,000,000
2-> 500,000
4-> 100,000
8 -> 40,000
16 -> 2,000
32 -> 1,800
64 -> 1,700

If a query is going to retrieve 5 rows from the table then the clustering factor of 2,000 for a table_cached_blocks value of 16 is going to be unrepresentative for the cost estimation in just the same way that a value of 1,000,000 would be, don’t you think? The fifth row to be retrieved is much less likely to be in a cached block than the clustering factor of 2,000 would suggest, as the most appropriate clustering factor for such a query would be 100,000

It suggests to me that a “more perfect” cost estimation would be based on varying the table_cached_blocks value according to the average number of rows that will be efficiently retrieved via the index, as the higher clustering factors for values less than 16 are still very meaningful for some queries.

For a simple case with a table of 1,000,000 rows and an index on a column with 50,000 distinct values then you might say that the average number of rows to be retrieved on an equality predicate is 20 — hence the most representative value for table_cached_blocks would be that for around 20 (unless the values have levelled off “earlier”). The “most perfectist” implementation would be to store clustering factors for a sample of table_cached_blocks values from 2 up to the “flattening out of the curve”, so that the optimiser can estimate the number of rows to be retrieved and then use the most appropriate clustering factor. I suspect that collecting multiple clustering factors for a range of table_cached_blocks values would not necessarily be much more expensive than collecting a single clustering factor — the curve could certainly be estimated with query based on an analytic function.

For low cardinality queries getting this cost estimation spot on is probably not so important, as the correct decision is more likely to be that the index should be used, of course, and choosing a high value of table_cached_blocks is likely to promote index-based access anyway.

Anyway, that’s all nit picking. Like you say, a high value is probably the best thing as it’s the queries that retrieve many rows that are the significant ones in this respect, and unless there’s a high performance overhead to high values of table_cached_blocks then setting it high is probably the best thing.

11. Latest data Industry news round up, Log Buffer #319 - May 10, 2013

[…] !! Clustering Factor Calculation […]

Richard Foote - May 11, 2013

You know you’ve hit the big time when you feature on the Pythian weekly highlights show :)

12. Richard Foote - May 11, 2013

Hi Horia

So you now have 6 simulataneous sessions getting in each others way, effectively using 6 freelists. For a column value that is basically being inserted in order such as the ID, by now incrementing the CF only if the table block is not one of the last 6 or 7 accessed, we effectively have a perfect CF. As indeed should be the case if the last 6 or so table blocks accessed would be in the cache regardless.

Auto is likely to be the “easiest” option to set but hey, I’m not giving up on 42 :)

Horia Berca - May 11, 2013

Richard, 42 is you trademark already!

Cheers,
Horia

13. Richard Foote - May 11, 2013

Hi David

I’m not entirely sure of your point. The CF “levels off” if indeed the data is reasonably clustered and the data block visited was already accessed “x” index entries ago. So it’s not really in relation to the number of rows being accessed by the query but in relation to how closely the data is really clustered. If the table is large and the data really is randomised all over the place and the data block being referenced by an index entry has NOT been accessed by the preceeding 255 (or whatever value of) index entries, then the CF will not level off and will remain poor. The CF will only level off if the data is generally clustered within x number of data block from each other.

I think I might write a part II to this and highlight other examples such as a genuinely poorly clustered example which won’t benefit from this feature, as indeed it shouldn’t because the CF really is poor.

14. VMCD.ORG » Blog Archive » How to control CF calculation - May 14, 2013

[…] Ref:Clustering Factor Calculation Improvement and provide some control over DBMS_STATS index clustering factor computation […]

15. How to control CF calculation | louis liu 的镜像blog - May 14, 2013

[…] Ref:Clustering Factor Calculation Improvement and provide some control over DBMS_STATS index clustering factor computation […]

16. Clustering Factor Calculation Improvement Part II (Blocks On Blocks) | Richard Foote's Oracle Blog - May 14, 2013

[…] previous post on the new TABLE_CACHED_BLOCKS statistics gathering preference certainly generated some interest My blog hits for the week have gone off the charts […]

17. Clustering Factor Calculation Improvement - How to Oracle - May 16, 2013

[…] of an index can now […] Important !! Clustering Factor Calculation Improvement (Fix You) Richard Foote’s Oracle Blog google_ad_client = "pub-4472060875810546"; google_ad_width = 468; google_ad_height = 60; […]

18. Новое в статистике | Oracle mechanics - May 16, 2013

[…] и тестами патча можно ознакомиться в блоге Richard Foote Important !! Clustering Factor Calculation Improvement (Fix You) и Clustering Factor Calculation Improvement Part II (Blocks […]

Richard Foote - May 16, 2013

I’m not very good with my Russian, I hope it’s all good !!

19. Richard Foote - May 16, 2013

Hi Amit

50+50=200, oh wait a minute ..

Thanks for pointing out the error, all fixed :)

20. Uwe M. Küchler - June 5, 2013

Hi Richard,
the document for Bug 13262857 states that fixes are available for MS Windows platforms only. Can that be true? Or have you applied a fix on a different platform? If so, could you provide a link to the according patch document (I searched for the mentioned Patch ID 15830250, but to no avail)?
Thanks a lot,
Uwe

21. Rajesh Aialavajjala - June 5, 2013

Uwe,
The patch # is 13262857

Description INDEX CLUSTERING FACTOR COMPUTATION IS PESSIMISTIC
Product Optimizer
Select a Release

Platform or Language Click for more information about this option
Last Updated 09-MAR-2012
Size 177K (181418 bytes)
Entitlement Class Software
Classification General

Interestingly enough MOS – also lists these

Patch Mar 5, 2012
INDEX CLUSTERING FACTOR COMPUTATION IS PESSIMISTIC [Patch ID 14620718]

Patch Feb 8, 2012
INDEX CLUSTERING FACTOR COMPUTATION IS PESSIMISTIC [Patch ID 14496848]

Patch Mar 9, 2012
INDEX CLUSTERING FACTOR COMPUTATION IS PESSIMISTIC [Patch ID 14564276]

Richard Foote - June 6, 2013

Thanks Rajesh :)

22. Neues vom Clustering Factor « Oraculix - June 5, 2013

[…] Erster Artikel: Ein Patch für Oracle 11.2 führt eine neue Option “TABLE_CACHED_BLOCKS” für die Berechnung mit DBMS_STATS ein. Dieser kann von 0-255 oder auf AUTO gesetzt werden und sorgt für eine Berücksichtigung von Caching-Effekten, die bislang nicht in die Berechnung des Clustering Factors eingegangen sind. Dadurch können unberechtigt zu hohe CFs deutlich reduziert werden, wenn die indizierten Rows innerhalb der nächsten n Blöcke der Zieltabelle liegen. n wird bei AUTO auf 1% der Tabellengröße in Blöcken gesetzt. […]

23. Rajesh Aialavajjala - June 26, 2013

For E-Business Suite Customers on 11.2.0.3 — this might be of interest…

Problem = “Patch 16306748 conflicts with 13262857 and 14013094″

Patch 16306748 was applied vide MOS Doc ID:1392633.1

Patch 14013094 – is recommended vide Doc ID: 244040.1 and Doc ID: 1542138.1

Patch 13262857 – is desired owing to the enhancement introduced for Clustering Factor – Bug 13262857 – Enh: provide some control over DBMS_STATS index clustering factor computation [ID 13262857.8]

Fix is

Patch 16956349 “MERGE REQUEST ON TOP OF 11.2.0.3.0 FOR BUGS 16306748 13262857 14013094″

Richard Foote - June 26, 2013

Thanks for the info Rajesh :)

24. CLUSTERING FACTOR DEMYSTIFIED : PART - III | ORACLE IN ACTION - February 25, 2014
25. Marius - March 30, 2014

Hello Richard,

Why is this different from hacking CF by using set_index_stats?

Thanks,
Marius

Richard Foote - April 16, 2014

Hi Marius

Well the difference is that statistics gathering will automatically set the CF to a reasonably accurate value whereas hacking has to be performed manually and could be set to a totally inappropriate value.

So automatic and acurrate vs. manual and maybe inaccurate :)

26. Hemant K Chitale - October 1, 2014

This is good. How did I not notice this ? Thanks !

27. Richard Foote - October 8, 2014

Hi Hemant

Very easily, considering how much the Oracle database keeps changing and improving over the years :)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,894 other followers

%d bloggers like this: