Oracle Database 12c Released – Lots of Indexing Stuff To Talk About June 26, 2013
Posted by Richard Foote in 12c, Oracle Indexes.10 comments
Finally, Oracle have released the long awaited 12c Database. You can now download it from OTN and other usual places. This is good news as my tongue was getting rather sore from having to bite on it for so long 🙂
From an indexing point of view, there are lots of fantastic new features, especially from a manageability perspective. The ability to create multiple indexes with the same column list, to partially index tables, with enhanced capabilities regarding online operations, with enhanced index monitoring capabilities, etc. etc. there’s plenty that I can now discuss here.
Unfortunately, all my presentations have been rejected for Oracle Openworld this year so I won’t be able to present them over there 😦
Fortunately I’ll be able to discuss them all here in lots of detail in the coming weeks, so stay tuned 🙂
Clustering Factor Calculation Improvement Part III (Too Much Rope) June 4, 2013
Posted by Richard Foote in 11g, CBO, Clustering Factor, Index statistics, TABLE_CACHED_BLOCKS.9 comments
In my previous post, I discussed how for larger tables that really do have randomised data, not even setting the new TABLE_CACHED_BLOCKS to its 255 maximum value is going to make any real difference to the resultant Clustering Factor (CF) of an index. The 255 maximum value here protects us somewhat from abusing this capability and setting unrealistic CF values.
However, for smaller tables in particular, we do need to exercise some caution.
In the following example, we’re only creating a relatively small table and associated index with a CODE column that is randomly distributed throughout the table:
SQL> create table bowie (id number, code number, text varchar2(30)); Table created. SQL> insert into bowie select rownum, trunc(dbms_random.value(0, 100)), 'DAVID BOWIE' from dual connect by level <= 70000; 70000 rows created. SQL> commit; Commit complete. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> create index bowie_code_i on bowie(code); Index created.
If we look at the CF of this index:
SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ ------------ ---------- ---------- ----------------- BOWIE BOWIE_CODE_I 244 70000 22711
We notice that at 22,711, the CF it’s pretty average. The table though is quite small at only 244 blocks.
If we run a simple query:
SQL> select * from bowie where code between 42 and 44; 2050 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1845943507 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2814 | 56280 | 65 (4)| 00:00:01 | |* 1 | TABLE ACCESS FULL| BOWIE | 2814 | 56280 | 65 (4)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter('CODE'<=44 AND 'CODE'>=42) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 254 consistent gets 0 physical reads 0 redo size 25044 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2050 rows processed
The CBO goes for a Full Table Scan. This is not unexpected as we likely have to visit all 244 blocks anyways to fetch the required 2050 rows due to the CODE data being so randomly distributed throughout the table. In a naive attempt to improve things, we decide to improve the CF by setting the TABLE_CACHED_BLOCKS to the maximum 255 value:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255); PL/SQL procedure successfully completed. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ -------------- ---------- ---------- ----------------- BOWIE BOWIE_CODE_I 244 70000 240
Indeed, we have improved the CF, dramatically reducing it down to just 240 from the previous 22711. Even though the column value for the CODE really is randomly distributed throughout the table, the CF now suggests the data is perfectly clustered. This is because with a table with only 244 blocks, incrementing the CF if the current index entry references a table block more than 255 blocks ago is now impossible. The CF is now guaranteed to be “perfect” as each index entry can only reference one of the 244 table blocks and so is incremented only when each table block is referenced the first time.
This dramatic reduction in the CF will certainly make the index more attractive to the CBO. But is this really a good thing:
SQL> select * from bowie where code between 42 and 44; 2050 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1602289932 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2814 | 56280 | 17 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| BOWIE | 2814 | 56280 | 17 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | BOWIE_CODE_I | 2814 | | 7 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access('CODE'>=42 AND 'CODE'<=44) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 681 consistent gets 0 physical reads 0 redo size 20895 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2050 rows processed
The CBO is now using the index, but is now performing significantly more consistent gets, 681 rather than the previous 254. This execution plan is actually much less efficient than the previous FTS execution plan. The CBO is getting this wrong now as the CF isn’t really anywhere near as good as it’s now being lead to believe.
Caution setting TABLE_CACHED_BLOCKS to a value that is anywhere close to the number of blocks in the table. This is one of the reasons for Oracle having a 1% of table blocks default value for this setting.