Clustering Factor Calculation Improvement Part III (Too Much Rope) June 4, 2013
Posted by Richard Foote in 11g, CBO, Clustering Factor, Index statistics, TABLE_CACHED_BLOCKS.9 comments
In my previous post, I discussed how for larger tables that really do have randomised data, not even setting the new TABLE_CACHED_BLOCKS to its 255 maximum value is going to make any real difference to the resultant Clustering Factor (CF) of an index. The 255 maximum value here protects us somewhat from abusing this capability and setting unrealistic CF values.
However, for smaller tables in particular, we do need to exercise some caution.
In the following example, we’re only creating a relatively small table and associated index with a CODE column that is randomly distributed throughout the table:
SQL> create table bowie (id number, code number, text varchar2(30)); Table created. SQL> insert into bowie select rownum, trunc(dbms_random.value(0, 100)), 'DAVID BOWIE' from dual connect by level <= 70000; 70000 rows created. SQL> commit; Commit complete. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> create index bowie_code_i on bowie(code); Index created.
If we look at the CF of this index:
SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ ------------ ---------- ---------- ----------------- BOWIE BOWIE_CODE_I 244 70000 22711
We notice that at 22,711, the CF it’s pretty average. The table though is quite small at only 244 blocks.
If we run a simple query:
SQL> select * from bowie where code between 42 and 44; 2050 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1845943507 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2814 | 56280 | 65 (4)| 00:00:01 | |* 1 | TABLE ACCESS FULL| BOWIE | 2814 | 56280 | 65 (4)| 00:00:01 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter('CODE'<=44 AND 'CODE'>=42) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 254 consistent gets 0 physical reads 0 redo size 25044 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2050 rows processed
The CBO goes for a Full Table Scan. This is not unexpected as we likely have to visit all 244 blocks anyways to fetch the required 2050 rows due to the CODE data being so randomly distributed throughout the table. In a naive attempt to improve things, we decide to improve the CF by setting the TABLE_CACHED_BLOCKS to the maximum 255 value:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255); PL/SQL procedure successfully completed. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ -------------- ---------- ---------- ----------------- BOWIE BOWIE_CODE_I 244 70000 240
Indeed, we have improved the CF, dramatically reducing it down to just 240 from the previous 22711. Even though the column value for the CODE really is randomly distributed throughout the table, the CF now suggests the data is perfectly clustered. This is because with a table with only 244 blocks, incrementing the CF if the current index entry references a table block more than 255 blocks ago is now impossible. The CF is now guaranteed to be “perfect” as each index entry can only reference one of the 244 table blocks and so is incremented only when each table block is referenced the first time.
This dramatic reduction in the CF will certainly make the index more attractive to the CBO. But is this really a good thing:
SQL> select * from bowie where code between 42 and 44; 2050 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1602289932 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 2814 | 56280 | 17 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| BOWIE | 2814 | 56280 | 17 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | BOWIE_CODE_I | 2814 | | 7 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access('CODE'>=42 AND 'CODE'<=44) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 681 consistent gets 0 physical reads 0 redo size 20895 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2050 rows processed
The CBO is now using the index, but is now performing significantly more consistent gets, 681 rather than the previous 254. This execution plan is actually much less efficient than the previous FTS execution plan. The CBO is getting this wrong now as the CF isn’t really anywhere near as good as it’s now being lead to believe.
Caution setting TABLE_CACHED_BLOCKS to a value that is anywhere close to the number of blocks in the table. This is one of the reasons for Oracle having a 1% of table blocks default value for this setting.
Clustering Factor Calculation Improvement Part II (Blocks On Blocks) May 14, 2013
Posted by Richard Foote in 11g, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.6 comments
My previous post on the new TABLE_CACHED_BLOCKS statistics gathering preference certainly generated some interest 🙂 My blog hits for the week have gone off the charts !!
One of the concerns raised by this new capability was that setting such a preference might result in really unrealistic and inaccurate Clustering Factor (CF) values, especially for those tables that truly have appalling CFs. Although there are certainly some dangers, Oracle has limited the possible “abuse” by ensuring TABLE_CACHED_BLOCKS can only be set to a maximum of 255. This means Oracle will only ignore a maximum of 255 table blocks that have recently been accessed during the CF calculation. For larger tables with truly randomised data patterns, not even the maximum 255 setting if utilised will make an appreciable difference to the final CF.
A couple of examples to demonstrate.
The first table is a relatively “large” table that has a DOB column that is effectively randomised throughout the table. There are approximately 20,000 different DOB values in a 2 million row table (so each DOB occurs approximately 100 times, give or take).
SQL> create table major_tom (id number, DOB date, text varchar2(30)); Table created. SQL> insert into major_tom select rownum, sysdate-trunc(dbms_random.value(0, 20000)), 'DAVID BOWIE' from dual connectby level <= 2000000; 2000000 rows created. SQL> commit; Commit complete.
Let’s now create an index on this DOB column and have a look at the CF:
SQL> create index major_tom_dob_i on major_tom(dob); Index created. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'MAJOR_TOM', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ --------------- ---------- ---------- ----------------- MAJOR_TOM MAJOR_TOM_DOB_I 9077 2000000 1988164
So at 1,988,164, the CF is terrible. This is as expected as the DOB values are all randomised throughout the table. The index is not being used as we had hope (naively) so let’s use the new TABLE_CACHED_BLOCKS preference to now improve the calculated CF by setting it to the maximum 255 setting and recalculate the index statistics:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'MAJOR_TOM', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255); PL/SQL procedure successfully completed. SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'MAJOR_TOM_DOB_I', estimate_percent=> null); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ --------------- ---------- ---------- ----------------- MAJOR_TOM MAJOR_TOM_DOB_I 9077 2000000 1941946
We notice that although the CF has improved marginally, at whopping 1,941,946 it’s still terrible and has made no real appreciable difference. Why ?
Well let’s do some basic maths here. There are 9077 blocks in the table and the next DOB referenced in the index can potentially be in any one of them. Therefore, the chances of the next DOB being in one of the 255 previously accessed table blocks is only 255/9077 x 100 = approximately 2.8%. So in only 2.8% of the time is the CF likely to not be incremented and so the CF is only likely to drop by around this 2.8% amount.
Let’s check. (1988164 – 1941946)/1988164 x 100 indeed does equal approximately 2.8%.
So statistically with such a poor CF on such a “large” table, to limit the CF calculation if any of the last 255 table blocks are referenced is only going to improve things by 2.8% on average. Effectively of no real use at all.
Another example now, but this time with a CODE column with just 100 distinct values that are randomly distributed throughout another reasonable “large” 2 million row table. For those mathematically challenged, that means each value occurs approximately 20,000 times, give or take:
SQL> create table ziggy (id number, code number, text varchar2(30)); Table created. SQL> insert into ziggy select rownum, trunc(dbms_random.value(0, 100)), 'DAVID BOWIE' from dual connect by level <= 2000000; 2000000 rows created. SQL> commit; Commit complete. SQL> create index ziggy_code_i on ziggy(code); Index created. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ --------------- ---------- ---------- ----------------- ZIGGY ZIGGY_CODE_I 7048 2000000 662962
So at 662,962 it’s what I would describe as a “poor to average” CF. It’s not particularly great with there being just 7,048 table blocks but it’s still some distance from the 2,000,000 row value.
The index is not being used in SQL statements as we (naively) wish, so let’s try and improve things by lowering the index CF by setting the new TABLE_CACHED_BLOCKS preference to the maximum 255 setting:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'ZIGGY', pname=>'TABLE_CACHED_BLOCKS', pvalue=>255); PL/SQL procedure successfully completed. SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'ZIGGY_CODE_I', estimate_percent=>null); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ --------------- ---------- ---------- ----------------- ZIGGY ZIGGY_CODE_I 7048 2000000 662962
We notice to our great disappointment (well, not really) that the CF remains completely unchanged at 662,962 !! Why ?
Again, let’s do some basic maths and consider the data distribution.
The table has some 7048 blocks but each distinct CODE value has some 20,000 occurrences on average. Therefore, each value is going to be found 20000/7048 = roughly 2 to 3 times per block. As the index is in CODE order and for each CODE in rowid order, the CF is going to increment for each CODE value for each distinct block we visit. We will therefore only go back to a previously visited table block (except for the 2 to 3 visits to the current block) when the CODE value changes but this will take us all the way back to the first block which is always going to be some 7047 blocks away from the current one. As 7047 is much greater than the 255 the CF calculation will only cater for, the CF is going to remain unchanged from the default calculation as a result.
And this is all as it should be, as the fundamental CF is indeed poor for these scenarios and even going back the maximum 255 data blocks will not reduce appreciably the manner in which the CF is calculated.
Of course, if there was no limit, then a setting of TABLE_CACHED_BLOCKS of say 7100 would enable the CF to be recalculated as being perfect in the above scenario, which would indeed be a concern. But 255 is the limit and so limits the potential “damaged” that can be done.
More on all this to come 🙂
Important !! Clustering Factor Calculation Improvement (Fix You) May 8, 2013
Posted by Richard Foote in 11g, ASSM, CBO, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.57 comments
Believe me, this article is worth reading 🙂
I’m currently not allowed to discuss Oracle 12c Database goodies but I am allowed to discuss things perhaps initially intended for 12c that are currently available and already back-ported to 11g. This includes a wonderful improvement in the manageability of how the Clustering Factor (CF) of an index can now be calculated. Many thanks to Martin Decker for pointing this out to me.
As anyone who has attended my Index Seminars will know, the CF of an index is one of the most important statistics used by the Cost Based Optimizer (CBO) in determining the most efficient execution plan. As such, it has always been an issue for me that the manner in which the CF is calculated has been so flawed.
Basically, the CF is calculated by performing a Full Index Scan and looking at the rowid of each index entry. If the table block being referenced differs from that of the previous index entry, the CF is incremented. If the table block being referenced is the same as the previous index entry, the CF is not incremented. So the CF gives an indication of how well ordered the data in the table is in relation to the index entries (which are always sorted and stored in the order of the index entries). The better (lower) the CF, the more efficient it would be to use the index as less table blocks would need to be accessed to retrieve the necessary data via the index.
However, there’s a basic flaw here. The CF calculation doesn’t take into consideration the fact the referenced table block, although maybe different from the previous one index entry, might already have recently been accessed. As such, during an index scan, the table block being accessed is almost certainly still cached in the buffer cache from the previous access, thereby not reducing the effectiveness of the index in any appreciable manner. A classic example of this would be a table with a few freelists. Although the data being inserted is not ordered precisely within the same data blocks, the data might actually be very well clustered within only a few blocks of each other.
Picture a table with 100 rows being inserted by 2 sessions simultaneously, each inserting 50 rows based on an ordered sequence. With one freelist, the data is basically inserted in one block first and then once full a second table block. The data is therefore perfectly ordered/clustered and the CF will evaluate to a value of 2 on such an indexed column. But with 2 freelists, one session could insert data into one block while the other session inserts into a second block, with the ordered sequenced values being randomly distributed among the 2 blocks. The CF could now potentially evaluate to a value of 100 as the rows are jumbled or “toggled” across the two blocks. This is a much much worse value (2 vs. 100) that can adversely impact the CBO calculations, although the efficiency of such an index is really almost identical as both table blocks are certain to be cached during an index scan regardless.
This is also a very common scenario with Automatic Segment Space Management (ASSM) tablespaces as I’ve discussed previously, which of course is now the default these days.
OK, let’s look at an example scenario. I’ll begin by creating a simple little table, an ordered sequence and a procedure that inserts 100,000 rows into the table:
SQL> create table bowie (id number, text varchar2(30)); Table created. SQL> create sequence bowie_seq order; Sequence created. SQL> CREATE OR REPLACE PROCEDURE bowie_proc AS 2 BEGIN 3 FOR i IN 1..100000 LOOP 4 INSERT INTO bowie VALUES (bowie_seq.NEXTVAL, 'ZIGGY STARDUST'); 5 COMMIT; 6 END LOOP; 7 END; 8 / Procedure created.
We note the table lives in an ASSM tablespace:
SQL> select table_name, i.tablespace_name, segment_space_management from dba_tables i, dba_tablespaces t where i.tablespace_name = t.tablespace_name and table_name='BOWIE'; TABLE_NAME TABLESPACE_NAME SEGMEN ------------ ------------------------------ ------ BOWIE USERS AUTO
We next have 3 different sessions that simultaneously run the procedure to load the table. Note that an ordered sequence is used which means the 3 sessions are randomly grabbing the next sequenced value to insert. The data though is basically being inserted in order of the ID column, it’s just that the data is being distributed across a few blocks as we go along the table, rather than strictly one block after the other.
SQL> exec bowie_proc PL/SQL procedure successfully completed.
Let’s create an index on the ID (sequenced) column and collect fresh statistics:
SQL> create index bowie_id_i on bowie(id); Index created. SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ ------------ ---------- ---------- ----------------- BOWIE BOWIE_ID_I 1126 300000 241465
We notice that although the data in the table in reality is actually quite well clustered/ordered on the ID column, the actual CF of the index is not reflecting this. At a massive 241,465 it’s an extremely high (bad) CF, much closer in value to rows in the table than the number of table blocks, as the CF calculation keeps flipping back and forth between differing blocks. With such a high CF, the CBO is therefore going to cost an index scan accordingly:
SQL> select * from bowie where id between 42 and 429; 388 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1845943507 --------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 389 | 7780 | 310 (1)| 00:00:04 | |* 1 | TABLE ACCESS FULL| BOWIE | 389 | 7780 | 310 (1)| 00:00:04 | --------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID"<=429 AND "ID">=42) Statistics ---------------------------------------------------------- 0 recursive calls 1 db block gets 1093 consistent gets 0 physical reads 0 redo size 4084 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 388 rows processed
Even though only approx. 0.13% of rows are being accessed and more importantly a similar low percentage of table blocks, the CBO has determined that a Full Table Scan (FTS) is the cheaper alternative. This is an all too familiar scenario, all down to the fact the CF is not accurately reflecting the true clustering of the data and subsequent efficiency of the index.
Finally, at long last, there’s now an official fix for this !!
Bug 13262857 Enh: provide some control over DBMS_STATS index clustering factor computation INDEX describes this scenario and currently has available patches that can be applied on both Exadata databases and Oracle versions 11.1.0.7, 11.2.0.2 and 11.2.0.3. The patches (eg. Patch ID 15830250) describe the fix as addressing “Index Clustering Factor Computation Is Pessimistic“. I couldn’t have described it better myself 🙂
Once applied (the following demo is on a patched 11.2.0.3 database), there is a new statistics collection preference that can be defined, called TABLE_CACHED_BLOCKS. This basically sets the number of table blocks we can assume would already be cached when performing an index scan and can be ignored when incrementing the CF during statistics gathering. The default is 1 (i.e. as performed presently) but can be set up to be a value between 1 and 255, meaning during the collection of index statistics, it will not increment the CF if the table block being referenced by the current index entry has already been referenced by any of the prior 255 index entries (if set to 255). It basically sets the appropriate parameter in the sys_op_countchg function used to calculate the CF value during statistic gathering to not increment the CF if the current table block has already been accessed “x” index entries previously.
The TABLE_CACHED_BLOCKS preference can be set by either the DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS or DBMS_STATS.SET_DATABASE_PREFS procedures.
So let’s now change the TABLE_CACHED_BLOCKS preference for this table and re-calculate the index statistics:
SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE', pname=>'TABLE_CACHED_BLOCKS', pvalue=>42); PL/SQL procedure successfully completed. SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null); PL/SQL procedure successfully completed. SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor 2 FROM user_tables t, user_indexes i 3 WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I'; TABLE_NAME INDEX_NAME BLOCKS NUM_ROWS CLUSTERING_FACTOR ------------ ------------ ---------- ---------- ----------------- BOWIE BOWIE_ID_I 1126 300000 1035
We notice that the CF has now been significantly reduced (down from 241465 to just 1035), reflecting far more accurately the true clustering of the data when considering the actual effectiveness of using the index.
If we now run the same query as before:
SQL> select * from bowie where id between 42 and 429; 388 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3472402785 ------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 389 | 7780 | 4 (0)|00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| BOWIE | 389 | 7780 | 4 (0)|00:00:01 | |* 2 | INDEX RANGE SCAN | BOWIE_ID_I | 389 | | 2 (0)|00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID">=42 AND "ID"<=429) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 9882 bytes sent via SQL*Net to client 519 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 388 rows processed
We notice the index is now being selected by the CBO. At a cost of 4 (previously the cost was somewhat greater than the 310 cost of the FTS), this much more accurately reflects the true cost of using the index (notice only 6 consistent gets are performed).
Being able to now set the TABLE_CACHED_BLOCKS preference during statistics collection finally gives us a fully supported and easy method to collect more accurate CF statistics. This in turn can only lead to more informed and accurate decisions by the CBO and ultimately better performing applications. Although available right now via the back ported patches, this will no doubt all be fully documented once the 12c database is finally released.
I can’t recommend enough the use of this new capability 🙂
Indexes: Oracle11g New Features Presentation (Get Back) September 19, 2012
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes.5 comments
I’m in the early stages of compiling an Oracle12c Indexes New Features presentation so I thought I might make available the 11g version I’ve presented previously at Oracle OpenWorld and InSync conferences:
Enjoy 🙂
Oracle11g: Analyze Table Validate Structure Cascade “FAST” (Slow Burn) March 31, 2011
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes, Validate Structure.9 comments
I always take notice when Oracle introduces a new “FAST” option, so it was with some excitement when I first noticed in Oracle 11g Rel 1 there was a new FAST option when running the ANALYZE TABLE CASCADE VALIDATE STRUCTURE command.
This was described in the manuals as introducing a hashing scheme that was significantly faster than the traditional cascade validation method.
However, when I tested this new feature, the results were rather disappointing to say the least (I’ve tested this on various 11g versions, both R1 and R2 and on various platforms, AIX and Windows). In the example below, the PERSONS table is a rather large table that has a number of associated indexes:
SQL> set timing on SQL> analyze table persons validate structure cascade; Table analyzed. Elapsed: 00:06:01.84 SQL> analyze table persons validate structure cascade fast; Table analyzed. Elapsed: 00:15:20.27 SQL> analyze table persons validate structure cascade; Table analyzed. Elapsed: 00:02:22.27 SQL> analyze table persons validate structure cascade fast; Table analyzed. Elapsed: 00:15:46.28 SQL> analyze table persons validate structure cascade; Table analyzed. Elapsed: 00:02:23.78 SQL> analyze table persons validate structure cascade fast; Table analyzed. Elapsed: 00:14:58.00
When using the so-called “FAST” option, the performance was consistently significantly slower, not faster, when compared to using the default method. Perhaps the default option is “FASTER-STILL” ? (the default is actually “COMPLETE”). Additionally, the results of subsequent executions of the default method often resulted in improved times (eg: elapsed times reduced from 6 mins to around 2.5 mins), whereas the FAST option resulted in uniformly slower elapsed times.
I thought this warranted further investigation and so decided to trace what was actually going on behind the scenes and see where the extra time was being spent.
With the Complete default method, a significant number of I/Os were being performing as Oracle had to effectively perform an Index Full Scan on each index, reading the table primarily using single block reads. However, CPU related overheads were relatively low, with most of the elapsed times attributed to the related I/Os. Following is an extract from a formatted trace file:
analyze table promis.persons validate structure cascade call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.11 17 915 0 0 Execute 1 145.62 352.22 134061 58166955 3 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 145.63 352.33 134078 58167870 3 0
The total results for the trace were as follows:
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.11 17 915 0 0 Execute 1 145.62 352.22 134061 58166955 3 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 145.63 352.33 134078 58167870 3 0 OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 87 0.03 0.02 0 0 0 0 Execute 243 0.04 0.07 0 0 0 100 Fetch 577 0.10 0.31 110 1114 0 766 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 907 0.17 0.41 110 1114 0 866
We’ll compare these results later but for now note that elapsed times were 352 seconds in total, with the CPU only contributing about 165 seconds of that time.
If we now look behind the scenes of a “FAST” VALIDATE STRUCTURE CASCADE, we notice that statements such as the following are now recursively executed:
select /*+ full(PROMIS.PERSONS) */ ORA_HASH(ID || rowid) from PROMIS.PERSONS MINUS select /*+ index_ffs(PROMIS.PER_PK) */ ORA_HASH(ID || rowid) from PROMIS.PERSONS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 1 35.23 37.83 7055 10402 6 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 3 35.23 37.83 7055 10402 6 0
We notice that Oracle now uses a new method, basically performing a Full Table Scan of the table and processing the indexed column and associated rowid through the ORA_HASH function which in turn returns a computed hash value. The full table scan is a very efficient method of reading all related values from a table but putting each column value and rowid through the ORA-HASH function is a relatively expensive CPU intensive operation. Oracle then does the same for the index by performing an Index Fast Full Scan of the index and processing again the index column value and associated rowid through the ORA-HASH function. Again, this is a very efficient method of reading all values from an index but again relatively CPU intensive putting every indexed value and associated rowid through the ORA-HASH function.
Having performed this for both table and index, in theory if the data structures are indeed valid and non-corrupt, they should both return exactly the same results. So by performing a MINUS of both row sets, if no data is returned, we can be sure indeed all is well with both table/index. If however differences are detected, then there’s some disparity between the data in the index and in the table and hence something is wrong. We’ll have no idea exactly what the differences might be as we only have the resultant hash values to go by, but the analyze validate structure command can at least raise an error and say something is indeed wrong when validating the table and its associated indexes.
We would actually now need to perform an Analyze Validate Structure again (without the FAST option this time) to determine the exact cause of the issue, but assuming detecting an error is a very rare event, it’s not an additional step we would ordinarily need to perform. So if by reading both table and index via multi-block Full Scans and processing the index data via the ORA-HASH function is “faster” and more efficient to determine nothing is actually wrong, then it’s a reasonable strategy to take.
This processing is then repeated for every index associated with the table we’re analyzing. So in theory, because we’re only performing Full Table and Fast Full Index Scans, we end up performing far fewer multi-block I/Os calls and so can complete this whole Validate Structure processing “Faster”.
However, Oracle needs to perform the ORA-HASH function operation for every column in every index, both within the table and associated indexes. So although we end up doing less I/Os, we end up burning much more CPU due to having to call the ORA-HASH function so often. If we look at the total resources when performing the “FAST” validate structure on exactly the same table:
OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 1 0.01 0.01 1 145 0 0 Execute 1 0.03 0.02 190294 3086489 2 0 Fetch 0 0.00 0.00 0 0 0 0 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 2 0.04 0.03 190295 3086634 2 0 OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS call count cpu elapsed disk query current rows ------- ------ -------- ---------- ---------- ---------- ---------- ---------- Parse 49 0.06 0.07 0 0 0 0 Execute 49 0.00 0.00 0 0 0 2 Fetch 46 830.63 878.62 190295 3086514 138 21 ------- ------ -------- ---------- ---------- ---------- ---------- ---------- total 144 830.69 878.69 190295 3086514 138 23
We notice the query values are significantly reduced (down to just 3M from 58M). This of course is a good thing as this translates to reduced I/O related overheads.
However, if we look at the corresponding CPU related costs, they have increased significantly (up to 830 seconds from just 146 seconds). It’s really expensive putting all the indexed column values through the ORA-HASH function. Additionally, because multi-block Full Scans which aren’t generally cached are being performed during the Fast option, the results remain consistently poor with subsequent executions.
The net result is that the final elapsed times are consistently greater with the FAST option than without (878 seconds up from 352 seconds). Using the FAST option has resulted in slower response times, not faster ones.
Now your mileage may vary based on your specific environments, but my results have not been particularly impressive to say the least. Substantially increased elapsed times (during which the structures remain locked let’s not forget) is not exactly my idea of a “FAST” option.
Oracle11g: Zero Sized Unusable Indexes Part II (Nathan Adler) February 27, 2011
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes, Unusable Indexes.12 comments
In my previous post, I discussed how Oracle from 11g R2 onwards will automatically drop the segment and associated storage from unusable index objects. Mohamend Houri asked in the comments section the excellent question of just how useful this feature will be in real life cases when typically indexes are not left in an unusuable state for a long time, perhaps only when performing large data loads when such indexes would ordinarily be rebuilt anyways.
Thought the question was worth a seperate blog entry to provide a worthy answer.
The first point I would make is that we need to think a little outside the box and consider how such change in behaviour can open up new possibilities and flexibilities in how we index our tables.
For example, previously a Local Partitioned Index must have the same number of index partitions as the parent table. But what if an index is only useful for the “current” partition, where accessing newish data makes sense via an index. However, historical data in “older” partitions might only be accessed in batch processes via full partition scans. Why have a local index for older partitions when such indexes are never used. Previously, we had no choice, it was a case of if one or some of the partitions needed an index, then all the partitions needed to be indexed. If we made such unnecessary partitioned indexes unusable, we still needed to allocate storage for the index segment. Now, we can make any unnecessary index partition unusable and no storage at all is allocated to such index partitions.
Taking this a step further, we now have a really nice method of potentially indexing only portions of a table that need indexing, values which don’t have any benefit of being indexed (perhaps because the values are too numerous to ever be accessed efficiently via an index) no longer need to be indexed at all.
Here’s a classic example. Following is a table with a flag in which the vast number of rows in the data have been “processed”. However, we have a few rows, those current rows which are of interest to us, which have not yet been processed (they may have a status of another value). We need an index in order to find the few rows which have not yet been processed but the index needs to also include all the values which are not of interest and have been processed.
SQL> create table bowie_stuff (id number, processed varchar2(10)); Table created. SQL> insert into bowie_stuff select rownum, 'YES' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit; Commit complete. SQL> update bowie_stuff set processed = ‘NO’ where id in (999990, 999992, 999994, 999996, 999998); 5 rows updated. SQL> commit; Commit complete. SQL> create index bowie_stuff_i on bowie_stuff(processed) pctfree 0; Index created. SQL> select index_name, leaf_blocks from dba_indexes where index_name = 'BOWIE_STUFF_I'; INDEX_NAME LEAF_BLOCKS ------------------------------ ----------- BOWIE_STUFF_I 1877 SQL> select segment_name, blocks from dba_segments where segment_name = 'BOWIE_STUFF_I'; SEGMENT_NAME BLOCKS -------------------- ---------- BOWIE_STUFF_I 1920
Notice how the index is quite large (1,877 leaf blocks) as it needs to hold values for all 1M rows, even though only a relative handful of values within the index are ultimately of any use.
If we now gather stats (note we need to collect histograms as the column value distribution is very skewed) and run a query to select just the 5 rows that have not actually been processed:
SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_STUFF', estimate_percent=>null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_STUFF', estimate_percent=>null, method_opt=> 'FOR COLUMNS PROCESSED SIZE 5'); PL/SQL procedure successfully completed. SQL> select * from bowie_stuff where processed = 'NO'; Execution Plan --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 40 | 4 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF | 5 | 40 | 4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | BOWIE_STUFF_I | 5 | | 3 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 540 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 5 rows processed
Note the CBO uses the index but it requires a total of 6 consistent reads.
Not bad but we can do somewhat better and perform less I/O , significantly reduce storage overheads and significantly reduce index maintenance operations, if only we didn’t store the unnecessary index values within the index.
One method could be to create a function-based index based on the decode function and only store non-null values that are of interest. However, this requires the application to likewise use the decode function in order to make use of the index.
Another method is to use a partitioned index and now with this new Oracle11g feature of zero sized unusable indexes, we don’t need any storage at all for the unwanted indexed values.
Let’s now re-create the index as a globally partitioned index, with one partition defined to contain all values of interest and another partition defined to contain the vast number of processed values. Initially, the index is created in an unusable state so no segments and no storage is allocated to any of the partitions:
SQL> drop index bowie_stuff_i; Index dropped. SQL> create index bowie_stuff_i on bowie_stuff(processed) 2 global partition by range (processed) 3 (partition not_processed_part values less than ('YES'), 4 partition processed_part values less than (MAXVALUE)) 5 unusable; Index created.
Next, we’re only going to rebuild the partition containing just the relatively few rows of interest. The partition containing the values that are not of interest is left in an unusable state and so continues to occupy no storage at all:
SQL> alter index bowie_stuff_i rebuild partition not_processed_part; Index altered. SQL> select index_name, partition_name, leaf_blocks from dba_ind_partitions where index_name = 'BOWIE_STUFF_I'; INDEX_NAME PARTITION_NAME LEAF_BLOCKS -------------------- -------------------- ----------- BOWIE_STUFF_I PROCESSED_PART 0 BOWIE_STUFF_I NOT_PROCESSED_PART 1 SQL> select segment_name, partition_name, blocks from dba_segments where segment_name = 'BOWIE_STUFF_I'; SEGMENT_NAME PARTITION_NAME BLOCKS -------------------- -------------------- ---------- BOWIE_STUFF_I NOT_PROCESSED_PART 8
Note how the index is now tiny (reduced from 1,877 leaf blocks to just 1) as it is only now just storing the index entries that are of interest. We have just saved ourselves heaps of storage as the other partition remains unusable and uses no storage at all.
If we now run our query again:
SQL> select * from bowie_stuff where processed = 'NO'; Execution Plan -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 5 | 45 | 1 (0)| 00:00:01 | | | | 1 | PARTITION RANGE SINGLE | | 5 | 45 | 1 (0)| 00:00:01 | 1 | 1 | | 2 | TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF | 5 | 45 | 1 (0)| 00:00:01 | | | |* 3 | INDEX RANGE SCAN | BOWIE_STUFF_I | 5 | | 1 (0)| 00:00:01 | 1 | 1 | -------------------------------------------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 542 bytes sent via SQL*Net to client 395 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 5 rows processed
We notice that the execution plan is just using the tiny index partition and as a result we have reduced our consistent gets down from 6 to just 4. Additionally, we have not had to change our application at all to use the improved index, it was the exact same query as executed previously.
This method can of course be used in Oracle prior to 11g R2 but now with zero sized unusable indexes, we do not have to allocate any storage at all to those indexes that we may wish to remain in an unusable state for extended or indefinite periods of time. So yes, zero sized unusable indexes can be extremely useful in many real life scenarios 🙂
Oracle11g: Zero Sized Unusable Indexes (Zeroes) February 25, 2011
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes, Unusable Indexes.2 comments
Following on from my previous discussion on “Create On Demand” segments, Oracle 11g R2 has also introduced storage saving initiatives in relation to useable indexes. Starting with a simple Oracle 10g example, we create a table and associated index:
SQL> create table bowie as select rownum id, 'BOWIE' name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
If we now make the index unusable:
SQL> alter index bowie_id_i unusable; Index altered. SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = 'BOWIE_ID_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO ---------- ---------- ----------- ---------- -------- --- BOWIE_ID_I 2 2226 1000000 UNUSABLE NO SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = 'BOWIE_ID_I'; SEGMENT_NAME BYTES BLOCKS EXTENTS ------------ ---------- ---------- ---------- BOWIE_ID_I 18874368 2304 18
We notice that the storage associated with the segment remains, even though the data within the index is now totally useless to us now. The index definition is of course vital but why bother continuing to assign 18 extents of storage (in this example) to the index ? Oracle 11g Release 2 has now by default changed this behaviour.
Using the same demo as before but running Oracle11g R2:
SQL> create table bowie as select rownum id, 'BOWIE' name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> alter index bowie_id_i unusable; Index altered. SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = 'BOWIE_ID_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS STATUS DRO ---------- ---------- ----------- ---------- -------- --- BOWIE_ID_I 2 2226 1000000 UNUSABLE NO SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = 'BOWIE_ID_I'; no rows selected
We notice that the storage associated with the object is all gone. The index object remains but the underling segment and storage have been automatically dropped.
If we now look at a partitioning example, where we create 3 types of indexes:
SQL> CREATE TABLE big_album_sales(id number, album_id number, country_id number, release_date date, total_sales number) PARTITION BY RANGE (release_date) (PARTITION ALBUMS_2007 VALUES LESS THAN (TO_DATE('01-JAN-2008', 'DD-MON-YYYY')), PARTITION ALBUMS_2008 VALUES LESS THAN (TO_DATE('01-JAN-2009', 'DD-MON-YYYY')), PARTITION ALBUMS_2009 VALUES LESS THAN (TO_DATE('01-JAN-2010', 'DD-MON-YYYY')), PARTITION ALBUMS_2010 VALUES LESS THAN (MAXVALUE)); Table created. SQL> INSERT INTO big_album_sales SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2000), ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 1000000; 1000000 rows created. SQL> commit; Commit complete.
We first create a Non-Partitioned Index:
SQL> CREATE INDEX big_album_tot_sales_i ON big_album_sales(total_sales); Index created.
Next a Global Partitioned Index:
SQL> CREATE INDEX big_album_country_id_i ON big_album_sales(country_id) GLOBAL PARTITION BY RANGE (country_id) (PARTITION TS1 VALUES LESS THAN (26), PARTITION TS2 VALUES LESS THAN (51), PARTITION TS3 VALUES LESS THAN (76), PARTITION TS4 VALUES LESS THAN (MAXVALUE)); Index created.
Finally, a Local Partitioned index:
SQL> CREATE INDEX big_album_album_id_i ON big_album_sales(album_id) local; Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'BIG_ALBUM_SALES', estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
If we now split the last table partition, this will effectively make the:
1) Non-Partitioned Unusable
2) All partitions of the Global Partitioned index unusable
3) Just the last 2 partitions of the Local Partitioned Index unusable
SQL> ALTER TABLE big_album_sales SPLIT PARTITION ALBUMS_2010 AT (TO_DATE('01-JAN-2011', 'DD-MON-YYYY')) INTO (PARTITION ALBUMS_2010, PARTITION ALBUMS_2011); Table altered. SQL> select index_name, status from dba_indexes where table_name = 'BIG_ALBUM_SALES'; INDEX_NAME STATUS ------------------------ -------- BIG_ALBUM_TOT_SALES_I UNUSABLE BIG_ALBUM_COUNTRY_ID_I N/A BIG_ALBUM_ALBUM_ID_I N/A SQL> select index_name, partition_name, status, leaf_blocks from dba_ind_partitions where index_name like 'BIG_ALBUM_%'; INDEX_NAME PARTITION_NAME STATUS LEAF_BLOCKS ----------------------- -------------- -------- ----------- BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 USABLE 807 BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 USABLE 381 BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 USABLE 383 BIG_ALBUM_ALBUM_ID_I ALBUMS_2010 UNUSABLE BIG_ALBUM_ALBUM_ID_I ALBUMS_2011 UNUSABLE BIG_ALBUM_COUNTRY_ID_I TS1 UNUSABLE 629 BIG_ALBUM_COUNTRY_ID_I TS2 UNUSABLE 629 BIG_ALBUM_COUNTRY_ID_I TS3 UNUSABLE 629 BIG_ALBUM_COUNTRY_ID_I TS4 UNUSABLE 629 SQL> select segment_name, partition_name, bytes, blocks from dba_segments where segment_name like 'BIG_ALBUM_%' and segment_type like 'INDEX%'; SEGMENT_NAME PARTITION_NAME BYTES BLOCKS --------------------- -------------- -------- ------ BIG_ALBUM_ALBUM_ID_I ALBUMS_2007 7340032 896 BIG_ALBUM_ALBUM_ID_I ALBUMS_2008 3145728 384 BIG_ALBUM_ALBUM_ID_I ALBUMS_2009 4194304 512 BIG_ALBUM_TOT_SALES_I 23068672 2816
We notice that all segments associated with the Global Partitioned index which are now unusable have been dropped. As have both unusable partitions from the Local Partitioned Index. However, the segment and storage associated with the unusable Non-Partitioned index still remains. Perhaps a missing feature for another time …
It’s a nice little touch that the unusable and somewhat useless index segments now get automatically cleaned out in Oracle11g R2, although they did previously act as “placeholders” in that nothing else within the tablespace could come along and use the occupied storage.
Oracle11g Creation On Demand Indexes (Invisible Touch) February 17, 2011
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes.11 comments
Prior to Oracle11g Release 2, the default and minimum size of a segment is one extent. So in the below example, where we create a table and five associated indexes:
SQL> create table empty (a number, b number, c number, d number, e number); Table created. SQL> create index empty_a_i on empty(a); Index created. SQL> create index empty_b_i on empty(b); Index created. SQL> create index empty_c_i on empty(c); Index created. SQL> create index empty_d_i on empty(d); Index created. SQL> create index empty_e_i on empty(e); Index created. SQL> select segment_name, blocks, bytes, extents from dba_segments where segment_name like 'EMPTY%'; SEGMENT_NAME BLOCKS BYTES EXTENTS ------------ ---------- ---------- ---------- EMPTY 128 1048576 1 EMPTY_A_I 128 1048576 1 EMPTY_B_I 128 1048576 1 EMPTY_C_I 128 1048576 1 EMPTY_D_I 128 1048576 1 EMPTY_E_I 128 1048576 1 6 rows selected.
Each of the segments has been allocated an extent, including each of the indexes.
However, since Oracle11g Release 2, this default behaviour has changed. Running exactly the same demo:
SQL> create table empty (a number, b number, c number, d number, e number); Table created. SQL> create index empty_a_i on empty(a); Index created. SQL> create index empty_b_i on empty(b); Index created. SQL> create index empty_c_i on empty(c); Index created. SQL> create index empty_d_i on empty(d); Index created. SQL> create index empty_e_i on empty(e); Index created. SQL> select segment_name, blocks, bytes, extents from dba_segments where segment_name like 'EMPTY%'; no rows selected
We can see that no actual segments have been allocated. The default number of extents when creating an object is now effectively zero. Oracle now defers the creation of the segment and the actual allocation of extents and storage until the point in time when the first row gets inserted.
This means for those packaged applications where a large number of objects get created of which relatively few are actually ever used by the specific deployment of the application (eg. SAP) , a substantial amount of storage could potentially be saved. It also can save a significant amount of time deploying such applications as the overheads associated with actually creating the never to be used segments can be avoided.
There also some subtle performance implications when the application attempts to access some of these “empty” tables. I’m just going to create another table, but this one populated with a bunch of rows and run a query that joins this with the “empty” table:
SQL> create table bowie as select rownum id, 'BOWIE' name from dual connect by level <= 1000000; Table created. SQL> create index bowie_id_i on bowie(id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=>true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'EMPTY', cascade=>true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> select * from bowie, empty where bowie.id=empty.a and bowie.id = 42; no rows selected Execution Plan ---------------------------------------------------------- Plan hash value: 1098778158 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 76 | 2 (0)| 00:00:01 | | 1 | MERGE JOIN CARTESIAN | | 1 | 76 | 2 (0)| 00:00:01 | | 2 | TABLE ACCESS BY INDEX ROWID | EMPTY | 1 | 65 | 1 (0)| 00:00:01 | |* 3 | INDEX RANGE SCAN | EMPTY_A_I | 1 | | 1 (0)| 00:00:01 | | 4 | BUFFER SORT | | 1 | 11 | 1 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| BOWIE | 1 | 11 | 1 (0)| 00:00:01 | |* 6 | INDEX RANGE SCAN | BOWIE_ID_I | 1 | | 0 (0)| 00:00:01 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("EMPTY"."A"=42) 6 - access("BOWIE"."ID"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 0 consistent gets 0 physical reads 0 redo size 303 bytes sent via SQL*Net to client 239 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
With a table which has yet to be populated, the CBO knows there can be no rows associated with this table (and hence no returns return from this query). You can’t get any less than 0 consistent gets. Note previously, the CBO would be forced to perform at the very least one read (if not more) as it can’t “know” there are indeed no rows, even with the table statistics set at zero (as they may no longer be accurate) until it actually physically accesses a segment.
The segment is not actually created and storage allocated to the object until the first row is inserted into the table. This means that this first insert will be a relatively expensive operation as the work associated with creating the segment and allocating the initial extent needs to be implicitly performed in the background at this time. Not only for the table itself, but for any dependant object as well (eg. indexes).
SQL> insert into empty (a, b) values (1,1); 1 row created. Execution Plan ---------------------------------------------------------- -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | EMPTY | | | | -------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 3535 recursive calls 178 db block gets 830 consistent gets 26 physical reads 27272 redo size 367 bytes sent via SQL*Net to client 321 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 59 sorts (memory) 0 sorts (disk) 1 rows processed SQL> commit; Commit complete.
Although it’s only a single row insert, note the high number of recursive calls and logical I/Os due in large part to the segment(s) being implicitly created in the background at this time. Subsequent inserts are nowhere near as expensive:
SQL> insert into empty (a, b) values (1,1); 1 row created. Execution Plan ---------------------------------------------------------- -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | INSERT STATEMENT | | 1 | 1 (0)| 00:00:01 | | 1 | LOAD TABLE CONVENTIONAL | EMPTY | | | | -------------------------------------------------------------------------- Statistics ---------------------------------------------------------- 0 recursive calls 5 db block gets 1 consistent gets 0 physical reads 692 redo size 381 bytes sent via SQL*Net to client 321 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 1 rows processed
Although, the table was populated with values that only directly impacted 2 of the indexes, once the table segment is created, all dependent segments are created at the same time.
SQL> select segment_name, blocks, bytes, extents from dba_segments where segment_name like 'EMPTY%'; SEGMENT_NAME BLOCKS BYTES EXTENTS ------------ ---------- ---------- ---------- EMPTY 128 1048576 1 EMPTY_A_I 128 1048576 1 EMPTY_B_I 128 1048576 1 EMPTY_C_I 128 1048576 1 EMPTY_D_I 128 1048576 1 EMPTY_E_I 128 1048576 1 SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'EMPTY', cascade=>true, estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> select index_name, blevel, leaf_blocks, status from dba_indexes where index_name like 'EMPTY%'; INDEX_NAME BLEVEL LEAF_BLOCKS STATUS ---------- ---------- ----------- -------- EMPTY_A_I 0 1 VALID EMPTY_B_I 0 1 VALID EMPTY_C_I 0 0 VALID EMPTY_D_I 0 0 VALID EMPTY_E_I 0 0 VALID
Note all the index segments have been created although only 2 of them actually contain data.
Finally, there are some implications with regard to how quotas are enforced that potentially may cause issues. Prior to Oracle11g Release 2, if a user tried to create a segment in a tablespace for which they don’t have sufficient privileges, the operation will fail during the creation of the object:
SQL> create user muse identified by muse default tablespace user_data temporary tablespace temp; User created. SQL> grant create session, create table to muse; Grant succeeded. SQL> connect muse/muse; Connected. SQL> create table fred (id number primary key using index (create index fred_pk on fred(id) tablespace user_data), name varchar2(20)); create table fred (id number primary key using index (create index fred_pk on fred(id) tablespace user_data), name varchar2(20)) * ERROR at line 1: ORA-01950: no privileges on tablespace 'USER_DATA'
However with Oracle11g Release 2, as no segment is actually created and hence storage allocated when an object is created, such creation of objects will now initially succeed. No quotas can be violated if no storage is actually used during the creation of the object:
SQL> create user muse identified by muse default tablespace user_data temporary tablespace temp; User created. SQL> grant create session, create table to muse; Grant succeeded. SQL> connect muse/muse Connected. SQL> create table fred (id number primary key using index (create index fred_pk on fred(id) tablespace user_data), name varchar2(20)); Table created.
It’s only when the table is first populated and hence when the segment is actually created and storage allocated, will quota related issues generate errors:
SQL> insert into fred values (1, 'BOWIE'); insert into fred values (1, 'BOWIE') * ERROR at line 1: ORA-01950: no privileges on tablespace 'USER_DATA'
An interesting change in default behaviour introduced in Oracle11g Release 2 that can potentially be quite beneficial but perhaps also a little dangerous for the unwary.
Oracle11g Bitmap-Join IOTs (Us and Them) January 25, 2011
Posted by Richard Foote in 11g, Bitmap Indexes, Index Organized Tables, Oracle Indexes.6 comments
With each new database release, nice little improvements and enhanced options continually get added. Since 11g R1, two index related features can finally be used in combination with each other.
To demonstrate, I’m first going to create and populate a so-called “large” Data Warehouse table.
SQL> CREATE TABLE big_dwh_table (id NUMBER PRIMARY KEY, album_id NUMBER, artist_id NUMBER, country_id NUMBER, format_id NUMBER, release_date DATE, total_sales NUMBER); Table created. SQL> CREATE SEQUENCE dwh_seq; Sequence created. SQL> create or replace procedure pop_big_dwh_table as 2 v_id number; 3 v_artist_id number; 4 begin 5 for v_album_id in 1..10000 loop 6 v_artist_id:= ceil(dbms_random.value(0,100)); 7 for v_country_id in 1..100 loop 8 select dwh_seq.nextval into v_id from dual; 9 insert into big_dwh_table values (v_id, v_album_id, v_artist_id, v_country_id, ceil(dbms_random.value(0,4)), trunc(sysdate-mod(v_id,ceil(dbms_random.value(0,1000)))), ceil(dbms_random.value(0,500000))); 10 end loop; 11 end loop; 12 commit; 13 end; 14 / Procedure created. SQL> exec pop_big_dwh_table PL/SQL procedure successfully completed.
I’ll next create a standard bitmap index on the ALBUM_ID column and collect a few statistics:
SQL> create bitmap index big_dwh_table_album_id_i on big_dwh_table(album_id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'BIG_DWH_TABLE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
OK, I’m now going to create and populate a “smaller” dimension/detail heap table and a few associated indexes:
SQL> CREATE TABLE albums (album_id number, album_details varchar2(30)); Table created. SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000; 10000 rows created. SQL> commit; Commit complete. SQL> alter table albums add primary key(album_id); Table altered. SQL> create index albums_details_i on albums(album_details); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
If we now run a little query that joins the two tables together:
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1936297994 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 4250 | 25 (0)| 00:00:01 | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 125 | 4250 | 25 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| ALBUMS | 1 | 22 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | ALBUMS_DETAILS_I | 1 | | 1 (0)| 00:00:01 | | 5 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 6 | BITMAP INDEX SINGLE VALUE | BIG_DWH_TABLE_ALBUM_ID_I | | | | | | 7 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 100 | 1200 | 25 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("A"."ALBUM_DETAILS"='TAB$') 6 - access("B"."ALBUM_ID"="A"."ALBUM_ID") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 10 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
The resultant execution plan is pretty good and efficient and what we would expect. It performs a nested loop join to join the tables together which based on the relatively small number of rows returned makes sense and uses the b-tree index to get the specific album details from the dimension table and the bitmap index to find the matching albums details from the larger table.
However, as this is a very frequently executed join condition, we can potentially improve things and reduce the 10 consistent gets by introducing a bitmap-join index. A bitmap-join index performs the “join” operation once, when the index is created and during subsequent DML operations by creating an index based on column(s) on the smaller dimension tables that directly references rows in the larger fact table.
SQL> drop index albums_details_i; Index dropped. SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id; Index created.
So the bitmap-join index is based on the ALBUM_DETAILS column from the smaller ALBUMS table, but it references and has rowids associated with the larger BIG_DWH_TABLE table, with the bitmap-join definition containing details on how the join between the two tables needs to be performed. It if want to know what rows within the larger table have ALBUM_DETAILS of interest, the corresponding bitmap-join index will find all such rows without having to access the smaller ALBUMS table that contains this column.
If we now run the same query as before:
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 950886520 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 1500 | 26 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 125 | 1500 | 26 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I | | | | | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("B"."SYS_NC00008$"='TAB$') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
We notice the nested loop operation is no longer required. In fact, we don’t need to reference the smaller ALBUMS table at all as all the required information can now be obtained by using the bitmap-join index and direct accesses to the larger table. The number of consistent gets has therefore reduced from 10 down to just 6.
Note in our example, there is no actual Foreign Key (FK) constraint in the larger table (in a Data Warehouse, such constraints may not be necessary and/or get in the way). The bitmap-join index doesn’t require a FK constraint to be in place however it’s necessary that the column in the join condition referencing the detail table be Unique else there could be a many-to-many join condition which wouldn’t make sense when attempting to populate the bitmap-join index.
However, make one of the tables in the Bitmap-Join index an Index Organized Table (IOT), in this case the smaller detail table …
SQL> drop table albums; Table dropped. SQL> CREATE TABLE albums (album_id number primary key, album_details varchar2(30)) organization index; Table created. SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000; 10000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) 2 FROM big_dwh_table b, albums a 3 WHERE b.album_id = a.album_id; CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) * ERROR at line 1: ORA-25966: join index cannot be based on an index organized table
and we get the above error as prior to 11g R1, there was a restriction that no table within a Bitmap-Join index could be an Index Organized Table.
Now, if we run exactly the same demo but in an Oracle11g database:
SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id; Index created. SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 950886520 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 1500 | 26 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 125 | 1500 | 26 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I | | | | | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("B"."SYS_NC00008$"='TAB$') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
It all now works fine.
So since Oracle 11g R1, there’s one less reason not use Index Organized Tables and/or Bitmap-Join indexes in your Data Warehouse 🙂
Oracle11g IGNORE_ROW_ON_DUPKEY_INDEX Hint (Micro Cuts) December 20, 2010
Posted by Richard Foote in 11g, 11g New features, Oracle Indexes.19 comments
An interesting new hint was introduced in Oracle11g which provides an alternative approach when inserting data where duplicate values might be an issue.
To illustrate, I’m going to create a little table with just the 10 rows with a unique ID column containing values 1 – 10 policed by a Unique index:
SQL> create table radiohead (id number constraint radiohead_pk_i primary key using index (create unique index radiohead_pk_i on radiohead(id)), name varchar2(20)); Table created. SQL> select index_name, uniqueness, table_name from dba_indexes where index_name='RADIOHEAD_PK_I'; INDEX_NAME UNIQUENES TABLE_NAME ------------------------------ --------- ------------------------------ RADIOHEAD_PK_I UNIQUE RADIOHEAD SQL> insert into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 10; 10 rows created. SQL> commit; Commit complete.
If we now attempt to add 12 more rows, but including the values 1 – 10:
SQL> insert into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 12; insert into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 12 * ERROR at line 1: ORA-00001: unique constraint (BOWIE.RADIOHEAD_PK_I) violated
We obviously get a unique constraint violation error.
However, Oracle11g allows us to use the IGNORE_ROW_ON_DUPKEY_INDEX hint, which will silently deal with the unique constraint violation errors by simply ignoring and not inserting any row in which the unique values already exist in the table.
The hint comes in 2 formats, the first allows us to specify the unique index which contains the unique values to be ignored:
SQL> insert /*+ ignore_row_on_dupkey_index(radiohead,radiohead_pk_i) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 12; 2 rows created. SQL> commit; Commit complete.
Note the 10 duplicate values (1 – 10) have been ignored and have not been inserted into the table but the values 11 and 12 which didn’t previously exist have successfully been inserted into the table.
Here we attempt to insert values 1 – 13 into the table, although now values 1 – 12 currently already exist. This time, we’ll use the second format of the hint which allows us to stipulate the column which contains unique values which are to be ignored if they already exist:
SQL> insert /*+ ignore_row_on_dupkey_index(radiohead(id)) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 13; 1 row created. SQL> commit; Commit complete. SQL> select * from radiohead; ID NAME ---------- -------------------- 1 OK COMPUTER 2 OK COMPUTER 3 OK COMPUTER 4 OK COMPUTER 5 OK COMPUTER 6 OK COMPUTER 7 OK COMPUTER 8 OK COMPUTER 9 OK COMPUTER 10 OK COMPUTER 11 OK COMPUTER 12 OK COMPUTER 13 OK COMPUTER 13 rows selected.
Note in this case, the values 1 – 12 have all been silently ignored with just the value 13 inserted this time into the table.
You can’t however use this hint within an update statement …
SQL> update /*+ ignore_row_on_dupkey_index(radiohead,radiohead_pk_i) */ radiohead set id = 13 where id = 3; update /*+ ignore_row_on_dupkey_index(radiohead,radiohead_pk_i) */ radiohead set id = 13 where id = 3 * ERROR at line 1: ORA-38917: IGNORE_ROW_ON_DUPKEY_INDEX hint disallowed for this operation
Interesting difference in the behaviour of this hint. Usually “invalid” hints are just ignored and treated as comments but here if an illegal operation is attempted with the use of this “hint”, an error is invoked.
Going to now set up the same demo again, but this time police the Primary Key constraint via a Non-Unique index:
SQL> drop table radiohead; Table dropped. SQL> create table radiohead (id number constraint radiohead_pk_i primary key using index (create index radiohead_pk_i on radiohead(id)), name varchar2(20)); Table created. SQL> insert into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 10; 10 rows created. SQL> commit; Commit complete. SQL> select index_name, uniqueness, table_name from dba_indexes where index_name='RADIOHEAD_PK_I'; INDEX_NAME UNIQUENES TABLE_NAME ------------------------------ --------- ------------------------------ RADIOHEAD_PK_I NONUNIQUE RADIOHEAD SQL> insert /*+ ignore_row_on_dupkey_index(radiohead,radiohead_pk_i) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 12; insert /*+ ignore_row_on_dupkey_index(radiohead,radiohead_pk_i) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 12 * ERROR at line 1: ORA-38913: Index specified in the index hint is invalid SQL> insert /*+ ignore_row_on_dupkey_index(radiohead(id)) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 13; insert /*+ ignore_row_on_dupkey_index(radiohead(id)) */ into radiohead select rownum, 'OK COMPUTER' from dual connect by level <= 13 * ERROR at line 1: ORA-38913: Index specified in the index hint is invalid
Note again an error is invoked here as well as this hint can only be applied via a constraint policed via a Unique index. A Non-Unique index, even with a Unique or PK constraint in place is not sufficient and will generate the above error.
Yet another reason to use Unique indexes to police constraints whenever possible and practical …
11g Virtual Columns and Fast Refreshable Materialized Views (What In The World) November 24, 2010
Posted by Richard Foote in 11g, 11g New features, Function Based Indexes, Oracle Bugs, Virtual Columns.15 comments
Previous to Oracle 11g Rel 2, two very common and useful features previously worked well together, they being fast refreshable materialized views and the introduction of virtual columns due to the creation of function-based indexes.
To illustrate, we create and populate a little demo table:
SQL> create table bowie (a number, b number, c number); Table created. SQL> insert into bowie select rownum, mod(rownum,10), mod(rownum,100) from dual connect by level <= 100000; 100000 rows created. SQL> commit; Commit complete. SQL> alter table bowie add primary key (a); Table altered.
We now create a simple little function-based index:
SQL> create index bowie_func_i on bowie(b+c); Index created.
If we look at the columns in the table via DBA_TAB_COLS:
SQL> select column_name, data_default, virtual_column, hidden_column from dba_tab_cols where table_name = 'BOWIE'; COLUMN_NAME DATA_DEFAULT VIR HID ------------ ------------ --- --- SYS_NC00004$ "B"+"C" YES YES C NO NO B NO NO A NO NO
We notice Oracle has introduced a new, hidden virtual column (SYS_NC00004$), required to store statistics for use by the Cost Based Optimizer.
Next we create a materialized view log on this table and a fast refreshable materialized view:
SQL> create materialized view log on bowie WITH PRIMARY KEY,SEQUENCE, ROWID (b,c) INCLUDING NEW VALUES; Materialized view log created. SQL> create materialized view bowie_mv 2 build immediate 3 refresh fast 4 with primary key 5 enable query rewrite 6 as 7 select b, count(*) from bowie group by b; Materialized view created.
Collect a few statistics and we note the Materialized View does indeed get used during a query rewrite scenario:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', estimate_percent=>null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1') PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_MV', estimate_percent=>null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1') PL/SQL procedure successfully completed. SQL> select b, count(*) from bowie having b > 3 group by b; B COUNT(*) ---------- ---------- 6 10000 4 10000 5 10000 8 10000 7 10000 9 10000 6 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 593592962 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 42 | 2 (0)| 00:00:01 | |* 1 | MAT_VIEW REWRITE ACCESS FULL| BOWIE_MV | 7 | 42 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("BOWIE_MV"."B">3) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 538 bytes sent via SQL*Net to client 395 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 6 rows processed
And indeed the materialized view is fast refreshable:
SQL> insert into bowie values (100001, 5, 42); 1 row created. SQL> commit; Commit complete. SQL> exec dbms_mview.refresh('BOWIE_MV', 'F'); PL/SQL procedure successfully completed. SQL> select b, count(*) from bowie having b > 3 group by b; B COUNT(*) ---------- ---------- 6 10000 4 10000 5 10001 8 10000 7 10000 9 10000 6 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 593592962 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 7 | 42 | 2 (0)| 00:00:01 | |* 1 | MAT_VIEW REWRITE ACCESS FULL| BOWIE_MV | 7 | 42 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("BOWIE_MV"."B">3) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 546 bytes sent via SQL*Net to client 395 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 6 rows processed
Notice how the materialized view does indeed displayed the correct updated information via the query rewrite operation . So the materialized view behaved and worked as expected even though the underlining master table has a virtual column due to the creation of the function-based index (note that QUERY_REWRITE_INTEGRITY is set to STALE_TOLERATED)
Unfortunately, things go off the rails somewhat since Oracle 11g Rel 2 when a virtual column is introduced due to one of the 11g new features. For example, I now collect some Extended Statistics on this table:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', method_opt=> 'FOR COLUMNS (A,B,C) SIZE 254'); PL/SQL procedure successfully completed. SQL> select column_name, data_default, virtual_column, hidden_column from dba_tab_cols where table_name = 'BOWIE'; COLUMN_NAME DATA_DEFAULT VIR HID ------------------------------ --------------------------------- --- --- SYS_STUM4KJU$CCICS9C1UJ6UWC4YP SYS_OP_COMBINED_HASH("A","B","C") YES YES SYS_NC00004$ "B"+"C" YES YES C NO NO B NO NO A NO NO
Notice how extended statistics has resulted in another hidden virtual column (SYS_STUM4KJU$CCICS9C1UJ6UWC4YP) being created to store the resultant statistics.
However, if now attempt to perform a fast refresh on the Materialized View:
SQL> insert into bowie values (100002, 5, 42); 1 row created. SQL> commit; Commit complete. SQL> exec dbms_mview.refresh('BOWIE_MV', 'F'); BEGIN dbms_mview.refresh('BOWIE_MV', 'F'); END; * ERROR at line 1: ORA-12008: error in materialized view refresh path ORA-00904: "MAS$"."SYS_STUM4KJU$CCICS9C1UJ6UWC4YP": invalid identifier ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2558 ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2771 ORA-06512: at "SYS.DBMS_SNAPSHOT", line 2740 ORA-06512: at line 1
We get an error, complaining about the existence of this new virtual column.
If we attempted to drop and re-create the materialized view:
SQL> drop materialized view bowie_mv; Materialized view dropped. SQL> create materialized view bowie_mv 2 build immediate 3 refresh fast 4 with primary key 5 enable query rewrite 6 as 7 select b, count(*) from bowie group by b; select b, count(*) from bowie group by b * ERROR at line 7: ORA-12033: cannot use filter columns from materialized view log on "BOWIE"."BOWIE"
It fails, complaining that the materialized view log is somehow missing a filter column (which it isn’t). We get exactly the same set of issues if we add a visible virtual column via this new 11g capability:
SQL> create table bowie2 (a number, b number, c number, d as (a+b+c)); Table created. SQL> select column_name, data_default, virtual_column, hidden_column from dba_tab_cols where table_name = 'BOWIE2'; COLUMN_NAME DATA_DEFAULT VIR HID ------------ ------------ --- --- D "A"+"B"+"C" YES NO C NO NO B NO NO A NO NO SQL> insert into bowie2 (a,b,c) select rownum, mod(rownum,10), mod(rownum,100) from dual connect by level <= 100000; 100000 rows created. SQL> commit; Commit complete. SQL> alter table bowie2 add primary key (a); Table altered. SQL> create materialized view log on bowie2 WITH PRIMARY KEY,SEQUENCE, ROWID (b,c) INCLUDING NEW VALUES; Materialized view log created. SQL> create materialized view bowie2_mv 2 build immediate 3 refresh fast 4 with primary key 5 enable query rewrite 6 as 7 select b, count(*) from bowie2 group by b; select b, count(*) from bowie2 group by b * ERROR at line 7: ORA-12033: cannot use filter columns from materialized view log on "BOWIE"."BOWIE2"
Extended statistics and visible virtual columns are both potentially extremely useful new features introduced in 11g but unfortunately both can not be implemented on any table that needs to be fast refreshable within a complex materialized view.
I raised this issue with Oracle Support who have raised bug 10281402 as a result as it occurs in both 11.2.0.1 and 11.2.0.2 on various platforms I’ve tested.
Oracle11g: New Locking Modes When Policing FK Constraints (A Wolf at the Door) November 10, 2010
Posted by Richard Foote in 11g, Foreign Keys, Locking Issues, Oracle Indexes.18 comments
As I’ve been focusing mainly with Oracle 11g at work these days, thought I might look at a number of Oracle 11g related topics in the coming weeks.
To start with, there’s been a subtle but potentially significant change introduced in Oracle 11g (since 11.1.0.6) with regard to the manner in which locks are held in relation to policing Foreign Key constraints. The following has been tested on both 11.2.0.1 and 11.2.0.2.
To set the scene and replicate the issue we hit at work, I’m just going to create a little table (ALBUMS) that has 2 FK constraints pointing to two parent tables (ARTISTS and FORMATS) and populate them with a few rows.
SQL> CREATE TABLE artists (id NUMBER PRIMARY KEY, artist_name VARCHAR2(30)); Table created. SQL> CREATE TABLE formats (id NUMBER PRIMARY KEY, format_name varchar2(30)); Table created. SQL> CREATE TABLE albums (id NUMBER, album_name VARCHAR2(30), artist_id NUMBER CONSTRAINT artist_fk REFERENCES artists(id), format_id number CONSTRAINT format_fk REFERENCES formats(id)); Table created. SQL> INSERT INTO artists VALUES (1, 'DAVID BOWIE'); 1 row created. SQL> INSERT INTO artists VALUES (2, 'PINK FLOYD'); 1 row created. SQL> INSERT INTO formats VALUES (1, 'CD'); 1 row created. SQL> INSERT INTO formats VALUES (2, 'DVD'); 1 row created. SQL> INSERT INTO albums VALUES (1, 'LOW', 1, 1); 1 row created. SQL> INSERT INTO albums VALUES (2, 'DIAMOND DOGS', 1, 1); 1 row created. SQL> COMMIT; Commit complete.
OK, when running the following insert statement on the ARTISTS table in 10.2.0.3:
SQL> insert into artists values (3, 'MUSE'); 1 row created.
A check in the v$lock view will show the transaction holds a TM (DML Enqueue) lock in row-S (SS) mode 2 on the child ALBUMS table due to the FK relationship between these tables.
If another session were to either say delete a row or update the PK from the other parent FORMATS table:
SQL> update formats set id = 2 where id = 2; 1 row updated.
It will succeed with no problem for when it temporarily requires a TM share (S) mode 4 lock on the ALBUMS table, it can successfully grab it as the concurrent SS lock does not prevent this from occurring. It requires access to this mode 4 Share lock to ensure there are no transactions currently impacting the ALBUMS table that could potentially violate the constraint following the DML operations on the parent FORMATS table.
However, repeating the same exercise in Oracle 11g and we hit a subtle difference. When running the insert statement again in the ARTISTS table:
SQL> insert into artists values (3, 'MUSE'); 1 row created.
A check in the v$lock view will now show the transaction holds a TM (DML Enqueue) lock in row-X (SX) LMODE 3 on the child ALBUMS table, not a LMODE 2 SS level lock as it did in 10g. This is a “higher” level lock mode which has the following consequence on the other session now attempting to either delete or update the PK in the FORMATS table:
SQL> update formats set id = 2 where id = 2;
The session now hangs as it has to wait for the other session to release the DML Enqueue LMODE 3 SX lock before it can in turn grab the required TM mode 4 Share table lock it’s requesting. This is precisely the issue we hit with a somewhat poorly written application trying to perform something akin to the above series of updates from within two different sessions.
This change was introduced by Oracle to eliminate an ORA-600 issue that could occur when deleting a row from a table with a PK while rebuilding an associated FK index that referenced the PK.
However, introducing a more restrictive level of lock in this manner has the side-effect of increasing the likelihood of encountering new locking issues such as this, increasing the likelihood of hitting deadlock scenarios (as discussed here previously by Charles Hooper) and can therefore potentially reduce the overall concurrency capabilities of an application.
The “fix” in this case is to simply create an index on the formats_id FK column (which probably should exist anyways in this case to prevent locking issues on the child table when updating the parent FORMAT table):
SQL> CREATE INDEX albums_format_i on albums(format_id); Index created. SQL> insert into artists values (3, 'MUSE'); 1 row created.
In which case the table share lock is no longer required on the ALBUMS table (as Oracle can now use the associated index to effectively police the integrity of the child table following such an operation on a parent table) and the statement no longer hangs in the other session:
SQL> update formats set id = 2 where id = 2; 1 row updated.
This change in the locking behaviour of policing FK constraints is certainly something to be aware of when migrating to Oracle 11g if you potentially have FK constraints that don’t have associated indexes.
Index Block Dump: Block Header Part I (Editions Of You) July 20, 2010
Posted by Richard Foote in 11g, Block Dumps, Index Internals, Oracle Indexes.1 comment so far
I’ve previously looked at how to generate an Oracle block dump, time to now go into a little more detail.
As I mentioned, a block dump is a formatted representation of the actual contents of an Oracle block. Producing strategic block dumps can be an extremely useful method of determining what might be going on in Oracle under the covers and over the years I’ve found them to be an invaluable aid in helping me understand Oracle behaviour, troubleshoot issues, investigate the contents of corrupted blocks, etc. The focus in this series will be Oracle block dumps from the perspective of indexes, although the contents any Oracle block can be dumped and investigated.
To setup the demo, I’m going to initially create a simple little table and associated index that has only 3 rows to begin with. As the index is so tiny, all the contents can fit within the one index leaf block resulting in an index with a blevel of 0 (height of 1). Note I’m using a 11.1.0.6.0 database running on Windows for this specific demo. The actual format and content of a block dump differs between releases and continually changes. However, much of the useful content which I’ll focus on remains relatively consistent.
SQL> create table bowie (id number, name varchar2(20)); Table created. SQL> insert into bowie values (1, 'BOWIE'); 1 row created. SQL> insert into bowie values (2, 'ZIGGY'); 1 row created. SQL> insert into bowie values (3, 'MAJOR TOM'); 1 row created. SQL> commit; Commit complete. SQL> create index bowie_name_i on bowie(name); Index created. SQL> select index_name, blevel, leaf_blocks from dba_indexes where index_name = 'BOWIE_NAME_I'; INDEX_NAME BLEVEL LEAF_BLOCKS --------------- ---------- ----------- BOWIE_NAME_I 0 1
OK, as this is a simple blevel 0 index, all the contents of the index can be found in the block immediately following the index segment header.
SQL> select header_file, header_block from dba_segments where segment_name='BOWIE_NAME_I'; HEADER_FILE HEADER_BLOCK ----------- ------------ 8 84233
So the block we want to dump is the block in datafile 8, block 84234 (1 more than the segment header block). Before we dump the block though, let’s just note a few other points about this index. Firstly, given we know the datafile and block of interest, we can determine its relative data block address (rdba) by plugging in these details into the dbms_utility.make_data_block_address function:
SQL> select dbms_utility.make_data_block_address(8, 84234) from dual; DBMS_UTILITY.MAKE_DATA_BLOCK_ADDRESS(8,84234) --------------------------------------------- 33638666
We’ll note the rdba of 33638666 for future reference. The other thing we’ll just record is the Object ID of the index:
SQL> select object_id, data_object_id from dba_objects where object_name = 'BOWIE_NAME_I'; OBJECT_ID DATA_OBJECT_ID ---------- -------------- 115205 115205
Again, we’ll note an Object_Id and Data_Object_Id of 115205 for future reference. Finally, I’m just going to flush the buffer cache to ensure the current contents of the block is written to disk. This is useful in 11g when the block dump differentiates between the block contents in the buffer cache and on disk.
SQL> alter system flush buffer_cache; System altered.
OK, let’s now dump the block associated with this index leaf block:
SQL> alter system dump datafile 8 block 84234; System altered.
Below is just the block header portion of the resultant block dump:
Start dump data blocks tsn: 8 file#:8 minblk 84234 maxblk 84234
Block dump from cache:
Dump of buffer cache at level 4 for tsn=8, rdba=33638666
BH (0x17BF3D8C) file#: 8 rdba: 0x0201490a (8/84234) class: 1 ba: 0x17A70000
set: 3 bsz: 8192 bsi: 0 sflg: 0 pwc: 0 lid: 0x00000000,0x00000000
dbwrid: 0 obj: 115205 objn: 115205 tsn: 8 afn: 8
hash: [0x292CCE8C,0x15BFB42C] lru: [0x167EA1EC,0x15BFB48C]
lru-flags: on_auxiliary_list
ckptq: [NULL] fileq: [NULL] objq: [NULL]
st: FREE md: NULL tch: 0
flags:
cr pin refcnt: 0 sh pin refcnt: 0
Buffer contents not dumped
Block dump from disk:
buffer tsn: 8 rdba: 0x0201490a (8/84234)
scn: 0x0000.02d11215 seq: 0x01 flg: 0x04 tail: 0x12150601
frmt: 0x02 chkval: 0x9ae6 type: 0x06=trans data
Hex dump of block: st=0, typ_found=1
Dump of memory from 0x0BF39A00 to 0x0BF3BA00
BF39A00 0000A206 0201490A 02D11215 04010000 […..I……….]
BF39A10 00009AE6 001A0002 0001C205 02D11214 […………….]
BF39A20 1FE80000 00021F02 00000000 00000000 […………….]
BF39A30 00000000 00000000 00000000 00000000 […………….]
BF39A40 00000000 0000FFFF 00000000 00000000 […………….]
… Snip rest of memory dump …
As I mentioned, the actual format and content of a block dump differs between releases and continually changes so your formatted block dump may differ somewhat. However, the main points which I’ll discuss should be found in most currently supported versions of Oracle. This post should only be considered as a basic introduction on the subject. More depth and details to come.
As this is an 11g block dump, the block header consists of three distinct sections:
1) Dump of the buffer cache details associated with the index block
2) Dump of index block from disk
3) Full raw hex dump of the associated block
The first thing to check is that one is looking at the correct block dump and that the correct segment block was indeed dumped. The start of the formated block dump states the details of the dumped block(s):
file#:8 minblk 84234 maxblk 84234
so indeed, we’re looking at the correct dump file. These details are also listed along with the hex representation of relative block address (rdba) information:
file#: 8 rdba: 0x0201490a (8/84234)
and we can also confirm that the rdba is also correct and consistent with the block we have dumped:
rdba=33638666
as this matches the results of the rdba from the rdba as previously displayed via the MAKE_DATA_BLOCK_ADDRESS function.
We can also confirm that the correct index segment was dumped as the object id of the index:
obj: 115205 objn: 115205
matches the data_object_id and object_id from dba_objects as listed previously.
Other details of interest found in block dumps in most Oracle versions include:
scn: 0x0000.02d11215 – The System Change Number (SCN) of the block when it was last modified (I’ll show you how this gets populated in a later post). This is the effective “timestamp” of the block and is used by Oracle to determine crucial information such as effectively “when” the block was last modified and by “what” transaction.
type: 0x06=trans data – denotes the type of block. 06 represents a transactional block (table/index/cluster)
seq: 0x01 – number of the block changes within the the current scn (we’ll see how this increments in a later post)
tail: 0x12150601 – which consists of the last 2 bytes of the SCN, type and seq
frmt: 0x02 – denotes the block format with 02 representing a post Oracle8 block format. An A2 block (as shown in the raw block dump) denotes a post 10g block format.
chkval: 0x9ae6 – checksum value of block as used by Oracle in part to check the consistency and validity of the block
In later versions of Oracle (10g and beyond), the block dump includes a full hex dump of the associated memory buffer. I’ve snipped most of this in the above dump extract. However, one can see where the details of the block I’ve listed above can be found within the memory dump (hopefully, the colour code will help to highlight where each distinct piece of information can be found). Note also in 11g and beyond, more details of the buffer cache are listed and detailed as defined in the buffer cache section.
I’ll look at the following portion of the index block header, the Interested Transaction Slots, in the next post. We’ve only just begun …
Visible Invisible Indexes (The Invisible Band) November 20, 2008
Posted by Richard Foote in 11g, Invisible Indexes.9 comments
After spending the last few weeks traveling throughout Europe, it’s about time I wrote a new post on the Blog !!
I’ve previously posted a simple demo on Invisible Indexes, a new 11g feature that allows you to quickly make an index “invisible” to the CBO.
Recently, a comment on the above post by Michael Sorensen mentions a nice post by Christian Antognini in which he demonstrates how so-called Invisible Indexes on Foreign Keys can still be used by Oracle to prevent locking and performance related issues when delete/update operations are performed on the parent records.
I previously posted how Index Monitoring on FK indexes doesn’t pick up the fact so-called “Unused Indexes” can actually be “used” by Oracle for exactly this reason, quickly determining whether there’s an existing FK record that needs to prevent the update or delete operation on the parent record from proceeding.
I also previously discussed how the 11g CBO can now use index statistics to determine the appropriate cardinality of a query and how Index Monitoring doesn’t detect an index being “used” in this context as well.
It should therefore come as no surprise that Invisible Indexes might not be so invisible after all when it comes to the CBO potentially using the index statistics of an Invisible Index to determine the correct cardinality of a query.
This demo on Invisible Indexes and Index Statistics clearly shows that Oracle can still use the statistics of an index, even after they’ve been made invisible, to determine a more accurate cardinality for a query.
This means you need to be very careful how you use and apply Invisible Indexes and have a clear understanding on how Invisible Indexes are implemented by Oracle and the purpose for their use.
As I explain in my Indexing Seminar, Invisible Indexes are primarily intended to be applied when an application has suddenly and inexplicably started to use an index inappropriately and the database has significant performance issues as a result. The scenario is such that the performance of the database or application is so bad, it has caused the drain of resources to such an extent that normal business functions can’t continue and drastic action is required to address the issue.
The drastic action is to make the problematic index invisible so that the CBO can no longer see and use the index within an execution plan. This is drastic because it means other queries that could currently be using the index appropriately will also be prevented from using the index but performance is so bad that such a consequence can’t make matters any worse. By making the index invisible at least the resource drain can be prevented and hopefully most of the other business activities can now continue within the database.
By making the index invisible rather than simply dropping it, it means some sessions can investigate what the problem might be that’s causing the inappropriate use of the index (by altering sessions to OPTIMIZER_USE_INVISIBLE_INDEXES = true). Once the issue has been solved and addressed, the index can quickly be made visible again, rather than having to recreate the index from scratch.
Invisible indexes could prove to be a handy option in such scenarios.
Invisible indexes are also often viewed as being a useful method of determining whether it’s safe to drop an existing index that may not actually be used by an application. An unused index is a waste of both storage and resources in maintaining the thing but it’s often difficult to determine what the ramifications might be if an index were to be dropped. By making an index invisible, it may be possible to “test the waters” so to speak and determine how an application may suddenly behave without the index.
However, making an index invisible may not necessarily provide an accurate determination of the ramifications of dropping such an index because as has already been demonstrated, there are a number of important and key areas in which invisible indexes are still being used by Oracle. Dropping such an index is not the same as simply making an index invisible and may result in the application suddenly behaving in an unexpected manner that could cause significant performance issues.
Invisible indexes are not entirely invisible if you see what I mean 😉
Constraints With Oracle Total Recall (Remember A Day) October 7, 2008
Posted by Richard Foote in 11g, Constraints, Flashback, Total Recall.15 comments
11g has many new and potentially useful features which I’m slowly investigating and evaluating.
One feature that has the potential to be particularly useful where I work is “Oracle Total Recall“. It provides the facility to automatically capture all changes to any tables you choose, in a “secure, tamper proof” manner. This enables all changes to be easily tracked and viewed for historical, auditing, security control, etc. purposes.
Thought I might “introduce” Total Recall to those of you that may not have seen it in action and to offer an initial word of warning of a little trap one can easily fall into …
One nice aspect with it all is that it’s very easy to setup and configure.
To begin with, you need to create a storage area called a “flashback data archive” where all the historical changes can be stored, e.g.:
SQL> CREATE FLASHBACK ARCHIVE bowie_archive
TABLESPACE total_recall_ts QUOTA 500m RETENTION 1 YEAR;
You simply specify the tablespace you wish to use, how much space within the tablespace you wish to allocate for archiving purposes and a retention period to specify how long to keep the archived data.
To create a table with the “Total Recall” change tracking capability, simply specify which flashback archive area to use:
SQL> CREATE TABLE bowie_recall (id number, text varchar2(20), hist_date date) FLASHBACK ARCHIVE bowie_archive;
Table created.
All changes to the table are now automatically tracked and archived and can easily be accessed as necessary. To illustrate, I’m just going to simply insert a row into the table.
SQL> insert into bowie_recall values (1, ‘Bowie’, sysdate);
1 row created.
SQL> commit;
Commit complete.
Sometime later, I perform an update on the row:
SQL> update bowie_recall
2 set text = ‘Ziggy’, hist_date = sysdate
3 where id = 1;
1 row updated.
SQL> commit;
Commit complete.
OK, so the current row basically looks like thus:
SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
from bowie_recall;
ID TEXT HIST_DATE ---- -------- ------------------- 1 Ziggy 06-10-2008 10:09:55
However, if I want to see what the row looked like at a previous point in time, I can use the AS OF TIMESTAMP clause such as this:
SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
from bowie_recall
as of timestamp to_timestamp(‘2008-10-06 10:07:00’,
‘yyyy-mm-dd hh24:mi:ss’)
where id = 1;
ID TEXT HIST_DATE ---- -------- ------------------- 1 Bowie 06-10-2008 10:06:15
This is what the row looked like when it was initially inserted.
If I want to look at all the different versions of the row between any specific times of interest, I can write something such as this using the VERSIONS BETWEEN clause:
SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
from bowie_recall
versions between timestamp
to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’)
and sysdate
where id = 1;
ID TEXT HIST_DATE ---- -------- ------------------- 1 Ziggy 06-10-2008 10:09:55 1 Bowie 06-10-2008 10:06:15
It lists all versions of the rows between the two points of time …
Like I said, very easy to setup and very easy to subsequently query.
One of the disadvantages however of enabling the tracking of changes to a table is that it prevents a number of subsequent operations on the table. For example, you can’t simply drop a column:
SQL> alter table bowie_recall drop column status;
alter table bowie_recall drop column status
*
ERROR at line 1:
ORA-55610: Invalid DDL statement on history-tracked table
You also for example can’t modify a column to say add a NOT NULL constraint:
SQL> alter table bowie_recall modify text not null;
alter table bowie_recall modify text not null
*
ERROR at line 1:
ORA-55610: Invalid DDL statement on history-tracked table
Now the trap …
Oracle does however allow you to add any other type of constraint, such as say a Primary Key, without complaint:
SQL> alter table bowie_recall add primary key (id);
Table altered.
All well and good, except for when you want to now retrieve some of the historical changes from the table:
SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date from bowie_recall versions between timestamp to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’) and sysdate where id = 1;
select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date from bowie_recall versions between timestamp to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’) and sysdate where id = 1
*
ERROR at line 1:
ORA-01466: unable to read data – table definition has changed
Remember, one of the reasons for implementing Total Recall on a table is to implement a “tamper proof” data tracking solution and yet the addition of any constraint on a table (Primary Key, Unique Key, Foreign Key or Check Constraint) will cause you to be unable to automatically read the historical data, without directly accessing the underlining SYS_FBA_ archive tables.
Be careful with constraint management and the historical tracking of tables …
Total Recall is of course all relatively new and these sorts of early glitches are to be expected. More on Total Recall later, including possible performance implications when both modifying and querying historical data.