jump to navigation

12.1.0.2 Introduction to Zone Maps Part II (Changes) October 30, 2014

Posted by Richard Foote in 12c, Exadata, Oracle Indexes, Zone Maps.
add a comment

In Part I, I discussed how Zone Maps are new index like structures, similar to Exadata Storage Indexes, that enables the “pruning” of disk blocks during accesses of the table by storing the min and max values of selected columns for each “zone” of a table. A Zone being a range of contiguous (8M) blocks.

I showed how a Zone Map was relatively tiny but very effective in reducing the number of consistent gets for a well clustered column (ALBUM_ID).

In this post, we’re going to continue with the demo and look at what happens when we update data in the table with a Zone Map in place.

So lets update the ALBUM_ID column (which currently has a Zone Map defined) for a few rows. The value of ALBUM_ID was previously 1 for all these rows (the full range of values is currently between 1 and 100) but we’re going to update them to 142:

SQL> update big_bowie set album_id=142 where id between 1 and 100;

100 rows updated.

SQL> commit;

Commit complete.

 

So the maximum value of ALBUM_ID is now 142, not 100. If we look at the maximum value as currently listed in the Zone Map:

 

SQL> select max(max_1_album_id) from  big_bowie_album_id_zm;

MAX(MAX_1_ALBUM_ID)
-------------------
100

 

We notice the maximum is still defined as being 100. So the update on the table has not actually updated the contents of the Zone Map. So this is a big difference between Zone Maps and conventional indexes, indexes are automatically updated during DML operations, Zone Maps are not (unless the REFRESH ON COMMIT option is specified).

If we look at the state of Zone Map entries that have a minimum of 1 (the previous values of ALBUM_ID before the update):

SQL> select * from big_bowie_album_id_zm where min_1_album_id = 1;

ZONE_ID$ MIN_1_ALBUM_ID MAX_1_ALBUM_ID ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- -------------- -------------- ----------- ----------- ----------
3.8586E+11              1              2           0           0      66234
3.8586E+11              1              2           0           1      65787
3.8586E+11              1              2           0           0      66223

 

We notice that one of the entries has a status of 1, meaning that a specific zone has been marked as stale. However, all the other zones are still OK.

If we look at the status of the overall Zone Map:

SQL> select zonemap_name, pruning, refresh_mode, invalid, stale, unusable
from dba_zonemaps where zonemap_name='BIG_BOWIE_ALBUM_ID_ZM';

ZONEMAP_NAME              PRUNING  REFRESH_MODE      INVALID STALE   UNUSABLE
------------------------- -------- ----------------- ------- ------- --------
BIG_BOWIE_ALBUM_ID_ZM     ENABLED  LOAD DATAMOVEMENT NO      NO      NO

 

We notice that the Zone Map is still “hunky dory” after the update.

If we now re-run the query we ran in Part I:

 

SQL> select * from big_bowie where album_id = 42;

100000 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 1980960934

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |   100K|  8984K|  3269  (12)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL WITH ZONEMAP| BIG_BOWIE |   100K|  8984K|  3269  (12)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ALBUM_ID"=42)
filter(SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_1_ALBUM_ID" > :1 OR
zm."MAX_1_ALBUM_ID" < :2) THEN 3 ELSE 2 END END FROM "BOWIE"."BIG_BOWIE_ALBUM_ID_ZM" zm
WHERE zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),42,42)<3 AND "ALBUM_ID"=42)
Statistics
----------------------------------------------------------
141  recursive calls
0  db block gets
3238  consistent gets
0  physical reads
0  redo size
3130019  bytes sent via SQL*Net to client
761  bytes received via SQL*Net from client
21  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
100000  rows processed

 

We see the Zone Map was still used by the CBO. The number of consistent gets has increased (up from 2364 to 3238) as we now have to additional access all the blocks associated with this stale zone, but it’s still more efficient that reading all the blocks from the entire table.

If we want to remove the stale zone entries, we can refresh the Zone Map or rebuild it (for ON DEMAND refresh):

 

SQL> alter materialized zonemap big_bowie_album_id_zm rebuild;

Materialized zonemap altered.

 

If we now look at the Zone Map entry:

 

SQL> select * from big_bowie_album_id_zm where min_1_album_id = 1;

ZONE_ID$ MIN_1_ALBUM_ID MAX_1_ALBUM_ID ZONE_LEVEL$ ZONE_STATE$ ZONE_ROWS$
---------- -------------- -------------- ----------- ----------- ----------
3.8586E+11              1              2           0           0      66234
3.8586E+11              1            142           0           0      65787
3.8586E+11              1              2           0           0      66223

 

We see that the entry is no longer stale and now correctly reflects the actual maximum value within the zone (142).

If we now re-run the query:

SQL> select * from big_bowie where album_id = 42;

100000 rows selected.

Elapsed: 00:00:00.30

Execution Plan
----------------------------------------------------------
Plan hash value: 1980960934

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |   100K|  8984K|  3269  (12)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL WITH ZONEMAP| BIG_BOWIE |   100K|  8984K|  3269  (12)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ALBUM_ID"=42)
filter(SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_1_ALBUM_ID" > :1 OR
zm."MAX_1_ALBUM_ID" < :2) THEN 3 ELSE 2 END END FROM "BOWIE"."BIG_BOWIE_ALBUM_ID_ZM" zm
WHERE zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),42,42)<3 AND "ALBUM_ID"=42)
Statistics
----------------------------------------------------------
141  recursive calls
0  db block gets
3238  consistent gets
0  physical reads
0  redo size
3130019  bytes sent via SQL*Net to client
761  bytes received via SQL*Net from client
21  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
100000  rows processed

 

We notice nothing has appreciably changed, the Zone Map is still being used but the number of consistent gets remains the same as before. Why haven’t we returned back to our previous 2364 consistent gets ?

Well, as the range of possible values within the updated zone is now between 1 and 142, the required value of 42 could potentially be found within this zone and so still needs to be accessed just in case. We know that the value of 42 doesn’t exist within this zone, but Oracle has no way of knowing this based on the possible 1 to 142 range.

Hence Zone Maps work best when the data is well clustered and the Min/Max ranges of each zone can be used to limit which zones need to be accessed. If the data was not well clustered and the values within each zone mostly had ranges between the min and max values, then Oracle wouldn’t be able to effectively prune many/any zone and the Zone Map would be useless.

As we’ll see in Part III :)

Index Compression Part VI: 12c Index Advanced Compression Block Dumps (Tumble and Twirl) October 9, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Block Dumps, Index Compression, Oracle Indexes.
4 comments

Sometimes, a few pictures (or in this case index block dumps) is better than a whole bunch of words :)

In my previous post, I introduced the new Advanced Index Compression feature, whereby Oracle automatically determines how to best compress an index. I showed a simple example of an indexed column that had sections of index entries that were basically unique (and so don’t benefit from compression) and other sections with index entries that had many duplicates (that do compress well). Advanced Index Compression enables Oracle to automatically just compress those index leaf blocks where compression is beneficial.

If we look at a couple of partial block dumps from this index, first a dump from a leaf block that did have duplicate index entries:

Leaf block dump
===============
header address 216542820=0xce82e64
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 651
kdxcofbo 1346=0x542
kdxcofeo 2172=0x87c
kdxcoavs 826
kdxlespl 0
kdxlende 0
kdxlenxt 25166046=0x18000de
kdxleprv 25166044=0x18000dc
kdxledsz 0
kdxlebksz 8036
kdxlepnro 1
kdxlepnco 1 (Adaptive)
prefix row#0[8031] flag: -P—–, lock: 0, len=5
col 0; len 2; (2):  c1 2b
prc 651
row#0[8022] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5c
psno 0
row#1[8013] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5d
psno 0
row#2[8004] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5e
psno 0
row#3[7995] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 5f
psno 0
row#4[7986] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 86 00 60
psno 0

row#650[2172] flag: ——-, lock: 0, len=9
col 0; len 6; (6):  01 80 1e 8d 00 10
psno 0
—– end of leaf block Logical dump —–

The red section is a portion of the index header that determines the number of rows in the prefix table of the index (kdxlepnro 1). The prefix table basically lists all the distinct column values in the leaf blocks that are to be compressed. The value 1 denotes there is actually only just the 1 distinct column value in this specific leaf block (i.e. all index entries have the same indexed value). This section also denotes how many of the indexed columns are to be compressed (kdxlepnco 1). As this index only has the one column, it also has a value of 1. Note this value can potentially be anything between 0 (no columns compressed) up to the number of columns in the index.  The (Adaptive) reference tells us that Index Advanced Compression has been used and that the values here can change from leaf block to leaf block depending on the data characteristics of the index entries within each leaf block (a dump of a basic compressed index will not have the “Adaptive” reference).

The green section is the compression prefix table and details all the unique combinations of index entries to be compressed within the leaf block. As all indexed values are the same in this index (value 42, internally represented as c1 2b hex), the prefix table only has the one row. prc 651 denotes that all 651 index entries in this leaf block have this specific indexed value.

Next follows all the actual index entries, which now only consist of the rowid (the 6 byte col 0 column) as they all reference psno 0, which is the unique row id of the only row within the prefix table (row#0).

So rather than storing the indexed value 651 times, we can just store the index value (42) just the once within the prefix table and simply reference it from within the actual index entries. This is why index compression can save us storage, storing something once within a leaf block rather than multiple times.

If we now look at a partial block dump of another index leaf block within the index, that consists of many differing (basically unique) index entries:

Leaf block dump
===============
header address 216542820=0xce82e64
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0xa0: opcode=0: iot flags=-C- is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 449
kdxcofbo 938=0x3aa
kdxcofeo 1754=0x6da
kdxcoavs 816
kdxlespl 0
kdxlende 0
kdxlenxt 25168667=0x1800b1b
kdxleprv 25168665=0x1800b19
kdxledsz 0
kdxlebksz 8036
kdxlepnro 0
kdxlepnco 0 (Adaptive)
row#0[8022] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2c
col 1; len 6; (6):  01 80 12 e6 00 41
row#1[8008] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2d
col 1; len 6; (6):  01 80 12 e6 00 42
row#2[7994] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 3d 2e
col 1; len 6; (6):  01 80 12 e6 00 43


row#448[1754] flag: ——-, lock: 0, len=14
col 0; len 4; (4):  c3 58 41 5c
col 1; len 6; (6):  01 80 12 ee 00 1d
—– end of leaf block Logical dump —–

We notice that in the red section, both kdxlepnro 0 and kdxlepnco 0 (Adaptive) have a value of 0, meaning we have no rows and no columns within the prefix table. As such, we have no prefix table at all here and that this leaf block has simply not been compressed.

If we look at the actual index entries, they all have an additional column now in blue, that being the actual indexed value as all the index values in this leaf block are different from each other. Without some form of index entry duplication, there would be no benefit from compression and Index Advanced Compression has automatically determined this and not bothered to compress this leaf block. An attempt to compress this block would have actually increased the necessary overall storage for these index entries, due to the additional overheads associated with the prefix table (note it has an additional 2 byes of overhead per row within the prefix table).

I’ll next look at an example of a multi-column index and how Index Advanced Compression handles which columns in the index to compress.

Index Compression Part V: 12c Advanced Index Compression (Little Wonder) October 2, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Index Compression, Oracle Indexes.
add a comment

I’ve finally managed to find some free time in the evening to write a new blog piece :)

This will have to be the record for the longest time between parts in a series, having written Part IV of this Index Compression series way way back in February 2008 !! Here are the links to the previous articles in the series:

Index Compression Part I (Low)

Index Compression Part II (Down Is The New Up)

Index Compression Part III (2+2=5)

Index Compression Part IV (Packt Like Sardines In a Crushd Tin Box)

 

As I’ve previously discussed, compressing an index can be an excellent way to permanently reduce the size of an index in a very cost effective manner. Index entries with many duplicate values (or duplicate leading columns within the index) can be “compressed” by Oracle to reduce both storage overheads and potentially access overheads for large index scans. Oracle basically de-duplicates repeated indexed column values within each individual leaf block by storing each unique occurrence in a prefix section within the block, as I explain in the above links.

But it’s important to compress the right indexes in the right manner. If indexes do not have enough repeated data, it’s quite possible to make certain indexes larger rather than smaller when using compression (as the overheads of having the prefix section in the index block outweighs the benefits of limited reduction of repeated values). So one needs to be very selective on which indexes to compress and take care to compress the correct number of columns within the index. Oracle will only protect you from yourself if you attempt to compress all columns in a unique index, as in this scenario there can be no duplicate values to compress. This is all discussed in Part II and Part III of the series.

So, wouldn’t it be nice if Oracle made it all a lot easier for us and automatically decided which indexes to compress, which columns within the index to compress and which indexes to simply not bother compressing at all. Additionally, rather than an all or nothing approach in which all index leaf blocks are compressed in the same manner, wouldn’t it be nice if Oracle decided for each and every individual leaf block within the index how to best compress it. For those index leaf block that have no duplicate entries, do nothing, for those with some repeated columns just compress them and for those leaf blocks with lots of repeated columns and values to compress all of them as efficiently as possible.

Well, wish no more :)

With the recent release of Oracle Database 12.1.0.2, one of the really cool new features that got introduced was Advanced Index Compression. Now a warning from the get-go. The use of Advanced Index Compression requires the Advanced Compression Option and this option is automatically enabled with Enterprise Edition. So only use this feature if you are licensed to do so :)

The best way as always to see this new feature in action is via a simple little demo.

To begin, I’ll create a table with a CODE column that is populated with unique values:

SQL> create table bowie (id number, code number, name varchar2(30));

Table created.

SQL> insert into bowie select rownum, rownum, 'ZIGGY STARDUST' from dual connect by level <= 1000000;

1000000 rows created.

I’ll now create a section of data within the table in which we have many repeated values:

SQL> update bowie set code = 42 where id between 250000 and 499999;

250000 rows updated.

SQL> commit;

Commit complete.

So I’ve fabricated the data such that the values in the CODE column are effectively unique within 75% of the table but the other 25% consists of repeated values.

From an index compression perspective, this index really isn’t a good candidate for normal compression as most of the CODE data contains unique data that doesn’t compress. However, it’s a shame that we can’t easily just compress the 25% of the index that would benefit from compression (without using partitioning or some such).

If we create a normal B-Tree index on the CODE column without compression:

SQL> create index bowie_code_i on bowie(code);

Index created.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2157 DISABLED

 

We notice the index consists of 2157 leaf blocks.

If we now try to use normal compression on the index:

SQL> alter index bowie_code_i rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2684 ENABLED

 

We notice that the compressed index rather than decrease in size has actually increased in size, up to 2684 leaf blocks. So the index has grown by some 25% due to the fact the index predominately contains unique values which don’t compress at all and the resultant prefix section in the leaf blocks becomes nothing more than additional overhead. The 25% section of the index containing all the repeated values has indeed compressed effectively but these savings are more than offset by the increase in size associated with the other 75% of the index where the index entries had no duplication.

However, if we use the new advanced index compression capability via the COMPRESS ADVANCED LOW clause:

SQL> alter index bowie_code_i rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, compression from user_indexes where table_name='BOWIE';

INDEX_NAME           LEAF_BLOCKS COMPRESSION
-------------------- ----------- -------------
BOWIE_CODE_I                2054 ADVANCED LOW

 

We notice the index has now indeed decreased in size from the original 2157 leaf blocks down to 2054. Oracle has effectively ignored all those leaf blocks where compression wasn’t viable and compressed just the 25% of the index where compression was effective. Obviously, the larger the key values (remembering the rowids associated with the index entries can’t be compressed) and the larger the percentage of repeated data, the larger the overall compression returns.

With Advanced Index Compression, it’s viable to simply set it on for all your B-Tree indexes and Oracle will uniquely compress automatically each individual index leaf block for each and every index as effectively as it can for the life of the index.

12.1.0.2 Introduction to Zone Maps Part I (Map Of The Problematique) September 3, 2014

Posted by Richard Foote in Uncategorized.
7 comments

Zone Maps are new index-like structures that enables the “pruning” of disk blocks during accesses of the table by storing the min and max values of selected columns for each “zone” of a table. A zone is simply a range of contiguous blocks within a table. Zone Maps are similar in concept to Exadata storage indexes but are physical objects that can be explicitly controlled and maintained. Additionally, in a similar manner to Bitmap-Join indexes, they can also be defined on a table with an outer join to other tables, with the Zone Map attributes based on columns from these joined tables. They can also be created as part of the new Attribute Clustering capabilities I introduced in my last post.

Before getting too excited though, Zone Maps unfortunately requires the Partitioning Option AND either Exadata or SuperCluster. If you’re still interested, read on :)

I’ll start by creating and populating my BIG_BOWIE table. For this specific post, the key thing to note here is that the ALBUM_ID column is very well clustered.

SQL> create table big_bowie (id number not null, album_id number not null, artist_id number not null, format_id number,

release_date date, total_sales number, description varchar2(100));

Table created.

SQL> create sequence bowie_seq order;

Sequence created.

SQL> create or replace procedure pop_big_bowie as
  begin
    for v_album_id in 1..100 loop
        for v_artist_id in 1..100000 loop
            insert into big_bowie values (bowie_seq.nextval, v_album_id, v_artist_id, ceil(dbms_random.value(0,5)) * 2,
     trunc(sysdate-ceil(dbms_random.value(0,10000))), ceil(dbms_random.value(0,500000)), 'THE RISE AND FALL OF ZIGGY

STARDUST AND THE SPIDERS FROM MARS');
        end loop;
          commit;
    end loop;
    commit;
end;
/

Procedure created.

SQL> exec pop_big_bowie

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'BIG_BOWIE', method_opt=>'FOR ALL COLUMNS SIZE 1');

 

 

I’ll next create a standard b-tree index on the ALBUM_ID column:

SQL> create index dwh_bowie_album_id_i on dwh_bowie(album_id);

Index created.

If I run the following query looking for just specific value of ALBUM_ID (1% of the data):

SQL> select * from big_bowie where album_id=42;

100000 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 1830705794

------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      |   100K|  8984K|  1554   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| BIG_BOWIE            |   100K|  8984K|  1554   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | BIG_BOWIE_ALBUM_ID_I |   100K|       |   201   (2)| 00:00:01 |
------------------------------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ALBUM_ID"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       1590  consistent gets
          0  physical reads
          0  redo size
    9689464  bytes sent via SQL*Net to client
        760  bytes received via SQL*Net from client
         21  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed

The CBO used the index because of the excellent clustering of the data and performed a relatively low 1590 consistent gets (note arraysize was set to 5000).

If I make the index invisible and re-run the query a number of times:

SQL> alter index big_bowie_album_id_i invisible;

Index altered.

SQL> select * from big_bowie where album_id=42;

100000 rows selected.

Elapsed: 00:00:00.29

Execution Plan
----------------------------------------------------------
Plan hash value: 469213804

---------------------------------------------------------------------------------------
| Id  | Operation                 | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |           |   100K|  8984K|  3269  (12)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| BIG_BOWIE |   100K|  8984K|  3269  (12)| 00:00:01 |
---------------------------------------------------------------------------------------


Predicate Information (identified by operation id):
---------------------------------------------------

   1 - storage("ALBUM_ID"=42)
       filter("ALBUM_ID"=42)


Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
     135085  consistent gets
     135068  physical reads
          0  redo size
    3130019  bytes sent via SQL*Net to client
        760  bytes received via SQL*Net from client
         21  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
     100000  rows processed

 

We notice execution times are almost identical despite the Full Table Scan (FTS), as Exadata Storage Indexes (SI) have kicked in. With the data being so well clustered, the SI has been able to very effectively “prune” the physical blocks that need to be accessed. This is clearly highlighted by looking at the sessions statistics of a session that performed one of these queries:

SQL> select name , value/1024/1024 MB
from v$statname n,  v$mystat s
where n.statistic# = s.statistic# and
n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index'); 

NAME                                                                     MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index                    1038.32813
cell physical IO interconnect bytes returned by smart scan       9.56008911

The vast majority of the table has actually been skipped as a result of the storage indexes.

I’m now going to create a very basic version of a Zone Map, one based just on this ALBUM_ID column:

SQL> create materialized zonemap big_bowie_album_id_zm on big_bowie(album_id);

Materialized zonemap created.

Zone Maps are implemented in a similar manner to Materialized Views and are separate table structures that can be refreshed in a similar manner to how Materialized Views are refreshed. This is an important point and a key difference between Zone Maps and conventional indexes in that depending on the refresh property of the Zone Map, may not be updated during DML operations. The default refresh property is REFRESH ON LOAD DATA MOVEMENT, which means the Zone Map in part becomes “stale” after DMLs.

Zone Maps basically store the min and max of the corresponding column(s) within each “zone” of the table. If we have a look at the contents of the Zone Map (below listing in part):

SQL> select zone_id$, min_1_album_id, max_1_album_id, zone_rows$ 

from big_bowie_album_id_zm;

       ZONE_ID$ MIN_1_ALBUM_ID MAX_1_ALBUM_ID ZONE_ROWS$
--------------- -------------- -------------- ----------
   385855025152              1              2      66234
   385855029250              5              6      56715
   385855029251              7              7      76562
   385855025155              7              8      76632
   385855004675              8              9      76633
   385855025161             21             22      75615
   385855004684             29             29      75582
   385855004685             31             32      75545
   385855004687             35             36      75617
   385855029267             43             44      75615
   385855029270             50             50      75481
   385855029275             61             62      75616
   385855025179             62             63      75530
   385855029284             81             82      75615
   385855029285             84             84      75480
   385855004710             87             88      75616
   385855004711             90             91      75484
   385855029293            100            100      75799
   385855029254             13             14      75615
   385855029255             16             16      75481
   385855004681             22             22      75480
   385855004682             24             25      75616


...

   385855025184             73             74      75615
   385855004705             76             77      75615
   385855029283             79             80      75615
   385855029286             86             87      75616
   385855029287             88             89      75618
   385855004714             97             97      75771
   385855029295            100            100      15871



134 rows selected.

A few things to note here. Firstly, the amount of data stored for the Zone Map is tiny. Each zone corresponds to approx. 8M of storage or in this case roughly 75,500 rows.  This is another key difference between Zone Maps and conventional B-Tree indexes. There’s an index entry per referenced row of the table whereas a Zone Map only has one entry per zone of the table. Note also, because the ALBUM_ID data is so well clustered, each zone has a very narrow min/max range of ALBUM_ID values. This means the Zone Map can be extremely effective in “pruning” storage accesses as Oracle can very easily eliminate the majority zones from containing data of interest.

To highlight just how tiny this Zone Map is:

SQL> select segment_name, segment_type, bytes from dba_segments where segment_name like 'BIG_BOWIE%';

SEGMENT_NAME              SEGMENT_TYPE     BLOCKS      BYTES
------------------------- ------------ ---------- ----------
BIG_BOWIE                 TABLE            139264 1140850688
BIG_BOWIE_ALBUM_ID_I      INDEX             20480  167772160
BIG_BOWIE_ALBUM_ID_ZM     TABLE                 8      65536

 

We only need one 64K extent for the Zone Map whereas the Table/Index are much larger. If fact, the 134 rows can actually all fit in just the one 8K data block !!

Two new data dictionary views provides details on our Zone Maps:

SQL> select zonemap_name, pruning, refresh_mode, invalid, stale, unusable  
from dba_zonemaps where zonemap_name='BIG_BOWIE_ALBUM_ID_ZM';

ZONEMAP_NAME              PRUNING  REFRESH_MODE      INVALID STALE   UNUSABLE
------------------------- -------- ----------------- ------- ------- --------
BIG_BOWIE_ALBUM_ID_ZM     ENABLED  LOAD DATAMOVEMENT NO      NO      NO

So the current status of the Zone Map can be determined in DBA_ZONEMAPS.

 

SQL> select measure, position_in_select, agg_function, agg_column_name
from dba_zonemap_measures where zonemap_name='BIG_BOWIE_ALBUM_ID_ZM';

MEASURE                        POSITION_IN_SELECT AGG_FUNCTION  AGG_COLUMN_NAME
------------------------------ ------------------ ------------- ---------------
"BOWIE"."BIG_BOWIE"."ALBUM_ID"                  3 MAX           MAX_1_ALBUM_ID
"BOWIE"."BIG_BOWIE"."ALBUM_ID"                  2 MIN           MIN_1_ALBUM_ID

Additionally, details of the columns within the Zone Map can be determined in DBA_ZONEMAP_MEASURES.

If we now re-run the same query on the BIG_BOWIE table:

SQL> select * from big_bowie where album_id=42;

100000 rows selected.

Elapsed: 00:00:00.28

Execution Plan
----------------------------------------------------------
Plan hash value: 1980960934

----------------------------------------------------------------------------------------------------
| Id  | Operation                              | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                       |           |   100K|  8984K|  3269  (12)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL WITH ZONEMAP| BIG_BOWIE |   100K|  8984K|  3269  (12)| 00:00:01 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("ALBUM_ID"=42)
filter(SYS_ZMAP_FILTER('/* ZM_PRUNING */ SELECT "ZONE_ID$", CASE WHEN
BITAND(zm."ZONE_STATE$",1)=1 THEN 1 ELSE CASE WHEN (zm."MIN_1_ALBUM_ID" > :1 OR
zm."MAX_1_ALBUM_ID" < :2) THEN 3 ELSE 2 END END FROM "BOWIE"."BIG_BOWIE_ALBUM_ID_ZM" zm
WHERE zm."ZONE_LEVEL$"=0 ORDER BY zm."ZONE_ID$"',SYS_OP_ZONE_ID(ROWID),42,42)<3 AND
"ALBUM_ID"=42)
Statistics
----------------------------------------------------------
141  recursive calls
0  db block gets
2364  consistent gets
0  physical reads
0  redo size
3130019  bytes sent via SQL*Net to client
760  bytes received via SQL*Net from client
21  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
100000  rows processed

We notice the CBO has used the Zone Map during the FTS (see the TABLE ACCESS STORAGE FULL WITH ZONEMAP execution step and filter predicate information). It’s performing some additional consistent gets over the database index (2364 vs. 1590) but it’s not too bad, nowhere near the 139,000 odd blocks in the table, with performance comparable to before. The extra consistent gets are necessary as any possible zones need to be read in their entirety, whereas a B-Tree index can point to just the specific table blocks that contain data of interest.

So the Zone Map has provided a “good enough” outcome here. Meanwhile, it has required very minimal storage, we can control how the Zone Map is maintained (to be discussed later) and we can guarantee that the Zone Map object will always exist and be present, even after bouncing either the database or Exadata storage servers.

In Part II, I’ll look at what happens when I perform some DML on the base table.

12.1.0.2 Introduction to Attribute Clustering (The Division Bell) August 26, 2014

Posted by Richard Foote in 12c, Attribute Clustering, Clustering Factor, Oracle Indexes.
3 comments

One of the really cool new features introduced in 12.1.0.2 is Attribute Clustering. This new table based attribute allows you to very easily cluster data in close physical proximity based on the content of specific columns.

As I’ve discussed many times, indexes love table data that is physically clustered in a similar manner to the index as it can significantly improve the efficiency of such indexes. A low Clustering Factor (CF) makes an index more viable and is one of the more important considerations in CBO calculations.

But not only database indexes benefit from well cluster data. Other index structures such as Exadata Storage Indexes and the new Zone Maps (to be discussed in future articles) all benefit from well clustered data. Additionally, compression is likely to be much more effective with data that is well clustered and this in turns also impacts the efficiency of In-memory data (again, to be discussed in future articles).

So having the capability to now easily cluster data in regular heap tables has potentially many benefits.

To illustrate, I’m first going to create a table with data that is not well clustered at all. The CODE column has data that is basically evenly distributed throughout the whole table structure:

SQL> create table ziggy (id number, code number, name varchar2(30));

Table created.

SQL> insert into ziggy select rownum, mod(rownum,100), 'DAVID BOWIE' from dual connect by level >= 2000000;

2000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=>null,
method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

I’ll next create an index on this CODE column and check out its default CF:

SQL> create index ziggy_code_i on ziggy(code);

Index created.

SQL> select index_name, clustering_factor, num_rows from user_indexes
where index_name='ZIGGY_CODE_I';

INDEX_NAME           CLUSTERING_FACTOR   NUM_ROWS
-------------------- ----------------- ----------
ZIGGY_CODE_I                    703133    2000000

For a table with 2 million rows, a CF of some 703,133 is very high and the index is going to be very inefficient when retrieving high numbers of rows.

Let’s run a query that returns a specific CODE value, approx. 1% of all the data (note I’ve set a large arraysize to minimize unnecessary fetches and resultant consistent  gets):

SQL> set arraysize 5000

SQL> select * from ziggy where code = 42;

20000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 2421001569

-----------------------------------------------------------------------------------
| Id  | Operation                 | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT          |       | 20000 |   390K|   383  (17)| 00:00:01 |
|*  1 |  TABLE ACCESS STORAGE FULL| ZIGGY | 20000 |   390K|   383  (17)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
filter("CODE"=42)

Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
15212  consistent gets
0  physical reads
0  redo size
211208  bytes sent via SQL*Net to client
585  bytes received via SQL*Net from client
5  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
20000  rows processed

The CBO has chosen a Full Table Scan and has decided to not use the index. If we hint the SQL:

SQL> select /*+ index (ziggy, ziggy_code_i) */ * from ziggy where code = 42;

20000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3294205578

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              | 20000 |   390K|  7081   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY        | 20000 |   390K|  7081   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY_CODE_I | 20000 |       |    43   (3)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
7081  consistent gets
41  physical reads
0  redo size
511195  bytes sent via SQL*Net to client
585  bytes received via SQL*Net from client
5  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
20000  rows processed

At a cost of 7081, the index is way more expensive than the 383 cost for the FTS. The poor clustering of the CODE data within the table has made the index non viable.

Let’s now create another table, but this one with a clustering attribute set on the CODE column:

SQL> create table ziggy2 (id number, code number, name varchar2(30))

clustering by linear order (code) without materialized zonemap;

Table created.

The CLUSTERING BY LINEAR ORDER clause orders data in the table based on the specified columns, in this case the CODE column. Up to 10 columns can be included using this particular technique (there are other attribute clustering options which I’ll again cover in later articles, yes I’ll be writing quite a few new articles) :) WITHOUT MATERIALIZED ZONEMAP means I don’t want to create these new Zone Maps index structures at this stage which could potentially reduce the amount of table storage needed to be accessed (again, I’ll discuss these at another time).

You must use a direct path insert to make use of attribute clustering (or reorganize the table as we’ll see).

So lets insert the exact same data into this new ZIGGY2 table via a straight direct path sub-select:

SQL> insert /*+ append */ into ziggy2 select * from ziggy;

2000000 rows created.

SQL> SELECT * FROM TABLE(dbms_xplan.display_cursor);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
SQL_ID  0arqdyc9vznpg, child number 0
-------------------------------------
insert /*+ append */ into ziggy2 select * from ziggy

Plan hash value: 1975011999

--------------------------------------------------------------------------------------------------
| Id  | Operation                        | Name  | Rows  | Bytes |TempSpc| Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT                 |       |       |       |       | 10596 (100)|          |
|   1 |  LOAD AS SELECT                  |       |       |       |       |            |          |
|   2 |   OPTIMIZER STATISTICS GATHERING |       |  2000K|    38M|       | 10596   (3)| 00:00:01 |
|   3 |    SORT ORDER BY                 |       |  2000K|    38M|    61M| 10596   (3)| 00:00:01 |
|   4 |     TABLE ACCESS STORAGE FULL    | ZIGGY |  2000K|    38M|       |   376  (16)| 00:00:01 |
--------------------------------------------------------------------------------------------------

SQL> commit;

Commit complete.

Notice the SORT ORDER BY step in the insert execution plan. This implicitly sorts the incoming data in CODE order to satisfy the attribute clustering requirement.

If we create an index on this table and examine the CF:

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY2',
estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> select index_name, clustering_factor, num_rows
from user_indexes where index_name='ZIGGY2_CODE_I';

INDEX_NAME           CLUSTERING_FACTOR   NUM_ROWS
-------------------- ----------------- ----------
ZIGGY2_CODE_I                     7072    2000000

We notice the default CF is indeed significantly lower at just 7072 than the previous value of 703133.

If we now run the equivalent query as before on this table:

SQL> select * from ziggy2 where code=42;

20000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 16801974

-----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |               | 20000 |   390K|   114   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY2        | 20000 |   390K|   114   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY2_CODE_I | 20000 |       |    43   (3)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
121  consistent gets
41  physical reads
0  redo size
511195  bytes sent via SQL*Net to client
585  bytes received via SQL*Net from client
5  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
20000  rows processed

We notice the CBO has now decided to use the index. This is due to the cost of the index based execution plan being just 114, significantly lower than the previous index cost of 7081 or the FTS at a cost of 383. Just as importantly, the resultant number of consistent gets has also significantly reduced to just 121, significantly less than the previous 7081 consistent gets when using the index. So the index is indeed much more efficient to use and the CBO costs for this is just about spot on. The end result is that performance has improved.

So how to now likewise improve the performance of the first table? Simple add the attribute clustering and reorganize the table:

SQL> alter table ziggy add clustering by linear order(code) without materialized zonemap;

Table altered.

SQL> alter table ziggy move;

Table altered.

SQL> alter index ziggy_code_i rebuild;

Index altered.

SQL> select index_name, clustering_factor, num_rows from user_indexes where index_name='ZIGGY_CODE_I';

INDEX_NAME      CLUSTERING_FACTOR   NUM_ROWS
--------------- ----------------- ----------
ZIGGY_CODE_I                 7134    2000000

So as expected, the CF has likewise reduced. So if we now run the query:

SQL> select * from ziggy where code=42;

20000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3294205578

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              | 20000 |   390K|   115   (1)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY        | 20000 |   390K|   115   (1)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY_CODE_I | 20000 |       |    43   (3)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
121  consistent gets
0  physical reads
0  redo size
511195  bytes sent via SQL*Net to client
585  bytes received via SQL*Net from client
5  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
20000  rows processed

The query likewise uses the index and with far less consistent gets and performance is significantly better.

So attribute clustering provides a nice mechanism by which data in a heap table (or importantly within a partition or sub-partition) can be physically clustered in a manner that can be potentially beneficial in various scenarios. Of course, the decision on how to actually cluster the data and on which columns is somewhat crucial :)

12.1.0.2 Released With Cool Indexing Features (Short Memory) July 25, 2014

Posted by Richard Foote in 12c, Advanced Index Compression, Attribute Clustering, Database In-Memory, Zone Maps.
2 comments

Oracle Database 12.1.0.2 has finally been released and it has a number of really exciting goodies from an indexing perspective which include:

  • Database In-Memory Option, which enables specific portions of the database to be in dual format, in both the existing row based format and additionally into an efficient memory only columnar based format. This in turn enables analytical based processing to access the real-time data in the In-Memory Store extremely fast, potentially faster and more effectively than via standard analytical based database indexes.
  • Advanced Index Compression, which allows Oracle to automatically choose the appropriate compression method for each individual leaf block, rather than having to manually select a single compression method across the whole index. This makes compressing an index a breeze and much more effective than previously possible.
  • Zone Maps, which enables Storage Index like capabilities to be manually configured and physically implemented inside the database, to eliminate unnecessary accesses of table storage via much smaller objects than conventional database indexes.
  • Attribute Clustering, a new table attribute which enables much better clustering of table data and we all know how both compression and index structures love table data to be well clustered.

These are all topics I’ll be covering in the coming weeks so stay tuned :)

Presenting at UKOUG Tech14 Conference (Ian Fish, U K Heir) July 15, 2014

Posted by Richard Foote in Richard Presentations, UKOUG Tech14.
add a comment

I’ve been lucky enough to present at various conferences, seminars and user group events over the years in some 20 odd countries. But somewhere I’ve never quite managed to present at before is the place of my birth, the UK. Well this year, I’ve decided end my drought and submitted a number of papers for the UKOUG Tech14 Conference and the parallel OakTable World UK14 Conference which were all thankfully accepted. So the land of David Bowie, Coronation Street, Bruce Forsyth, televised darts and a touch of drizzle, here I come.

The conference this year is being held in balmy Liverpool on 8-10 December 2014.

I’ll be presenting (times subject to change):

Oracle Indexes Q&A Session With Richard Foote (at OakTable World UK14)
(08/12/2014 09:00 – 09:50)

Oracle Database 12c New Indexing Features
(09/12/2014 12:00 – 12:50)

Indexing In Exadata
(10/12/2014 11:30 – 12:20)

 

Hopefully I’ll get to meet some of you there :)

12c Index Like Table Statistics Collection (Wearing The Inside Out) July 9, 2014

Posted by Richard Foote in 12c, Automatic Table Statistics, Index statistics, Oracle Indexes.
2 comments

This change introduced in 12c has caught me out on a number of occasions.

If you were to create a new table:

SQL> create table thin_white_duke1 (id number, code number, name varchar2(30));

Table created.

And then populate it with a conventional insert:

SQL> insert into thin_white_duke1 select rownum, mod(rownum,10), 'DAVID BOWIE'  from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

We find there are no statistics associated with the table until we explicitly collect them:

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed from dba_tables where table_name='THIN_WHITE_DUKE1';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE1

But if we were to now create an index on this table:

SQL> create index thin_white_duke1_code_i on thin_white_duke1(code);

Index created.

We find that we now do indeed have index statistics collected by default (since 9i days anyways):

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name='THIN_WHITE_DUKE1_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE1_CODE_I    1000000          2        1936            10

OK, if we now create another table and index:

SQL> create table thin_white_duke2 (id number, code number, name varchar2(30));

Table created.

SQL> create index thin_white_duke2_code_i on thin_white_duke2(code);

Index created.

And populate it with a bulk load parallel, append insert:

SQL> insert /*+ append */ into thin_white_duke2 select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <=1000000;

1000000 rows created.

 SQL> commit;

Commit complete.

If we now look at the table statistics:

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed
from dba_tables where table_name='THIN_WHITE_DUKE2';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE2    1000000       3511          0          20 09-JUL-14

SQL> select column_name, num_distinct, density, num_buckets from dba_tab_columns
 where table_name='THIN_WHITE_DUKE2';

COLUMN_NAME NUM_DISTINCT    DENSITY NUM_BUCKETS
----------- ------------ ---------- -----------
NAME                   1          1           1
CODE                  10         .1           1
ID               1000000    .000001           1

SQL> select column_name, num_distinct, density, histogram, notes
from dba_tab_col_statistics where table_name='THIN_WHITE_DUKE2';

COLUMN_NAME NUM_DISTINCT    DENSITY HISTOGRAM  NOTES
----------- ------------ ---------- ---------- -------------
NAME                   1          1 NONE       STATS_ON_LOAD
CODE                  10         .1 NONE       STATS_ON_LOAD
ID               1000000    .000001 NONE       STATS_ON_LOAD

 

We notice that in 12c, they’re automatically populated and accurate. The NOTES column in dba_tab_col_statistics highlights that the statistics were collected via STATS_ON_LOAD.

If however we look at the current state of the index statistics:

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys from dba_indexes where index_name='THIN_WHITE_DUKE2_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE2_CODE_I          0          0           0             0

We noticed they haven’t changed since the index was initially created. So by populating an empty table with an append insert, table statistics are now collected on the fly, but not the statistics on any existing indexes.

If we want to now collect just the index statistics and potentially any useful histograms on the table (as histograms are not collect during the bulk load operation), we can use the new GATHER_AUTO option with the dbms_stats.gather_table_stats procedure without having to re-gather base table statistics:

SQL> exec dbms_stats.gather_table_stats(ownname=>user,
tabname=>'THIN_WHITE_DUKE2', options=>'GATHER AUTO');

PL/SQL procedure successfully completed.

SQL> select index_name, num_rows, blevel, leaf_blocks, distinct_keys
from dba_indexes where index_name='THIN_WHITE_DUKE2_CODE_I';

INDEX_NAME                NUM_ROWS     BLEVEL LEAF_BLOCKS DISTINCT_KEYS
----------------------- ---------- ---------- ----------- -------------
THIN_WHITE_DUKE2_CODE_I    1000000          2        1739            10

 The same automatic table statistics gathering occurs when we create a table via a sub-select clause:

SQL> create table thin_white_duke3
as select rownum id, mod(rownum,1000) code, 'DAVID BOWIE' name
from dual connect by level <= 1000000;

Table created.

SQL> select table_name, num_rows, blocks, avg_space, avg_row_len, last_analyzed
from dba_tables where table_name='THIN_WHITE_DUKE3';

TABLE_NAME         NUM_ROWS     BLOCKS  AVG_SPACE AVG_ROW_LEN LAST_ANAL
---------------- ---------- ---------- ---------- ----------- ---------
THIN_WHITE_DUKE3    1000000       3787          0          22 09-JUL-14

Of course, sometimes when creating test tables as I regularly do, one sometimes forgets these statistics are now automatically collected !!

It was 12 years ago today… July 8, 2014

Posted by Richard Foote in Oracle Indexes.
4 comments

It was exactly 12 years ago today that I first presented my Index Internals – Rebuilding The Truth presentation at a local ACT Oracle User Group event.

And so my association with Oracle indexes started. It would be an interesting statistic to know how many people have subsequently read the presentation :) It would no doubt result in a few 50-50 block splits !!

How time flies :)

 

Richard Foote’s Art Work (Art Decade) May 5, 2014

Posted by Richard Foote in Richard's Art, Richard's Musings.
add a comment

Just a quick note to say I have a new tab on the blog called “Richard Foote’s Art” where I’ll periodically upload a new piece of my art work.

As the page says, they highlight both what’s in my mind (so beware) and how I like to relax in the evenings.

All pieces are for sale for $1 Million (or nearest offer) :) Actually, they’re not really, the one below has already gone to a good home.

Enjoy !!

Random Reads

“Random Reads” 2014

Estimate Index Size With Explain Plan (I Can’t Explain) April 24, 2014

Posted by Richard Foote in Estimate Index Size, Explain Plan For Index, Oracle Indexes.
9 comments

I discussed recently an updated MOS note that details the needs vs. the implications of rebuilding indexes.

Following is a neat little trick if you want to very quickly and cheaply estimate the size of an index if it were to be rebuilt or a new index before you actually create the thing. I meant to blog about this sometime ago but was re- reminded of it when I recently came across this entry in Connor McDonald’s excellent blog.

I’ll start by creating a table with a bunch of rows:

SQL> create table ziggy as select o.* from dba_objects o, dba_users;

Table created.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY');

PL/SQL procedure successfully completed.

SQL> select count(*) from ziggy;

  COUNT(*)
----------
   3939187

I’m thinking of creating an index on the OBJECT_NAME column, but I’m unsure if I’ll have enough free space in my tablespace. So let’s quickly get an estimate of the index size by simply generating the explain plan of the CREATE INDEX statement:

SQL> explain plan for create index ziggy_object_name_i on ziggy(object_name);

Explained.

Elapsed: 00:00:00.09

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
Plan hash value: 1219136602

----------------------------------------------------------------------------------------------

| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     |

----------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | CREATE INDEX STATEMENT |                     |  3939K|    93M| 22032   (3)| 00:00:01 |
|   1 |  INDEX BUILD NON UNIQUE| ZIGGY_OBJECT_NAME_I |       |       |            |          |
|   2 |   SORT CREATE INDEX    |                     |  3939K|    93M|            |          |
|   3 |    TABLE ACCESS FULL   | ZIGGY               |  3939K|    93M| 17199   (4)| 00:00:01 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
Note
-----
   - estimated index size: 159M bytes

 

Notice the little note below the execution plan. Oracle has estimated an index size of approximately 159M bytes and it’s only taken it 0.09 seconds to do so. A trace of the session highlights how Oracle simply uses the table statistics in its determination of the estimated index size.

Well, that’s OK I have sufficient space for an index of that size.  Let’s create the physical index and check out its actual size:

SQL> create index ziggy_object_name_i on ziggy(object_name);

Index created.

SQL> select bytes from dba_segments where segment_name='ZIGGY_OBJECT_NAME_I';

     BYTES
----------
 163577856

SQL> analyze index ziggy_object_name_i validate structure;

Index analyzed.

SQL> select btree_space from index_stats;

BTREE_SPACE
-----------
  157875040

 

Not bad at all, the estimate and actual index sizes are pretty well spot on.

There are some limitations however. Let’s pick another column, SUBOBJECT_NAME, which has a large number of NULL values:

SQL> select count(*) from ziggy where subobject_name is not null;

  COUNT(*)
----------
     33669

SQL> explain plan for create index ziggy_subobject_name_i on ziggy(subobject_name);

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Plan hash value: 4065057084

-------------------------------------------------------------------------------------------------
| Id  | Operation              | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | CREATE INDEX STATEMENT |                        |  3939K|  7693K| 20132  (4)| 00:00:01 |
|   1 |  INDEX BUILD NON UNIQUE| ZIGGY_SUBOBJECT_NAME_I |       |       |     |          |
|   2 |   SORT CREATE INDEX    |                        |  3939K|  7693K|     |          |
|   3 |    TABLE ACCESS FULL   | ZIGGY                  |  3939K|  7693K| 17238  (4)| 00:00:01 |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

-------------------------------------------------------------------------------------------------
Note
-----
   - estimated index size: 100M bytes

The SUBOBJECT_NAME column only has a relatively few (33,669) values that are not null, but the explain plan is still estimating the index to have the full 3.9 million rows (remembering that fully null indexed values are not indexed in a B-Tree index). The estimated index size of 100M is therefore not going to be particularly accurate.

SQL> create index ziggy_subobject_name_i on ziggy(subobject_name);

Index created.

SQL> select bytes from dba_segments where segment_name='ZIGGY_SUBOBJECT_NAME_I';

      
     BYTES
----------
   1048576

SQL> analyze index ziggy_subobject_name_i validate structure;

Index analyzed.

SQL> select btree_space from index_stats;

BTREE_SPACE
-----------
     928032

So in this example, the estimated index size is indeed way off. This method doesn’t seem to cater for null index values and assumes the index to be fully populated.

However, if we simply take the known ratio of  not null values (in this example, 33669 not null rows /3939187 total rows =0.00855) and then apply it to the calculated estimate (100M x .00855 = 0.855M), where are now back into accurate ballpark territory again.

Of course, such estimates are based on the accuracy of the table statistics. If we have stale statistics, we’ll have stale index size estimates.

Let’s insert more rows and double the size of the table and associated index: 

SQL> insert into ziggy select * from ziggy;

3939187 rows created.

SQL> commit;

Commit complete.

If we re-run the index creation explain plan:

SQL> explain plan for create index ziggy_object_name_i on ziggy(object_name);

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Plan hash value: 746589531

----------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
|   0 | CREATE INDEX STATEMENT |                     |  3939K|    93M| 22032   (3)| 00:00:01 |
|   1 |  INDEX BUILD NON UNIQUE| ZIGGY_OBJECT_NAME_I |       |       |            |          |
|   2 |   SORT CREATE INDEX    |                     |  3939K|    93M|            |          |
|   3 |    INDEX FAST FULL SCAN| ZIGGY_OBJECT_NAME_I |       |       |            |          |

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
Note
-----
   - estimated index size: 159M bytes

We get the same estimate as before. We need to update the table statistics in order to get an updated and more accurate index size estimate:

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY');

PL/SQL procedure successfully completed.

SQL> explain plan for create index ziggy_object_name_i on ziggy(object_name);

Explained.

SQL> select * from table(dbms_xplan.display);

PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

Plan hash value: 746589531

----------------------------------------------------------------------------------------------
| Id  | Operation              | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------

|   0 | CREATE INDEX STATEMENT |                     |  7878K|   187M| 45811   (3)| 00:00:02 |
|   1 |  INDEX BUILD NON UNIQUE| ZIGGY_OBJECT_NAME_I |       |       |            |          |
|   2 |   SORT CREATE INDEX    |                     |  7878K|   187M|            |          |
|   3 |    INDEX FAST FULL SCAN| ZIGGY_OBJECT_NAME_I |       |       |            |          |


PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------
Note
-----
   - estimated index size: 318M bytes

Both the estimated index entries and  index size are now much more accurate.

The number of expected index entries is therefore a useful guide as to the potential accuracy of the size estimate.

So the next time you’re wondering whether an index is significantly larger than it should or whether you have sufficient space for a new index, this is a useful, simple technique to get a quick estimate.

Presenting at ODTUG Kscope14 Conference in Seattle June 22-26 2014 April 23, 2014

Posted by Richard Foote in ODTUG Kscope14.
add a comment

Kscope14

 

Just a short note to say I’ll be presenting at the Oracle Development Tools User Group (ODTUG) Kaleidoscope 14 Conference this year in beautiful Seattle, Washington on June 22-26 2014. I had a fantastic time when I attended this conference a few years ago when it was held in Monterey so I’m really looking forward to it. It has another great lineup of speakers this year including Tom Kyte, Jonathan Lewis, Cary Millsap, Tim Gorman, Alex Gorbachev, Kyle HaileyKellyn Pot’Vin, Yury Velikanov and Bryn Llewellyn to name but a very few so it should be another excellent event.

I’ll be presenting a couple of papers and perhaps be on the odd panel or two.

  • New Indexing Features Introduced In Oracle Database 12c (Monday June 23, Session 3 2:30 pm – 3:30 pm)
  • Indexing In Exadata (Tuesday June 24, Session 8 2 pm – 3 pm)

Hope to see some of you there although sadly it looks like Stanley The ACE Vest won’t be able to make it this time :)

Indexing Foreign Key Constraints With Invisible Indexes (Invisible People) April 22, 2014

Posted by Richard Foote in 12c, Block Dumps, Foreign Keys, Invisible Indexes, Oracle Indexes.
1 comment so far

In my previous post I discussed when deleting rows from parent tables, how Bitmap Indexes based on the FK constraint can prevent the expensive Full Tables Scans (FTS) on the child tables but not the associated exclusive table locks.

Last year, I discussed how it was possible in Oracle Database 12c to have multiple indexes on the same column list.

Quite some time ago, I discussed how so-called Invisible Indexes can indeed still be visible in various scenarios, including when policing FK constraints.

Well, lets put all these three topics together :)

First, let use the same basic setup as the last post:

SQL> create table bowie_dad (id number, dad_name varchar2(30)); 

Table created.

SQL> insert into bowie_dad values (1, 'DAVID BOWIE'); 

1 row created.

SQL> insert into bowie_dad values (2, 'ZIGGY STARDUST'); 

1 row created.

SQL> insert into bowie_dad values (3, 'MAJOR TOM');

1 row created.

SQL> insert into bowie_dad values (4, 'THIN WHITE DUKE');

1 row created.

SQL> commit;

Commit complete.

SQL> create table bowie_kid (id number, kid_name varchar2(30), dad_id number);

Table created.

SQL> insert into bowie_kid select rownum, 'ALADDIN SANE', mod(rownum,3)+2 from dual connect by level <=1000000; 

1000000 rows created.

SQL> commit;

Commit complete.

SQL> alter table bowie_dad add primary key(id);

Table altered.

SQL> alter table bowie_kid add constraint bowie_kid_fk foreign key(dad_id) references bowie_dad(id);

Table altered.

 

We’re now going to create two indexes concurrently on the FK constraint on the DAD_ID column, a Bitmap Index and an invisible B-Tree Index as is now possible since Oracle Database 12c:

SQL> create bitmap index bowie_kid_fk_i on bowie_kid(dad_id); 

Index created.

SQL> create index bowie_kid_fk2_i on bowie_kid(dad_id) invisible;

Index created.

Oracle Database 12c allows us to now create multiple indexes on the same column list, providing only one index is visible at a time.

Let’s look at a partial block dump of the first leaf block of each index. First the Bitmap Index:

Block header dump:  0x0180805c
 Object id on Block? Y
 seg/obj: 0x16f45  csc: 0x00.36bc54  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1808058 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.0036bc54
Leaf block dump
===============
header address 32801380=0x1f48264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 959=0x3bf
kdxcoavs 919
kdxlespl 0
kdxlende 0
kdxlenxt 25198685=0x180805d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4499] flag: ——-, lock: 0, len=3537
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 00
col 2; len 6; (6):  01 80 80 2c 00 3f
col 3; len 3516; (3516):
 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24
 49 92 24 49 cb 92 24 49 92 ff 33 24 49 92 24 49 92 24 49 cf 92 24 49 92 24
 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cc 92 24 49
 92 24 ff 32 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24
 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cb 92 24 49 92 ff 33 49 92 24 49

 

Note the indexed value is c1 03, denoting the lowest DAD_ID=2 currently in the table.

Now the  partial block dump of the invisible B-Tree Index:

 

Block header dump:  0x0181b724
 Object id on Block? Y
 seg/obj: 0x16f46  csc: 0x00.36bc78  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x181b720 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.0036bc78
Leaf block dump
===============
header address 32801380=0x1f48264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 513
kdxcofbo 1062=0x426
kdxcofeo 1880=0x758
kdxcoavs 818
kdxlespl 0
kdxlende 0
kdxlenxt 25278245=0x181b725
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 01
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 04
row#2[8000] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 07

Again as expected the first index entry is C1 03.

With only a visible Bitmap Index in place, does that mean we’ll have table locking issues if we delete a parent row with current transactions in place ? Let’s check it out.

In one session, we have a current transaction on the child table:

SQL> insert into bowie_kid values (1000001, 'LOW', 4); 

1 row created.

 

In another session, we attempt to delete a parent row (with an ID = 1 which doesn’t currently exist with the child table):

SQL> delete bowie_dad where id = 1; 

1 row deleted.

We note the DML was successful and didn’t hang. This means the B-Tree index is clearly being used to police this constraint, even though it’s currently invisible.

In a third session, we now attempt to insert a child row using a FK value that’s in the process of being deleted:

SQL> insert into bowie_kid values (1000003, 'HEROES', 1);

 

As expected, it hangs as it’s currently effectively waiting on the row level lock made possible by the index entry in the B-Tree index as invisible indexes are still maintained behind the scenes. If we look at a fresh block dump of both indexes, beginning with the Bitmap Index:

Block header dump:  0x0180805c
 Object id on Block? Y
 seg/obj: 0x16f45  csc: 0x00.36bc54  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1808058 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.0036bc54
Leaf block dump
===============
header address 402948708=0x18048264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 959=0x3bf
kdxcoavs 919
kdxlespl 0
kdxlende 0
kdxlenxt 25198685=0x180805d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4499] flag: ——-, lock: 0, len=3537
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 00
col 2; len 6; (6):  01 80 80 2c 00 3f
col 3; len 3516; (3516):
 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24
 49 92 24 49 cb 92 24 49 92 ff 33 24 49 92 24 49 92 24 49 cf 92 24 49 92 24
 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cc 92 24 49
 92 24 ff 32 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24
 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cb 92 24 49 92 ff 33 49 92 24 49

 

We note the Bitmap Index has not been updated. It still lists the C1 03 value as the minimum indexed value.

However, if we look at the invisible B-Tree index:

Block header dump:  0x0181b724
 Object id on Block? Y
 seg/obj: 0x16f46  csc: 0x00.36bc78  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x181b720 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0x0008.015.00000b86  0x014316ab.01c5.42  —-    1  fsc 0x0000.00000000
Leaf block dump
===============
header address 402948708=0x18048264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 514
kdxcofbo 1064=0x428
kdxcofeo 1868=0x74c
kdxcoavs 804
kdxlespl 0
kdxlende 0
kdxlenxt 25278245=0x181b725
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[1868] flag: ——-, lock: 2, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 81 b6 f3 00 00
row#1[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 01
row#2[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 7f d3 00 04
row#3[8000] flag: ——-, lock: 0, len=12

It has been updated and lists a new index entry C1 02 as the minimum value now in the index.

So the B-Tree index can be used to successfully police the FK index and prevent the possible table level locking issues associated with deleting parent rows, even though it’s invisible and there is an equivalent visible Bitmap index in place. Invisible indexes are simply not considered as viable execution paths by the Cost Based Optimizer, but may still be “visible” in a variety of scenarios such as quietly policing constraints behind the scenes.

Do I recommend creating two such indexes in Oracle Database 12c. Well, no as the costs of maintaining both indexes need to be considered. But I certainly do caution simply making indexes invisible and expecting the database to behave in exactly the same manner if the index were to be subsequently dropped.

Because rolling back all the above and then dropping the invisible index:

SQL> drop index bowie_kid_fk2_i;

Index dropped.

SQL> insert into bowie_kid values (1000001, 'LOW', 4);

1 row created.

 

Means in another session the parent delete operation will now hang without the B-Tree index being in place:

SQL> delete bowie_dad where id = 1;

 

 

 

 

 

 

Indexing Foreign Key Constraints With Bitmap Indexes (Locked Out) April 17, 2014

Posted by Richard Foote in Bitmap Indexes, Block Dumps, Foreign Keys, Index Internals, Oracle Indexes.
4 comments

Franck Pachot made a very valid comment in my previous entry on Indexing Foreign Keys (FK) that the use of a Bitmap Index on the FK columns does not avoid the table locks associated with deleting rows from the parent table. Thought I might discuss why this is the case and why only a B-Tree index does the trick.

Let’s first setup some very simple Parent-Child tables:

SQL> create table bowie_dad (id number, dad_name varchar2(30));

Table created.

SQL> insert into bowie_dad values (1, 'DAVID BOWIE');

1 row created.

SQL> insert into bowie_dad values (2, 'ZIGGY STARDUST');

1 row created.

SQL> insert into bowie_dad values (3, 'MAJOR TOM');

1 row created.

SQL> insert into bowie_dad values (4, 'THIN WHITE DUKE');

1 row created.

SQL> commit;

Commit complete.

SQL> create table bowie_kid (id number, kid_name varchar2(30), dad_id number);

Table created.

SQL> insert into bowie_kid select rownum, 'ALADDIN SANE', mod(rownum,3)+2 from dual connect by level >=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> alter table bowie_dad add primary key(id);

Table altered.

SQL> alter table bowie_kid add constraint bowie_kid_fk foreign key(dad_id) references bowie_dad(id);

Table altered.

OK, so we have a small parent table (BOWIE_DAD) and a much larger child table (BOWIE_KID) with all the necessary constraints in place. Note we don’t actually have a child row with a  FK DAD_ID = 1. So we can potentially delete this row from the BOWIE_DAD table (where ID = 1).

Let’s begin by creating a B-Tree  index on the FK column (DAD_ID) and have a look a partial block dump of the first leaf block in the index:

SQL> create index bowie_kid_fk_i on bowie_kid(dad_id);

Index created.

 

Block header dump:  0x01806efc
 Object id on Block? Y
 seg/obj: 0x16f0b  csc: 0x00.35f861  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1806ef8 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.0035f861
Leaf block dump
===============
header address 360809060=0x15818264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 513
kdxcofbo 1062=0x426
kdxcofeo 1880=0x758
kdxcoavs 818
kdxlespl 0
kdxlende 0
kdxlenxt 25194237=0x1806efd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 00
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 03
row#2[8000] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 06
…..

 

We’ll compare future block dumps with this one but for now just note that the first index entry has a value of (hex) C1 03, which corresponds to the minimum value for DAD_ID = 2 we currently have in this table/index.

If we insert a new child record in one session (but not yet commit);

SQL> insert into bowie_kid values (1000001, 'LOW', 4);

1 row created.

In a second session, we can delete (but not yet commit) the unwanted parent row without any locking implications thanks to this index on the FK column:

SQL> delete bowie_dad where id = 1;

1 row deleted.

In a third session, we can insert another child record again with no locking implications, providing we don’t attempt to use the parent value the second session is in the process of deleting:

SQL> insert into bowie_kid values (1000002, 'LOW', 3);

1 row created.

But if we do try to insert a new child row with a FK value for which the parent is in the process of being deleted:

SQL> insert into bowie_kid values (1000003, 'HEROES', 1);

The statement hangs and it will do so until the transaction deleting the parent record commits (in which case it will receive an ORA-02291 integrity constraint error) or the transaction rolls back (in which case the insert will succeed).

If we take a fresh dump of the first leaf block (which must contain the associated index entry as it’s the minimum value now in the table):

 Block header dump:  0x01806efc
 Object id on Block? Y
 seg/obj: 0x16f0b  csc: 0x00.35f861  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1806ef8 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0x0008.004.00000b8a  0x01431602.01c5.14  —-    1  fsc 0x0000.00000000
Leaf block dump
===============
header address 225280612=0xd6d8264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 514
kdxcofbo 1064=0x428
kdxcofeo 1868=0x74c
kdxcoavs 804
kdxlespl 0
kdxlende 0
kdxlenxt 25194237=0x1806efd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[1868] flag: ——-, lock: 2, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 80 7f 38 00 00
row#1[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 00
row#2[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 03

 

We notice we indeed do have a new index entry (highlighted above), with all the associated locking information in ITL slot 2 for the new row in which the session is locked. So the key point here is that the index is indeed updated and Oracle can proceed or not depending on what happens with the transaction on the parent table. The overhead of this new index entry is minimal and locking can be easily policed and restricted to just the index entries with this specific value (hex) C1 02 which corresponds to DAD_ID = 1.

If we do indeed proceed with the delete on the parent table:

SQL> commit;

Commit complete.

 

The session attempting to insert the now deleted parent FK value indeed fails:

 

SQL> insert into bowie_kid values (1000002, 'HEROES', 1);
insert into bowie_kid values (1000002, 'HEROES', 1)
*
ERROR at line 1:
ORA-02291: integrity constraint (BOWIE.BOWIE_KID_FK) violated - parent key not
found

 

And we notice with a fresh block dump that the index entry has been removed by the now unlocked session:

 

Block header dump:  0x01806efc
 Object id on Block? Y
 seg/obj: 0x16f0b  csc: 0x00.35f861  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1806ef8 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.0035f861
Leaf block dump
===============
header address 225280612=0xd6d8264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 513
kdxcofbo 1062=0x426
kdxcofeo 1880=0x758
kdxcoavs 818
kdxlespl 0
kdxlende 0
kdxlenxt 25194237=0x1806efd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 00
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 03
row#2[8000] flag: ——-, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 06

Everything is back to the way it was previously.

 

OK, let’s now re-insert the parent row, drop the FK index and replace it with a Bitmap Index instead:

 

SQL> insert into bowie_dad values (1, 'DAVID BOWIE');

1 row created.

SQL> commit;

Commit complete.

SQL> drop index bowie_kid_fk_i;

Index dropped.

SQL> create bitmap index bowie_kid_fk_i on bowie_kid(dad_id);

Index created.

 

If we take a look at a partial block dump of the first leaf block of this Bitmap Index:

 

Block header dump:  0x01806efc
 Object id on Block? Y
 seg/obj: 0x16f14  csc: 0x00.3602fc  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1806ef8 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.003602fc
Leaf block dump
===============
header address 360809060=0x15818264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 958=0x3be
kdxcoavs 918
kdxlespl 0
kdxlende 0
kdxlenxt 25194237=0x1806efd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4498] flag: ——-, lock: 0, len=3538
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 00
col 2; len 6; (6):  01 80 6e cc 00 3f
col 3; len 3517; (3517):
 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49
 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24 01 ff 32 92 24 49 92 24 49
 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92
 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92 24 49 cf 92 24 49 92
 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cb 92 24
 49 92 ff 33 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24
 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cc 92 24 49 92 24 ff 32 24 49 92
 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24
 49 92 24 49 92 24 49 cb 92 24 49 92 ff 33 92 24 49 92 24 49 92 24 cf 49 92
 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cc
 49 92 24 49 02 ff 32 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf
 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24 01 ff 32
 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24
 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92 24
 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49
 92 24 49 cb 92 24 49 92 ff 33 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24
 01 ff 32 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24
 49 92 24 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24

….

 

We notice the first key difference here in that these Bitmap Index entries are potentially HUGE, with just the 2 index entries in this block. The other thing to note is the combination of Bitmap indexes and DMLs can result in locking hell because if an index entry needs to be modified (resulting in a change in the compressed bitmap string), all rows between the rowid ranges specified within the Bitmap Index entry are effectively locked. So Bitmap Indexes introduce severe locking issues, regardless of the Parent/Child update issue highlighted above.

If we insert a child row in one session:

SQL> insert into bowie_kid values (1000001, 'LOW', 4);

1 row created.

And in another session insert another row with the same FK value:

SQL> insert into bowie_kid values (1000002, 'HEROES', 4);

The session hangs until the transaction in the first session completes because of the locking implications introduced with the Bitmap Index.

 

Therefore, with a Bitmap Index in place, the last of our worries will be locking issues associated with deleting a parent row. After rolling back the above, we attempt the following. In one session, we insert a child record:

SQL> insert into bowie_kid values (1000001, 'LOW', 4);

1 row created.

In a second session, we delete the unwanted parent row:

SQL> delete bowie_dad where id = 1;

and it hangs. The Bitmap Index is not effective in preventing this lock as it was with the B-Tree Index.

In a third session, we attempt to insert a child row with the soon to be deleted parent key:

SQL> insert into bowie_kid values (1000002, 'HEROES', 1);

and it hangs as well. So the Bitmap Index on the FK does not prevent the locking hell such parent deletes can introduce into our environments.

If we roll all this back and simply have one session delete a parent row:

SQL> delete bowie_dad where id = 1;

1 row deleted.

And in another session insert a child row with the FK about to be deleted, the insert hangs as expected with an exclusive transaction lock:

SQL> insert into bowie_kid values (1000001, 'BOWIE', 1);

 

However, if we look at a fresh partial block dump of the first Bitmap Index leaf block:

Block header dump:  0x01806efc
 Object id on Block? Y
 seg/obj: 0x16f14  csc: 0x00.3602fc  itc: 2  flg: E  typ: 2 – INDEX
     brn: 0  bdba: 0x1806ef8 ver: 0x01 opc: 0
     inc: 0  exflg: 0
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0000.000.00000000  0x00000000.0000.00  —-    0  fsc 0x0000.00000000
0x02   0xffff.000.00000000  0x00000000.0000.00  C—    0  scn 0x0000.003602fc
Leaf block dump
===============
header address 225280612=0xd6d8264
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 4
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 958=0x3be
kdxcoavs 918
kdxlespl 0
kdxlende 0
kdxlenxt 25194237=0x1806efd
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4498] flag: ——-, lock: 0, len=3538
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 52 73 00 00
col 2; len 6; (6):  01 80 6e cc 00 3f
col 3; len 3517; (3517):
 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49
 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24 01 ff 32 92 24 49 92 24 49
 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92
 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92 24 49 cf 92 24 49 92
 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cb 92 24
 49 92 ff 33 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24
 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cc 92 24 49 92 24 ff 32 24 49 92
 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24
 49 92 24 49 92 24 49 cb 92 24 49 92 ff 33 92 24 49 92 24 49 92 24 cf 49 92
 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf 92 24 49 92 24 49 92 24 cc
 49 92 24 49 02 ff 32 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24 49 cf
 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24 01 ff 32
 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92 24
 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24 49 92 24
 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24 49
 92 24 49 cb 92 24 49 92 ff 33 49 92 24 49 92 24 49 92 cf 24 49 92 24 49 92
 24 49 cf 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cc 24 49 92 24
 01 ff 32 92 24 49 92 24 49 92 24 cf 49 92 24 49 92 24 49 92 cf 24 49 92 24
 49 92 24 49 cf 92 24 49 92 24 49 92 24 cc 49 92 24 49 02 ff 32 24 49 92 24

…..

 

Unlike the B-Tree index which was updated, the Bitmap index has remained unchanged. No attempt was made by Oracle at this stage to insert the index entry as such a new Bitmap Index entry would likely generate too much overheads and not appreciably reduce the locking implications of these DML statements with these Bitmap Indexes in place anyways. The actual index update is delayed until such as change is possible with the rollback of the parent deletion.

However, in a third session, an insert into the child table with a FK that’s not to be deleted is successful:

SQL> insert into bowie_kid values (1000002, 'BOWIE', 4);

1 row created.

Bitmap indexes are simply not designed with concurrency in mind and have efficiencies that make it easier for single sessions to load data in Data Warehouses environments where they are indeed suitable.

One advantage of the Bitmap index is that at least Oracle doesn’t have to perform a FTS on the (potentially huge) child table when checking for the existence of any associated child FK values. Oracle can quickly use the index to determine whether the parent delete can proceed or not. If we roll everything back and just attempt to delete a parent row:

SQL> delete bowie_dad where id = 1;

1 row deleted.

       
Execution Plan
----------------------------------------------------------
Plan hash value: 2571176721

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    13 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | BOWIE_DAD    |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010356 |     1 |    13 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access('ID'=1)

    
Statistics
----------------------------------------------------------
          7  recursive calls
          8  db block gets
          3  consistent gets
          0  physical reads
        676  redo size
        862  bytes sent via SQL*Net to client
        830  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

 

We notice at just 3 consistent gets, the potentially expensive FTS on the child table has been avoided. Drop the Bitmap index and the FTS must be performed to ensure no current FK values would violate the constraint when the parent row is deleted:

SQL> drop index bowie_kid_fk_i;

Index dropped.

   
SQL> delete bowie_dad where id = 1;

1 row deleted.

    
Execution Plan
----------------------------------------------------------
Plan hash value: 2571176721

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    13 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | BOWIE_DAD    |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010356 |     1 |    13 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access('ID'=1)

    
Statistics
----------------------------------------------------------
          7  recursive calls
          8  db block gets
       3629  consistent gets
          0  physical reads
        676  redo size
        863  bytes sent via SQL*Net to client
        830  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

 

We notice without the Bitmap Index in place, we are now performing many more (3629) consistent gets due to the necessary FTS.

So using a Bitmap Index to police a FK constraint doesn’t reduce the locking implications associated with deleting parent rows (with Bitmap indexes, we have locking hell regardless if there’s much DML) but it does at least reduce the overheads of checking the associated child table.

Indexing Foreign Keys (Helden) April 2, 2014

Posted by Richard Foote in Foreign Keys, Oracle Indexes.
6 comments

A recent question on an internal forum asked whether an index on a Foreign Key (FK) constraint designed to avoid locking issues associated with deletes on the parent tables needs to precisely match the columns in the FK. Could the columns in the index potentially be a different order or be appended with additional columns ?

The answer is basically the same as when using an index to police a Primary Key or Unique Key constraint. An index can be used providing the leading columns match those of the constraint (in any order). The index can indeed potentially have additional columns appended (or overloaded) to it.

Often the easiest way to find out these sorts of things is of course to just test it :) The point of this blog is not only to show candidate FK based indexes but also to highlight how easy it is to create simple test cases.

First, let’s create a couple of tables:

SQL> CREATE TABLE artists (id NUMBER,
                           code number,
                           artist_name VARCHAR2(30));

Table created.

SQL> CREATE TABLE albums (id NUMBER,
                          album_name VARCHAR2(30),
                          artist_id NUMBER ,
                          artist_code number,
                          format_id number);

Table created.

We populate the ARTISTS parent table with a few rows:

SQL> INSERT INTO artists VALUES (1, 1, 'DAVID BOWIE'); 

1 row created.

SQL> INSERT INTO artists VALUES (1, 2, 'ZIGGY STARDUST'); 

1 row created.

SQL> INSERT INTO artists VALUES (2, 1, 'MAJOR TOM');

1 row created.

SQL> INSERT INTO artists VALUES (2, 2, 'THIN WHITE DUKE');

1 row created.

We now populate the much larger ALBUMS child table with lots of rows:

SQL> insert into albums select rownum, 'BLAH', 1,  mod(rownum,2)+1, mod(rownum,100)
from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Now the tables are populated, we can add the necessary constraints. A concatenated PK based on the ID and CODE columns on the ARTISTS table and an associated FK constraint on the ALBUMS table:

SQL> alter table artists add primary key (id, code); 

Table altered.

SQL> alter table albums add constraint artists_fk foreign key (artist_id, artist_code)
references artists(id, code);

Table altered.

OK, note at this point there is no index based on the FK constraint columns on the ALBUMS table. Let’s look at the number of consistent gets generated when we try to delete just a single row from the tiny ARTISTS table:

SQL> delete artists where id=2 and code = 1; 

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 898601404

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    26 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | ARTISTS      |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010352 |     1 |    26 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=2 AND "CODE"=1)

Statistics
----------------------------------------------------------
          8  recursive calls
          7  db block gets
       3358  consistent gets
          0  physical reads
        640  redo size
        864  bytes sent via SQL*Net to client
        839  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          1  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> rollback;

Rollback complete.

We notice our first issue. In order to delete just one row from the table via a Unique Index scan, we performed a massive 3358 consistent gets. Why? Because we can only successfully delete this row if there are no corresponding FKs based on this parent row. Without an associated index, the only way Oracle can perform this check on the large child table is to perform an expensive, slow, Full Table Scan (FTS).

Let’s rollback and this time start with an insert into the child, ALBUMS table (but not yet commit): 

SQL> insert into albums values (1000001, 'HEATHEN', 1, 1, 1); 

1 row created.

In a second session, let’s now attempt to delete the a parent row from the ARTISTS table:

SQL> delete artists where id = 2;

 

We notice, this session now hangs while it waits for all current transactions on the ALBUMS table complete.

In a third session, we attempt to insert another row into the child, ALBUMS table:

SQL> insert into albums values (1000002, 'THE NEXT DAY', 1,2,3);

 

And we notice it hangs as well, due to the previous locks on the table. With lots of other transactions trying to make changes to the ALBUMS table getting locked as well, we effectively have locking hell …

Why ? Because Oracle needs some way to ensure while it runs the FTS looking for any FKs associated with the deleted parent row, no-one else comes in and inserts or updates a row with this FK value. And as Oracle is performing a slow FTS, and values could potentially be inserted or updated in an area of the table already checked, the only way to effectively achieve this is to exclusively lock the table. And exclusive table locks are not really that great from a concurrency point of view …

So, introducing the index on the FK column(s). By having an index in place, we can effectively address both of the above issues. When searching for a corresponding FK value, the index will very quickly direct us to the leaf block that will either:

  • find a value of the parent key being deleted (in which case the delete of the parent row will fail with an ORA-02292 that a child record has been found) or
  • not find the value being deleted, in which case the delete on the parent row can be successful

Additionally, as it’s a very fast index scan being performed, there is no need to exclusively lock the table. Oracle in fact effectively “locks” the location within the index where the index value would reside if it existed or were to be subsequently inserted. Only an attempt to insert/update a row into the child table with the specific deleted FK value would be locked until the point when the parent delete is either committed (in which case the child insert will fail with an ORA-02291 parent key not found) or rolled back (in which case the child insert will be successful). The FK index can effectively detect when such a child insert has taken place because unlike the table where a such a new row could potentially be anywhere within the table, such an insert can only occur in a specific location within the index.

So if you do potentially delete a parent record (or update the PK value, a rare thing to do which is logically equivalent to a delete/insert of the PK value), then it would be a good idea to create an appropriate index on the FK of the child tables.

So going back to our demo:

SQL> create index albums_fk_i on albums(artist_id, artist_code);

Index created.

If we now delete a parent row:

SQL> delete artists where id=2 and code = 1;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 898601404

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    26 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | ARTISTS      |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010352 |     1 |    26 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=2 AND "CODE"=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          8  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        864  bytes sent via SQL*Net to client
        839  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

We notice the number of consistent gets has dropped dramatically from the previous 3358. So no expensive FTS of the child table and none of the locking issues previously experienced.

But what if the index had the columns in a different order to that specified in the constraints:

SQL> rollback;

Rollback complete.

SQL> drop index albums_fk_i;

Index dropped.

SQL> create index albums_fk_i on albums(artist_code, artist_id);

Index created.

SQL> delete artists where id=2 and code = 1;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 898601404

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    26 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | ARTISTS      |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010352 |     1 |    26 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=2 AND "CODE"=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          8  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        864  bytes sent via SQL*Net to client
        839  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

Not a problem. The index contains all the columns of interest and can still be effectively used to quickly check for the existence of the deleted parent value.

What if the index had additional columns defined ?

SQL> rollback;

Rollback complete.

SQL> drop index albums_fk_i;

Index dropped.

SQL> create index albums_fk_i on albums(artist_code, artist_id, album_name);

Index created.

SQL> delete artists where id=2 and code = 1;

1 row deleted.

Execution Plan
----------------------------------------------------------
Plan hash value: 898601404

-----------------------------------------------------------------------------------
| Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | DELETE STATEMENT   |              |     1 |    26 |     0   (0)| 00:00:01 |
|   1 |  DELETE            | ARTISTS      |       |       |            |          |
|*  2 |   INDEX UNIQUE SCAN| SYS_C0010352 |     1 |    26 |     0   (0)| 00:00:01 |
-----------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ID"=2 AND "CODE"=1)

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          1  consistent gets
          0  physical reads
          0  redo size
        864  bytes sent via SQL*Net to client
        839  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          2  sorts (memory)
          0  sorts (disk)
          1  rows processed

Again not a problem. As the leading columns contain the FK columns of interest, Oracle can still effectively find the location within the index where the deleted value would be found if it existed in the child table.

So any index in which all the FK columns match the leading columns of the index would suffice.

And it’s really quite easy to create a quick demo to test this all out :)

Follow

Get every new post delivered to your Inbox.

Join 1,894 other followers