jump to navigation

Indexed Organized Tables – An Introduction to IOT Secondary Indexes (A Second Face) March 19, 2012

Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Secondary Indexes.
14 comments

Man, its been ages since I had free time to update the blog, what with birthday parties to organise, Roger Water concerts to attend and Radiohead concerts in the planning !! OK, time to take an initial look at Secondary Indexes for Index Organized Tables (IOTs).

If the IOT needs to be accessed via the Primary Key (PK) column(s), then no problem, the IOT structure must have a PK defined and the logical structure of the IOT ensures that data within the IOT is ordered based on the PK. Therefore, the IOT can be navigated like any conventional PK and the necessary data can be efficiently accessed.

But what if we want to access the data efficiently via Non-PK columns or without specify the leading column of the PK ? Can we create secondary indexes on a IOT ?

When IOTs were first introduced way back in Oracle8, secondary indexes weren’t supported (they came later in 8i). That’s likely due to the fact Oracle had to resolve a tricky issue in relation to indexing an IOT structure, that being what to do when indexing rows that potentially move around all the time ?

With a conventional Heap table, once a row is inserted into the table, it doesn’t generally subsequently move. There are relatively few examples of when this occurs, for example updating the partitioned column of a row such that it needs to be stored in another partition. This is recognised as a rather expensive thing to do as not only do at least two blocks need to be accessed and modified but it also requires associated indexes to be updated as well. As such, it generally requires explicitly allowing such activities to occur (by enabling row movement and the such). Note, when rows migrate to another block due to an increase in row size, indexes are not impacted and still reference the original block and the remaining stub of the row which points to the new block/location of the row.

But with IOTs, the story can be very different. When a 50-50 index block split occurs, roughly half the rows in the leaf block move to a new block. A relatively expensive operation would be even more expensive if  Oracle had to also update the index entries of all secondary indexes that referenced all these moved rows. Although rare with Heap tables, rows moving to new locations could be relatively common in an IOT due to associated 50-50 block split operations.

To deal with the difficulties of frequently moving rows within an IOT, Oracle created the IOT Secondary Index structure. It has three main components:

  • The indexed column values
  • The PK columns of the associated IOT
  • A “guess” that points to the physical location of the rows within the IOT, initially at the time the index is created

So the IOT Secondary Index is used in the following fashion. During an index scan, Oracle attempts to use the “guess” to access the block that was the last known physical location of the  row within the IOT. If it finds the required row in the IOT, great. The index performs in a similar manner to using a rowid with a conventional secondary index. However, if the required row is nowhere to be seen within the referenced block, Oracle tries again, this time using the PK value contained with the IOT Secondary Index to perform a Unique Scan of the IOT. This is a little more expensive to perform as it requires navigating down the branch structures of the IOT, but is at least guaranteed to find the row this time in its current location.

So in the best case scenario, the index performs similar to that of a normal secondary index. In the worst case scenario where the row has moved, the index is forced to perform an additional Unique Scan of the IOT using the PK but at least this has the potential to be much more efficient that a Fast Full Scan of the IOT in order to find the necessary row.

The key point to note here is that the secondary index is  not updated when a block split on the parent IOT occurs. The “guess” via the physical pointer reference simply becomes stale and the PK which is also stored within the secondary index is used as a backup method of accessing the required row.

If we start with a traditionally simple little demo, let’s first create and populate an IOT:

SQL> CREATE TABLE album_sales_IOT(album_id number, country_id number, total_sales number, album_colour varchar2(20), CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX;

Table created.

SQL> begin
  2  for i in 1..5000 loop
  3    for c in 1..100 loop
  4      insert into album_sales_iot values (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD');
  5    end loop;
  6  end loop;
  7  commit;
  8  end;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we now run a query based on the non-PK TOTAL_SALES column:

SQL> select * from album_sales_iot where total_sales = 2000;

  ALBUM_ID COUNTRY_ID TOTAL_SALES ALBUM_COLOUR
---------- ---------- ----------- --------------------
      1764         56        2000 GOLD

 
Execution Plan
----------------------------------------------------------
Plan hash value: 1789589470

-------------------------------------------------------------------------------------------
| Id  | Operation            | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT     |                    |     1 |    18 |   425   (1)| 00:00:06 |
|*  1 |  INDEX FAST FULL SCAN| ALBUM_SALES_IOT_PK |     1 |    18 |   425   (1)| 00:00:06 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("TOTAL_SALES"=2000)

 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       1586  consistent gets
          0  physical reads
          0  redo size
        757  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

We see that Oracle has no real choice (the PK is of no direct use here) but to perform an expensive FAST FULL INDEX SCAN, even though it correctly knows relatively few rows are to be retrieved.

If we create a secondary index on the IOT however:

SQL> create index album_sales_IOT_total_sales_i on album_sales_iot(total_sales);

Index created.

SQL> select * from album_sales_iot where total_sales = 2000;

  ALBUM_ID COUNTRY_ID TOTAL_SALES ALBUM_COLOUR
---------- ---------- ----------- --------------------
      1764         56        2000 GOLD

 
Execution Plan
----------------------------------------------------------
Plan hash value: 1433198708

---------------------------------------------------------------------------------------------------
| Id  | Operation         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                               |     1 |    18 |4   (0)| 00:00:01 |
|*  1 |  INDEX UNIQUE SCAN| ALBUM_SALES_IOT_PK            |     1 |    18 |4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN| ALBUM_SALES_IOT_TOTAL_SALES_I |     1 |       |3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TOTAL_SALES"=2000)
   2 - access("TOTAL_SALES"=2000)

 
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
          5  consistent gets
          5  physical reads
          0  redo size
        757  bytes sent via SQL*Net to client
        523  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

 
We notice that the index is used as expected and the number of consistent gets has dropped significantly. Notice also that the IOT is accessed subsequently not via Index ROWIDs but by a INDEX UNIQUE SCAN via the IOT PK. More on this later …

If we look at a partial block dump of an index entry within the IOT Secondary index:

row#0[8015] flag: K—–, lock: 0, len=21
col 0; len 3; (3):  c2 1f 28
col 1; len 3; (3):  c2 15 37
col 2; len 2; (2):  c1 1b
tl: 8 fb: –H-FL– lb: 0x0  cc: 1
col  0: [ 4]  01 01 41 da

col 0 represents the indexed value (TOTAL_SALES)

col 1 and col 2 represent the PK columns (ALBUM_ID and COUNTRY_ID)

Following the 3 byte table header overhead required for the “guess”, we have the second col 0, which represents the 4 byte  “guess” to the last known physical location of the row.

Much more to follow shortly …

Index Organized Tables – PCTTHRESHOLD (The Wedding Song) February 8, 2012

Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Overflow Segment, PCTTHRESHOLD.
7 comments

I’ve recently returned from a great two-week holiday, firstly at the Australian Open Tennis (what a final !!) and then up at the Gold Coast in not quite so sunny Queensland. Time now to get back to my blog :)

In my previous IOT examples, we had a very large column called Description which we didn’t really want to store within the Index Organized Table as it would cause the resultant index structure to get very inflated and inefficient. All the rows contained a very large Description value so it never made sense to include the Description column within the IOT.

In the following example, the Description column has values of varying lengths. Some of the values remain very large, however many of the Description values are quite moderate in size and wouldn’t be problematic to store within the IOT. Indeed, it would be quite beneficial as it wouldn’t be necessary to perform additional I/Os to the Overflow segment in cases where the Description was quite small in size and required by the application.

PCTTHRESHOLD gives us more flexibility in what is actually stored within the IOT index structure by storing  the non-PK columns up to the INCLUDING clause within the IOT but only if the row length to be stored inside the IOT is below a specified percentage threshold of the block size. So with a PCTTHRESHOLD of (say) 5, the non-PK columns up to the INCLUDING clause will be included within the IOT but only if the resultant row size is less than 5% of the blocksize. If a row size were to be greater than the specified percentage threshold of the block size, then any non-PK columns that would violate this length threshold would not be included within the IOT and stored instead within the Overflow segment.

In the following example, every other row is actually quite small and we would want these rows to have the Description value stored within the IOT. Therefore, we have modified the IOT table definition to include the Description column if the resultant row is less than 5% of the (8K in this case) blocksize:

SQL> CREATE TABLE album_sales_iot(album_id NUMBER, country_id NUMBER, total_sales NUMBER, description VARCHAR2(1000), CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX INCLUDING description PCTTHRESHOLD 5 OVERFLOW TABLESPACE bowie2;

Table created.

SQL> BEGIN
  2    FOR i in 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4         if mod(c,2) = 1 then
  5              INSERT INTO album_sales_iot VALUES(i, c, ceil(dbms_random.value(1,5000000)), 'A really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really long description');
  6         else INSERT INTO album_sales_iot VALUES(i, c, ceil(dbms_random.value(1,5000000)), 'A short description');
  7         end if;
  8      END LOOP;
  9    END LOOP;
 10    COMMIT;
 11  END;
 12  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we look at the size of the resultant IOT:

SQL> ANALYZE INDEX album_sales_iot_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT blocks, height, br_blks, lf_blks FROM index_stats;

    BLOCKS     HEIGHT    BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
      2176          3          5       2052

The IOT is only of a moderate size, with 5 branch blocks and 2,052 leaf blocks.

If we look at the size of the Overflow segment:

SQL> SELECT object_id FROM user_objects WHERE object_name = 'ALBUM_SALES_IOT';

 OBJECT_ID
----------
     74209

SQL> SELECT table_name, iot_name, iot_type, blocks FROM user_tables WHERE table_name = 'SYS_IOT_OVER_74209';

TABLE_NAME         IOT_NAME         IOT_TYPE         BLOCKS
------------------ ---------------- ------------ ----------
SYS_IOT_OVER_74209 ALBUM_SALES_IOT  IOT_OVERFLOW      35715

We see that the vast majority of the storage is still allocated to the Overflow segment, at 35,715 blocks in size.

If look at a partial block dump of an IOT leaf block:

Leaf block dump
===============
header address 461972060=0x1b89225c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 1
kdxcoopc 0x97: opcode=7: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 251
kdxcofbo 538=0x21a
kdxcofeo 561=0x231
kdxcoavs 23
kdxlespl 0
kdxlende 0
kdxlenxt 21053971=0x1414213
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[561] flag: K—S-, lock: 2, len=23
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 02
tl: 15 fb: –H-F— lb: 0x0  cc: 1
nrid:  0x01811901.0
col  0: [ 5]  c4 04 57 1d 44
row#1[584] flag: K—S-, lock: 2, len=36
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 03
tl: 28 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 4]  c3 1d 2a 2e
col  1: [19]  41 20 73 68 6f 72 74 20 64 65 73 63 72 69 70 74 69 6f 6e
row#2[620] flag: K—S-, lock: 2, len=23
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 04
tl: 15 fb: –H-F— lb: 0x0  cc: 1
nrid:  0x01811901.1
col  0: [ 5]  c4 04 22 2d 07
row#3[643] flag: K—S-, lock: 2, len=37
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 05
tl: 29 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 04 36 17 52
col  1: [19]  41 20 73 68 6f 72 74 20 64 65 73 63 72 69 70 74 69 6f 6e

We notice the leaf block contains 251 row entries. Half the rows with a Description of 19 bytes have the Description value stored within the IOT leaf block, while the other half of rows with the larger Description values contain a nrid that refers to the corresponding Description within the Overflow segment.

If we analyze the table:

SQL> ANALYZE TABLE album_sales_iot COMPUTE STATISTICS;

Table analyzed.

SQL> SELECT table_name, num_rows, chain_cnt, blocks from user_tables WHERE table_name = 'ALBUM_SALES_IOT';

TABLE_NAME                       NUM_ROWS  CHAIN_CNT     BLOCKS
------------------------------ ---------- ---------- ----------
ALBUM_SALES_IOT                    500000     250000

We notice that only half the rows are now “chained rows”.

If we run a query that only references the rows with a small Description that are stored within the IOT structure:

SQL> SELECT * FROM album_sales_iot WHERE album_id = 42 and mod(country_id,2)=0;

50 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |     1 |   510 |     5   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |     1 |   510 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
       filter(MOD("COUNTRY_ID",2)=0)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          7  consistent gets
          0  physical reads
          0  redo size
       2211  bytes sent via SQL*Net to client
        557  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed

The query at 7 consistent gets is relatively efficient as all the required data can be found within the IOT.

If however we run a query that references the larger Description rows:

SQL> SELECT * FROM album_sales_iot WHERE album_id = 42 and mod(country_id,2)=1;

50 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |     1 |   510 |     5   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |     1 |   510 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
       filter(MOD("COUNTRY_ID",2)=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
       4147  bytes sent via SQL*Net to client
        557  bytes received via SQL*Net from client
          5  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         50  rows processed

We see that it’s less efficient at 18 consistent gets as it needs to obviously access a larger volume of data and requires additional I/Os to access the corresponding Overflow segment.

So, with a combination of the INCLUDING and PCTTHRESHOLD clauses, one can control what data is and is not included within the IOT index structure.

Index Organized Tables – Overflow Segment Part II (The Loneliest Guy) January 18, 2012

Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Overflow Segment, Primary Key.
3 comments

In my previous post on Index Organized Tables (IOT), I introduced the concept of the IOT Overflow Segment, where we can store columns that we may not want to include within the actual IOT index structure. Before we move on, I just wanted to cover off a few additional points that could be a trap for the unwary …

In my experience, the Primary Key (PK) columns of a table are typically the first columns defined in the table. This has certainly been standard practice in most environments I’ve seen. This makes sense in that the PK are in many ways the “key” column(s) in the table and are identified as such by having the prestigious honour of being the first column(s) defined within the table. Most people look at and intuitively expect the first columns in the table to be the PK columns and for that reason alone, it’s probably good practice to consistently define the PK columns in this manner.

However, there’s also a good argument why having the PK columns as the leading columns in the table is precisely the wrong location for them. As many tables are “primarily” accessed via the PK columns and so accessed directly through the associated PK index, the application already knows the PK values of the row in question. Therefore, it’s somewhat inefficient to then have the PK columns the first columns defined in the table as these generally have to be read through and ignored before we get to the non-PK columns that are of direct interest and the reason for visiting the table block in the first place. By placing the PK columns after the most accessed non-PK columns, we avoid having to unnecessarily read through these PK columns again when accessing the table via the PK index.

I personally prefer to define the PK columns first in a standardised manner, with the advantages of avoiding possible confusion and misunderstandings outweighing any possible performance improvements. However, I can at least see the logic and merit of not following this standard with Heap tables.

The same however can not really be said for IOTs and I would strongly recommend defining the PK columns first in an IOT …

I’m going to run the same demo as I did in my last post on the Overflow Segment, but with one subtle change. I’m not going to define the two PK columns first but rather have them defined after my heavily accessed non-PK column:

SQL> CREATE TABLE album_sales_iot(total_sales NUMBER, album_id NUMBER, country_id NUMBER, description VARCHAR2(1000), CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX INCLUDING country_id OVERFLOW TABLESPACE bowie2;

Table created.

So in this example, my leading column is the non-PK total_sales column, followed then by the two PK columns. I still only want these 3 columns to be included in the actual IOT structure, so I have my INCLUDING clause only including columns up to the country_id column. I want the remaining large description column to be stored separately in an Overflow segment.

OK, let’s populate this table with the same data we used previously:

SQL> BEGIN
  2    FOR i in 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4         INSERT INTO album_sales_iot VALUES(ceil(dbms_random.value(1,5000000)), i, c, 'A really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really long description');
  6      END LOOP;
  9    END LOOP;
 10    COMMIT;
 11  END;
 12  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we describe this table, we get the expected listing:


SQL> desc album_sales_iot
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

 TOTAL_SALES                                        NUMBER
 ALBUM_ID                                  NOT NULL NUMBER
 COUNTRY_ID                                NOT NULL NUMBER
 DESCRIPTION                                        VARCHAR2(1000)

With the columns listed in the order as we defined them in the table.

If we query the column details from dba_tab_columns:

SQL> select column_id, column_name from dba_tab_columns where table_name = 'ALBUM_SALES_IOT' order by column_id;

 COLUMN_ID COLUMN_NAME
---------- ------------------------------
         1 TOTAL_SALES
         2 ALBUM_ID
         3 COUNTRY_ID
         4 DESCRIPTION

We again find the column order is as we defined them in the table.

When we run the same query we ran last time that returned the data with 5 consistent gets:

SQL> set arraysize 100
SQL> select album_id, country_id, total_sales from album_sales_iot where album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   100 |  1300 |    18   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |   100 |  1300 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         20  consistent gets
          0  physical reads
          0  redo size
       2394  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

We notice that performance is a lot worse, with 20 consistent gets now required. Obviously, something has changed unexpectedly ???

The first clue on what’s going on here can be found by looking at dba_tab_cols:

SQL> select column_id, segment_column_id, column_name from dba_tab_cols where table_name = 'ALBUM_SALES_IOT' order by column_id;

 COLUMN_ID SEGMENT_COLUMN_ID COLUMN_NAME
---------- ----------------- ------------------------------
         1                 3 TOTAL_SALES
         2                 1 ALBUM_ID
         3                 2 COUNTRY_ID
         4                 4 DESCRIPTION

The SEGMENT_COLUMN_ID column determines the order of the columns as they’re actually stored within the segment and we notice the column order is different. The two PK columns are listed first, with the total_sales column only listed in the 3rd position.

As discussed in the IOT Introduction post, the structure of an index entry in an IOT has the PK columns as the leading columns, following by the non-PK columns in the table portion. This is critical because the PK columns determine the location within the IOT table where new rows need to be inserted and the subsequent ordering of the rows in the table. As such, the PK columns must always be the leading columns of an IOT, despite how the table is actually defined at creation time. If the PK columns are not listed first in the table creation DDL statement, Oracle will automatically re-order the columns and place the PK columns first regardless.

This now has consequences on the INCLUDING clause if specified. In the above table creation statement, the INCLUDING clause specified the country_id column. Although defined as the third column, as it’s a PK column, Oracle has automatically re-ordered the columns such that it’s physically listed as the second column within the IOT segment. Unfortunately the INCLUDING clause is only applied after the re-ordering of the columns and as such, the total_sales column which is now logically listed third and now after the country_id column, is not therefore actually included in the IOT index structure as (perhaps) intended.

A partial block dump of an IOT leaf block will confirm his:

Leaf block dump
===============
header address 298590812=0x11cc225c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x90: opcode=0: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 421
kdxcofbo 878=0x36e
kdxcofeo 879=0x36f
kdxcoavs 1
kdxlespl 0
kdxlende 0
kdxlenxt 21052811=0x1413d8b
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[879] flag: K—–, lock: 0, len=17
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 02
tl: 9 fb: –H-F— lb: 0x0  cc: 0
nrid:  0x01811911.0
row#1[896] flag: K—–, lock: 0, len=17
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 03
tl: 9 fb: –H-F— lb: 0x0  cc: 0
nrid:  0x01811911.1

As we can see, the IOT row entries only consist of the PK columns and the row reference to the corresponding Overflow segment. None of the non-PK columns (such as total_sales) are co-located within the IOT segment as the table column count is 0 (cc: 0).

As a result, additional consistent gets are now required to fetch the total_sales column from the Overflow segment to satisfy the query. This explains why the query is now less efficient than it was previously.

My recommendation with regard to defining IOTs is to simply list the PK columns first. This will ensure the INCLUDING clause is applied as intended and will generally reduce confusion and misunderstandings. Otherwise, the INCLUDING clause needs to specify a Non-PK column to ensure more than just the PK columns are actually included in the IOT segment, the consequences of which may not be obvious to the casual observer of the DDL or describer of the table.

Jonathan Lewis, a great source of information on indexes and Oracle in general has previously discussed this same IOT Trap on his blog.

Index Organized Tables – Overflow Segment (Shadow Man) January 13, 2012

Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Overflow Segment.
14 comments

In my previous introductory IOT post, I illustrated how an Index Organized Table (IOT) might be worth consideration if most or all columns in a table were to be included within an index.

I’m going to use a slightly different demo this time, replacing one of the columns with a much larger DESCRIPTION column, one which is rarely accessed by the application:

SQL> CREATE TABLE album_sales_details_iot(album_id NUMBER, country_id NUMBER, total_sales NUMBER, description VARCHAR2(1000), CONSTRAINT album_sales_det_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX;

Table created.

SQL> BEGIN
  2    FOR i in 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4         INSERT INTO album_sales_details_iot VALUES(i, c, ceil(dbms_random.value(1,5000000)), 'A really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really rlly really really really really really long description');
  5       END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_DETAILS_IOT', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

Sorry for the unimaginative manner of loading the description field but you get the point :)

OK, let’s have a look at the size of the IOT:

SQL> ANALYZE INDEX album_sales_det_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT blocks, height, br_blks, lf_blks FROM index_stats;

    BLOCKS     HEIGHT    BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
     71680          3        116      71429

As expected, the IOT is quite large as it has to accommodate the very large Description field within the IOT index structure. At 71,429 leaf blocks for the 500,000 rows in the table, that’s just 7 rows on average per leaf block.

The application doesn’t generally access the Description column with the following query typical (Note: to make fetching data as efficient as possible, I’ve set the arraysize to 100):

SQL> set arraysize 100
SQL> SELECT album_id, country_id, total_sales FROM album_sales_details_iot WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 521866300

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   100 |  1300 |    17   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_DET_PK |   100 |  1300 |    17   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         19  consistent gets
          0  physical reads
          0  redo size
       2387  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

The query requires 19 consistent gets to retrieve the 100 rows because even though the data is extremely well clustered, there are very few rows per leaf block.

If we look at a partial block dump of one of these IOT leaf blocks:

Leaf block dump
===============
header address 548373084=0x20af825c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x90: opcode=0: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 7
kdxcofbo 50=0x32
kdxcofeo 1011=0x3f3
kdxcoavs 961
kdxlespl 0
kdxlende 0
kdxlenxt 20978307=0x1401a83
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[1011] flag: K—–, lock: 0, len=1004
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 02
tl: 996 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 04 05 3b 03
col  1: [984]
 41 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 6c 6f 6e 67 20 64 65
 73 63 72 69 70 74 69 6f 6e

We can see the leaf block only has 7 rows, with the vast majority of space taken up by the very large Description column.

Considering the Description column is so large and/or that it’s rarely accessed, wouldn’t it be nice if we didn’t have to store this column directly within the IOT index structure itself.

Enter the IOT Overflow segment. The IOT Overflow segment enables us to store in another physical location those columns that we don’t necessarily want to store directly within the IOT index structure. So those columns that might be particularly large (or just the occurrences of those columns when the specific values might be too large to store within the IOT index structure) or those columns that are rarely accessed can be stored elsewhere. Effectively, we’re back to having a separate “table” like structure, but the Overflow segment will only hold those columns that we don’t necessarily want to store within the index structure. Unlike a normal Heap table, in which all columns are stored within the table segment.

There are a number of different methods we could use (to be explored further in future posts), for now I’ll use the INCLUDING clause:

SQL> CREATE TABLE album_sales_details_iot2(album_id NUMBER, country_id NUMBER, total_sales NUMBER, description VARCHAR2(1000), CONSTRAINT album_sales_det_pk2 PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX INCLUDING total_sales OVERFLOW TABLESPACE bowie2;

Table created.

So in the above example, all columns up to and “including” the total_sales column will be included in the IOT index structure. All the following columns listed in the table definition (in this case the Description column) will be store in the Overflow segment, which in the above example will be created within the BOWIE2 tablespace.

If we now populate this table with the identical data as before:

SQL> BEGIN
  2    FOR i in 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4         INSERT INTO album_sales_details_iot2 VALUES(i, c, ceil(dbms_random.value(1,5000000)), 'A really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really rlly really really really really really long description');
  5       END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_DETAILS_IOT2', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> ANALYZE INDEX album_sales_det_pk2 VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT blocks, height, br_blks, lf_blks FROM index_stats;

    BLOCKS     HEIGHT    BR_BLKS    LF_BLKS
---------- ---------- ---------- ----------
      1664          3          4       1613

We notice the IOT index structure is now significantly smaller, down from 71,429 to just 1,613 leaf blocks. All the “clutter” has now been removed and is stored elsewhere.

If we now re-run our query:

SQL> SELECT album_id, country_id, total_sales FROM album_sales_details_iot2 WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2379894191

----------------------------------------------------------------------------------------
| Id  | Operation        | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                     |   100 |  1300 |    18   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_DET_PK2 |   100 |  1300 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          5  consistent gets
          0  physical reads
          0  redo size
       2390  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

Things are now much more efficient, having reduced the required consistent gets down from 19 to just 5 consistent gets.

If we now look at a partial block dump of an IOT leaf block:

Leaf block dump
===============
header address 441197148=0x1a4c225c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x90: opcode=0: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 322
kdxcofbo 680=0x2a8
kdxcofeo 703=0x2bf
kdxcoavs 23
kdxlespl 0
kdxlende 0
kdxlenxt 21049987=0x1413283
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[703] flag: K—–, lock: 0, len=23
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 02
tl: 15 fb: –H-F— lb: 0x0  cc: 1
nrid:  0x01800081.0
col  0: [ 5]  c4 02 5e 0d 25
row#1[726] flag: K—–, lock: 0, len=23
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 03
tl: 15 fb: –H-F— lb: 0x0  cc: 1
nrid:  0x01800081.1
col  0: [ 5]  c4 04 41 13 43

We can see the number of index entries in the leaf block has increased from 7 to 322, with the size of the index entry decreasing from 1004 to just 23 bytes. Instead of the Description column being stored within the leaf block, we now have a nrid entry consisting of a 6 byte relative block address and row directory number (0x01800081.0), which effectively points to the actual location of the remaining portion of the row within the Overflow segment. We only therefore have a table column count of 1 (cc:1).

To find out more about the corresponding Overflow segment, we first must determine the OBJECT_ID of the IOT:

SQL> SELECT object_id FROM user_objects WHERE object_name = 'ALBUM_SALES_DETAILS_IOT2';

 OBJECT_ID
----------
     74116

This OBJECT_ID is used to name the corresponding Overflow segment which we can determine from DBA_TABLES as it has a format of SYS_IOT_OVER_object_id:

SQL> SELECT table_name, iot_name, iot_type, blocks FROM user_tables WHERE table_name = 'SYS_IOT_OVER_74116';

TABLE_NAME         IOT_NAME                 IOT_TYPE      BLOCKS
------------------ ------------------------ ------------ -------
SYS_IOT_OVER_74116 ALBUM_SALES_DETAILS_IOT2 IOT_OVERFLOW   71430

We notice this Overflow segment (at 71,430 blocks) is where the majority of our storage has been allocated.

Although it’s listed as a table, the Overflow segment can’t be directly accessed or manipulated. Any attempt to do so will result in an error:

SQL> select * from SYS_IOT_OVER_74116;
select * from SYS_IOT_OVER_74116
              *
ERROR at line 1:
ORA-25191: cannot reference overflow table of an index-organized table

If we look at a partial block dump of the Overflow segment block referenced in the previous IOT block dump:

Block header dump:  0x01800081
 Object id on Block? Y
 seg/obj: 0x12185  csc: 0x00.17482cc  itc: 1  flg: –  typ: 1 – DATA
     fsl: 0  fnx: 0x0 ver: 0x01
 
 Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x000a.00b.0000a919  0x00c24a2e.03d2.2a  C—    0  scn 0x0000.01748279
bdba: 0x01800081
data_block_dump,data header at 0x1a4c2244
===============
tsiz: 0x1fb8
hsiz: 0x20
pbl: 0x1a4c2244
     76543210
flag=——–
ntab=1
nrow=7
frre=-1
fsbo=0x20
fseo=0x4a6
avsp=0x486
tosp=0x486
0xe:pti[0] nrow=7 offs=0
0x12:pri[0] offs=0x1bda
0x14:pri[1] offs=0x17fc
0x16:pri[2] offs=0x141e
0x18:pri[3] offs=0x1040
0x1a:pri[4] offs=0xc62
0x1c:pri[5] offs=0x884
0x1e:pri[6] offs=0x4a6
block_row_dump:
tab 0, row 0, @0x1bda
tl: 990 fb: —–L– lb: 0x0  cc: 1
col  0: [984]
 41 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20
 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c
 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72
 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c
 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65
 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79
 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61
 6c 6c 79 20 72 65 61 6c 6c 79 20 72 65 61 6c 6c 79 20 6c 6f 6e 67 20 64 65
 73 63 72 69 70 74 69 6f 6e

We notice the Overflow block contains 7 rows as we would expect, as this was all the IOT segment could previously manage when it had to store the large Description column values.

The table row directory contains 7 rows, with the first row (#0) having an offset at address 0x1bda, which is the actual location of the first row within the Overflow block.

Therefore, in order to find a specific Description column value of interest from the IOT, Oracle references the (say) nrid:  0x01800081.0 within the IOT index entry for the row. This in turns points to the relative block address (0x01800081) of the Overflow block containing the description and the corresponding row directory number (0), which in turn specifies the offset (say) 0x1bda to the actual location of the Description value within the Overflow block. Easy !!

If we Analyze the IOT table:

SQL> ANALYZE TABLE album_sales_details_iot2 COMPUTE STATISTICS;

Table analyzed.

SQL> SELECT table_name, num_rows, chain_cnt, blocks from user_tables WHERE table_name = 'ALBUM_SALES_DETAILS_IOT2';

TABLE_NAME                       NUM_ROWS  CHAIN_CNT     BLOCKS
------------------------------ ---------- ---------- ----------
ALBUM_SALES_DETAILS_IOT2           500000     500000

We notice all the rows are listed as “Chained Rows“. This is because all the rows have a corresponding Description value stored in the Overflow segment and so the rows are not stored within the one block. As the previous query illustrated, this is no bad thing if we don’t need to reference these additional columns stored in the Overflow segment. It makes the resultant IOT table more compact and efficient to access.

However, on those (hopefully) rarer occasions when we do need to access the columns in the Overflow segment, this will clearly require additional block accesses:

SQL> SELECT * FROM album_sales_details_iot2 WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2379894191

----------------------------------------------------------------------------------------
| Id  | Operation        | Name                | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                     |   100 | 99400 |    18   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_DET_PK2 |   100 | 99400 |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         32  consistent gets
          0  physical reads
          0  redo size
       5541  bytes sent via SQL*Net to client
        590  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

The above query which returns the Description column results in the consistent gets increasing to 32 consistent gets, from the 5 consistent gets when the Description wasn’t accessed and from the 19 consistent gets from when the Description column was co-located within the IOT segment. But this is a price we might be willing to pay if this query isn’t frequently executed while the frequently executed queries which don’t access the Description column are more efficient.

The Overflow segment gives us in a manner “the best of both worlds”. The ability to store just those columns of interest within the IOT segment (although these must always include all the Primary Key columns) and those that are less often accessed or too large to be efficiently stored within the IOT can be stored elsewhere. Effectively, it’s an index and table relationship except the table doesn’t have to store again the columns that are already stored within the index.

It’s all good news so far for IOTs …

Index Organized Tables – An Introduction Of Sorts (Pyramid Song) January 10, 2012

Posted by Richard Foote in Block Dumps, CBO, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Primary Key.
14 comments

Thought it was high time that I covered in a little detail the subject of Index Organized Tables (IOTs). When used appropriately, they can be an extremely useful method of storing and accessing data. Hopefully by the end of this series, you’ll have a better understanding of IOTs, their respective strengths and weaknesses and so perhaps be in a better position to take advantage of them when appropriate.

As I mentioned in a previous post, Martin Widlake has recently written an excellent series on IOTs, which I highly recommend. I’ll try to cover differing aspects of IOTs that will hopefully be of interest.

To start, let’s cover a very basic little example.

Let’s begin by creating and populating a simple Heap Table that holds information about musical albums (Note using an 8K blocksize in a MSSM tablespace):

SQL> CREATE TABLE album_sales(album_id number, country_id number, total_sales number, album_colour varchar2(20),
  2  CONSTRAINT album_sales_pk PRIMARY KEY(album_id, country_id));

Table created.

SQL> BEGIN
  2    FOR i IN 1..5000 LOOP
  3      FOR c IN 1..100 LOOP
  4        INSERT INTO album_sales VALUES (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD');
  5      END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

We have a natural Primary Key that consists of two columns and an additional two columns of information.

Let’s look at some basic sizing information on the table and associated Primary Key index:

SQL> SELECT blocks, empty_blocks, IOT_TYPE FROM dba_tables WHERE table_name = 'ALBUM_SALES';

    BLOCKS EMPTY_BLOCKS IOT_TYPE
---------- ------------ ------------
      1570            0

SQL> ANALYZE INDEX album_sales_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      1152          3       1062

So the table segment consists of 1570 blocks and the index segment 1152, with a total of 1062 leaf blocks.

OK, let’s run a basic query looking for all albums with an album_id=42:

SQL> SELECT * FROM album_sales WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3244723662

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |   100 |  1800 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ALBUM_SALES    |   100 |  1800 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ALBUM_SALES_PK |   100 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
       4084  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

As we can see, things are pretty good. 18 consistent gets in order to return 100 rows isn’t bad at all. Clearly, the index has a good Clustering Factor and can retrieve the 100 required rows in a relatively efficient manner.

However, this is a very frequently executed query and we want to do even better. One thing we notice is that we only have a couple of columns in the table which are not part of the index. Perhaps if we included these columns in the index as well, we can then use the index to extract all the required data and thus eliminate the need to visit the table segment at all. Overloading an index in this manner is a common tuning technique and will hopefully reduce the number of required logical I/Os to run the query.

We can do this by dropping and recreating the index with all the columns, making sure the PK columns remain the leading columns. This will ensure the index can still be used to police the PK constraint:

SQL> ALTER TABLE album_sales DROP PRIMARY KEY;

Table altered.

SQL> CREATE INDEX album_sales_pk_i ON album_sales(album_id, country_id, total_sales, album_colour) COMPUTE STATISTICS;

Index created.

SQL> ALTER TABLE album_sales ADD constraint album_sales_pk PRIMARY KEY(album_id, country_id);

Table altered.

OK, so the index now contains all the columns in the table and is now used to police the PK constraint:

SQL> select constraint_name, constraint_type, index_name from dba_constraints where constraint_name = 'ALBUM_SALES_PK';

CONSTRAINT_NAME                C INDEX_NAME
------------------------------ - ------------------------------
ALBUM_SALES_PK                 P ALBUM_SALES_PK_I

Let’s now look at the size of the index:

SQL> ANALYZE INDEX album_sales_pk_i VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      2048          5       2006

OK, as expected the index is now somewhat larger as it now needs to accommodate the extra columns. The number of overall blocks allocated to the index is 2048, with leaf blocks increasing from 1062  to 2006 leaf blocks.

If we now re-run the query:

SQL> SELECT * FROM album_sales WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1126128764

-------------------------------------------------------------------------------------
| Id  | Operation        | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                  |   100 |  1800 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_PK_I |   100 |  1800 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         11  consistent gets
          0  physical reads
          0  redo size
       3568  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

We notice things have indeed improved and we have reduced the number consistent gets from 18 down to just 11. Not a bad improvement !!

If look at a partial block dump of one of the index leaf blocks:

Leaf block dump
===============
header address 484409948=0x1cdf825c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 5
kdxcosdc 0
kdxconro 258
kdxcofbo 552=0x228
kdxcofeo 1373=0x55d
kdxcoavs 821
kdxlespl 0
kdxlende 0
kdxlenxt 20972941=0x140058d
kdxleprv 20972939=0x140058b
kdxledsz 0
kdxlebksz 8036
row#0[8010] flag: ——, lock: 0, len=26
col 0; len 2; (2):  c1 07
col 1; len 2; (2):  c1 12
col 2; len 5; (5):  c4 04 15 31 59
col 3; len 4; (4):  47 4f 4c 44
col 4; len 6; (6):  01 40 05 82 00 b7
row#1[7984] flag: ——, lock: 0, len=26
col 0; len 2; (2):  c1 07
col 1; len 2; (2):  c1 13
col 2; len 5; (5):  c4 03 19 2c 3d
col 3; len 4; (4):  47 4f 4c 44
col 4; len 6; (6):  01 40 05 82 00 b8

We notice that each leaf entry is 26 bytes in length. The length of the four columns adds up to 13 bytes. The remaining 13 bytes is basically overhead required for each index entry:

2 bytes for flag and lock information in the index entry header

5 x 1 byte for each of the length bytes for each column

6 bytes for the 5th index column which is the index rowid

So that’s 13 bytes of overhead per index entry in this example index.

Well, everything is currently pretty good. We have the application now performing approximately 40% less work than it was previously. But we have one little issue. With the index now consisting of all the columns in the table and with the application using the index exclusively, what’s the point of now having the table? It’s wasting storage and wasting resources in having to be maintained for no purpose other than having to exist so that the index can in turn exist.

Wouldn’t it be nice if we can somehow just have the index, but without the underlining table. Enter the Index Organized Table (IOT), first introduced way back in Oracle 8.0. It’s basically an index structure that can exist without the need for an underlining table. The index structure itself is the table by which we can store and retrieve the necessary data.

OK, let’s now create a new version of this table with the same data, but this time as an IOT:

SQL> CREATE TABLE album_sales_IOT(album_id number, country_id number, total_sals number, album_colour varchar2(20),
     CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX;

Table created.

SQL> BEGIN
  2    FOR i IN 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4        INSERT INTO album_sales_IOT VALUES (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD');
  5      END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

The key clause is here ORGANIZATION INDEX. I’ll discuss other options and syntax in coming posts.

If we look now at the table segment:

SQL> SELECT blocks, empty_blocks, IOT_TYPE FROM dba_tables
  2  WHERE table_name = 'ALBUM_SALES_IOT';

    BLOCKS EMPTY_BLOCKS IOT_TYPE
---------- ------------ ------------
                        IOT

We see there is an IOT segment listed but consists of no blocks as it doesn’t physically exist …

If we look at the size of the corresponding index:

SQL> SELECT index_name, table_name, blevel, leaf_blocks FROM dba_indexes
  2  WHERE table_name = 'ALBUM_SALES_IOT';

INDEX_NAME           TABLE_NAME       BLEVEL LEAF_BLOCKS
-------------------- --------------- ------- -----------
ALBUM_SALES_IOT_PK   ALBUM_SALES_IOT       2        1550

SQL> ANALYZE INDEX album_sales_iot_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      1664          4       1550

We notice it’s smaller than the corresponding overloaded index for the Heap Table. The previous index consisted of 2048 blocks and 2006 leaf blocks but this index is somewhat smaller at just 1664 blocks and 1550 leaf blocks.

If we take a look at a partial block dump of a leaf block from the IOT:

Leaf block dump
===============
header address 483926620=0x1cd8225c
kdxcolev 0
KDXCOLEV Flags = – – -
kdxcolok 1
kdxcoopc 0x90: opcode=0: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 2
kdxconro 336
kdxcofbo 708=0x2c4
kdxcofeo 710=0x2c6
kdxcoavs 2
kdxlespl 0
kdxlende 0
kdxlenxt 20976645=0x1401405
kdxleprv 20976643=0x1401403
kdxledsz 0
kdxlebksz 8036
row#0[710] flag: K—S-, lock: 2, len=22
col 0; len 2; (2):  c1 08
col 1; len 2; (2):  c1 49
tl: 14 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 04 2f 10 59
col  1: [ 4]  47 4f 4c 44
row#1[732] flag: K—S-, lock: 2, len=22
col 0; len 2; (2):  c1 08
col 1; len 2; (2):  c1 4a
tl: 14 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 03 01 03 46
col  1: [ 4]  47 4f 4c 44

Firstly, we notice it’s definitely an IOT block dump as the IOT flag is set.

The structure of the index entry is somewhat different here. It basically consists of:

2 bytes for lock and flag info in the index header as previously

Next come the two Primary Key columns with their corresponding length bytes. Note an IOT must have a PK defined.

Following are 3 bytes for the table header consisting of a lock byte, flag byte and a byte to denote the number of table (non PK) columns (in this case 2).

Followed finally by the 2 Non-PK columns and their corresponding length bytes.

Note the big missing component here from the previous block dump is that there is no rowid defined with its corresponding length byte. No need for a rowid if there’s no corresponding table to point down to …

So the overall overhead has been reduced to:

2 byes for the index header

3 bytes for the table header

4 bytes for the 4 column lengths

for a total of 9 bytes, 4 less than the 13 bytes overhead required in the previous example. So the total length of an index entry has reduced down from 26 bytes to just 22 bytes. Hence, the overall reduction in the size of the corresponding IOT index.

So we have saved 1570 table blocks and 384 index blocks in total.

If we now re-run the same query:

SQL> SELECT * FROM album_sales_iot WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   100 |  1800 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |   100 |  1800 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)

 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       3575  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

Not only have we saved ourselves some storage and having to maintain two physical segments, but things are a tad more efficient as well, reducing the number of consistent gets down from 11 to 10 as the corresponding index segment we need to access is smaller …

Enough to start with for now and yes the pun in the title is fully intended :)

Curious Case Of The Ever Increasing Index Solution (A Big Hurt) January 5, 2012

Posted by Richard Foote in ASSM, Indexing Myth, Oracle Indexes, Quiz.
6 comments

Based on the excellent comments in the Quiz post, we have some clever cookies out there :)

I guess the first thing to point out is that based in the basic scenario provided, the index shouldn’t ordinarily be continually growing in this fashion. Although the index values are monotonically increasing, the deletions are leaving behind fully emptied leaf blocks which can generally be recycled in future block splits.

OK, so why is this index behaving in this fashion, continually to increase in size while the number of rows in the table remains relatively constant ?

Well, there are a number of contributing factors.

As stated, the index values are indeed monotonically increasing so all inserts into the index are hitting the right-most index leaf block within the index structure and the deletions which all occur on the “left-side” of the index are leaving behind leaf blocks that contain nothing but deleted index entries. As Marcus and David mentioned in the comments, the use of a Reverse Key index would therefore alleviate this problem as subsequent inserts will be performed within the same leaf blocks in the index structure, automatically cleaning out and reusing space used by previously deleted index entries. This solution though may create as many problems as it solves (if say range predicate statements relied on a Non-Reverse index).

Additionally, the processing is being performed with a PL/SQL block. Oracle has a whole lot of smarts to make PL/SQL as efficient as possible and so the manner in which code runs within PL/SQL compared to other native languages can vary significantly. Unfortunately at times, these smarts might not be quite so smart after all.

The tablespace used to store the index is a Locally Managed Tablespace (LMT) with Automatic Segment Storage Management (ASSM). Instead of freelists (or freelist groups), Oracle uses a set of bitmap blocks within the index segment to determine the amount of free space available within its blocks and whether a block is available for inserts. As Vyacheslav and Alberto highlighted in the comments, there are a number of “issues” in the manner in which these bitmap blocks are maintained within PL/SQL processing. This is effectively locking out the vast number of these now effectively empty leaf blocks from being recycled and reconsidered for subsequent inserts. By rebuilding the index once in a Manual Segment Space Management (MSSM) tablespace would also alleviate this issue.

The actual processing involved within the PL/SQL block can also have an impact. The procedure contained the following important statement:

select min(id),max(id) into n,m from bowie;

By removing this statement from the PL/SQL block and either manually defining the values to be processed or passing them through to the procedure, can impact how PL/SQL manages the freespace bitmaps within the segment. For example, if one used something similar to the following:

SQL> declare
  2      n number:= 1;
  3      m number:= 200000;
  4  begin
  5      for i in 1..200000 loop
  6          delete from bowie where id=n+i-1;
  7          insert into bowie values(m+i,'David Bowie');
  8          if (i mod 1000=0) then
  9            commit;
 10          end if;
 11      end loop;
 12      commit;
 13  end;
 14  /

The number of leaf blocks allocated will be nowhere as significant as before and stabilised after a few runs to approximate:

SQL> select name, height, lf_rows, del_lf_rows, lf_blks, br_blks from index_stats;

NAME             HEIGHT    LF_ROWS DEL_LF_ROWS    LF_BLKS    BR_BLKS
------------ ---------- ---------- ----------- ---------- ----------
BOWIE_PK              3     289040       89040        744          4

Finally, the PL/SQL procedure only performed a Commit after 1000 iterations. This means that there were 1000 deletions performed during a logical transaction. As Greg mentioned in the comments, Unique Key index values can be reused within a single transaction BUT only if the same actual values are reused. The procedure introduces new values for those that have been deleted and so the deleted entries can’t be reused during the same transaction. This means there will be at least 1000 deleted index entries that can’t be reused during the running of the procedure and sufficient additional leaf blocks to accommodate these 1000 index entries will need to be allocated, even if we use some of the solutions mentioned, such as Reverse Key indexes or MSSM tablespaces. By performing either all the necessary deletions within one transaction followed by the inserts or having a Commit for each delete/insert pairing, these additional blocks won’t be required. For example:

SQL> declare
  2        n number:= 1;
  3        m number:= 200000;
  4    begin
  5      for i in 1..200000 loop
  6           delete from bowie where id=n+i-1;
  7           commit;
  8           insert into bowie values(m+i,'David Bowie');
  9           commit;
 10      end loop;
 11  end;
 12  /

Although of course, the inefficiencies in the processing or the potential breaking of business rules may not make the index savings worthwhile.

So in summary, there a number of things we could do to fix this scenario, rather than simply periodically rebuilding the index all the time. Depending on applicability, we could convert the index to a Reverse Key index (or say Hash Partition), we could move the index to a MSSM tablespace, we could modify our procedure logic to remove the reference to fetching MIN/MAX values, not use PL/SQL, or to make the index as efficient as possible, change the transactional logic to not perform large numbers of inserts and deletes within the same transaction.

So there’s quite a lot happening within what on the surface looks like a simple piece of PL/SQL :)

Curious Case Of The Ever Increasing Index Quiz (She’ll Drive The Big Car) January 4, 2012

Posted by Richard Foote in Index Internals, Indexing Myth, Oracle Indexes, Quiz.
22 comments

I received an email recently that had a nice example of what can potentially go wrong with an index.

Let’s first create a simple table with a unique index and populate it with 200,000 rows (following demo run on 11.2.0.1):

SQL> create table bowie (id number constraint bowie_pk primary key, name varchar2(100));

Table created.

SQL> insert into bowie select rownum, 'BOWIE' from dual connect by level <= 200000;

200000 rows created.

SQL> commit;

Commit complete.

SQL> analyze index bowie_pk validate structure;

Index analyzed.

SQL> select name, height, lf_rows, del_lf_rows, lf_blks, br_blks from index_stats;

NAME             HEIGHT    LF_ROWS DEL_LF_ROWS    LF_BLKS    BR_BLKS
------------ ---------- ---------- ----------- ---------- ----------
BOWIE_PK              2     200000           0        374          1

So far, everything is as expected. With have an index with 200,000 rows that currently has 374 leaf blocks.

OK, what we want to do is basically gradually delete the current set of rows and replace them with 200,000 new rows, with ever-increasing Primary Key values. To this end, we create the following procedure:

SQL> create or replace procedure delete_insert_rows
  2  as
  3       n number;
  4       m number;
  5  begin
  6       select min(id),max(id) into n,m from bowie;
  7       for i in 1..200000 loop
  8           delete from bowie where id=n+i-1;
  9           insert into bowie values(m+i,'David Bowie');
 10           if (i mod 1000=0) then
 11                commit;
 12           end if;
 13       end loop;
 14       commit;
 15  end;
 16  /

Procedure created.

So the procedure basically determines the current MIN and MAX values of our PK column and gradually deletes the current rows and then inserts new ones. Every 1000 iterations, we commit the changes. Nothing too complex here.

When we run this procedure for the first time:

SQL> exec delete_insert_rows

PL/SQL procedure successfully completed.

SQL> analyze index bowie_pk validate structure;

Index analyzed.

SQL> select name, height, lf_rows, del_lf_rows, lf_blks, br_blks from index_stats;

NAME             HEIGHT    LF_ROWS DEL_LF_ROWS    LF_BLKS    BR_BLKS
------------ ---------- ---------- ----------- ---------- ----------
BOWIE_PK              2     293820       93820        619          1

We notice we now have a whole bunch of deleted leaf entries and that the index has grown from 374 to 619 leaf blocks.

If we run the procedure again:

SQL> exec delete_insert_rows

PL/SQL procedure successfully completed.

SQL> analyze index bowie_pk validate structure;

Index analyzed.

SQL> select name, height, lf_rows, del_lf_rows, lf_blks, br_blks from index_stats;

NAME             HEIGHT    LF_ROWS DEL_LF_ROWS    LF_BLKS    BR_BLKS
------------ ---------- ---------- ----------- ---------- ----------
BOWIE_PK              3     347841      147841        994          3

Things have gone even worse. We still only have 200,000 rows in the table but the index now has an additional 147,841 deleted entries and the number of leaf blocks has again increased substantially to 994 leaf blocks.

If we have a look at a partial treedump of the index:

SQL> select object_id from dba_objects where object_name = 'BOWIE_PK';

 OBJECT_ID
----------
     74060

SQL> alter session set events 'immediate trace name treedump level 74060';

Session altered.

—– begin tree dump
branch: 0x100378b 16791435 (0: nrow: 2, level: 2)
   branch: 0x1003ce0 16792800 (-1: nrow: 733, level: 1)
      leaf: 0x100378e 16791438 (-1: nrow: 149 rrow: 0)
      leaf: 0x100378f 16791439 (0: nrow: 571 rrow: 0)
      leaf: 0x100378c 16791436 (1: nrow: 291 rrow: 0)
      leaf: 0x1003795 16791445 (2: nrow: 571 rrow: 0)
      leaf: 0x1003796 16791446 (3: nrow: 433 rrow: 0)
      leaf: 0x1003797 16791447 (4: nrow: 4 rrow: 0)
      leaf: 0x1003790 16791440 (5: nrow: 571 rrow: 0)
      leaf: 0x1003791 16791441 (6: nrow: 146 rrow: 0)
      leaf: 0x1003792 16791442 (7: nrow: 571 rrow: 0)
      leaf: 0x1003793 16791443 (8: nrow: 288 rrow: 0)
      leaf: 0x1003794 16791444 (9: nrow: 571 rrow: 0)
      leaf: 0x10037a9 16791465 (10: nrow: 430 rrow: 0)

… (most of the treedump has been cut out, following is the last portion of the dump)

  
     leaf: 0x1003e70 16793200 (248: nrow: 533 rrow: 533)
      leaf: 0x1003e74 16793204 (249: nrow: 533 rrow: 533)
      leaf: 0x1003e78 16793208 (250: nrow: 533 rrow: 533)
      leaf: 0x1003e7c 16793212 (251: nrow: 533 rrow: 533)
      leaf: 0x1003e41 16793153 (252: nrow: 533 rrow: 533)
      leaf: 0x1003e45 16793157 (253: nrow: 533 rrow: 533)
      leaf: 0x1003e49 16793161 (254: nrow: 533 rrow: 533)
      leaf: 0x1003e4d 16793165 (255: nrow: 533 rrow: 533)
      leaf: 0x1003e51 16793169 (256: nrow: 533 rrow: 533)
      leaf: 0x1003e3e 16793150 (257: nrow: 533 rrow: 533)
      leaf: 0x1003e03 16793091 (258: nrow: 533 rrow: 533)
      leaf: 0x1003e07 16793095 (259: nrow: 236 rrow: 236)
—– end tree dump

We notice that the first portion of the index contains leaf blocks with nothing but deleted index entries. The number of rrows is 0 for a vast number of leaf blocks. We also notice that the root block has a rba of 0x100378b 16791435, which is only a few values below some of the rba values of the left most indexes in the index structure (say) 0x100378e 16791438. Therefore, this highlights that even though these left most blocks in the index structure contain nothing but deleted index entries, Oracle is not recycling them as it should do. Oracle is simply adding new blocks to the index structure rather than recycling empty leaf blocks, resulting in the index growing bigger and bigger.

The leaf blocks however at the right most end of the index structure (the second portion of the partial treedump), shows us a nice compact set of leaf blocks with lots of index entries per block (most with 533 per leaf block) and with no deleted index entries (rrows matches the nrows value). 

If we run the procedure 10 times in total, we get an index that looks like the following:

SQL> select name, height, lf_rows, del_lf_rows, lf_blks, br_blks from index_stats;

NAME             HEIGHT    LF_ROWS DEL_LF_ROWS    LF_BLKS    BR_BLKS
------------ ---------- ---------- ----------- ---------- ----------
BOWIE_PK              3    1325132     1125132       4136          7

We now have 1,125,132 deleted index entries and the index is now over 10 times the original size, up from 374 to a massive 4,136 leaf blocks, even though the table only contains 200,000 rows.

There are a number of contributing factors here :)

The question is why, why is the index behaving in this fashion and what can we do to ensure the index doesn’t grow in this manner and can remain basically the same size as we delete and insert new rows into the table ?

Index Organized Tables – A Start (Star) January 3, 2012

Posted by Richard Foote in Index Organized Tables, Oracle Indexes.
8 comments

Happy New Year !!

As has been mentioned to me on quite a number of occasions, I’ve haven’t really covered the area of Index Organized Tables (IOTs) on this blog. This despite covering the subject of IOTs in quite some detail in my Index Internals Seminar. Which I readily admit is a bit of a shame as IOTs are often misunderstood little things and I would say are generally way under-utilised in many environments, I would suggest in part because they are so often misunderstood. The only reason I can really give for not discussing them more is that I simply haven’t gotten around to it :)

Well the good news as hinted at late last year, it’s about time I addressed this and started writing a few blog pieces on IOTs.

However, the even better news is that the subject of IOTs have already been covered by Martin Widlake in a truly excellent series of articles on his blog. I would strongly recommend giving them a read as they cover many aspects of IOTs in an interesting and informative manner.

I’ll recap some of the basic principles of IOTs here and then discuss various different aspects (such as the joys of Secondary Indexes on IOTs) in my own unique way :)

So, all that’s to come in the coming weeks.

However, I recently received an index question via email during the Christmas holidays which I thought was worthy of a blog piece or two. Although I often respond to emails, I much prefer to discuss index related “stuff” on the blog so everyone can share in the knowledge exchange. This is an interesting question because it relates to why an index keeps growing, even though the number of rows in the table remains constant. It’s an even more interesting question because there are a number of contributing factors for this impact on the index which I thought were worth sharing.

No more clues, I’ll post the question and the specific scenario in the next few days …

Merry Christmas (plus long lost “The Jean Genie”) December 24, 2011

Posted by Richard Foote in Christmas.
9 comments

I just want to wish you all a very Merry Christmas and a very happy, safe and successful New Year.

As always, thank you all so much for supporting the blog, with all your comments and nice feedback. Hopefully, you’ve found content here that has been of some use. Now that I’m back at Oracle Corporation, 2012 should hopefully be an exciting year with lots to talk and blog about.

Here’s my Christmas present …

Described as an early Christmas present for all David Bowie fans world-wide, footage of the great man performing his classic hit “The Jean Genie” on England’s “Top of the Pops” has recently re-surfaced. Thought long-lost (as the BBC had a tendency to simply delete their tapes), this was originally recorded way back at the time Bowie was working on the “Aladdin Sane” album, on 3rd January 1973 and broadcast the next day on 4th January 1973. It basically hasn’t been seen since until a cameraman on the show recently told the world he had a copy of the tapes in his shed !!

Now, nearly 40 years on, it’s still a great song, a great band and a great performance.

Enjoy :)

Back At Oracle Corporation (The Return Of The Thin White Duke) November 30, 2011

Posted by Richard Foote in Richard's Musings.
30 comments

I’ve been very quiet on the blogging front lately, so apologies for that.

One of the main reasons is that after a period of nearly 10 years, I’ve decided to take up a position back at Oracle Corporation as a “Principal Solutions Consultant” in sunny Canberra. So things have been rather hectic, finishing up in my previous job and starting up afresh at Oracle. Things have certainly changed a lot over this time, with new offices and a much larger workforce than worked in Canberra previously. Some things haven’t changed such as I’ve still got my old email address and the same employee number (2020) as before. I’ve got that feeling of being really excited again at work which I’ve missed recently so I’m really looking forward to my many new adventures to come at Oracle.

The good news is that I’ve been given the green light to continue on with my blogging activities so expect action on that front here again soon. I’ll still be keeping it technical (unless there’s a new David Bowie album or Pink Floyd Box-set to discuss of course) and I’ll still be talking mainly about indexes as there’s heaps that I’ll still like to cover such as Secondary Indexes on IOTs as promised previously, as well as looking at more “exotic” indexes such as those for Spatial, XML, Oracle Text etc. There are also a few “rumours” regarding not needing indexes with Exadata which might potentially become full-blown myths if I don’t do something about it :)

I’ll also get back to answering the various questions that have been building up over the past few weeks.

So stay tuned !!

Why Are My Indexes Still Valid Solution (A Second Face) October 20, 2011

Posted by Richard Foote in IOT, Oracle Indexes, Quiz, Secondary Indexes.
add a comment

I’ve been so busy lately, I just haven’t had any spare time to post.

For now, the quick answer to the last quiz is that the second table was indeed an Index Organized Table (IOT).

One of the nice benefits of an IOT is that when re-organised, unlike a Heap Table, all indexes remain valid, even the Secondary Indexes. I’ll explain why in my next post in the next few days. I’ll also explain why secondary indexes are one of the main disadvantages with IOTs as well.

Stay tuned !!

Why Are My Indexes Still Valid Quiz ? (Move On) October 11, 2011

Posted by Richard Foote in Index Organized Tables, Oracle Indexes, Quiz.
6 comments

OK, this quiz is a nice easy one, the lads at work got this without too much trouble. 
 

Normally, when you MOVE (re-org) a table, all the associated indexes become Unusable. As below:
 

SQL> create table pink_floyd (id number constraint pf_pk primary key, code number, name varchar2(30));
 
Table created.
 
SQL> create index pf_code_i on pink_floyd(code);
 
Index created.
 
SQL> insert into pink_floyd select rownum, mod(rownum,100), 'The Dark Side Of The Moon' from dual connect by level <= 10000;
 
10000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'PINK_FLOYD', cascade=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select index_name, num_rows, status from dba_indexes where table_name = 'PINK_FLOYD';
 
INDEX_NAME                       NUM_ROWS STATUS
------------------------------ ---------- --------
PF_PK                               10000 VALID
PF_CODE_I                           10000 VALID
 
SQL> alter table pink_floyd move;
 
Table altered.
 
SQL> select index_name, num_rows, status from dba_indexes where table_name = 'PINK_FLOYD';
 
INDEX_NAME                       NUM_ROWS STATUS
------------------------------ ---------- --------
PF_PK                               10000 UNUSABLE
PF_CODE_I                           10000 UNUSABLE

 
 
 
So the indexes are now all unusable ..
 
 
However, I previously created another table called BOWIE that has exactly the same columns, indexes and data but when I MOVE this table:
 

SQL> select index_name, num_rows, status from dba_indexes where table_name = 'BOWIE';
 
INDEX_NAME                       NUM_ROWS STATUS
------------------------------ ---------- --------
BOWIE_PK                            10000 VALID
BOWIE_CODE_I                        10000 VALID
 
SQL> alter table bowie move;
 
Table altered.
 
SQL> select index_name, num_rows, status from dba_indexes where table_name = 'BOWIE';
 
INDEX_NAME                       NUM_ROWS STATUS
------------------------------ ---------- --------
BOWIE_PK                            10000 VALID
BOWIE_CODE_I                        10000 VALID

 
All the indexes remain VALID !!
 
What’s so different about this table and their indexes ??????

I plan to discuss this whole topic (finally) in more detail in the coming weeks …

Why Is My Index Not Being Used No. 2 Solution (The Narrow Way) October 6, 2011

Posted by Richard Foote in CBO, Oracle Indexes, Quiz.
3 comments

As many have identified, the first thing to point out is that the two queries are not exactly equivalent.

The BETWEEN clause is equivalent to a ‘>= and <=’ predicate, whereas the original query only had a ‘> and <‘ predicate. The additional equal conditions at each end is significant.

The selectivity of the original query is basically costed as  (max condition – min condition)/(max column value – min column value). As this equates to 1 day/2000 days, the selectivity of the first query is therefore 0.0005.

1M rows multiplied by 0.0005 = 500 rows, the CBO cardinality estimate.

Note that 0.0005 is also the selectivity of 1 day.

The  calculation for the BETWEEN clause is different. It’s the above formula PLUS the selectivity of 1 day for the each of the two equal conditions (as the CBO assumes you not only want the values within the range but the specific values on each side of the range).

So that’s a selectivity of 0.0005 for the date range as above plus 2 x 0.0005 for each of the equal conditions = 0.0005 + 0.001 = 0.0015.

1M muliplied by 0.0015 = 1500, the CBO cardinality estimate of the second query.

As many have mentioned in the comments, the Clustering Factor of this index is awful, as each of the distinct 500 occurences of each day is spread evenly throughout the whole table. It’s therefore very expensive to use this index to retrieve a “larger” number of rows.

The cost of the first query that used the index is 505, about as bad as it gets when retrieving 500 rows. If we were to retrieve not 500 but 1500 rows, then the cost would effectively triple and be approximately 1500.

However, the cost of the FTS as highlighted in the second query is 933. This is less than 1500 and so the FTS is prefered by the CBO in the second query.

It all comes down to the relative costs and these all come down to the estimated selectivity of the query and the associated segment (and system) statistics, of which the Clustering Factor of the index is one of the most significant factors of all (no pun intended of course). If we effectively triple the estimated costs of a query as we do with the second query, then this can obviously have an impact on the CBO calculations and the resultant execution plan.

If we were to rewrite the first query to be equivalent to using the BETWEEN:

SQL> select * from bowie where hist_date >= '01-JAN-2011' and hist_date <= '02-JAN-2011';
 
500 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507
 
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |  1500 | 28500 |   933   (2)| 00:00:12 |
|*  1 |  TABLE ACCESS FULL| BOWIE |  1500 | 28500 |   933   (2)| 00:00:12 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("HIST_DATE">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "HIST_DATE"<=TO_DATE(' 2011-01-02 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3344  consistent gets
          0  physical reads
          0  redo size
       5563  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        500  rows processed

then we now get the same FTS costings and execution plan.

The moral of this story is that using a BETWEEN for a small range can significantly increase the cost of the query. Something to be aware of if perhaps close enough is good enough.

Why Is My Index Not Being Used No. 2 Quiz (Quicksand) October 3, 2011

Posted by Richard Foote in CBO, Oracle Indexes, Quiz.
34 comments

I have a table that has 1M rows with dates that span 2000 days, all evenly distributed (so there are 500 rows per day for the mathematically challenged). All stats are 100% accurate and I have an index on the date column.
 

SQL> create table bowie (id number, hist_date date, text varchar2(30));
 
Table created.
 
SQL> insert into bowie select rownum, sysdate-mod(rownum, 2000), 'BOWIE' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> create index bowie_date_i on bowie(hist_date);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 
 
 
OK, I now select 1 day’s worth of data:

 
 

SQL> select * from bowie where hist_date > '01-JAN-2011' and hist_date < '02-JAN-2011';
 
500 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 690852991
 
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |   500 |  9500 |   505   (0)| 00:00:07 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE        |   500 |  9500 |   505   (0)| 00:00:07 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_DATE_I |   500 |       |     4   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("HIST_DATE">TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
              AND "HIST_DATE"<TO_DATE(' 2011-01-02 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        506  consistent gets
          0  physical reads
          0  redo size
       5563  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        500  rows processed

 
 
 
Everything is perfect. The index is used and the cardinality estimate is spot on with the CBO correctly predicting that 500 rows will be returned.
 
 
OK, I now re-write the query with a BETWEEN clause:
 
 

SQL> select * from bowie where hist_date between '01-JAN-2011' and '02-JAN-2011';
 
500 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507
 
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |  1500 | 28500 |   933   (2)| 00:00:12 |
|*  1 |  TABLE ACCESS FULL| BOWIE |  1500 | 28500 |   933   (2)| 00:00:12 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("HIST_DATE">=TO_DATE(' 2011-01-01 00:00:00', 'syyyy-mm-dd
              hh24:mi:ss') AND "HIST_DATE"<=TO_DATE(' 2011-01-02 00:00:00',
              'syyyy-mm-dd hh24:mi:ss'))
 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3344  consistent gets
          0  physical reads
          0  redo size
       5563  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        500  rows processed

 
 
And now everything has gone wrong. I’m still getting the same 500 rows but the CBO is choosing an inefficient FTS. The estimates are now way way off, with the CBO expecting 1500, not 500 rows to be returned.
 
 
QUESTION: Why has everything now gone so terribly wrong ?

Why Is My Index Not Being Used Solution (Eclipse) October 1, 2011

Posted by Richard Foote in ASSM, CBO, Clustering Factor, Oracle Indexes, Quiz.
1 comment so far

Well done to everyone that got the correct answer :)

Indeed, the subtle but significant difference between the two demos was that demo one created the table in a tablespace called USER_DATA with manual segment space management (with freelists/freelist groups set to 1), while demo two created the table in a tablespace called USER_DATA1 with automatic segment space management.

In the first demo, the 3 separate sessions all followed the same freelist and inserted their rows concurrently into the same table blocks, resulting in the table being effectively sorted in ID order.

If we look at the resultant Clustering Factor:

SQL> select num_rows, leaf_blocks, clustering_factor from dba_indexes where index_name = 'BOWIE_ID_I';
NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
    300000        1452              2171

We notice the Clustering Factor of 2171 is relatively low for an index with 300000 rows, as indeed the order of the rows in the table almost exactly matches the order of the index entries.

In the second demo, ASSM ensures the 3 separate transactions don’t cause contention and insert their rows in a different set of blocks from each other. This is good in that contention is reduced but has the nasty side-effect on now having the resultant rows scattered randomly between different sets of 3 varying blocks. The actual Clustering Factor isn’t particularly bad in that Oracle has to now visit 3 different blocks for a range of values that previously might have been co-located within the 1 block, but because of the manner of which the Clustering Factor is calculated and that it will increase even if forced to visit a block it had just visited a couple of I/O calls beforehand, the calculated Clustering Factor can be appalling.

If we look at the Clustering Factor of the index from the second demo:

SQL> select num_rows, leaf_blocks, clustering_factor from dba_indexes where index_name = 'BOWIE_ID_I';
 
NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------- ----------- -----------------
    300000        1573            271936

We notice the Clustering Factor is now terrible at 271936. It’s a classic example of a table with the data that is relatively well clustered but has an appalling Clustering Factor. If Oracle didn’t increment the Clustering Factor for a block it had only visited a couple of index entries previously, then it would likely have a similar Clustering Factor to the first demo.

But statistics collection doesn’t take this into consideration, it will increment the Clustering Factor even if the block had only just recently been visited (only if it’s the same table block as the previous index entry will the Clustering Factor not increment during stats collection), so hence the terrible Clustering Factor and hence the dramatic difference in how the index is now considered, costed and used by the CBO.

The moral of this story is that if you use ASSM or you use mutliple Freelists/Freelist Groups to avoid contention, seriously consider the impact of the Clustering Factor on indexed columns that would ordinarily have a good Clustering Factor and the impact this in turn may have on your resultant execution plans …

Follow

Get every new post delivered to your Inbox.

Join 1,861 other followers