jump to navigation

Index Organized Tables – Overflow Segment Part II (The Loneliest Guy) January 18, 2012

Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Overflow Segment, Primary Key.
trackback

In my previous post on Index Organized Tables (IOT), I introduced the concept of the IOT Overflow Segment, where we can store columns that we may not want to include within the actual IOT index structure. Before we move on, I just wanted to cover off a few additional points that could be a trap for the unwary …

In my experience, the Primary Key (PK) columns of a table are typically the first columns defined in the table. This has certainly been standard practice in most environments I’ve seen. This makes sense in that the PK are in many ways the “key” column(s) in the table and are identified as such by having the prestigious honour of being the first column(s) defined within the table. Most people look at and intuitively expect the first columns in the table to be the PK columns and for that reason alone, it’s probably good practice to consistently define the PK columns in this manner.

However, there’s also a good argument why having the PK columns as the leading columns in the table is precisely the wrong location for them. As many tables are “primarily” accessed via the PK columns and so accessed directly through the associated PK index, the application already knows the PK values of the row in question. Therefore, it’s somewhat inefficient to then have the PK columns the first columns defined in the table as these generally have to be read through and ignored before we get to the non-PK columns that are of direct interest and the reason for visiting the table block in the first place. By placing the PK columns after the most accessed non-PK columns, we avoid having to unnecessarily read through these PK columns again when accessing the table via the PK index.

I personally prefer to define the PK columns first in a standardised manner, with the advantages of avoiding possible confusion and misunderstandings outweighing any possible performance improvements. However, I can at least see the logic and merit of not following this standard with Heap tables.

The same however can not really be said for IOTs and I would strongly recommend defining the PK columns first in an IOT …

I’m going to run the same demo as I did in my last post on the Overflow Segment, but with one subtle change. I’m not going to define the two PK columns first but rather have them defined after my heavily accessed non-PK column:

SQL> CREATE TABLE album_sales_iot(total_sales NUMBER, album_id NUMBER, country_id NUMBER, description VARCHAR2(1000), CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX INCLUDING country_id OVERFLOW TABLESPACE bowie2;

Table created.

So in this example, my leading column is the non-PK total_sales column, followed then by the two PK columns. I still only want these 3 columns to be included in the actual IOT structure, so I have my INCLUDING clause only including columns up to the country_id column. I want the remaining large description column to be stored separately in an Overflow segment.

OK, let’s populate this table with the same data we used previously:

SQL> BEGIN
  2    FOR i in 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4         INSERT INTO album_sales_iot VALUES(ceil(dbms_random.value(1,5000000)), i, c, 'A really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really really long description');
  6      END LOOP;
  9    END LOOP;
 10    COMMIT;
 11  END;
 12  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we describe this table, we get the expected listing:


SQL> desc album_sales_iot
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------

 TOTAL_SALES                                        NUMBER
 ALBUM_ID                                  NOT NULL NUMBER
 COUNTRY_ID                                NOT NULL NUMBER
 DESCRIPTION                                        VARCHAR2(1000)

With the columns listed in the order as we defined them in the table.

If we query the column details from dba_tab_columns:

SQL> select column_id, column_name from dba_tab_columns where table_name = 'ALBUM_SALES_IOT' order by column_id;

 COLUMN_ID COLUMN_NAME
---------- ------------------------------
         1 TOTAL_SALES
         2 ALBUM_ID
         3 COUNTRY_ID
         4 DESCRIPTION

We again find the column order is as we defined them in the table.

When we run the same query we ran last time that returned the data with 5 consistent gets:

SQL> set arraysize 100
SQL> select album_id, country_id, total_sales from album_sales_iot where album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   100 |  1300 |    18   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |   100 |  1300 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         20  consistent gets
          0  physical reads
          0  redo size
       2394  bytes sent via SQL*Net to client
        524  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

We notice that performance is a lot worse, with 20 consistent gets now required. Obviously, something has changed unexpectedly ???

The first clue on what’s going on here can be found by looking at dba_tab_cols:

SQL> select column_id, segment_column_id, column_name from dba_tab_cols where table_name = 'ALBUM_SALES_IOT' order by column_id;

 COLUMN_ID SEGMENT_COLUMN_ID COLUMN_NAME
---------- ----------------- ------------------------------
         1                 3 TOTAL_SALES
         2                 1 ALBUM_ID
         3                 2 COUNTRY_ID
         4                 4 DESCRIPTION

The SEGMENT_COLUMN_ID column determines the order of the columns as they’re actually stored within the segment and we notice the column order is different. The two PK columns are listed first, with the total_sales column only listed in the 3rd position.

As discussed in the IOT Introduction post, the structure of an index entry in an IOT has the PK columns as the leading columns, following by the non-PK columns in the table portion. This is critical because the PK columns determine the location within the IOT table where new rows need to be inserted and the subsequent ordering of the rows in the table. As such, the PK columns must always be the leading columns of an IOT, despite how the table is actually defined at creation time. If the PK columns are not listed first in the table creation DDL statement, Oracle will automatically re-order the columns and place the PK columns first regardless.

This now has consequences on the INCLUDING clause if specified. In the above table creation statement, the INCLUDING clause specified the country_id column. Although defined as the third column, as it’s a PK column, Oracle has automatically re-ordered the columns such that it’s physically listed as the second column within the IOT segment. Unfortunately the INCLUDING clause is only applied after the re-ordering of the columns and as such, the total_sales column which is now logically listed third and now after the country_id column, is not therefore actually included in the IOT index structure as (perhaps) intended.

A partial block dump of an IOT leaf block will confirm his:

Leaf block dump
===============
header address 298590812=0x11cc225c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x90: opcode=0: iot flags=I– is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 421
kdxcofbo 878=0x36e
kdxcofeo 879=0x36f
kdxcoavs 1
kdxlespl 0
kdxlende 0
kdxlenxt 21052811=0x1413d8b
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[879] flag: K—–, lock: 0, len=17
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 02
tl: 9 fb: –H-F— lb: 0x0  cc: 0
nrid:  0x01811911.0
row#1[896] flag: K—–, lock: 0, len=17
col 0; len 2; (2):  c1 02
col 1; len 2; (2):  c1 03
tl: 9 fb: –H-F— lb: 0x0  cc: 0
nrid:  0x01811911.1

As we can see, the IOT row entries only consist of the PK columns and the row reference to the corresponding Overflow segment. None of the non-PK columns (such as total_sales) are co-located within the IOT segment as the table column count is 0 (cc: 0).

As a result, additional consistent gets are now required to fetch the total_sales column from the Overflow segment to satisfy the query. This explains why the query is now less efficient than it was previously.

My recommendation with regard to defining IOTs is to simply list the PK columns first. This will ensure the INCLUDING clause is applied as intended and will generally reduce confusion and misunderstandings. Otherwise, the INCLUDING clause needs to specify a Non-PK column to ensure more than just the PK columns are actually included in the IOT segment, the consequences of which may not be obvious to the casual observer of the DDL or describer of the table.

Jonathan Lewis, a great source of information on indexes and Oracle in general has previously discussed this same IOT Trap on his blog.

Comments»

1. Brian Tkatch - January 19, 2012

Wow. I never thought about PK ordering having any affect. I always put them first for clarity, and (without thinking) assumed that was a better idea, because it was better logically.

Jonathan’s summary note is perfect: If you use the including clause when defining an IOT, you have to reference a non-key column if you want any column other than the key columns in the index segment.

I’m curious what the performance difference actually is in non-IOTs . Please put that on your topics-to-write-about list. 🙂

Like

2. Richard Foote - January 19, 2012

Hi Brian

Yes, performance considerations will come into the discussion I promise 🙂

Like

3. IOTs | Oracle Scratchpad - February 10, 2014

[…] Overflow segments – pt. 2 […]

Like


Leave a comment