jump to navigation

Index Organized Tables – An Introduction Of Sorts (Pyramid Song) January 10, 2012

Posted by Richard Foote in Block Dumps, CBO, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Primary Key.
trackback

Thought it was high time that I covered in a little detail the subject of Index Organized Tables (IOTs). When used appropriately, they can be an extremely useful method of storing and accessing data. Hopefully by the end of this series, you’ll have a better understanding of IOTs, their respective strengths and weaknesses and so perhaps be in a better position to take advantage of them when appropriate.

As I mentioned in a previous post, Martin Widlake has recently written an excellent series on IOTs, which I highly recommend. I’ll try to cover differing aspects of IOTs that will hopefully be of interest.

To start, let’s cover a very basic little example.

Let’s begin by creating and populating a simple Heap Table that holds information about musical albums (Note using an 8K blocksize in a MSSM tablespace):

SQL> CREATE TABLE album_sales(album_id number, country_id number, total_sales number, album_colour varchar2(20),
  2  CONSTRAINT album_sales_pk PRIMARY KEY(album_id, country_id));

Table created.

SQL> BEGIN
  2    FOR i IN 1..5000 LOOP
  3      FOR c IN 1..100 LOOP
  4        INSERT INTO album_sales VALUES (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD');
  5      END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

We have a natural Primary Key that consists of two columns and an additional two columns of information.

Let’s look at some basic sizing information on the table and associated Primary Key index:

SQL> SELECT blocks, empty_blocks, IOT_TYPE FROM dba_tables WHERE table_name = 'ALBUM_SALES';

    BLOCKS EMPTY_BLOCKS IOT_TYPE
---------- ------------ ------------
      1570            0

SQL> ANALYZE INDEX album_sales_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      1152          3       1062

So the table segment consists of 1570 blocks and the index segment 1152, with a total of 1062 leaf blocks.

OK, let’s run a basic query looking for all albums with an album_id=42:

SQL> SELECT * FROM album_sales WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3244723662

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                |   100 |  1800 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| ALBUM_SALES    |   100 |  1800 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | ALBUM_SALES_PK |   100 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         18  consistent gets
          0  physical reads
          0  redo size
       4084  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

As we can see, things are pretty good. 18 consistent gets in order to return 100 rows isn’t bad at all. Clearly, the index has a good Clustering Factor and can retrieve the 100 required rows in a relatively efficient manner.

However, this is a very frequently executed query and we want to do even better. One thing we notice is that we only have a couple of columns in the table which are not part of the index. Perhaps if we included these columns in the index as well, we can then use the index to extract all the required data and thus eliminate the need to visit the table segment at all. Overloading an index in this manner is a common tuning technique and will hopefully reduce the number of required logical I/Os to run the query.

We can do this by dropping and recreating the index with all the columns, making sure the PK columns remain the leading columns. This will ensure the index can still be used to police the PK constraint:

SQL> ALTER TABLE album_sales DROP PRIMARY KEY;

Table altered.

SQL> CREATE INDEX album_sales_pk_i ON album_sales(album_id, country_id, total_sales, album_colour) COMPUTE STATISTICS;

Index created.

SQL> ALTER TABLE album_sales ADD constraint album_sales_pk PRIMARY KEY(album_id, country_id);

Table altered.

OK, so the index now contains all the columns in the table and is now used to police the PK constraint:

SQL> select constraint_name, constraint_type, index_name from dba_constraints where constraint_name = 'ALBUM_SALES_PK';

CONSTRAINT_NAME                C INDEX_NAME
------------------------------ - ------------------------------
ALBUM_SALES_PK                 P ALBUM_SALES_PK_I

Let’s now look at the size of the index:

SQL> ANALYZE INDEX album_sales_pk_i VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      2048          5       2006

OK, as expected the index is now somewhat larger as it now needs to accommodate the extra columns. The number of overall blocks allocated to the index is 2048, with leaf blocks increasing from 1062  to 2006 leaf blocks.

If we now re-run the query:

SQL> SELECT * FROM album_sales WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1126128764

-------------------------------------------------------------------------------------
| Id  | Operation        | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                  |   100 |  1800 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_PK_I |   100 |  1800 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         11  consistent gets
          0  physical reads
          0  redo size
       3568  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

We notice things have indeed improved and we have reduced the number consistent gets from 18 down to just 11. Not a bad improvement !!

If look at a partial block dump of one of the index leaf blocks:

Leaf block dump
===============
header address 484409948=0x1cdf825c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 5
kdxcosdc 0
kdxconro 258
kdxcofbo 552=0x228
kdxcofeo 1373=0x55d
kdxcoavs 821
kdxlespl 0
kdxlende 0
kdxlenxt 20972941=0x140058d
kdxleprv 20972939=0x140058b
kdxledsz 0
kdxlebksz 8036
row#0[8010] flag: ——, lock: 0, len=26
col 0; len 2; (2):  c1 07
col 1; len 2; (2):  c1 12
col 2; len 5; (5):  c4 04 15 31 59
col 3; len 4; (4):  47 4f 4c 44
col 4; len 6; (6):  01 40 05 82 00 b7
row#1[7984] flag: ——, lock: 0, len=26
col 0; len 2; (2):  c1 07
col 1; len 2; (2):  c1 13
col 2; len 5; (5):  c4 03 19 2c 3d
col 3; len 4; (4):  47 4f 4c 44
col 4; len 6; (6):  01 40 05 82 00 b8

We notice that each leaf entry is 26 bytes in length. The length of the four columns adds up to 13 bytes. The remaining 13 bytes is basically overhead required for each index entry:

2 bytes for flag and lock information in the index entry header

5 x 1 byte for each of the length bytes for each column

6 bytes for the 5th index column which is the index rowid

So that’s 13 bytes of overhead per index entry in this example index.

Well, everything is currently pretty good. We have the application now performing approximately 40% less work than it was previously. But we have one little issue. With the index now consisting of all the columns in the table and with the application using the index exclusively, what’s the point of now having the table? It’s wasting storage and wasting resources in having to be maintained for no purpose other than having to exist so that the index can in turn exist.

Wouldn’t it be nice if we can somehow just have the index, but without the underlining table. Enter the Index Organized Table (IOT), first introduced way back in Oracle 8.0. It’s basically an index structure that can exist without the need for an underlining table. The index structure itself is the table by which we can store and retrieve the necessary data.

OK, let’s now create a new version of this table with the same data, but this time as an IOT:

SQL> CREATE TABLE album_sales_IOT(album_id number, country_id number, total_sals number, album_colour varchar2(20),
     CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX;

Table created.

SQL> BEGIN
  2    FOR i IN 1..5000 LOOP
  3      FOR c in 1..100 LOOP
  4        INSERT INTO album_sales_IOT VALUES (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD');
  5      END LOOP;
  6    END LOOP;
  7    COMMIT;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

The key clause is here ORGANIZATION INDEX. I’ll discuss other options and syntax in coming posts.

If we look now at the table segment:

SQL> SELECT blocks, empty_blocks, IOT_TYPE FROM dba_tables
  2  WHERE table_name = 'ALBUM_SALES_IOT';

    BLOCKS EMPTY_BLOCKS IOT_TYPE
---------- ------------ ------------
                        IOT

We see there is an IOT segment listed but consists of no blocks as it doesn’t physically exist …

If we look at the size of the corresponding index:

SQL> SELECT index_name, table_name, blevel, leaf_blocks FROM dba_indexes
  2  WHERE table_name = 'ALBUM_SALES_IOT';

INDEX_NAME           TABLE_NAME       BLEVEL LEAF_BLOCKS
-------------------- --------------- ------- -----------
ALBUM_SALES_IOT_PK   ALBUM_SALES_IOT       2        1550

SQL> ANALYZE INDEX album_sales_iot_pk VALIDATE STRUCTURE;

Index analyzed.

SQL> SELECT BLOCKS, BR_BLKS, LF_BLKS FROM index_stats;

    BLOCKS    BR_BLKS    LF_BLKS
---------- ---------- ----------
      1664          4       1550

We notice it’s smaller than the corresponding overloaded index for the Heap Table. The previous index consisted of 2048 blocks and 2006 leaf blocks but this index is somewhat smaller at just 1664 blocks and 1550 leaf blocks.

If we take a look at a partial block dump of a leaf block from the IOT:

Leaf block dump
===============
header address 483926620=0x1cd8225c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 1
kdxcoopc 0x90: opcode=0: iot flags=I— is converted=Y
kdxconco 2
kdxcosdc 2
kdxconro 336
kdxcofbo 708=0x2c4
kdxcofeo 710=0x2c6
kdxcoavs 2
kdxlespl 0
kdxlende 0
kdxlenxt 20976645=0x1401405
kdxleprv 20976643=0x1401403
kdxledsz 0
kdxlebksz 8036
row#0[710] flag: K—S-, lock: 2, len=22
col 0; len 2; (2):  c1 08
col 1; len 2; (2):  c1 49
tl: 14 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 04 2f 10 59
col  1: [ 4]  47 4f 4c 44
row#1[732] flag: K—S-, lock: 2, len=22
col 0; len 2; (2):  c1 08
col 1; len 2; (2):  c1 4a
tl: 14 fb: –H-FL– lb: 0x0  cc: 2
col  0: [ 5]  c4 03 01 03 46
col  1: [ 4]  47 4f 4c 44

Firstly, we notice it’s definitely an IOT block dump as the IOT flag is set.

The structure of the index entry is somewhat different here. It basically consists of:

2 bytes for lock and flag info in the index header as previously

Next come the two Primary Key columns with their corresponding length bytes. Note an IOT must have a PK defined.

Following are 3 bytes for the table header consisting of a lock byte, flag byte and a byte to denote the number of table (non PK) columns (in this case 2).

Followed finally by the 2 Non-PK columns and their corresponding length bytes.

Note the big missing component here from the previous block dump is that there is no rowid defined with its corresponding length byte. No need for a rowid if there’s no corresponding table to point down to …

So the overall overhead has been reduced to:

2 byes for the index header

3 bytes for the table header

4 bytes for the 4 column lengths

for a total of 9 bytes, 4 less than the 13 bytes overhead required in the previous example. So the total length of an index entry has reduced down from 26 bytes to just 22 bytes. Hence, the overall reduction in the size of the corresponding IOT index.

So we have saved 1570 table blocks and 384 index blocks in total.

If we now re-run the same query:

SQL> SELECT * FROM album_sales_iot WHERE album_id = 42;

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1834499174

---------------------------------------------------------------------------------------
| Id  | Operation        | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT |                    |   100 |  1800 |     3   (0)| 00:00:01 |
|*  1 |  INDEX RANGE SCAN| ALBUM_SALES_IOT_PK |   100 |  1800 |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("ALBUM_ID"=42)

 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       3575  bytes sent via SQL*Net to client
        589  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

Not only have we saved ourselves some storage and having to maintain two physical segments, but things are a tad more efficient as well, reducing the number of consistent gets down from 11 to 10 as the corresponding index segment we need to access is smaller …

Enough to start with for now and yes the pun in the title is fully intended 🙂

Comments»

1. Uwe Hesse - January 10, 2012

Richard,
great post! I like your introduction to motivate IOTs – it is very similar to what I’ve told my students for years when they see the concept of an IOT for the first time: “Wouldn’t it be nice if we can somehow just have the index, but without the underlining table.” Exactly – just that I take scott.dept as an example 🙂

Like

Richard Foote - January 13, 2012

Hi Uwe

Thank-you !!

Nothing wrong with scott.dept, except perhaps the lack of Bowie references 🙂

Like

2. IOTs by the Oracle Indexing Expert « Martin Widlake's Yet Another Oracle Blog - January 10, 2012

[…] pleased to see that Richard Foote has started a series on Index Organized Tables. You can see his introductory post on the topic here. As ever with Richard, he puts in lots of detail and explanation and I’ve been a fan of hit […]

Like

3. Brian Tkatch - January 10, 2012

Wow, great intro!

Thanx Richard.

Like

Richard Foote - January 13, 2012

Hi Brian

Thank-you !!

Just hope the remaining posts are considered as good 🙂

Like

4. jgarry - January 12, 2012

Underlying table. We’re talking databases, not html 🙂

Like

Richard Foote - January 13, 2012

Ooops 🙂

Like

5. Index Organized Tables – Overflow Segment Part II (The Loneliest Guy) « Richard Foote’s Oracle Blog - January 18, 2012

[…] discussed in the IOT Introduction post, the structure of an index entry in an IOT has the PK columns as the leading columns, following by […]

Like

6. IOT Secondary Indexes – The Logical ROWID Guess Component Part I (Lucky) « Richard Foote’s Oracle Blog - April 26, 2012

[…] discussed previously, an index entry within a Secondary Index on an Index Organized Table (IOT) basically consists of […]

Like

7. Santosh - July 24, 2012

Really its gr8 notes to get introduced to IOT…
Thanks a lot

Santosh

Like

Richard Foote - September 19, 2012

Hi Santosh

My pleasure 🙂

Like

8. Cardio4life - August 26, 2012

When I initially left a comment I seem to have clicked on the -Notify me when new comments are added- checkbox and from now on every time a comment is added I receive four emails with the exact same comment. Is there a way you can remove me from that service? Many thanks!

Like

Richard Foote - September 19, 2012

Hi Cardio

I’ve had a look around but can find no way to turn things off at my end. Perhaps someone else on wordpress can give me a hint ?

Like

9. IOTs | Oracle Scratchpad - February 10, 2014

[…] An introduction […]

Like

10. sachinperfdba - May 1, 2017

Hi Richard,
A very helpful series of articles on IOTs.
I have one question. In the example you gave we get the space savings when using index organized tables.
But in some cases actually i see more space used when using IOTs. Example the one used by Martin in this link

IOT 2 – First examples and proofs

I see increased space usage (more than the combined space used by heap table + its index) for one of our application table as well.

Can you please suggest why more space used in these cases.

Regds.

Like

Richard Foote - May 12, 2017

It’s usually due to the additional free space generated by index block splits.

In the link you reference, notice how the index entries are basically randomised with the leading column randomly generated. Therefore, Oracle performs 50-50 block splits as the IOT grows. Notice also that there are quite some extra columns in addition to the PK columns, so the IOT is enlarged with these columns, making the impact of these splits larger.

In the heap example, the PK index is much smaller as it only has the 2 columns, so when it splits it has less overall impact on overall storage. With the table, it just inserted as normal with little free space.

So yes, IOT that are randomly inserted, that have larger number of columns in addition to the PK can quite easily consume more storage than equivalent heap tables. This difference can increase further if you also have secondary indexes as well.

Liked by 1 person


Leave a reply to Santosh Cancel reply