Indexed Organized Tables – An Introduction to IOT Secondary Indexes (A Second Face) March 19, 2012
Posted by Richard Foote in Block Dumps, Index Internals, Index Organized Tables, IOT, Oracle Indexes, Secondary Indexes.14 comments
Man, its been ages since I had free time to update the blog, what with birthday parties to organise, Roger Water concerts to attend and Radiohead concerts in the planning !! OK, time to take an initial look at Secondary Indexes for Index Organized Tables (IOTs).
If the IOT needs to be accessed via the Primary Key (PK) column(s), then no problem, the IOT structure must have a PK defined and the logical structure of the IOT ensures that data within the IOT is ordered based on the PK. Therefore, the IOT can be navigated like any conventional PK and the necessary data can be efficiently accessed.
But what if we want to access the data efficiently via Non-PK columns or without specify the leading column of the PK ? Can we create secondary indexes on a IOT ?
When IOTs were first introduced way back in Oracle8, secondary indexes weren’t supported (they came later in 8i). That’s likely due to the fact Oracle had to resolve a tricky issue in relation to indexing an IOT structure, that being what to do when indexing rows that potentially move around all the time ?
With a conventional Heap table, once a row is inserted into the table, it doesn’t generally subsequently move. There are relatively few examples of when this occurs, for example updating the partitioned column of a row such that it needs to be stored in another partition. This is recognised as a rather expensive thing to do as not only do at least two blocks need to be accessed and modified but it also requires associated indexes to be updated as well. As such, it generally requires explicitly allowing such activities to occur (by enabling row movement and the such). Note, when rows migrate to another block due to an increase in row size, indexes are not impacted and still reference the original block and the remaining stub of the row which points to the new block/location of the row.
But with IOTs, the story can be very different. When a 50-50 index block split occurs, roughly half the rows in the leaf block move to a new block. A relatively expensive operation would be even more expensive if Oracle had to also update the index entries of all secondary indexes that referenced all these moved rows. Although rare with Heap tables, rows moving to new locations could be relatively common in an IOT due to associated 50-50 block split operations.
To deal with the difficulties of frequently moving rows within an IOT, Oracle created the IOT Secondary Index structure. It has three main components:
- The indexed column values
- The PK columns of the associated IOT
- A “guess” that points to the physical location of the rows within the IOT, initially at the time the index is created
So the IOT Secondary Index is used in the following fashion. During an index scan, Oracle attempts to use the “guess” to access the block that was the last known physical location of the row within the IOT. If it finds the required row in the IOT, great. The index performs in a similar manner to using a rowid with a conventional secondary index. However, if the required row is nowhere to be seen within the referenced block, Oracle tries again, this time using the PK value contained with the IOT Secondary Index to perform a Unique Scan of the IOT. This is a little more expensive to perform as it requires navigating down the branch structures of the IOT, but is at least guaranteed to find the row this time in its current location.
So in the best case scenario, the index performs similar to that of a normal secondary index. In the worst case scenario where the row has moved, the index is forced to perform an additional Unique Scan of the IOT using the PK but at least this has the potential to be much more efficient that a Fast Full Scan of the IOT in order to find the necessary row.
The key point to note here is that the secondary index is not updated when a block split on the parent IOT occurs. The “guess” via the physical pointer reference simply becomes stale and the PK which is also stored within the secondary index is used as a backup method of accessing the required row.
If we start with a traditionally simple little demo, let’s first create and populate an IOT:
SQL> CREATE TABLE album_sales_IOT(album_id number, country_id number, total_sales number, album_colour varchar2(20), CONSTRAINT album_sales_iot_pk PRIMARY KEY(album_id, country_id)) ORGANIZATION INDEX; Table created. SQL> begin 2 for i in 1..5000 loop 3 for c in 1..100 loop 4 insert into album_sales_iot values (i, c, ceil(dbms_random.value(1,5000000)), 'GOLD'); 5 end loop; 6 end loop; 7 commit; 8 end; 9 / PL/SQL procedure successfully completed. SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=> 'ALBUM_SALES_IOT', cascade=> true, estimate_percent=> null, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
If we now run a query based on the non-PK TOTAL_SALES column:
SQL> select * from album_sales_iot where total_sales = 2000; ALBUM_ID COUNTRY_ID TOTAL_SALES ALBUM_COLOUR ---------- ---------- ----------- -------------------- 1764 56 2000 GOLD Execution Plan ---------------------------------------------------------- Plan hash value: 1789589470 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 | 425 (1)| 00:00:06 | |* 1 | INDEX FAST FULL SCAN| ALBUM_SALES_IOT_PK | 1 | 18 | 425 (1)| 00:00:06 | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("TOTAL_SALES"=2000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1586 consistent gets 0 physical reads 0 redo size 757 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We see that Oracle has no real choice (the PK is of no direct use here) but to perform an expensive FAST FULL INDEX SCAN, even though it correctly knows relatively few rows are to be retrieved.
If we create a secondary index on the IOT however:
SQL> create index album_sales_IOT_total_sales_i on album_sales_iot(total_sales); Index created. SQL> select * from album_sales_iot where total_sales = 2000; ALBUM_ID COUNTRY_ID TOTAL_SALES ALBUM_COLOUR ---------- ---------- ----------- -------------------- 1764 56 2000 GOLD Execution Plan ---------------------------------------------------------- Plan hash value: 1433198708 --------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 18 |4 (0)| 00:00:01 | |* 1 | INDEX UNIQUE SCAN| ALBUM_SALES_IOT_PK | 1 | 18 |4 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN| ALBUM_SALES_IOT_TOTAL_SALES_I | 1 | |3 (0)| 00:00:01 | --------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("TOTAL_SALES"=2000) 2 - access("TOTAL_SALES"=2000) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 5 consistent gets 5 physical reads 0 redo size 757 bytes sent via SQL*Net to client 523 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We notice that the index is used as expected and the number of consistent gets has dropped significantly. Notice also that the IOT is accessed subsequently not via Index ROWIDs but by a INDEX UNIQUE SCAN via the IOT PK. More on this later …
If we look at a partial block dump of an index entry within the IOT Secondary index:
row#0[8015] flag: K—–, lock: 0, len=21
col 0; len 3; (3): c2 1f 28
col 1; len 3; (3): c2 15 37
col 2; len 2; (2): c1 1b
tl: 8 fb: –H-FL– lb: 0x0 cc: 1
col 0: [ 4] 01 01 41 da
col 0 represents the indexed value (TOTAL_SALES)
col 1 and col 2 represent the PK columns (ALBUM_ID and COUNTRY_ID)
Following the 3 byte table header overhead required for the “guess”, we have the second col 0, which represents the 4 byte “guess” to the last known physical location of the row.
Much more to follow shortly …