jump to navigation

IOT Secondary Indexes – The Logical ROWID Guess Component Part I (Lucky) April 26, 2012

Posted by Richard Foote in Index Organized Tables, IOT, Oracle Indexes, Primary Key, ROWID, Secondary Indexes.
8 comments

As discussed previously, an index entry within a Secondary Index on an Index Organized Table (IOT) basically consists of the indexed column(s) and the Logical Rowid, the PK column(s) and a “guess” to the physical block in the IOT containing the corresponding row.

Let’s discuss this “guess” component in a bit more detail.

When the Secondary Index is created, this guess is spot on and will indeed point to the correct block within the IOT structure that contains the row being referenced by the indexed entry.

When I initially created the Secondary Index on the Total_Sales column, all the physical guesses were accurate and indeed pointed to the correct blocks within the IOT structure. This can be confirmed by the following query:

SQL> SELECT index_name, pct_direct_access, iot_redundant_pkey_elim
  2  FROM dba_indexes WHERE index_name = 'ALBUM_SALES_IOT_TOTAL_SALES_I';
 INDEX_NAME                     PCT_DIRECT_ACCESS IOT
------------------------------ ----------------- ---
ALBUM_SALES_IOT_TOTAL_SALES_I                100 NO

As we can see, the PCT_DIRECT_ACCESS value is 100, which means that 100% of all the guess components are correct. Therefore, the index behaves in a manner very similar to an ordinary Secondary Index with a rowid, in that all the initial accesses to the IOT are valid and there’s no need to subsequently re-access the IOT via the PK component. From the perspective of finding the required row entries with the IOT structure, the Secondary Index is as efficient as possible when all the guesses are valid.

If we run a little query to access a number of rows via this Secondary Index:

SQL> SELECT * FROM album_sales_iot
  2  WHERE total_sales BETWEEN 2742000 and 2743000;

99 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1433198708

---------------------------------------------------------------------------------------------------
| Id  | Operation         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                               |   102 |  1836 |   105   (0)| 00:00:02 |
|*  1 |  INDEX UNIQUE SCAN| ALBUM_SALES_IOT_PK            |   102 |  1836 |   105   (0)| 00:00:02 |
|*  2 |   INDEX RANGE SCAN| ALBUM_SALES_IOT_TOTAL_SALES_I |   102 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
   2 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        110  consistent gets
          0  physical reads
          0  redo size
       3657  bytes sent via SQL*Net to client
        590  bytes received via SQL*Net from client
          8  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         99  rows processed

Note we’re returning 99 rows which requires 110 consistent gets. So a touch over 1 consistent get per row being access. Note these numbers, we’ll reference them again later …

OK, we’re now going to add some more rows to the table. This will result in 50-50 block splits occurring which will in turn cause a whole bunch of rows to move to new physical blocks within the IOT.

SQL> BEGIN
  2    FOR i IN 1..5000 LOOP
  3       FOR c IN 101..200 LOOP
  4          INSERT INTO album_sales_iot
  5          VALUES(i,c,ceil(dbms_random.value(1,5000000)), 'Some new rows');
  6       END LOOP;
  7    END LOOP;
  8    COMMIT;
  9  END;
 10  /
 PL/SQL procedure successfully completed.

If we now collect fresh statistics and look at the index statistics again:

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'ALBUM_SALES_IOT', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT index_name, pct_direct_access, iot_redundant_pkey_elim
  2  FROM dba_indexes WHERE index_name = 'ALBUM_SALES_IOT_TOTAL_SALES_I';

INDEX_NAME                     PCT_DIRECT_ACCESS IOT
------------------------------ ----------------- ---
ALBUM_SALES_IOT_TOTAL_SALES_I                 58 NO

We notice that the PCT_DIRECT_ACCESS value has dropped significantly to just 58%. This means that only 58% of the guesses are now accurate and that in the other 42% of cases, Oracle is forced to now re-access the IOT again via the PK component stored in the Secondary Indexes. This results in additional consistent gets now likely being required to access the IOT via the index, resulting in a less efficient index.

If we now re-run the original query again:

SQL> SELECT * FROM album_sales_iot
  2  WHERE total_sales BETWEEN 2742000 and 2743000;

184 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1433198708

---------------------------------------------------------------------------------------------------
| Id  | Operation         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                               |   202 |  4646 |   376   (0)| 00:00:05 |
|*  1 |  INDEX UNIQUE SCAN| ALBUM_SALES_IOT_PK            |   202 |  4646 |   376   (0)| 00:00:05 |
|*  2 |   INDEX RANGE SCAN| ALBUM_SALES_IOT_TOTAL_SALES_I |   202 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
   2 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        463  consistent gets
          0  physical reads
          0  redo size
       7144  bytes sent via SQL*Net to client
        656  bytes received via SQL*Net from client
         14  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        184  rows processed

We can see that approximately doubled the number of rows are now returned (184 from 99 rows). However, the number of consistent gets has increased by approximately 4 fold (from 110 to 463). The index is now not as efficient in retrieving rows as it was previously, requiring now some 2.5 consistent gets per row being accessed.

To fix these guesses and make the index more efficient again, one can either ALTER the index with the REBUILD or the UPDATE BLOCK REFERENCES clause:

SQL> alter index album_sales_iot_total_sales_i UPDATE BLOCK REFERENCES;

Index altered.

If we now look at some fresh index statistics:

SQL> exec dbms_stats.gather_index_stats(ownname=> null, indname=> 'ALBUM_SALES_IOT_TOTAL_SALES_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT index_name, pct_direct_access, iot_redundant_pkey_elim
  2  FROM dba_indexes WHERE index_name = 'ALBUM_SALES_IOT_TOTAL_SALES_I';

INDEX_NAME                     PCT_DIRECT_ACCESS IOT
------------------------------ ----------------- ---
ALBUM_SALES_IOT_TOTAL_SALES_I                100 NO

We notice that the index now has the PCT_DIRECT_ACCESS back at a nice high 100%. If we re-run the same query again:

SQL> SELECT * FROM album_sales_iot
  2  WHERE total_sales BETWEEN 2742000 and 2743000;

184 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1433198708

---------------------------------------------------------------------------------------------------
| Id  | Operation         | Name                          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |                               |   202 |  4646 |   206   (0)| 00:00:03 |
|*  1 |  INDEX UNIQUE SCAN| ALBUM_SALES_IOT_PK            |   202 |  4646 |   206   (0)| 00:00:03 |
|*  2 |   INDEX RANGE SCAN| ALBUM_SALES_IOT_TOTAL_SALES_I |   202 |       |     4   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
   2 - access("TOTAL_SALES">=2742000 AND "TOTAL_SALES"<=2743000)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
        202  consistent gets
          0  physical reads
          0  redo size
       7144  bytes sent via SQL*Net to client
        656  bytes received via SQL*Net from client
         14  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)

The consistent gets have now reduced substantially from 463 to just 202, back at a touch over 1 consistent get per row retrieved …

So, IOT Secondary Indexes can be as efficient as normal secondary indexes, but if the IOT is subject to 50-50 block splits, they’ll likely need to be maintained more regularly to ensure they stay nice and efficient. Another significant disadvantage associated with IOTs that have Secondary Indexes …

There’s a bit more I’ll like to say on the subject but I’ll leave it for a Part II :)

Follow

Get every new post delivered to your Inbox.

Join 1,918 other followers