jump to navigation

Index Skip Scan: Potential Use Case or Maybe Not ? (Shine On You Crazy Diamond) January 30, 2018

Posted by Richard Foote in Oracle Indexes.
5 comments

shine-on-you-crazy-diamond

While answering a recent question on a LinkedIn forum, it got me thinking whether there’s a potential use case for using an INDEX SKIP SCAN I hadn’t previously considered.

I’ve discussed Index Skip Scans previously (as I did here), a feature introduced around Oracle9i that allows an index to be considered by the CBO even if the leading column of the index is not included in a query predicate. However, the index is only going to be used by the CBO if there are relatively few distinct values in the missing leading column, as Oracle has to effectively scan the index multiple times for each potential leading column value. If there are too many distinct values, each scan might not result in sufficient index leaf blocks being “skipped” thereby making the INDEX SKIP SCAN access path too inefficient.

But it occurred to me that the strategy of using an index skip scan could also potentially be applied when performing a sort-based aggregate function. Or maybe not.

So a little test case to find out.

I begin by creating a table with 1M rows. The key here though is that the CODE column only has 3 distinct values, so any aggregation based on CODE will only return 3 or fewer rows.

SQL> create table bowie (id number not null, code number not null, sales number not null, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,3), ceil(dbms_random.value(0,10000)), 'David Bowie' from dual connect by level  commit;

Commit complete.

I now create an index based on the CODE and SALES columns, with CODE the leading column:

SQL> create index bowie_i on bowie(code, sales);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

As CODE only has very few distinct values, the index is a candidate for an INDEX SKIP SCAN if CODE is not specified in a predicate. For example:

SQL> select code, sales from bowie where sales=1;

100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3861840421

----------------------------------------------------------------------------
| Id | Operation        | Name    | Rows | Bytes | Cost (%CPU) | Time     |
----------------------------------------------------------------------------
| 0  | SELECT STATEMENT |         |  100 |   700 |       5 (0) | 00:00:01 |
|* 1 | INDEX SKIP SCAN  | BOWIE_I |  100 |   700 |       5 (0) | 00:00:01 |
----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("SALES"=1)
    filter("SALES"=1)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
  31 consistent gets
   0 physical reads
   0 redo size
2305 bytes sent via SQL*Net to client
 674 bytes received via SQL*Net from client
   8 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
 100 rows processed

So an INDEX SKIP SCAN has indeed been used by the CBO and at just 31 consistent gets, not a bad result when fetching the 100 rows of interest. With so few distinct values of CODE, Oracle only has to perform a relatively few number of scans of the index to retrieve all possible SALES of interest across each og the (3) CODE values. At a cost of just 5, the CBO has estimated that relatively few index leaf blocks indeed need to be accessed here.

It’s also worth mentioning at this point that Oracle can also use the index very effectively to return just the MIN (or potentially MAX) SALES column for each CODE value of interest, as it only has to read the first CODE index entry to subsequently determine the associated minimum SALES value:

SQL> select min(sales) from bowie where code=1;

MIN(SALES)
----------
1

Execution Plan
----------------------------------------------------------
Plan hash value: 2634611566

----------------------------------------------------------------------------------------
| Id | Operation                  | Name    | Rows | Bytes | Cost (%CPU) | Time      |
----------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT           |         |    1 |     7 |       3 (0) |  00:00:01 |
| 1  | SORT AGGREGATE             |         |    1 |     7 |             |           |
| 2  | FIRST ROW                  |         |    1 |     7 |       3 (0) |  00:00:01 |
|* 3 | INDEX RANGE SCAN (MIN/MAX) | BOWIE_I |    1 |     7 |       3 (0) |  00:00:01 |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("CODE"=1)

Statistics
----------------------------------------------------------
  0 recursive calls
  0 db block gets
  3 consistent gets
  0 physical reads
  0 redo size
544 bytes sent via SQL*Net to client
608 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  1 rows processed

At just 3 consistent gets, Oracle has only had to read the first CODE=1 index entry to immediately determine the minimum associated SALES value.

In theory, Oracle could use the same strategies when processing a GROUP BY aggregate query to very quickly and efficiently determine the minimum SALES value for each distinct CODE in my data. Oracle knows there are only 3 (few) distinct key values and with a combination of using the index to quickly access the minimum (first) SALES of each CODE value and an INDEX SKIP SCAN to quickly re-scan the index to get to the next CODE index entry, an index could be used to very quickly and efficiently find and retrieve the necessary data set in the following query:

SQL> select code, min(sales) from bowie group by code;

CODE MIN(SALES)
---------- ----------
1 1
2 1
0 1

Execution Plan
----------------------------------------------------------
Plan hash value: 2387048003

---------------------------------------------------------------------------------
| Id | Operation            | Name    | Rows  | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------------
| 0  | SELECT STATEMENT     |         |     3 |    21 |   1079 (43) | 00:00:01 |
| 1  | HASH GROUP BY        |         |     3 |    21 |   1079 (43) | 00:00:01 |
| 2  | INDEX FAST FULL SCAN | BOWIE_I | 1000K | 6835K |    677 (10) | 00:00:01 |
---------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
2477 consistent gets
   0 physical reads
   0 redo size
 677 bytes sent via SQL*Net to client
 608 bytes received via SQL*Net from client
   2 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   3 rows processed

However, only an INDEX FAST FULL SCAN is used at a relatively expensive 2477 consistent gets. Note that Oracle is using a HASH GROUP BY for its aggregation by default (and not some form of sort/group type aggregation). As a result, the index is not considered here and note also that the final result set is NOT in CODE order (the data is returned in 1,2,0 CODE order).

We could try to force the use of an INDEX SKIP SCAN via a hint:

SQL> select /*+ index_ss (bowie, bowie_i) */ code, min(sales) from bowie group by code;

CODE MIN(SALES)
---------- ----------
0 1
1 1
2 1

Execution Plan
----------------------------------------------------------
Plan hash value: 944722262

--------------------------------------------------------------------------------
| Id | Operation            | Name    | Rows  | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT     |         |     3 |    21 |    2558 (4) | 00:00:01 |
| 1  | SORT GROUP BY NOSORT |         |     3 |    21 |    2558 (4) | 00:00:01 |
| 2  | INDEX SKIP SCAN      | BOWIE_I | 1000K | 6835K |    2558 (4) | 00:00:01 |
--------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
2469 consistent gets
   0 physical reads
   0 redo size
 677 bytes sent via SQL*Net to client
 608 bytes received via SQL*Net from client
   2 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   3 rows processed

It’s now using an INDEX SKIP SCAN in the plan. But it’s not being “clever” in its use of the INDEX SKIP SCAN with 2469 consistent gets suggesting Oracle is not effectively making use of the (MIN/MAX) scan capability per access of distinct CODE and is unnecessarily reading most of the index leaf blocks.

Oracle however is now using a sort/group based process when performing it’s aggregation (as evidenced in the second step of the plan), resulting in the data now being returned in CODE order.

Even if we take the aggregation out of the equation with a simple MIN based query accessing more than one CODE value:

SQL> select min(sales) from bowie where code in (1,2);

Execution Plan
----------------------------------------------------------
Plan hash value: 1467404054

------------------------------------------------------------------------------
| Id | Operation        | Name    | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------
| 0  | SELECT STATEMENT |         |    1 |     7 |       3 (0) |  00:00:01 |
| 1  | SORT AGGREGATE   |         |    1 |     7 |             |           |
| 2  | INLIST ITERATOR  |         |      |       |             |           |
|* 3 | INDEX RANGE SCAN | BOWIE_I |    1 |     7 |       3 (0) |  00:00:01 |
------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("CODE"=1 OR "CODE"=2)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
1676 consistent gets
   0 physical reads
   0 redo size
 544 bytes sent via SQL*Net to client
 608 bytes received via SQL*Net from client
   2 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   1 rows processed

The CBO still doesn’t consider INDEX SKIP SCAN type of processing in this scenario and is using a relatively inefficient INDEX RANGE SCAN instead.

So unfortunately, although I have an existing index in place that if only used effectively could potentially return the GROUP BY result set very efficiently, the CBO is not using the index. The CBO doesn’t appear to be able use an INDEX SKIP SCAN in combination with multiple MIN/MAX scans in scenarios when it perhaps could.

Announcement: Oracle Indexing Internals Seminars Coming to New Zealand in March 2018 (Shipyards of New Zealand) January 22, 2018

Posted by Richard Foote in Oracle Indexes.
add a comment

Richard Let's Talk Database Nov 2015

I’m very pleased to announce I’ve now finalised some dates in New Zealand for my popular and highly acclaimed “Oracle Indexing Internals and Best Practices” seminar. They are:

Wellington 12-13 March 2018 (Auldhouse: Level 8 Lumley House, 11 Hunter Street): Tickets and Registration Link

Auckland 15-16 March 2018 (Karstens: Level 4, 205 Queen Street): Tickets and Registration Link

As usual, numbers will be strictly limited due to the small class nature of the seminars, so if interested I advise booking early to avoid disappointment as seminars will obviously not be scheduled too regularly in NZ.

Early bird rates are available until 23 March or until the seminars are full. Venues have now been finalised and are listed above.

I hope to catch up with some of my many NZ friends at one of these events 🙂