jump to navigation

“Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part I (The Jean Genie) October 4, 2018

Posted by Richard Foote in Global Indexes, Local Indexes, Non-Partitioned Indexes, Oracle Indexes, Partitioned Indexes, Partitioning.
trackback

jean genie

When it comes to indexing a partitioned table, many automatically opt for Local Indexes, as it’s often assumed they’re simply easier to manage and more efficient than a corresponding Global Index.

Having smaller index structures that are aligned to how the table is partitioned certainly has various advantages. The focus in this little series is on the “more efficient” claim.

A key point that many miss is that a Non-Partitioned Index on a Non-Partitioned table is not exactly the same beast as a Non-Partitioned Index on a Partitioned Table. The purpose of this initial post is to illustrate this difference.

Let’s begin by creating a Non-Partitioned table that has a number of years worth of data:

SQL> CREATE TABLE big_ziggy (id number, album_id number, country_id number, release_date date, total_sales number);

Table created.

SQL> INSERT INTO big_ziggy
     SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
            mod(rownum,200000)+1
     FROM dual CONNECT BY LEVEL  2000000;

2000000 rows created.

SQL> commit;
      
Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_ZIGGY');

PL/SQL procedure successfully completed.

So we have a 2M row table with about 8 years worth of data (based on the RELEASE_DATE column) and a TOTAL_SALES column that has some 200,000 distinct columns throughout this period.

Let’s next create a standard Non-Partitioned index based on the TOTAL_SALES column:

SQL> create index big_ziggy_total_sales_i on big_ziggy(total_sales);

Index created.

If we now run a query to access the 10 rows with a value equal to 42:

SQL> SELECT * FROM big_ziggy WHERE total_sales = 42;

        ID   ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES
---------- ---------- ---------- --------- -----------
   1400041         42         42 24-AUG-18          42
    400041         42         42 28-JAN-12          42
   1000041         42         42 20-JUL-17          42
   1800041         42         42 28-JAN-12          42
    600041         42         42 15-JUN-16          42
    800041         42         42 03-MAR-13          42
   1200041         42         42 07-APR-14          42
   1600041         42         42 12-MAY-15          42
    200041         42         42 12-MAY-15          42
        41         42         42 24-AUG-18          42

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1252095634

---------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                    | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                         |   10 |   250 |      13 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED | BIG_ZIGGY               |   10 |   250 |      13 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | BIG_ZIGGY_TOTAL_SALES_I |   10 |       |       3 (0) | 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
  14 consistent gets
   0 physical reads
   0 redo size
1184 bytes sent via SQL*Net to client
 624 bytes received via SQL*Net from client
   2 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  10 rows processed

We notice we need 14 consistent gets to access these 10 rows, 4 gets for index block accesses and 10 gets to access the relevant rows from the table blocks (as we have a terrible clustering due to the relevant data being distributed throughout the table).

If we run a query where we’re only interested in accessing data only within a specific year:

SQL> SELECT * FROM big_ziggy WHERE total_sales = 42 and release_date
between '01-JAN-2017' and '31-JUL-2017';

        ID   ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES
---------- ---------- ---------- --------- -----------
   1000041         42         42 20-JUL-17          42

Execution Plan
----------------------------------------------------------
Plan hash value: 1252095634

---------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                    | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |                         |    1 |    25 |      13 (0) | 00:00:01 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BIG_ZIGGY               |    1 |    25 |      13 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | BIG_ZIGGY_TOTAL_SALES_I |   10 |       |       3 (0) | 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("RELEASE_DATE">=TO_DATE(' 2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "RELEASE_DATE"<=TO_DATE(' 2017-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

2 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
  0 recursive calls
  0 db block gets
 14 consistent gets
  0 physical reads
  0 redo size
885 bytes sent via SQL*Net to client
624 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  1 rows processed

Even though we now only return the one row,  notice we still have to perform the same 14 consistent gets. That’s because the RELEASE_DATE column is NOT part of the index, so we still need to fetch all 10 matching rows with TOTAL_SALES=42 and then filter out those that don’t have a RELEASE_DATE of interest. The note above in the predicate information shows we now have this additional filtering taking place.

Let’s run the same queries on a table with identical data, but this time on a table that is partitioned based on the RELEASE_DATE column, with a partition for each years worth of data:

SQL> CREATE TABLE big_bowie(id number, album_id number, country_id number, release_date date, total_sales number)
2 PARTITION BY RANGE (release_date)
3 (PARTITION ALBUMS_2011 VALUES LESS THAN (TO_DATE('01-JAN-2012', 'DD-MON-YYYY')),
4 PARTITION ALBUMS_2012 VALUES LESS THAN (TO_DATE('01-JAN-2013', 'DD-MON-YYYY')),
5 PARTITION ALBUMS_2013 VALUES LESS THAN (TO_DATE('01-JAN-2014', 'DD-MON-YYYY')),
6 PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')),
7 PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
8 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
9 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
10 PARTITION ALBUMS_2018 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_bowie
     SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
            mod(rownum,200000)+1
     FROM dual CONNECT BY LEVEL  2000000;

2000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE');

PL/SQL procedure successfully completed.

Again, we create a standard, Non-Partitioned Index:

SQL> create index big_bowie_total_sales_i on big_bowie(total_sales);

Index created.

If we now run the equivalent of the first query:

SQL> SELECT * FROM big_bowie WHERE total_sales = 42;

        ID   ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES
---------- ---------- ---------- --------- -----------
    400041         42         42 28-JAN-12          42
   1800041         42         42 28-JAN-12          42
    800041         42         42 03-MAR-13          42
   1200041         42         42 07-APR-14          42
   1600041         42         42 12-MAY-15          42
    200041         42         42 12-MAY-15          42
    600041         42         42 15-JUN-16          42
   1000041         42         42 20-JUL-17          42
        41         42         42 24-AUG-18          42
   1400041         42         42 24-AUG-18          42

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1761527485

--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                  | Name                    | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                           |                         |   10 |   250 |      13 (0) | 00:00:01 |        |       |
|  1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE               |   10 |   250 |      13 (0) | 00:00:01 |  ROWID | ROWID |
|* 2 | INDEX RANGE SCAN                           | BIG_BOWIE_TOTAL_SALES_I |   10 |       |       3 (0) | 00:00:01 |        |       |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
  14 consistent gets
   0 physical reads
   0 redo size
1184 bytes sent via SQL*Net to client
 624 bytes received via SQL*Net from client
   2 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  10 rows processed

We get the exact same performance, with the same 14 consistent gets necessary to access the 10 rows of interest.

If we now run the equivalent of the second query:

SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017';

        ID   ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES
---------- ---------- ---------- --------- -----------
   1000041         42         42 20-JUL-17          42

Execution Plan
----------------------------------------------------------
Plan hash value: 1081241859

--------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation                                  | Name                    | Rows | Bytes | Cost (%CPU) | Time     | Pstart | Pstop |
--------------------------------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                           |                         |    1 |    25 |      13 (0) | 00:00:01 |        |       |
|* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE               |    1 |    25 |      13 (0) | 00:00:01 |      7 |     7 |
|* 2 | INDEX RANGE SCAN                           | BIG_BOWIE_TOTAL_SALES_I |   10 |       |       3 (0) | 00:00:01 |        |       |
--------------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("RELEASE_DATE"=TO_DATE(' 2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

2 - access("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
  0 recursive calls
  0 db block gets
  5 consistent gets
  0 physical reads
  0 redo size
885 bytes sent via SQL*Net to client
624 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  1 rows processed

We notice a key difference. Even though it’s equivalent to the same index as in the previous Non-Partitioned table and even though the index only contains just the TOTAL_SALES column, the number of consistent gets has dropped from 14 to just 5 consistent gets.

In this example, Oracle has clearly not had to fetch the rows from the table that do not match the RELEASE_DATE of interest. Even though the predicate information is listing the requirement for filtering to take place, this filtering has clearly been performed within the index, without having to actually fetch any of the rows that aren’t of interest.

The index is able to only access the row(s) of interest from the Partitioned Table…

This is the little “hidden efficiency” of Global Indexes on Partitioned Tables, which is what we effectively have here.

In Part II, I’ll discuss how Oracle does this additional filtering within the index and why understanding this is important in deciding which type of index to deploy, as from a “performance” perspective, Global Indexes are often the preferred option.

Advertisements

Comments»

1. Yusuf Nar - October 5, 2018

Is it because a global index on a non-partitioned table does include the partition column as part of the index so that the query didn’t need to fetch the data from the table?

Like

Richard Foote - October 5, 2018

Hi Yusuf

No, that’s not the explanation. Stay tuned 🙂

Cheers

Richard

Like

2. “Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part II (Aladdin Sane) | Richard Foote's Oracle Blog - October 9, 2018

[…] Part I of this series, I highlighted how a Non-Partitioned Global Index on a Partitioned Table is able to […]

Like

3. “Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part III” (Ricochet) | Richard Foote's Oracle Blog - October 25, 2018

[…] Part I and Part II of this series, we looked at how Global Indexes can effectively perform […]

Like

4. “Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part IV” (Hallo Spaceboy) | Richard Foote's Oracle Blog - October 31, 2018

[…] Part I, Part II and Part III we looked at some advantages of Global Indexes that may not be obvious to […]

Like


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: