jump to navigation

12c Partial Indexes For Partitioned Tables Part I (Ignoreland) July 8, 2013

Posted by Richard Foote in 12c, Oracle Indexes, Partial Indexes, Partitioning.
8 comments

In my opinion, one of the unsung “Heroes” of the new 12c Oracle database are improvements associated with Partitioning. There are lots of really nice capabilities introduced in the 12c database that significantly improves the manageability of partitioned environments.

One of those new capabilities is the ability to now create both local and (importantly) global  indexes on only a subset of partitions within a partitioned table. This provides us with the flexibility to say only create partitions with data that would make sense to index, to not index current partitions where perhaps data insert performance is paramount, etc. Additionally and just as importantly, the CBO is aware of the indexing characteristics of individual partitions and can access partitions in differing manners accordingly.

To illustrate, a simple little demo as usual :) Firstly, I’ll create a partitioned table with the new INDEXING clause:

SQL> create table pink_floyd (id number, status varchar2(6), name varchar2(30))
INDEXING OFF
partition by range (id)
(partition pf1 values less than (1000001),
partition pf2 values less than (2000001) INDEXING OFF,
partition pf3 values less than (maxvalue) INDEXING ON);
Table created.

The INDEXING clause determines whether or not the partition is to be indexed. It can be set at the table level and so set the default behaviour for the table or at the individual partition/subpartition level.

In the above example, I’ve set INDEXING OFF at the table level and so indexing by default is not be enabled for the table partitions. Therefore the PF1 partition is not indexed by default. The PF2 partition is explicitly set to also not be indexed but the PF3 index is explicitly set (INDEXING ON) to enable indexing and so override the table level default.

Let’s now populate the table with some basic data:

SQL> insert into pink_floyd select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level <= 3000000;

3000000 rows created.

SQL> commit;

Commit complete.

SQL> update pink_floyd set status = 'OPEN' where id > 2000000 and mod(id,10000)=0;

100 rows updated.

SQL> commit;

Commit complete.

Most of the data has a STATUS column value of ‘CLOSED’ but I’ve updated a few rows within just the last partition with a STATUS set to ‘OPEN’.

Let’s now create an index on this STATUS column and collect table statistics:

SQL> create index pink_floyd_status_i on pink_floyd(status);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'PINK_FLOYD', estimate_percent=>null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1 FOR COLUMNS STATUS SIZE 5');

PL/SQL procedure successfully completed.

SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I';

INDEX_NAME             NUM_ROWS LEAF_BLOCKS INDEXIN
-------------------- ---------- ----------- -------
PINK_FLOYD_STATUS_I     3000000        9203 FULL

By default, an index will include all partitions in a table, regardless of the INDEXING table clause setting. So this index covers all 3M rows in the table and currently has 9203 leaf blocks. The new INDEXING column in DBA_INDEXES shows us that this index is a FULL (non-Partial) index.

We can of course get the data of interest (STATUS = ‘OPEN’) via this index now:

SQL> select * from pink_floyd where status = 'OPEN';

100 rows selected.

Execution Plan
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                     |   964 | 24100 |     4   (0)| 00:00:01 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD          |   964 | 24100 |     4   (0)| 00:00:01 | ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                         | PINK_FLOYD_STATUS_I |   100 |       |     3   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS"='OPEN')

Statistics
----------------------------------------------------------

0  recursive calls

0  db block gets

110  consistent gets

0  physical reads

0  redo size

4479  bytes sent via SQL*Net to client

609  bytes received via SQL*Net from client

8  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

100  rows processed

However, we can potentially also run a query based on just the last partition as all the ‘OPEN’ statuses of interest only reside in this last partition:

SQL> select * from pink_floyd where status = 'OPEN' and id > 2000001;

100 rows selected.

Execution Plan
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                     |    99 |  2475 |     4   (0)| 00:00:01 |       |       |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD          |    99 |  2475 |     4   (0)| 00:00:01 |     3 |     3 |
|*  2 |   INDEX RANGE SCAN                         | PINK_FLOYD_STATUS_I |   100 |       |     3   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID">2000001)
2 - access("STATUS"='OPEN')

Statistics
----------------------------------------------------------

0  recursive calls

0  db block gets

110  consistent gets

0  physical reads

0  redo size

2787  bytes sent via SQL*Net to client

609  bytes received via SQL*Net from client

8  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

100  rows processed

Currently however, the index includes data from all table partitions, even though we’re only really interested in using the index to retrieve the less common ‘OPEN’ status that resides in only the last table partition. With 12c, there is now the capability to only index those partitions that are of interest to us, which with proper design can also be implemented such that only those column values of interest are included within an index.

I’m going to drop and create the index as a “Partial” Index:

SQL> drop index pink_floyd_status_i;

Index dropped.

SQL> create index pink_floyd_status_i on pink_floyd(status) indexing partial;

Index created.

SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I';

INDEX_NAME             NUM_ROWS LEAF_BLOCKS INDEXIN
-------------------- ---------- ----------- -------
PINK_FLOYD_STATUS_I     1000000        3068 PARTIAL

The new INDEXING PARTIAL clause means only those table partitions with INDEXING ON are to be included within the index.

Notice how the index, which is a Global, Non-Partitioned Index, now only has 1M entries (not all 3M as previously) and with 3068 leaf blocks is only 1/3 of what it was previously. The INDEXING column now denotes this as a “Partial” index.

If we run the query again that only explicitly references the last “active” table partition:

SQL> select * from pink_floyd where status = 'OPEN' and id >= 2000001;

100 rows selected.

Execution Plan
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                     |   100 |  2500 |     4   (0)| 00:00:01 |       |       |
|*  1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD          |   100 |  2500 |     4   (0)| 00:00:01 |     3 |     3 |
|*  2 |   INDEX RANGE SCAN                         | PINK_FLOYD_STATUS_I |    33 |       |     3   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("ID">=2000001)
2 - access("STATUS"='OPEN')

Statistics
----------------------------------------------------------

0  recursive calls

0  db block gets

110  consistent gets

0  physical reads

0  redo size

2787  bytes sent via SQL*Net to client

609  bytes received via SQL*Net from client

8  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

100  rows processed

We see that the index is used as it was previously. By stating with the ID > 2000001 predicate we’re only interested in data that can only reside in the last table partition, the partition with INDEXING ON, the CBO knows the index can be used to retrieve all the rows of interest. If we know the application will only extract data in this manner, all is well with our smaller, Partial index.

However, if it’s possible within the application to perhaps search for STATUS values from other partitions, that have INDEXING OFF:

SQL> select * from pink_floyd where status = 'OPEN';

100 rows selected.

Execution Plan

----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                    | Name                | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                             |                     |   100 |  2500 |  2474   (1)| 00:00:01 |       |
|   1 |  VIEW                                        | VW_TE_2             |    99 |  3465 |  2474   (1)| 00:00:01 |       |
|   2 |   UNION-ALL                                  |                     |       |       |            |          |       |
|*  3 |    TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD          |    33 |   825 |     4   (0)| 00:00:01 | ROWID | ROWID
|*  4 |     INDEX RANGE SCAN                         | PINK_FLOYD_STATUS_I |   100 |       |     3   (0)| 00:00:01 |       |
|   5 |    PARTITION RANGE ITERATOR                  |                     |    66 |  1650 |  2470   (1)| 00:00:01 |     1 |     2
|*  6 |     TABLE ACCESS FULL                        | PINK_FLOYD          |    66 |  1650 |  2470   (1)| 00:00:01 |     1 |     2
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - filter("PINK_FLOYD"."ID">=2000001 OR "PINK_FLOYD"."ID" IS NULL)
4 - access("STATUS"='OPEN')
6 - filter("STATUS"='OPEN')

Statistics
----------------------------------------------------------

0  recursive calls

0  db block gets

16341  consistent gets

8204  physical reads

0  redo size

2787  bytes sent via SQL*Net to client

609  bytes received via SQL*Net from client

8  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

100  rows processed

We notice the index is still used to efficiently find those rows of interest from the last partition, but a Full Table (Partition) Scan is performed to search for data from the other two partitions, for which with INDEXING OFF means the index does not contain entries that reference these partitions. As a result, this query is now much more expensive than it was previously as the index can not be used to exclusively find the rows of interest. The CBO within the one execution plan uses the index where it can and a full scan of the other partitions where it can’t use the index.

If however we were a little cleverer in how we designed our table and also created table subpartitions based on the STATUS column so that rows of interest resided in their own subpartitions and only set INDEXING ON for these subpartitions:

SQL> create table pink_floyd (id number, status varchar2(6), name varchar2(30))
indexing off
partition by range (id) subpartition by list(status)
subpartition template
(subpartition closed values ('CLOSED') indexing off, subpartition open values ('OPEN') indexing on)
(partition pf1 values less than (1000001),
partition pf2 values less than (2000001),
partition pf3 values less than (maxvalue))
enable row movement;

Table created.

Notice how only the subpartitions with a STATUS of ‘OPEN’ are now to be indexed. If we populate the table with the exact same data as before, we find the table and partial index have the follow characteristics:

SQL> select subpartition_position, subpartition_name, num_rows, indexing from dba_tab_subpartitions where table_name = 'PINK_FLOYD';

SUBPARTITION_POSITION SUBPARTITION_NAME      NUM_ROWS IND
--------------------- -------------------- ---------- ---
                    1 PF1_CLOSED              1000000 OFF
                    2 PF1_OPEN                      0 ON
                    1 PF2_CLOSED              1000000 OFF
                    2 PF2_OPEN                      0 ON
                    1 PF3_CLOSED               999900 OFF
                    2 PF3_OPEN                    100 ON

SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I';

INDEX_NAME             NUM_ROWS LEAF_BLOCKS INDEXIN
-------------------- ---------- ----------- -------
PINK_FLOYD_STATUS_I         100           1 PARTIAL

We can see that only the subpartitions with data of interest are now indexed. The resultant Partial global non-partitioned index is now tiny, with just the 100 index entries of interest residing in a single leaf block.

Just as importantly, a query searching for this data across the whole table is now extremely efficient and can be fully serviced by this tiny Partial index:

SQL> select * from pink_floyd where status = 'OPEN';

100 rows selected.

Execution Plan
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                  | Name                | Rows | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                           |                     |   100 |  2500 |     2   (0)| 00:00:01 |       |       |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD          |   100 |  2500 |     2   (0)| 00:00:01 | ROWID | ROWID |
|*  2 |   INDEX RANGE SCAN                         | PINK_FLOYD_STATUS_I |   100 |       |     1   (0)| 00:00:01 |       |       |
----------------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("STATUS"='OPEN')

Statistics
----------------------------------------------------------

0  recursive calls

0  db block gets

16  consistent gets

0  physical reads

0  redo size

4479  bytes sent via SQL*Net to client

609  bytes received via SQL*Net from client

8  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

100  rows processed

The query can now be fully serviced by the Partial index as all subpartitions that could contain data of interest are indexed and because the data of interest is all neatly clustered within the table subpartitions, can be retrieved with far fewer consistent gets than previously.

If we ever wanted to access those STATUS values of ‘CLOSED’, the CBO can only do so via a Full Table Scan as such values are not indexed. However, as these values represent the vast majority of rows in the table, the Full Table Scan would be the most appropriate and efficient manner to access these rows any-ways.

The new Partial Index capabilities introduced in Oracle 12c enables us to easily use global (and local) indexes to just index data of interest without having to change the application. Such a capability has many potential uses.

More on Partial Indexes to come soon in Part II.

12c: Intro To Multiple Indexes On Same Column List (Repetition) July 2, 2013

Posted by Richard Foote in 12c, Multiple Indexes, Oracle Indexes.
6 comments

From an indexing perspective, one of the bigger ticket items introduced with Oracle Database 12c is the new capability to create multiple indexes on the same column list. It’s even a feature listed in the New Features Guide :) This can be particularly useful when you want to change the index definition in some way but you want to reduce the time in which the corresponding columns might not indexed.

Previous to 12c, you could not create an index if the same column list is already indexed and would generate an ORA-01408: such column list already indexed error. So, if you wanted to change an index from being say a B-Tree index to a Bitmap index, or from being Unique to Non-Unique or from being Non-Partitioned to Partitioned in same manner, etc. then you had to first drop the index and re-create it again as required. This means for the period in which the index is being re-created (which could be a considerable period for a larger index), the column list is not covered by an index, which might prove to be problematic.

Here, we create a partitioned table and create a Non-Unique, Non-Partitioned index on the ID column:

SQL> create table ziggy (id number, name varchar2(30)) partition by range (id) (partition ziggy1 values less than (1000), partition ziggy2 values less than (2000), partition ziggy3 values less than (maxvalue));

Table created.

SQL> insert into ziggy select rownum, 'DAVID BOWIE' from dual connect by level <=5000;

5000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy_id_i1 on ziggy(id);              

Index created.

SQL> alter table ziggy add constraint ziggy_pk primary key(id);

Table altered.

If we decide for whatever reason we want to have say a Unique index instead:

SQL> create unique index ziggy_id_i2 on ziggy(id);
create unique index ziggy_id_i2 on ziggy(id)
                                         *
ERROR at line 1:
ORA-01408: such column list already indexed

No good, we can’t, even if we make it initially INVISIBLE:

SQL> create unique index ziggy_id_i2 on ziggy(id) invisible;
create unique index ziggy_id_i2 on ziggy(id) invisible
                                         *
ERROR at line 1:
ORA-01408: such column list already indexed

Because the index is used to police a PK constraint, we can’t even just drop the index:

SQL> drop index ziggy_id_i1;
drop index ziggy_id_i1
           *
ERROR at line 1:
ORA-02429: cannot drop index used for enforcement of unique/primary key

We have to first drop or disable the PK constraint, then drop the index, then re-create the index.

SQL> alter table ziggy drop primary key;

Table altered.

SQL> drop index ziggy_id_i1;

Index dropped.

SQL> alter table ziggy add primary key(id) using index (create unique index ziggy_id_i2 on ziggy(id));

Table altered.

This means the constraint is not automatically enforced (unless we disable it with validate, thus locking the table) and the ID column is not available via an index during the entire duration of creating the new index.

The 12c database has given us more flexibility in this regard.

Providing we create the index as INVISIBLE (meaning there is only ever the one Visible index on the column list), we can now create multiple indexes on the same column list, providing the new index has a different characteristic to existing indexes. So with a 12c database:

SQL> create unique index ziggy_id_i2 on ziggy(id) invisible;

Index created.

We now have two indexes on the same column list (the ID column). An attempt however to create or alter an index such that two visible indexes have the same column list will fail:

SQL> alter index ziggy_id_i2 visible;
alter index ziggy_id_i2 visible
*
ERROR at line 1:
ORA-14147: There is an existing VISIBLE index defined on the same set of
columns.

This means we can now (say) replace the index policing the PK constraint quicker (or convert a B-Tree to a Bitmap index or convert a Non-Partitioned index to a Partitioned index, etc.) as we don’t now have to wait for the new index to be created first:

SQL> alter table ziggy drop primary key;

Table altered.

SQL> drop index ziggy_id_i1;

Index dropped.

SQL> alter index ziggy_id_i2 visible;

Index altered.

SQL> alter table ziggy add constraint ziggy_pk primary key(id);

Table altered.

We can create as many indexes as we like on the ID, providing only one index is ever visible at a time and providing all the indexes have some form of differing characteristic. For example, all these indexes could potentially all co-exist:

SQL> create index ziggy_id_i3 on ziggy(id) local invisible;

Index created.

SQL> create bitmap index ziggy_id_i4 on ziggy(id) local invisible;

Index created.

SQL> create index ziggy_id_i5 on ziggy(id) reverse invisible;

Index created.

Of course, having many multiple indexes on the same column list in this manner is very likely a very bad idea …

Oracle Database 12c Released – Lots of Indexing Stuff To Talk About June 26, 2013

Posted by Richard Foote in 12c, Oracle Indexes.
10 comments

Finally, Oracle have released the long awaited 12c Database. You can now download it from OTN and other usual places. This is good news as my tongue was getting rather sore from having to bite on it for so long :)

From an indexing point of view, there are lots of fantastic new features, especially from a manageability perspective. The ability to create multiple indexes with the same column list, to partially index tables, with enhanced capabilities regarding online operations, with enhanced index monitoring capabilities, etc. etc. there’s plenty that I can now discuss here.

Unfortunately, all my presentations have been rejected for Oracle Openworld this year so I won’t be able to present them over there :(

Fortunately I’ll be able to discuss them all here in lots of detail in the coming weeks, so stay tuned :)

Clustering Factor Calculation Improvement Part III (Too Much Rope) June 4, 2013

Posted by Richard Foote in 11g, CBO, Clustering Factor, Index statistics, TABLE_CACHED_BLOCKS.
9 comments

In my previous post, I discussed how for larger tables that really do have randomised data, not even setting the new TABLE_CACHED_BLOCKS to its 255 maximum value is going to make any real difference to the resultant Clustering Factor (CF) of an index. The 255 maximum value here protects us somewhat from abusing this capability and setting unrealistic CF values.

However, for smaller tables in particular, we do need to exercise some caution.

In the following example, we’re only creating a relatively small table and associated index with a CODE column that is randomly distributed throughout the table:

SQL> create table bowie (id number, code number, text varchar2(30));
Table created.

SQL> insert into bowie select rownum,  trunc(dbms_random.value(0, 100)), 'DAVID BOWIE'
from dual connect by level <= 70000;

70000 rows created.

SQL> commit;

Commit complete.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true,
     method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> create index bowie_code_i on bowie(code);

Index created.

If we look at the CF of this index:

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ ------------ ---------- ---------- -----------------
BOWIE        BOWIE_CODE_I        244      70000             22711

We notice that at 22,711, the CF it’s pretty average. The table though is quite small at only 244 blocks.

If we run a simple query:

SQL> select * from bowie where code between 42 and 44;
2050 rows selected.

Execution Plan
----------------------------------------------------------

Plan hash value: 1845943507
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |  2814 | 56280 |    65   (4)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| BOWIE |  2814 | 56280 |    65   (4)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter('CODE'<=44 AND 'CODE'>=42)

Statistics
----------------------------------------------------------

0  recursive calls
1  db block gets
254  consistent gets
0  physical reads
0  redo size
25044  bytes sent via SQL*Net to client
519  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
2050  rows processed

The CBO goes for a Full Table Scan. This is not unexpected as we likely have to visit all 244 blocks anyways to fetch the required 2050 rows due to the CODE data being so randomly distributed throughout the table. In a naive attempt to improve things, we decide to improve the CF by setting the TABLE_CACHED_BLOCKS to the maximum 255 value:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE', estimate_percent=> null, cascade=> true,
method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
FROM user_tables t, user_indexes i WHERE t.table_name = i.table_name AND i.index_name='BOWIE_CODE_I';

TABLE_NAME   INDEX_NAME         BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ -------------- ---------- ---------- -----------------
BOWIE        BOWIE_CODE_I          244      70000               240

Indeed, we have improved the CF, dramatically reducing it down to just 240 from the previous 22711. Even though the column value for the CODE really is randomly distributed throughout the table, the CF now suggests the data is perfectly clustered. This is because with a table with only 244 blocks, incrementing the CF if the current index entry references a table block more than 255 blocks ago is now impossible. The CF is now guaranteed to be “perfect” as each index entry can only reference one of the 244 table blocks and so is incremented only when each table block is referenced the first time.

This dramatic reduction in the CF will certainly make the index more attractive to the CBO. But is this really a good thing:

SQL> select * from bowie where code between 42 and 44;
2050 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1602289932
--------------------------------------------------------------------------------------------
| Id  | Operation                   | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |              |  2814 | 56280 |    17   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE        |  2814 | 56280 |    17   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_CODE_I |  2814 |       |     7   (0)| 00:00:01 |
--------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access('CODE'>=42 AND 'CODE'<=44)

Statistics
----------------------------------------------------------

0  recursive calls
0  db block gets
681  consistent gets
0  physical reads
0  redo size
20895  bytes sent via SQL*Net to client
519  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
2050  rows processed

The CBO is now using the index, but is now performing significantly more consistent gets, 681 rather than the previous 254. This execution plan is actually much less efficient than the previous FTS execution plan. The CBO is getting this wrong now as the CF isn’t really anywhere near as good as it’s now being lead to believe.

Caution setting TABLE_CACHED_BLOCKS to a value that is anywhere close to the number of blocks in the table. This is one of the reasons for Oracle having a 1% of table blocks default value for this setting.

Clustering Factor Calculation Improvement Part II (Blocks On Blocks) May 14, 2013

Posted by Richard Foote in 11g, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
6 comments

My previous post on the new TABLE_CACHED_BLOCKS statistics gathering preference certainly generated some interest :) My blog hits for the week have gone off the charts !!

One of the concerns raised by this new capability was that setting such a preference might result in really unrealistic and inaccurate Clustering Factor (CF) values, especially for those tables that truly have appalling CFs. Although there are certainly some dangers, Oracle has limited the possible “abuse” by ensuring TABLE_CACHED_BLOCKS can only be set to a maximum of 255. This means Oracle will only ignore a maximum of 255 table blocks that have recently been accessed during the CF calculation. For larger tables with truly randomised data patterns, not even the maximum 255 setting if utilised will make an appreciable difference to the final CF.

A couple of examples to demonstrate.

The first table is a relatively “large” table that has a DOB column that is effectively randomised throughout the table. There are approximately 20,000 different DOB values in a 2 million row table (so each DOB occurs approximately 100 times, give or take).

SQL> create table major_tom (id number, DOB date, text varchar2(30));

Table created.

SQL> insert into major_tom select rownum,  sysdate-trunc(dbms_random.value(0, 20000)), 'DAVID BOWIE' from dual connectby level <= 2000000;

2000000 rows created.

SQL> commit;

Commit complete.

Let’s now create an index on this DOB column and have a look at the CF:

SQL> create index major_tom_dob_i on major_tom(dob);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'MAJOR_TOM', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
MAJOR_TOM    MAJOR_TOM_DOB_I       9077    2000000           1988164

So at 1,988,164, the CF is terrible. This is as expected as the DOB values are all randomised throughout the table. The index is not being used as we had hope (naively) so let’s use the new TABLE_CACHED_BLOCKS preference to now improve the calculated CF by setting it to the maximum 255 setting and recalculate the index statistics:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'MAJOR_TOM',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'MAJOR_TOM_DOB_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='MAJOR_TOM_DOB_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
MAJOR_TOM    MAJOR_TOM_DOB_I       9077    2000000           1941946

We notice that although the CF has improved marginally, at whopping 1,941,946 it’s still terrible and has made no real appreciable difference. Why ?

Well let’s do some basic maths here. There are 9077 blocks in the table and the next DOB referenced in the index can potentially be in any one of them. Therefore, the chances of the next DOB being in one of the 255 previously accessed table blocks is only 255/9077 x 100 = approximately 2.8%. So in only 2.8% of the time is the CF likely to not be incremented and so the CF is only likely to drop by around this 2.8% amount.

Let’s check. (1988164 – 1941946)/1988164 x 100  indeed does equal approximately 2.8%.

So statistically with such a poor CF on such a “large” table, to limit the CF calculation if any of the last 255 table blocks are referenced is only going to improve things by 2.8% on average. Effectively of no real use at all.

Another example now, but this time with a CODE column with just 100 distinct values that are randomly distributed throughout another reasonable “large” 2 million row table. For those mathematically challenged, that means each value occurs approximately 20,000 times, give or take:

SQL> create table ziggy (id number, code number, text varchar2(30));

Table created.

SQL> insert into ziggy select rownum,  trunc(dbms_random.value(0, 100)), 'DAVID
BOWIE' from dual connect by level <= 2000000;

2000000 rows created.

SQL> commit;

Commit complete.

SQL> create index ziggy_code_i on ziggy(code);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'ZIGGY', estimate_percent=> null, cascade=> true,
method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
ZIGGY        ZIGGY_CODE_I          7048    2000000            662962

So at 662,962 it’s what I would describe as a “poor to average” CF. It’s not particularly great with there being just  7,048 table blocks but it’s still some distance from the 2,000,000 row value.

The index is not being used in SQL statements as we (naively) wish, so let’s try and improve things by lowering the index CF by setting the new TABLE_CACHED_BLOCKS preference to the maximum 255 setting:

SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'ZIGGY',
pname=>'TABLE_CACHED_BLOCKS', pvalue=>255);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'ZIGGY_CODE_I',
estimate_percent=>null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor
2  FROM user_tables t, user_indexes i
3  WHERE t.table_name = i.table_name AND i.index_name='ZIGGY_CODE_I';

TABLE_NAME   INDEX_NAME          BLOCKS   NUM_ROWS CLUSTERING_FACTOR
------------ --------------- ---------- ---------- -----------------
ZIGGY        ZIGGY_CODE_I          7048    2000000            662962

We notice to our great disappointment (well, not really) that the CF remains completely unchanged at 662,962 !! Why ?

Again, let’s do some basic maths and consider the data distribution.

The table has some 7048 blocks but each distinct CODE value has some 20,000 occurrences on average. Therefore, each value is going to be found 20000/7048 = roughly 2 to 3 times per block. As the index is in CODE order and for each CODE in rowid order, the CF is going to increment for each CODE value for each distinct block we visit. We will therefore only go back to a previously visited table block (except for the 2 to 3 visits to the current block) when the CODE value changes but this will take us all the way back to the first block which is always going to be some 7047 blocks away from the current one. As 7047 is much greater than the 255 the CF calculation will only cater for, the CF is going to remain unchanged from the default calculation as a result.

And this is all as it should be, as the fundamental CF is indeed poor for these scenarios and even going back the maximum 255 data blocks will not reduce appreciably the manner in which the CF is calculated.

Of course, if there was no limit, then a setting of TABLE_CACHED_BLOCKS  of say 7100 would enable the CF to be recalculated as being perfect in the above scenario, which would indeed be a concern. But 255 is the limit and so limits the potential “damaged” that can be done.

More on all this to come :)

Important !! Clustering Factor Calculation Improvement (Fix You) May 8, 2013

Posted by Richard Foote in 11g, ASSM, CBO, Clustering Factor, Index statistics, Oracle Cost Based Optimizer, Oracle Indexes.
41 comments

Believe me, this article is worth reading :)

I’m currently not allowed to discuss Oracle 12c Database goodies but I am allowed to discuss things perhaps initially intended for 12c that are currently available and already back-ported to 11g. This includes a wonderful improvement in the manageability of how the Clustering Factor (CF) of an index can now be calculated. Many thanks to Martin Decker for pointing this out to me.

As anyone who has attended my Index Seminars will know, the CF of an index is one of the most important statistics used by the Cost Based Optimizer (CBO) in determining the most efficient execution plan. As such, it has always been an issue for me that the manner in which the CF is calculated has been so flawed.

Basically, the CF is calculated by performing a Full Index Scan and looking at the rowid of each index entry. If the table block being referenced differs from that of the previous index entry, the CF is incremented. If the table block being referenced is the same as the previous index entry, the CF is not incremented. So the CF gives an indication of how well ordered the data in the table is in relation to the index entries (which are always sorted and stored in the order of the index entries). The better (lower) the CF, the more efficient it would be to use the index as less table blocks would need to be accessed to retrieve the necessary data via the index.

However, there’s a basic flaw here. The CF calculation doesn’t take into consideration the fact the referenced table block, although maybe different from the previous one index entry, might already have recently been accessed. As such, during an index scan, the table block being accessed is almost certainly still cached in the buffer cache from the previous access, thereby not reducing the effectiveness of the index in any appreciable manner. A classic example of this would be a table with a few freelists. Although the data being inserted is not ordered precisely within the same data blocks, the data might actually be very well clustered within only a few blocks of each other.

Picture a table with 100 rows being inserted by 2 sessions simultaneously, each inserting 50 rows based on an ordered sequence. With one freelist, the data is basically inserted in one block first and then once full a second table block. The data is therefore perfectly ordered/clustered and the CF will evaluate to a value of 2 on such an indexed column. But with 2 freelists, one session could insert data into one block while the other session inserts into a second block, with the ordered sequenced values being randomly distributed among the 2 blocks.  The CF could now potentially evaluate to a value of 100 as the rows are jumbled or “toggled” across the two blocks. This is a much much worse value (2 vs. 100) that can adversely impact the CBO calculations, although the efficiency of such an index is really almost identical as both table blocks are certain to be cached during an index scan regardless.

This is also a very common scenario with Automatic Segment Space Management (ASSM) tablespaces as I’ve discussed previously, which of course is now the default these days.

OK, let’s look at an example scenario. I’ll begin by creating a simple little table, an ordered sequence and a procedure that inserts 100,000 rows into the table:


SQL> create table bowie (id number, text varchar2(30));

Table created.

SQL> create sequence bowie_seq order;

Sequence created.

SQL> CREATE OR REPLACE PROCEDURE bowie_proc AS

2  BEGIN

3     FOR i IN 1..100000 LOOP

4         INSERT INTO bowie VALUES (bowie_seq.NEXTVAL, 'ZIGGY STARDUST');

5         COMMIT;

6     END LOOP;

7  END;

8  /

Procedure created.

We note the table lives in an ASSM tablespace:


SQL> select table_name, i.tablespace_name, segment_space_management

from dba_tables i, dba_tablespaces t   where i.tablespace_name = t.tablespace_name and table_name='BOWIE';

TABLE_NAME   TABLESPACE_NAME                SEGMEN

------------ ------------------------------ ------

BOWIE        USERS                          AUTO

We next have 3 different sessions that simultaneously run the procedure to load the table. Note that an ordered sequence is used which means the 3 sessions are randomly grabbing the next sequenced value to insert. The data though is basically being inserted in order of the ID column, it’s just that the data is being distributed across a few blocks as we go along the table, rather than strictly one block after the other.


SQL> exec bowie_proc

PL/SQL procedure successfully completed.

Let’s create an index on the ID (sequenced) column and collect fresh statistics:


SQL> create index bowie_id_i on bowie(id);

Index created.

SQL> EXEC dbms_stats.gather_table_stats(ownname=>user, tabname=>'BOWIE',      estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000            241465

We notice that although the data in the table in reality is actually quite well clustered/ordered on the ID column, the actual CF of the index is not reflecting this. At a massive 241,465 it’s an extremely high (bad) CF, much closer in value to rows in the table than the number of table blocks, as the CF calculation keeps flipping back and forth between differing blocks. With such a high CF, the CBO is therefore going to cost an index scan accordingly:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 1845943507

---------------------------------------------------------------------------

| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------

|   0 | SELECT STATEMENT  |       |   389 |  7780 |   310   (1)| 00:00:04 |

|*  1 |  TABLE ACCESS FULL| BOWIE |   389 |  7780 |   310   (1)| 00:00:04 |

---------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

1 - filter("ID"<=429 AND "ID">=42)

Statistics

----------------------------------------------------------

0  recursive calls

1  db block gets

1093  consistent gets

0  physical reads

0  redo size

4084  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

Even though only approx. 0.13% of rows are being accessed and more importantly a similar low percentage of table blocks, the CBO has determined that a Full Table Scan (FTS) is the cheaper alternative. This is an all too familiar scenario, all down to the fact the CF is not accurately reflecting the true clustering of the data and subsequent efficiency of the index.

Finally, at long last, there’s now an official fix for this !!

Bug 13262857 Enh: provide some control over DBMS_STATS index clustering factor computation INDEX describes this scenario and currently has available patches that can be applied on both Exadata databases and Oracle versions 11.1.0.7, 11.2.0.2 and 11.2.0.3. The patches (eg. Patch ID 15830250) describe the fix as addressing “Index Clustering Factor Computation Is Pessimistic“. I couldn’t have described it better myself :)

Once applied (the following demo is on a patched 11.2.0.3 database), there is a new statistics collection preference that can be defined, called TABLE_CACHED_BLOCKS. This basically sets the number of table blocks we can assume would already be cached when performing an index scan and can be ignored when incrementing the CF during statistics gathering. The default is 1 (i.e. as performed presently) but can be set up to be a value between 1 and 255, meaning during the collection of index statistics, it will not increment the CF if the table block being referenced by the current index entry has already been referenced by any of the prior 255 index entries (if set to 255). It basically sets the appropriate parameter in the sys_op_countchg function used to calculate the CF value during statistic gathering to not increment the CF if the current table block has already been accessed “x” index entries previously.

The TABLE_CACHED_BLOCKS preference can be set by either the DBMS_STATS.SET_TABLE_PREFS, DBMS_STATS.SET_SCHEMA_PREFS or DBMS_STATS.SET_DATABASE_PREFS procedures.

So let’s now change the TABLE_CACHED_BLOCKS preference for this table and re-calculate the index statistics:


SQL> exec dbms_stats.set_table_prefs(ownname=>user, tabname=>'BOWIE',

pname=>'TABLE_CACHED_BLOCKS', pvalue=>42);

PL/SQL procedure successfully completed.

SQL> EXEC dbms_stats.gather_index_stats(ownname=>user, indname=>'BOWIE_ID_I', estimate_percent=> null);

PL/SQL procedure successfully completed.

SQL> SELECT t.table_name, i.index_name, t.blocks, t.num_rows, i.clustering_factor

2  FROM user_tables t, user_indexes i

3  WHERE t.table_name = i.table_name AND i.index_name='BOWIE_ID_I';

TABLE_NAME   INDEX_NAME       BLOCKS   NUM_ROWS CLUSTERING_FACTOR

------------ ------------ ---------- ---------- -----------------

BOWIE        BOWIE_ID_I         1126     300000              1035

We notice that the CF has now been significantly reduced (down from 241465 to just 1035), reflecting far more accurately the true clustering of the data when considering the actual effectiveness of using the index.

If we now run the same query as before:


SQL> select * from bowie where id between 42 and 429;

388 rows selected.

Execution Plan

----------------------------------------------------------

Plan hash value: 3472402785

------------------------------------------------------------------------------------------

| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |

------------------------------------------------------------------------------------------

|   0 | SELECT STATEMENT            |            |   389 |  7780 |     4   (0)|00:00:01 |

|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      |   389 |  7780 |     4   (0)|00:00:01 |

|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I |   389 |       |     2   (0)|00:00:01 |

------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):

---------------------------------------------------

2 - access("ID">=42 AND "ID"<=429)

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

6  consistent gets

0  physical reads

0  redo size

9882  bytes sent via SQL*Net to client

519  bytes received via SQL*Net from client

2  SQL*Net roundtrips to/from client

0  sorts (memory)

0  sorts (disk)

388  rows processed

We notice the index is now being selected by the CBO. At a cost of 4 (previously the cost was somewhat greater than the 310 cost of the FTS), this much more accurately reflects the true cost of using the index (notice only 6 consistent gets are performed).

Being able to now set the TABLE_CACHED_BLOCKS preference during statistics collection finally gives us a fully supported and easy method to collect more accurate CF statistics. This in turn can only lead to more informed and accurate decisions by the CBO and ultimately better performing applications. Although available right now via the back ported patches, this will no doubt all be fully documented once the 12c database is finally released.

I can’t recommend enough the use of this new capability :)

Storage Indexes vs Database Indexes IV: 8 Column Limit (Eight Line Poem) May 1, 2013

Posted by Richard Foote in Exadata, Oracle Indexes, Smart Scans, Storage Indexes.
2 comments

As Exadata Storage Indexes (SI) are purely memory only structures located on the Exadata storage servers, care needs to be taken in how much memory they can potentially consume. As a result, there is a limit of 8 columns (or 8 SIs) that can be defined for a given 1M storage region at any point in time, even though SIs are much smaller structures than equivalent database indexes. As database indexes are physical constructs however, there’s no such limit on the number of indexes that can be defined for a table. This can become another key difference between SIs and database indexes.

The following table has more than 8 columns, each with differing numbers of distinct values or distributions of data:

SQL> create table radiohead (id number, col1 number, col2 number, col3 number, col4 number, col5 number, col6 number, col7 number, col8 number, col9 number, col10 number, col11 number, col12 number, some_text varchar2(50));
Table created.

SQL> insert into radiohead select rownum, mod(rownum,10), mod(rownum,100), mod(rownum,1000), mod(rownum,10000), mod (rownum,100000), mod(rownum,1000000), ceil(dbms_random.value(0,10)), ceil(dbms_random.value(0,100)), ceil
 (dbms_random.value(0,1000)), ceil(dbms_random.value(0,10000)), ceil(dbms_random.value(0,100000)), ceil(dbms_random.value (0,1000000)), 'OK COMPUTER' from dual connect by level <=2000000; 2000000 rows created. SQL> commit;

Commit complete.

SQL> insert/*+ append */ into radiohead select * from radiohead;

2000000 rows created.

SQL> commit;

Commit complete.

SQL> insert/*+ append */ into radiohead select * from radiohead;

4000000 rows created.

SQL> commit;

Commit complete.

SQL> insert/*+ append */ into radiohead select * from radiohead;

8000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'RADIOHEAD', estimate_percent=>null, method_opt=> 'FOR ALL COLUMNS SIZE 1');

Let’s start by running a highly selective query of the ID column:

SQL> select * from radiohead where id = 42;
8 rows selected.

Elapsed: 00:00:00.05

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 8 | 416 | 42425 (1)| 00:08:30 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 8 | 416 | 42425 (1)| 00:08:30 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ID"=121212)
 filter("ID"=121212)

If we look at the session statistics, we’ll notice that a SI has been created and saved us physical IOs. Note: If you follow the demo, you’ll need to keep track of these statistics after each query or simply reconnect as a new session to ensure a SI has or has not been used.

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 1333.99219
 cell physical IO interconnect bytes returned by smart scan .164878845

OK, let’s now run a selective query on the COL1 column (there are no values 42 in this case and so no rows are returned):

SQL> select * from radiohead where col1=42;
no rows selected

Elapsed: 00:00:00.01

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1 | 52 | 42440 (1)| 00:08:30 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 1 | 52 | 42440 (1)| 00:08:30 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("COL1"=42)
 filter("COL1"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 2546.90625
 cell physical IO interconnect bytes returned by smart scan .341438293

Again, a SI has been created and used here. The SI in this case has been extremely beneficial because no data exists for 42 (only the values 1 – 10 exist). However, if an existing value were to be selected, the SI would be next to useless as such a value would exist throughout all 1M storage regions. With just 10 distinct randomly distributed values, this SI has the potential to be a waste of time. But while we search for values that don’t exist, it serves a very useful purpose. An important consideration in what’s to come.

If we now run a query now using column COL4:

SQL> select * from radiohead where col4=42;
1600 rows selected.

Elapsed: 00:00:00.68

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1600 | 83200 | 42486 (1)| 00:08:30 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 1600 | 83200 | 42486 (1)| 00:08:30 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("COL4"=42)
 filter("COL4"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 2612.64063 <= + 66MB
 cell physical IO interconnect bytes returned by smart scan 32.3048401

OK, this is the really important one to note. Firstly, yes again a SI has been created and used. Note though that this has not been the first SI to be created or used on this table; SIs have previously been created and used in the previous two queries on different columns. This will also not be the most recent SI to be created, more will soon follow. However, this is by far the least effective use of a SI, because of both the selectivity of the query and distribution of data. Here, only some 66MB or so of physical IOs have been saved.

We now repeat this process, running a query against a different column, being very selective and ensuring a SI is created and used. We finally reach a point when 8 SIs have been created on the table:

SQL> select * from radiohead where col9=0;
no rows selected

Elapsed: 00:00:00.02

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 15984 | 811K| 42561 (1)| 00:08:31 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 15984 | 811K| 42561 (1)| 00:08:31 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("COL9"=0)
 filter("COL9"=0)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 8792.94531
 cell physical IO interconnect bytes returned by smart scan 45.3872757

OK, we’ve now reached the point where  8 SIs have definitely been created on this table.

If we now run a query that could potentially use a SI but isn’t particularly effective, basically on a par to the one we saw created previously on COL4:

SQL> select * from radiohead where col10=42;
1536 rows selected.

Elapsed: 00:00:00.73

Execution Plan
----------------------------------------------------------
Plan hash value: 2516349655

---------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1600 | 83200 | 42577 (1)| 00:08:31 |
|* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 1600 | 83200 | 42577 (1)| 00:08:31 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("COL10"=42)
 filter("COL10"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
---------------------------------------------------------------- ----------
cell physical IO bytes saved by storage index 8792.94531
cell physical IO interconnect bytes returned by smart scan 45.9288864

We notice that no bytes have been saved here via a SI. We can run this query repeatedly and the results will be the same. No SI is created and no bytes are saved. Although Oracle could potentially create a SI and save some work, the fact we already have 8 SIs created for this table means we have already reached the limit on the number of SIs that can be created for this table. 8 is it.

Let’s run another query now using yet another different column (COL12), but this time it’s again a very selective query, much more selective and efficient than the previous query based on COL4 in which a SI had been created:

SQL> select * from radiohead where col12=42;
8 rows selected.

Elapsed: 00:00:00.39

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 19 | 988 | 42607 (1)| 00:08:32 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 19 | 988 | 42607 (1)| 00:08:32 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("COL12"=42)
 filter("COL12"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 9526.01563
 cell physical IO interconnect bytes returned by smart scan 53.5218124

This time however the number of bytes saved has again gone up from previously meaning that indeed a new SI has been created and used for column COL12. But this makes 9 SIs in total now for this table where the limit should be a maximum of 8 ?

Does this mean that a previously created SI has been dropped and replaced by this new one. If so, which SI is now gone ?

Well, let’s go back to the first SI we created and the one that hasn’t been used for the longest period of time. If we re-run the first query again:

SQL> select * from radiohead where id=42;
8 rows selected.

Elapsed: 00:00:00.05

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 8 | 416 | 42425 (1)| 00:08:30 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 8 | 416 | 42425 (1)| 00:08:30 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ID"=42)
 filter("ID"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 10726.9844
 cell physical IO interconnect bytes returned by smart scan 53.5264816

We notice that not only has it used a SI, but it has saved the maximum amount of IO bytes possible here meaning there was no “warming up” processes happening here indicating a newly created SI. So not only did it use a SI, it clearly used a previously created one. So this SI was not obviously impacted by the creation of this “9th” SI.

However, if we run the query again that used column COL4, the query that previously used a SI but was by far the least effective in saving physical IOs:

SQL> select * from radiohead where col4 = 4242;
1600 rows selected.

Elapsed: 00:00:00.73

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1600 | 83200 | 42486 (1)| 00:08:30 |
 |* 1 | TABLE ACCESS STORAGE FULL| RADIOHEAD | 1600 | 83200 | 42486 (1)| 00:08:30 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("COL4"=4242)
 filter("COL4"=4242)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 10726.9844 No Change !!
 cell physical IO interconnect bytes returned by smart scan 54.1522598

This time we notice there’s no change to the number of bytes saved by a SI. No matter how often we run this query now, no SI is used. So the SI that was created previously is now gone as a result of creating the more effective SI on the COL12 column. Indeed there are still just the 8 SIs on this table.

So Oracle will indeed limit the number of SIs to 8 for each table/storage region. However, it’s not simply a case of “first in first served” or some such, with Oracle using (undocumented) performance metrics to determine which 8 SIs/columns to choose. This means it might be possible in some rarer scenarios where more than 8 columns get referenced in SQL statements for SIs to come and go depending on changing workloads.  Another example of where database indexes may yet play a role in Exadata environments, where currently tables have more than 8+ indexed columns.

1,000,000 Blog Visits (How I Made My Millions) April 30, 2013

Posted by Richard Foote in Richard's Musings.
add a comment

Things have certainly been busy lately so I haven’t had much of a chance to get back here for a while. Busy at work, busy at home, busy at play and even busy with a new David Bowie album to enjoy !! And yes, “The Next Day” is an absolutely stunning album by the great man :)

While answering a backlog of questions today however, I noticed the blog has recently just passed a rather significant milestone. A little like driving the car and being too busy listening to David Bowie on the stereo to notice that the odometer has just passed 200,000 kms (yes my car is getting on a bit), I missed out on seeing the blog visit counter go past 1,000,000 visits. I kinda get excited by these milestones, see how I was when I passed 10,000, so I’m a little sad to have missed the party. That said, 1 million is quite a number for a humble little blog that doesn’t discuss Lady Gaga or religion (very often).

So thank-you to everyone that has visited and contributed to all the various discussions over the years.

The good news, I have a couple of new articles that will hopefully make an appearance in the next few days so the drought is nearly over :) But first, I have a new episode of Game of Thrones Season III to get through :)

I’m Back !! Oracle Index Internals Seminar Scheduled in Europe For June. April 3, 2013

Posted by Richard Foote in Oracle Index Seminar, Richard's Musings.
5 comments

It’s been quite a while but I’ve recently been asked to present my Indexing Seminar in Europe for Oracle University and at this stage, all the planets seem to be aligning themselves to make this a distinct possibility.

Seminar details are as follows:

Location: Bucharest, Romania

When: 17-18 June 2013

Cost: 2000 RON (approx. 450 EUR)

Enrolment: Follow this link

This will be a unique opportunity as I just don’t get the chance in my schedule to present this any more, to learn some rather interesting and useful information on all things to do with Oracle indexes. Note there are currently no plans to schedule other events in the foreseeable future, this is it.

Hopefully you’ll get the opportunity to attend and I’ll get to meet you in beautiful Bucharest :)

Storage Indexes vs Database Indexes Part III: Hidden Values (Hide Away) February 7, 2013

Posted by Richard Foote in Exadata, Oracle Indexes, Smart Scans, Storage Indexes.
3 comments

OK, my holiday to Hawaii is now slowing fading away into distant memory. Time for a new post :)

In my previous post on differences between Exadata Storage Indexes and Database Indexes Part II, I discussed how the clustering of data within the data is an important factor (pun fully intended !!) in the performance and efficiency of both types of indexes. In this post, I’ll expand a little more on this point and highlight scenarios where a normal database index can significantly improve performance but be totally ineffective as a Storage Index.

Storage Indexes can only determine where required data can’t exist within a (1M default) storage region via the associated Min/Max values of the index. If a required value can’t possibly exist within the Min/Max range boundary, the specific storage region can be ignored and not be accessed during the full scan operation. However, in some scenarios, there can be limitations in the usefulness of only having Min/Max values to work with if both Min and Max values are relatively common and randomly distributed throughout the table. This is not an uncommon scenario when dealing with columns that have data such as code status values.

In the BIG_BOWIE table I’ve been using in this series, I have a FORMAT_ID column that was initially populated with just values 2,4,6,8 and 10, evenly distributed throughout the table. I then updated a relatively small number of these values in the following manner:

SQL> update big_bowie set format_id = 3 where mod(id,10000)=0;
1000 rows updated.

SQL> commit;

Commit complete.

SQL> update big_bowie set format_id = 5 where id between 424242 and 425241;

1000 rows updated.

SQL> commit;

Commit complete.

So I have relatively few FORMAT_ID = 3 (1000 in 10M rows, 0.01% of data) littered throughout the table and a few FORMAT_ID = 5 located in a specific portion/location of the table.

I’ll now create a normal database index on this FORMAT_ID column and a histogram to let the CBO know there are indeed relatively few occurances of values 3 and 5 within the table:

SQL> create index big_bowie_format_id_i on big_bowie(format_id);
Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'BIG_BOWIE', method_opt=> 'FOR COLUMNS FORMAT_ID SIZE 10');

PL/SQL procedure successfully completed.

OK, if we now run a query looking for all the FORMAT_ID = 3:

SQL> select album_id, total_sales from big_bowie where format_id = 3;
1000 rows selected.

Elapsed: 00:00:00.01

Execution Plan
 -----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 -----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 11000 | 72 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 11000 | 72 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_FORMAT_ID_I | 1000 | | 4 (0)| 00:00:01 |
 -----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("FORMAT_ID"=3)

Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 1072 consistent gets
 0 physical reads
 0 redo size
 22984 bytes sent via SQL*Net to client
 1250 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

We notice  the CBO has used the index to efficiently retrieve just the 0.01% of required rows because of the high selectivity of the query, even though the overall Clustering Factor of the index is appalling.

It does an even better job when accessing FORMAT_ID = 5 as these specific values are all well clustered and  found within in a very tight portion of the table:

SQL> select album_id, total_sales from big_bowie where format_id = 5;
1000 rows selected.

Elapsed: 00:00:00.00

Execution Plan
 -----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 -----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 11000 | 72 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 11000 | 72 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_FORMAT_ID_I | 1000 | | 4 (0)| 00:00:01 |
 -----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("FORMAT_ID"=5)

Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 154 consistent gets
 0 physical reads
 0 redo size
 22685 bytes sent via SQL*Net to client
 1250 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

At just 154 consistent gets in this example, the results return almost instantaneously. The database index is extremely effective in these examples because it can point just to the locations within the table where these relatively few rows of interest are located.

Let’s see how Storage Indexes cope in this scenario. First, we make the database index invisible:

SQL> alter index big_bowie_format_id_i invisible;
Index altered.

We now run the same query again (this can be repeated as many times as you like):

SQL> select album_id, total_sales from big_bowie where format_id = 3;
1000 rows selected.

Elapsed: 00:00:00.80

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 11000 | 36682 (1)| 00:07:21 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 1000 | 11000 | 36682 (1)| 00:07:21 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("FORMAT_ID"=3)
 filter("FORMAT_ID"=3)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 0
 cell physical IO interconnect bytes returned by smart scan .397476196

We notice no matter how often we try, that Storage Indexes have managed to save us precisely nothing. The entire table has had to be scanned and read during the Full Table Scan operation.

SQL> select album_id, total_sales from big_bowie where format_id = 5;
1000 rows selected.

Elapsed: 00:00:00.89

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 11000 | 36682 (1)| 00:07:21 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 1000 | 11000 | 36682 (1)| 00:07:21 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("FORMAT_ID"=5)
 filter("FORMAT_ID"=5)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 0
 cell physical IO interconnect bytes returned by smart scan .565643311

Same results when accessing the FORMAT_ID = 5. Any possible Storage Indexes have been totally ineffective and the Full Table Scan has had to read the entire table.

The result being performance is significantly slower than with the database indexes visible and in place.

Why ?

The data consists primarily of randomly distributed values 2,4,6,8 and 10, with the odd value 3 and 5 littered around. This means that if created, the Storage Index Min/Max values will effectively be 2 and 10 for all 1M storage regions throughout the entire table.

Therefore, both values 3 and 5 could possibly exist within all/any of the table storage regions as they sit inside the 2 – 10 Min/Max range boundaries. These 3 and 5 values are effectively “hidden” within the Storage Index Min/Max values making Storage Indexes totally ineffective in this scenario, as they can’t skip a thing.

A final point here. If one of these rarer values of interest were to be say value “12”, then a Storage Index could be useful when searching for this particular value as any Min/Max ranges of 2-10 that don’t therefore contain a value of 12 can be skipped. However, if values 3 and 5 were also of interest, the Storage Index would still only be useful in a subset of required cases. A database index would therefore still be required to cater all necessarily cases.

As Database indexes can directly point to indexed values of interest, this is another example of where we would likely not want to just drop a database index in Exadata.

Merry Christmas and A Happy New Year !! (The Snowman) December 24, 2012

Posted by Richard Foote in Christmas.
1 comment so far

I just like to wish you all a very Merry Christmas and a Happy, Index-Rebuild Free New Year !!

I have a feeling next year will very exciting one for Oracle Database professionals:)

My little “gift” this year is a lovely little animation called “The Snowman” that dates all the way back to 1982 when I arrived in Australia.

Introduced by the one and only David Bowie, it’s based on the famous children’s book by Raymond Biggs (who also did “When The Wind Blows” that Bowie also worked on).

Enjoy !!

5 Year Anniversary (5 Years) December 20, 2012

Posted by Richard Foote in Richard's Musings.
13 comments

It’s just occurred to me that I’ve just recently past the 5 year anniversary of when I started this humble little blog !!

My first real post on Invisible Indexes was written back on 11 December 2007. Some 256 posts later and I’m still here, posting occasionally on Oracle Indexes with as many David Bowie references as I can manage :)

There have been roughly 923,000 page views in that time and more importantly some 3,000+ comments in total, some of them driving shall we say “interesting and lively” debates. The entry with the most comments is this one on how execution plans can suddenly when the statistics remain unchanged, although this one on rebuilding indexes every Sunday afternoon was up there as well.

The most views on any given day was back on March 13, 2010 with 3606 views, when this entry on block and tree dumps was particularly hammered. The most popular blog entry of all time however goes to this one on the change of behaviour of dbms_stats with 10g, with this one on the differences between an index rebuild, coalesce and shrink a very close second.

My all-time favourite blog entry however is this April Fool’s one on my retirement announcement and plans to work in the adult film industry. Some of the comments on this one are just classic :)

A very sincere thank-you to all my loyal readers and to everyone who has contributed to this blog over the years, I hope you’ve learnt something new along the way. I certainly have !! With the Oracle 12c database due for release sometime in 2013, I’m sure there will be lots to write about for a while longer yet :)

Pushing through the market square, so many mothers sighing, news had just come over we had 5 years left to cry in

Storage Indexes vs Database Indexes Part II: Clustering Factor (Fast Track) December 19, 2012

Posted by Richard Foote in Clustering Factor, Exadata, Oracle Indexes, Storage Indexes.
2 comments

Two posts in two days !! Well, with Christmas just around the corner, I thought I better finish off a couple of blog posts before I get fully immersed in the festive season :)

The Clustering Factor (CF) is the most important index related statistic, with the efficiency of an index performing multi-row range scans very much dependent on the CF of the index. If the data in the table is relatively well clustered in relation to the index (i.e. it has a “low” CF), then an index range scan can visit relatively few table blocks to obtain the necessary data. If the data is effectively randomised and not well clustered in relation to the index (i.e. has a “high” CF), then an index range scan has to visit many more table blocks and not be as efficient/effective as a result. The CBO will be less inclined to use such an index as a result, depending on the overall selectivity of the query.

It’s something I’ve discussed here many times before.

It’s a very similar story for Exadata Storage Indexes (SI) as well. The better the table data is clustered in relation to the SIs, the more efficient and effective the SIs are likely to be in relation to being able to eliminate accessing storage regions that can’t possibly contain data of interest. By having the data more clustered (or ordered) in relation to a specific SI, the Min/Max ranges associated with the SI are more likely to be able to determine areas of the table where data can’t exist.

For the data I’ll be using in the following examples, I refer you to the previous post where I setup the necessary data.

The following query is on a 10 million row table, based on the ALBUM_ID column that has an excellent CF:

SQL> select * from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:01.07

Execution Plan
 ----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 8984K| 1550 (1)| 00:00:19 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 100K| 8984K| 1550 (1)| 00:00:19 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 100K| | 199 (1)| 00:00:03 |
 ----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)

Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 1590 consistent gets
 1550 physical reads
 0 redo size
 9689267 bytes sent via SQL*Net to client
 733 bytes received via SQL*Net from client
 21 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 100000 rows processed

An index range scan access path is selected by the CBO to retrieve the 100,000 rows. At just 1590 consistent gets, with such an excellent CF, the index can very efficiently access just the necessary 100,000 rows of data. Notice that most of these consistent gets are still physical reads (1550). If we re-run the query several times and are able to cache the corresponding index/table blocks in the database buffer cache:

SQL> select * from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.27

Execution Plan
 ----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 8984K| 1550 (1)| 00:00:19 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 100K| 8984K| 1550 (1)| 00:00:19 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 100K| | 199 (1)| 00:00:03 |
 ----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)

Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 1590 consistent gets
 0 physical reads
 0 redo size
 9689267 bytes sent via SQL*Net to client
 733 bytes received via SQL*Net from client
 21 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 100000 rows processed

The overall execution times reduce down from 1.07 to just 0.27 seconds. Not bad at all considering we’re returning 100,000 rows.

However, if we run the same query on the same data with all the smarts turned on in Exadata (with the index made Invisible so that it doesn’t get used by the CBO):

SQL> select * from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.27
 Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 8984K| 36663 (1)| 00:07:20 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 100K| 8984K| 36663 (1)| 00:07:20 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ALBUM_ID"=42)
 filter("ALBUM_ID"=42)

Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 134834 consistent gets
 134809 physical reads
 0 redo size
 4345496 bytes sent via SQL*Net to client
 73850 bytes received via SQL*Net from client
 6668 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 100000 rows processed

 SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 1042.24219
 cell physical IO interconnect bytes returned by smart scan 9.56161499

We notice that although a Full Table Scan is being performed, the overall performance of the query is practically identical to that of using the index. That’s because the SIs are kicking in here and by saving 1042 MB (approximately 99% of the table), Oracle only has to actually physically access a tiny 1% of the table (basically, the 1% selectivity of the query itself). SIs based on the well clustered ALBUM_ID column are therefore very effective at eliminating the access of unnecessary data.

If we now run a query based on the TOTAL_SALES column in which the data is randomly distributed all over the place and so the associated index has a very poor CF:

SQL> select album_id, artist_id from big_bowie where total_sales between 42 and 142;
2009 rows selected.

Elapsed: 00:00:01.45

Execution Plan
 -------------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes |Cost (%CPU)| Time |
 -------------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 2040 | 26520 | 2048 (1)| 00:00:25 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 2040 | 26520 | 2048 (1)| 00:00:25 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 2040 | | 7 (0)| 00:00:01 |
 -------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("TOTAL_SALES">=42 AND "TOTAL_SALES"<=142)

Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 2150 consistent gets
 2005 physical reads
 0 redo size
 43311 bytes sent via SQL*Net to client
 1987 bytes received via SQL*Net from client
 135 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 2009 rows processed

We notice that although only 2009 rows are retrieved with this query, 2150 consistent gets have been performed (practically 1 for each row returned) . This is somewhat more than the 1590 consistent gets of the previous example when a full 100,000 rows were returned. Using this index therefore is nowhere near as efficient/effective in retrieving data as was the index in the previous example.

If all this data can be cached in the buffer cache however, we can again improve overall execution times:

SQL> select album_id, artist_id from big_bowie where total_sales between 42 and 142;
2009 rows selected.

Elapsed: 00:00:00.02

Execution Plan
 ------------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes |Cost (%CPU)| Time |
 ------------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 2040 | 26520 | 2048 (1)| 00:00:25 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 2040 | 26520 | 2048 (1)| 00:00:25 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 2040 | | 7 (0)| 00:00:01 |
 ------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("TOTAL_SALES">=42 AND "TOTAL_SALES"<=142)

Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 2150 consistent gets
 0 physical reads
 0 redo size
 43308 bytes sent via SQL*Net to client
 1987 bytes received via SQL*Net from client
 135 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 2009 rows processed

So with all the index/table data now cached, we can return the 2000 odd rows in just 0.02 seconds.

If we now run the same query with the same data in Exadata:

SQL> select album_id, artist_id from big_bowie where total_sales between 42 and 142;
2009 rows selected.

Elapsed: 00:00:01.25

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 2040 | 26520 | 36700 (1)| 00:07:21 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 2040 | 26520 | 36700 (1)| 00:07:21 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("TOTAL_SALES"<=142 AND "TOTAL_SALES">=42)
 filter("TOTAL_SALES"<=142 AND "TOTAL_SALES">=42)

Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 134834 consistent gets
 134809 physical reads
 0 redo size
 47506 bytes sent via SQL*Net to client
 1987 bytes received via SQL*Net from client
 135 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 2009 rows processed

 SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 72.65625
 cell physical IO interconnect bytes returned by smart scan .383415222

We noticed that the physical IO bytes saved by the SIs has significantly reduced from the previous example (just 72 MBs down from 1042 MBs), even though at just 2000 odd rows we require much less data than before. In this example, only approximately 7% of table storage need not be accessed, meaning we still have to access a significant 93% of the table as the required data could potentially exist throughout the majority of the table. The poor clustering of the data in relation the TOTAL_SALES column has effectively neutralised the effectiveness of the associated SIs on the TOTAL_SALES column.

Note also that the 1.25 seconds is as good as it gets when performing a FTS with the fully generated SIs in place. In this case, using a fully cached database index access path can outperform the FTS/SI combination and provide a more efficient and ultimately more scalable method of accessing this data. As the required selectively on this column is low enough to warrant the use of a database index despite the poor CF, this is again another example of an index we may not necessarily want to automatically drop when moving to Exadata.

Storage Indexes vs Database Indexes Part I MIN/MAX (Maxwell’s Silver Hammer) December 18, 2012

Posted by Richard Foote in Exadata, Oracle Indexes, Storage Indexes.
add a comment

It’s often stated that in Exadata, you don’t need conventional database indexes anymore as everything runs so damn fast that indexes are simply a waste of time and space. Simply drop all database indexes and things will run just as fast.

Well, not quite …

There are many many scenarios where database indexes are still critical for optimal performance and scalability in an Exadata environment.

Prompted by a question in an earlier post (thanks Paul), I thought I might start looking at some scenarios where dropping a database index in Exadata would not be the best of ideas.

The first example is the scenario where one wants either the minimum/maximum of a column value. As database index entries are always ordered, this can very efficiently be found by traversing down to either the first or last index leaf block within an appropriate index. There’s an execution the INDEX FULL SCAN MIN/MAX execution path by which the CBO can decide to access an index in this manner.

For example, if I create an index on the ID column of my large DWH_BOWIE table and select the MIN(ID):

SQL> create index dwh_bowie_id_i on dwh_bowie(id);
Index created.

SQL> select min(id) from dwh_bowie;

MIN(ID)
 ----------
 1

Elapsed: 00:00:00.01

Execution Plan
 ---------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1 | 6 | 4 (0)| 00:00:01 |
 | 1 | SORT AGGREGATE | | 1 | 6 | | |
 | 2 | INDEX FULL SCAN (MIN/MAX)| DWH_BOWIE_ID_I | 1 | 6 | 4 (0)| 00:00:01 |
 ---------------------------------------------------------------------------------------------

Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 4 consistent gets
 3 physical reads
 0 redo size
 525 bytes sent via SQL*Net to client
 524 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1 rows processed

With only a handful of consistent gets, we have managed to get the minimum ID value extremely quickly and efficiently.

However, if we drop this index and re-run the same query in Exadata numerous times:

SQL> select min(id) from dwh_bowie;
MIN(ID)
 ----------
 1

Elapsed: 00:00:41.31

Execution Plan
 ----------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1 | 6 | 2345K (1)| 07:49:12 |
 | 1 | SORT AGGREGATE | | 1 | 6 | | |
 | 2 | TABLE ACCESS STORAGE FULL| DWH_BOWIE | 640M| 3662M| 2345K (1)| 07:49:12 |
 ----------------------------------------------------------------------------------------

Statist5ics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 8626963 consistent gets
 8625479 physical reads
 0 redo size
 525 bytes sent via SQL*Net to client
 524 bytes received via SQL*Net from client
 2 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1 rows processed

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 0
 cell physical IO interconnect bytes returned by smart scan 7644.38194

We notice the query takes considerably longer and that the Storage Indexes have been unable to be of any help with not a single byte saved.

This is because all aggregation type operations are performed at the database level, not within the storage servers. Even though in theory Storage Indexes could be seen as perhaps being of value here, they are totally ignored by Oracle when retrieving either the minimum/maximum of a column value. Remember also that a key difference between a Storage Index and a Database Index is that a Storage Index does not necessarily have to exist fully for all regions of a given table.

Drop database indexes used in this manner at your peril …

Many more examples to come :)

Exadata Storage Indexes Part V: Warming Up (Here Come The Warm Jets) December 3, 2012

Posted by Richard Foote in Exadata, Oracle Indexes, Storage Indexes.
add a comment

As I mentioned in a previous post, there are a number of Similarities between Storage Indexes and Database Indexes.

One of these similarities is the “warming up” process that needs to take place before indexes become “optimal” after either the Storage Server (in the case of Storage Indexes) or the Database Server (in the case of Database Indexes) is restarted.

When a database server is restarted, the contents associated with the database buffer cache is lost and has to be reloaded (as well as other components of the SGA of course). This means for a period of time while the instance “warms up”, there is a spike in physical IO (PIO) activity. Index range scans, which generally greatly benefit from re-using cached data therefore need to perform additional PIOs for both the index blocks and the table blocks they reference. The overheads associated with these additional PIOs in turn slows the overall performance of (especially) these index related execution plans.

As a simple illustration, the following query is executed after a flushed buffer cache:

SQL> select * from big_bowie where album_id = 42;
1000 rows selected.

Elapsed: 00:00:01.99

Execution Plan
 ----------------------------------------------------------
 Plan hash value: 2703824528

----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 75000 | 101 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 75000 | 101 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 1000 | | 1 (0)| 00:00:01 |
 ----------------------------------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)
 Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 1072 consistent gets
 1005 physical reads
 0 redo size
 91231 bytes sent via SQL*Net to client
 1245 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

There is a considerable amount of physical reads associated with having to re-load the contents of the database buffer cache. If we now re-run the query:

SQL> select * from big_bowie where album_id = 42;
1000 rows selected.

Elapsed: 00:00:00.01

Execution Plan
 ----------------------------------------------------------
 Plan hash value: 2703824528

----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 75000 | 101 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 75000 | 101 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 1000 | | 1 (0)| 00:00:01 |
 ----------------------------------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)
 Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 1072 consistent gets
 0 physical reads
 0 redo size
 91231 bytes sent via SQL*Net to client
 1245 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

All the data is now cached and the overall query response time improves considerably. Once a database instance has warmed up and much of the database’s hot, data-set of interest has been cached, PIOs are minimised and performance improves accordingly.

Much has been made of the fact Storage Indexes (SIs) need to be created on the fly but in a similar manner to database indexes, they just need a period of time to warm up and be created/stored in memory to become optimal. The difference being that SIs are actually created on the fly when cached in memory unlike database indexes which physically preexist on disk.

So when an Exadata storage server is re-started, the performance associated with the SIs on the server is initially sluggish as well, while Oracle performs the additional PIOs off disk to create the necessary SIs in memory. Additionally, new queries using new column predicates will also be initially sluggish while any necessary SIs are created and cached in memory. Note that SIs are created on a 1M storage region basis so it could well take a period of time for a SI to be created across all the associated storage regions of the table. There is also a limit of 8 SIs per 1M storage region which might limit whether a SI is actually created or preserved (more on this point in a later post).

The following query is executed on a relatively new table and in a new session to capture fresh session statistics:

SQL> select artist_id, total_sales from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.89

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 1269K| 36663 (1)| 00:07:20 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 100K| 1269K| 36663 (1)| 00:07:20 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ALBUM_ID"=42)
 filter("ALBUM_ID"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 114.601563
 cell physical IO interconnect bytes returned by smart scan 1.77637482

The amount of physical IO bytes saved by a SI is initially minimal, about 114 MB.

However, when the query is re-executed and the associated SI on the ALBUM_ID column has been fully created:

SQL> select artist_id, total_sales from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.27

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 1269K| 36663 (1)| 00:07:20 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 100K| 1269K| 36663 (1)| 00:07:20 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ALBUM_ID"=42)
 filter("ALBUM_ID"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 1156.84375
 cell physical IO interconnect bytes returned by smart scan 3.41764832

The amount of physical IO bytes saved by SIs increases considerably and overall performance improves as a result.

Restarting a database server will result in sluggish performance for a period until most of the database hot data is cached. This is also the case when an Exadata Storage server is re-started, in part while the necessary SIs are recreated and cached.

I’ll next discuss another similarity, that being the importance of the clustering of data within the table to the efficiency of these indexes.

Follow

Get every new post delivered to your Inbox.

Join 1,852 other followers