jump to navigation

Exadata Storage Indexes Part V: Warming Up (Here Come The Warm Jets) December 3, 2012

Posted by Richard Foote in Exadata, Oracle Indexes, Storage Indexes.
trackback

As I mentioned in a previous post, there are a number of Similarities between Storage Indexes and Database Indexes.

One of these similarities is the “warming up” process that needs to take place before indexes become “optimal” after either the Storage Server (in the case of Storage Indexes) or the Database Server (in the case of Database Indexes) is restarted.

When a database server is restarted, the contents associated with the database buffer cache is lost and has to be reloaded (as well as other components of the SGA of course). This means for a period of time while the instance “warms up”, there is a spike in physical IO (PIO) activity. Index range scans, which generally greatly benefit from re-using cached data therefore need to perform additional PIOs for both the index blocks and the table blocks they reference. The overheads associated with these additional PIOs in turn slows the overall performance of (especially) these index related execution plans.

As a simple illustration, the following query is executed after a flushed buffer cache:

SQL> select * from big_bowie where album_id = 42;
1000 rows selected.

Elapsed: 00:00:01.99

Execution Plan
 ----------------------------------------------------------
 Plan hash value: 2703824528

----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 75000 | 101 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 75000 | 101 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 1000 | | 1 (0)| 00:00:01 |
 ----------------------------------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)
 Statistics
 ----------------------------------------------------------
 1 recursive calls
 0 db block gets
 1072 consistent gets
 1005 physical reads
 0 redo size
 91231 bytes sent via SQL*Net to client
 1245 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

There is a considerable amount of physical reads associated with having to re-load the contents of the database buffer cache. If we now re-run the query:

SQL> select * from big_bowie where album_id = 42;
1000 rows selected.

Elapsed: 00:00:00.01

Execution Plan
 ----------------------------------------------------------
 Plan hash value: 2703824528

----------------------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ----------------------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 1000 | 75000 | 101 (0)| 00:00:01 |
 | 1 | TABLE ACCESS BY INDEX ROWID| BIG_BOWIE | 1000 | 75000 | 101 (0)| 00:00:01 |
 |* 2 | INDEX RANGE SCAN | BIG_BOWIE_ALBUM_ID_I | 1000 | | 1 (0)| 00:00:01 |
 ----------------------------------------------------------------------------------------------------
 Predicate Information (identified by operation id):
 ---------------------------------------------------

2 - access("ALBUM_ID"=42)
 Statistics
 ----------------------------------------------------------
 0 recursive calls
 0 db block gets
 1072 consistent gets
 0 physical reads
 0 redo size
 91231 bytes sent via SQL*Net to client
 1245 bytes received via SQL*Net from client
 68 SQL*Net roundtrips to/from client
 0 sorts (memory)
 0 sorts (disk)
 1000 rows processed

All the data is now cached and the overall query response time improves considerably. Once a database instance has warmed up and much of the database’s hot, data-set of interest has been cached, PIOs are minimised and performance improves accordingly.

Much has been made of the fact Storage Indexes (SIs) need to be created on the fly but in a similar manner to database indexes, they just need a period of time to warm up and be created/stored in memory to become optimal. The difference being that SIs are actually created on the fly when cached in memory unlike database indexes which physically preexist on disk.

So when an Exadata storage server is re-started, the performance associated with the SIs on the server is initially sluggish as well, while Oracle performs the additional PIOs off disk to create the necessary SIs in memory. Additionally, new queries using new column predicates will also be initially sluggish while any necessary SIs are created and cached in memory. Note that SIs are created on a 1M storage region basis so it could well take a period of time for a SI to be created across all the associated storage regions of the table. There is also a limit of 8 SIs per 1M storage region which might limit whether a SI is actually created or preserved (more on this point in a later post).

The following query is executed on a relatively new table and in a new session to capture fresh session statistics:

SQL> select artist_id, total_sales from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.89

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 1269K| 36663 (1)| 00:07:20 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 100K| 1269K| 36663 (1)| 00:07:20 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ALBUM_ID"=42)
 filter("ALBUM_ID"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 114.601563
 cell physical IO interconnect bytes returned by smart scan 1.77637482

The amount of physical IO bytes saved by a SI is initially minimal, about 114 MB.

However, when the query is re-executed and the associated SI on the ALBUM_ID column has been fully created:

SQL> select artist_id, total_sales from big_bowie where album_id = 42;
100000 rows selected.

Elapsed: 00:00:00.27

Execution Plan
 ---------------------------------------------------------------------------------------
 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
 ---------------------------------------------------------------------------------------
 | 0 | SELECT STATEMENT | | 100K| 1269K| 36663 (1)| 00:07:20 |
 |* 1 | TABLE ACCESS STORAGE FULL| BIG_BOWIE | 100K| 1269K| 36663 (1)| 00:07:20 |
 ---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
 ---------------------------------------------------

1 - storage("ALBUM_ID"=42)
 filter("ALBUM_ID"=42)

SQL> select name , value/1024/1024 MB from v$statname n, v$mystat s where n.statistic# = s.statistic# and n.name in
 ('cell physical IO interconnect bytes returned by smart scan', 'cell physical IO bytes saved by storage index');

NAME MB
 ---------------------------------------------------------------- ----------
 cell physical IO bytes saved by storage index 1156.84375
 cell physical IO interconnect bytes returned by smart scan 3.41764832

The amount of physical IO bytes saved by SIs increases considerably and overall performance improves as a result.

Restarting a database server will result in sluggish performance for a period until most of the database hot data is cached. This is also the case when an Exadata Storage server is re-started, in part while the necessary SIs are recreated and cached.

I’ll next discuss another similarity, that being the importance of the clustering of data within the table to the efficiency of these indexes.

About these ads

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,853 other followers

%d bloggers like this: