jump to navigation

Separate Indexes From Tables, Some Thoughts Part 1.5 (Think For Yourself) April 23, 2008

Posted by Richard Foote in Oracle General, Oracle Indexes, Oracle Myths, Tablespace Management.
5 comments

Based on the discussions on whether there’s any performance benefit in storing indexes separately from tables, I’ve had a number of questions on how can one actually determine exactly how an index range scan is performed. How for example can one determine whether an index I/O is following by a table I/O in a serial manner.

I thought therefore it might be worth going through an example of the process one can use to find these things out for oneself. This is the same basic process I use to determine much of Oracle’s internal behaviour. Simply take an extended trace of the specific operation and study the resultant trace file …

First step is to create a sample table. The table and index doesn’t need to be particularly sophisticated, simple is usually best:

SQL> create table bowie_test (id number, value number, name varchar(20)) tablespace bowie_ts parallel 1;

Table created.

SQL> create index bowie_test_id_i on bowie_test(id) tablespace users parallel 1;

Index created.

SQL> create index bowie_test_value_i on bowie_test(value) tablespace users parallel 1;

Index created.

The first index will be an index on a monotonically increasing value, the second index is on a randomly generated value. Notice the indexes have been created in a separate tablespace to the table so we can easily determine the associated data files.

We next need to just check which data files belong to which tablespace:

SQL> select tablespace_name, file_id from dba_data_files where tablespace_name in (‘BOWIE_TS’, ‘USERS’);

TABLESPACE_NAME                   FILE_ID
------------------------------ ----------
USERS                                   4
BOWIE_TS                                7
BOWIE_TS                                8 

Notice how the USERS tablespace which stores the indexes has a data file ID of 4 …

Next we populate the table, the ID column with a monotonically increasing value and the VALUE column with a randomly generated number:

SQL> insert into bowie_test select rownum, dbms_random.value(0,1000000), ‘David Bowie’ from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE_TEST’, cascade=>true, estimate_percent=>null, method_opt=>’FOR ALL COLUMNS SIZE 1′);

PL/SQL procedure successfully completed.

Next, we flush the buffer cache to ensure all subsequent I/Os at least start off by being physical I/Os:

SQL> alter system flush buffer_cache;

System altered.

We now start an extended trace of the subsequent operations in the session:

SQL> set autotrace traceonly
SQL> alter session set events ’10046 trace name context forever, level 12′;

Session altered.

Now, let’s perform a large index range scan, using the random column index. Note we use a hint to ensure the execution plan uses the index:

SQL> SELECT /*+ index(b bowie_test_value_i) */ * FROM bowie_test b WHERE value BETWEEN 100000 and 200000;

99611 rows selected.

Finally, we end the extended trace of the session:

SQL> alter session set events ’10046 trace name context off’;

Session altered.

Following are sample lines from the generated trace file. Let’s begin by looking at the trace file, searching for the query we’ve just executed:

PARSING IN CURSOR #1 len=99 dep=0 uid=88 oct=3 lid=88 tim=27706931411 hv=2642872055 ad=’23a7d548′ sqlid=’7aynpuafsf1rr’
SELECT /*+ index(b bowie_test_value_i) */ * FROM bowie_test b WHERE value BETWEEN 100000 and 200000
END OF STMT
PARSE #1:c=0,e=12923,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=27706931405
BINDS #1:
EXEC #1:c=0,e=56,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=27706931577
WAIT #1: nam=’SQL*Net message to client’ ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=12446 tim=27706931615
WAIT #1: nam=’db file sequential read’ ela= 14972 file#=4 block#=23148 blocks=1obj#=80423 tim=27706946845
WAIT #1: nam=’db file sequential read’ ela= 10979 file#=4 block#=30204 blocks=1obj#=80423 tim=27706957960
WAIT #1: nam=’db file sequential read’ ela= 33756 file#=4 block#=24802 blocks=1obj#=80423 tim=27706991842
WAIT #1: nam=’db file sequential read’ ela= 16408 file#=7 block#=121331 blocks=1obj#=80422 tim=27707008593
FETCH #1:c=0,e=77078,p=4,cr=4,cu=0,mis=0,r=1,dep=0,og=1,tim=27707008734
WAIT #1: nam=’SQL*Net message from client’ ela= 2099 driver id=1413697536 #bytes=1 p3=0 obj#=80422 tim=27707010906
WAIT #1: nam=’db file sequential read’ ela= 1661 file#=7 block#=123430 blocks=1obj#=80422 tim=27707012707
WAIT #1: nam=’SQL*Net message to client’ ela= 5 driver id=1413697536 #bytes=1 p3=0 obj#=80422 tim=27707012765
WAIT #1: nam=’db file sequential read’ ela= 10203 file#=7 block#=120965 blocks=1obj#=80422 tim=27707023006
WAIT #1: nam=’db file sequential read’ ela= 13488 file#=7 block#=122222 blocks=1obj#=80422 tim=27707036715
WAIT #1: nam=’db file sequential read’ ela= 16737 file#=7 block#=124918 blocks=1obj#=80422 tim=27707053604
WAIT #1: nam=’db file sequential read’ ela= 15888 file#=7 block#=120896 blocks=1obj#=80422 tim=27707069597
WAIT #1: nam=’db file sequential read’ ela= 11684 file#=7 block#=125491 blocks=1obj#=80422 tim=27707081569

Notice how the first 3 physical I/Os are related to the index, the root block, the intermediate branch block and the first leaf block of interest. Note the file number tells us this, as does the obj#. Also notice how each I/O only reads the one block and how each block is random and not consecutive to the previously read blocks (as illustrated by the block# value).

Notice how we next have a whole series of I/Os that correspond to the table. Again we can tell by the different file number and by the new obj#. Notice how they’re all also 1 block I/Os and how they’re all random, non consecutive physical I/Os. There are actually approximately 150 such table related I/Os until we finally reach the next index related I/O:

WAIT #1: nam=’db file sequential read’ ela= 8374 file#=4 block#=30216 blocks=1obj#=80423 tim=27708496278
WAIT #1: nam=’db file sequential read’ ela= 13180 file#=7 block#=122271 blocks=1 obj#=80422 tim=27708509565
WAIT #1: nam=’db file sequential read’ ela= 13942 file#=7 block#=124230 blocks=1 obj#=80422 tim=27708523576
WAIT #1: nam=’db file sequential read’ ela= 10766 file#=7 block#=122566 blocks=1 obj#=80422 tim=27708534450
WAIT #1: nam=’db file sequential read’ ela= 9414 file#=7 block#=124035 blocks=1 obj#=80422 tim=27708543976
WAIT #1: nam=’db file sequential read’ ela= 10800 file#=7 block#=125604 blocks=1 obj#=80422 tim=27708554874
WAIT #1: nam=’db file sequential read’ ela= 13215 file#=7 block#=121903 blocks=1 obj#=80422 tim=27708568145
WAIT #1: nam=’db file sequential read’ ela= 12977 file#=7 block#=124409 blocks=1 obj#=80422 tim=27708581221
WAIT #1: nam=’db file sequential read’ ela= 13834 file#=7 block#=121260 blocks=1 obj#=80422 tim=27708595150

Notice how this next index block which relates to the next leaf block is only a single block I/O that again is not at all consecutive to the previously read index block. In short, it’s just another random I/O.

This is then again followed by a whole series of randomly accessed table related blocks.

This whole sequence is then repeated many times for a whole series of different index leaf block, none of which are consecutive to the previously read index blocks. Here are a few of the following index related blocks reads, in between each we had a whole series of table related I/Os:

WAIT #1: nam=’db file sequential read’ ela= 16580 file#=4 block#=26300 blocks=1 obj#=80423 tim=27709706388

WAIT #1: nam=’db file sequential read’ ela= 13769 file#=4 block#=30113 blocks=1 obj#=80423 tim=27711127979

WAIT #1: nam=’db file sequential read’ ela= 12267 file#=4 block#=16476 blocks=1 obj#=80423 tim=27712662102

WAIT #1: nam=’db file sequential read’ ela= 7692 file#=4 block#=30603 blocks=1 obj#=80423 tim=27713801853

Note that the next “logical” block in an index structure is not necessarily the next “physical” block in the index segment. None of these are consecutive, they’re all effectively random Index related I/Os …

The fact all these I/Os are performed in this specific, serial sequence and the fact all these I/Os are effectively random is one of the main reasons why separating indexes from tables makes no real difference to performance.

That and the fact of course that while this select was being performed, there were concurrently a whole series of other I/Os being requested at the same time by all the other concurrent sessions in this database environment that potentially conflicted with the I/O requests in my session.

I’ll leave it to the reader to perform the same tests again with the monotonically increasing index to see how things can be somewhat different for well clustered indexes. However, contention between sessions and different processes still makes separating indexes in these more promising scenarios ultimately futile as well.

Follow

Get every new post delivered to your Inbox.

Join 1,713 other followers