jump to navigation

Indexes On Small Tables Part III (Another Brick In The Wall Part III) April 29, 2009

Posted by Richard Foote in Non-Unique Indexes, Oracle Indexes, Small Indexes.

So far, in Part I and Part II of this little series, we’ve looked at how Oracle performs a Full Table Scan (FTS) when accessing a small table. With very small tables, Oracle needs to still access the table segment header and perform a number of fetch operations, even if the table is as small as you can get with all the rows in the table residing in one table block and even if we’re only interested in accessing the one row. In the little example we went through, Oracle still needs to access 2 distinct table blocks and perform 4 consistent get operations to read this one row of interest.

OK, let’s now finally introduce an index into the discussion.  To begin with, we’ll just create a standard, Non-Unique index on the ID column.

SQL> create index small_id_i on small(id);

Index created.

Now surely, there would be no real benefit in creating such an index on such a tiny table. This would be a separate index segment from the existing table segment so that if Oracle were to ever use the index to retrieve rows, it would need to visit both the index and table segments.

How can that possibly be more efficient than reading directly from the only table block in the table segment that contains rows ?

Well, let’s see what happens:

SQL> select * from small where id = 42;

        ID NAME
---------- -----
        42 BOWIE

|Id|Operation                   |Name      |
| 0|SELECT STATEMENT            |          |
|*2|  INDEX RANGE SCAN          |SMALL_ID_I|

  0  recursive calls
  0  db block gets
  3  consistent gets
  0  physical reads
  0  redo size
465  bytes sent via SQL*Net to client
396  bytes received via SQL*Net from client
  2  SQL*Net roundtrips to/from client
  0  sorts (memory)
  0  sorts (disk)
  1  rows processed

Well, we notice 2 very interesting points.

Firstly, the CBO does indeed decide to use the index, even though the table is so tiny.

Secondly, we notice that the actual number of consistent gets has actually reduced from 4 with the FTS down to just 3.  The index scan does indeed require less consistent gets than the equivalent FTS.

If we look at the consistent get stats via another session before and after the SELECT staement as we did in Part I, we can again confirm only 3 consistent gets were necessary:

SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n
WHERE s.statistic# =n.statistic# AND s.sid = 134 AND n.name LIKE ‘consistent%’;

NAME                            VALUE
----------------------------- -------
consistent gets                   110
consistent gets - examination      25

And again after the SELECT on the small table:

SQL> SELECT n.name, s.value FROM v$sesstat s, v$statname n
WHERE s.statistic# =n.statistic# AND s.sid = 134 AND n.name LIKE ‘consistent%’;

NAME                            VALUE
----------------------------- -------
consistent gets                   113 (+3)
consistent gets - examination      25

The consistent gets indeed increased by 3 rather than 4 but again as in the FTS example, none of these consistent gets were the cheaper “consistent gets – examination” (to be discussed later).

But why did the index scan only require 3 consistent gets ? Again, by flushing the buffer cache and tracing the session, we get the necessary information to answer the question:

SQL> alter system flush buffer_cache;

System altered.

SQL> exec dbms_monitor.session_trace_enable(waits=> true);

PL/SQL procedure successfully completed.

SQL> select * from small where id = 42;

        ID NAME
---------- -----
        42 BOWIE

SQL> exec dbms_monitor.session_trace_disable();

PL/SQL procedure successfully completed.

If we now look at an example trace file:
PARSING IN CURSOR #2 len=33 dep=0 uid=88 oct=3 lid=88 tim=24452815463 hv=2225904586 ad=’23bdd320′ sqlid=’7pjs08q2at6ya’
select * from small where id = 42
PARSE #2:c=0,e=7613,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=24452815457
EXEC #2:c=0,e=42,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=24452815593
WAIT #2: nam=’SQL*Net message to client’ ela= 5 driver id=1111838976 #bytes=1 p3=0 obj#=12446 tim=24452815631
WAIT #2: nam=’db file sequential read’ ela= 17035 file#=7 block#=118026 blocks=1 obj#=99587 tim=24452832933
WAIT #2: nam=’db file sequential read’ ela= 5592 file#=7 block#=117898 blocks=1 obj#=99586 tim=24452838643

FETCH #2:c=0,e=23057,p=2,cr=2,cu=0,mis=0,r=1,dep=0,og=1,tim=24452838728
WAIT #2: nam=’SQL*Net message from client’ ela= 394 driver id=1111838976 #bytes=1 p3=0 obj#=99586 tim=24452839197
FETCH #2:c=0,e=22,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,tim=24452839275
STAT #2 id=1 cnt=1 pid=0 pos=1 obj=99586 op=’TABLE ACCESS BY INDEX ROWID SMALL (cr=3 pr=2 pw=2 time=0 us cost=2 size=9 card=1)’
STAT #2 id=2 cnt=1 pid=1 pos=1 obj=99587 op=’INDEX RANGE SCAN SMALL_ID_I (cr=2 pr=1 pw=1 time=0 us cost=1 size=0 card=1)’
WAIT #2: nam=’SQL*Net message to client’ ela= 3 driver id=1111838976 #bytes=1 p3=0 obj#=99586 tim=24452839399

The critical part here is the first “db file sequential read” wait event. By looking at the obj#=99587, we can determine this refers to the SMALL_ID_I index segment. By looking at the associated file#=7 block#=118026, a block dump reveals this to be the index leaf block containing the index row entries. Notice how we do not actually access the index segment header at all prior to accessing the leaf block of the index.

The following “db file sequential read” wait event on obj#=99586 corresponds to the table segment and file#=7 blockblock#=117898 corresponds to the table block containing the rows. So the first FETCH uses 2 consistent reads (cr=2) to return the first (and in this case only) row of interest (r=1) by first reading the index leaf block and then reading the table data block containing the row of interest.

Oracle then performs another FETCH to read the leaf block again to determine whether there are any other rows of interest (cr=1) of which there are none (r=0).

There is no PK or Unique constraint on the ID column in table and the index is Non-Unique so there is no possible way for Oracle to know there is indeed only one row of interest hence why Oracle needs to again visit the leaf block just in case there are multiple rows that have an ID = 42.

So that’s our 3 consistent gets, 1 to read the leaf block, 1 to read the table block and 1 to again read the index block again due to a SQL*PLUS quirk to perform a secondary fetch operation for non-unique scans as the first fetch only checks and fetches for one row.

Now in our specific example, that’s a saving of 1 consistent get and effectively 2 latch gets by using the Non-Unique index vs. performing the FTS on this effectively 1 block table. The savings would of course be more substantial if the small table had more than just the one data block containing rows.

Now you might well say, is it really worth the effort, what’s 1 consistent get in the grand scheme of things. Well, that’s 1 consistent get each and every time such a query needs to access this specific table. And this is just one table, you may have many such lookup tables in your databases. In heavily used applications, such lookup tables can potentially be accessed many millions of times per hour. Now all this can add up to potentially many millions of reduced consistent gets, many millions of reduced latch requests, which all in turn can add up to considerable CPU savings as well as the significantly reduced contention issues, increasing the general scalability of your applications.

And as we’ll see, we can actually do a lot better than saving just 1 consistent get by using an index to access such small tables …

However, the key question at this point is why doesn’t Oracle have to visit the index segment header as it does with the table segment during a FTS and how does Oracle know where to find the index block it needs ?

That will be answered in the next instalment coming soon …


1. John - April 30, 2009

Interesting topic.
I perhaps don’t have the patience or expertise to decipher the lower level details, but what about the cache? Can you comment how things are affected if the data IS already in mem?

Richard Foote - April 30, 2009

Hi John

There are 2 physical blocks that need to be accessed in both cases but with a Non-Unique index, we only need to perform 3 consistent get operations rather than 4 with the FTS.

If all the blocks are already in memory which is highly likely if the table is heavily accessed, then it’s still 3 vs. 4 consistent gets. Which means we save 1 attempt at getting the cache buffers chain latch to pin the block and another attempt at getting the cache buffers chain latch again to unpin the block.

Getting the latch twice and pinning / unpinning a block all takes up CPU and is a source of possible contention even if the blocks are already in memory so if we can save doing so potentially millions of times in a given period, it’s worth doing.

2. Nigel Thomas - April 30, 2009

Another question is the cpu load; how does it compare having to trail through all the rows in the data block (full table scan) to test if BOWIE = 42 vs skimming through index keys in the leaf block (index range scan).

Richard Foote - April 30, 2009

Hi Nigel

Yes it would require less work reading the data from the table block via an index. The given row could be anyway within the table block and Oracle has to check each row (and all table blocks) when performing a FTS. However the index determines the exact row location within the block and burns less CPU in finding the specific row.

If you do this operation millions of times per hour, it all adds up to potentially useful savings.

3. Vyacheslav Kryuchkov - May 1, 2009

Hello Richard,

Thank you for your great articles!
I know, sqlplus is the best and ‘default’ tool to show demos and I understand that show demonstrations on java smells bad. But there are many-many applications written in java, and it makes some things differently from sqlplus.
I tried demo with FTS and IRS in java and got 3 and 2 consistent gets. Jdbc doesn’t do second fetch to check more rows of interest, it already has this information after the first fetch, so we have 1 consistent get less. Exception is if we set fetch size to 1, in this case java will show the same result as sqlplus.
Trace from java (sorry, if bad-formatted, there is no preview here):
PARSING IN CURSOR #1 len=33 dep=0 uid=36 oct=3 lid=36 tim=29906358753 hv=2225904586 ad=’1f2d9e44′
select * from small where id = 42
PARSE #1:c=0,e=51,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=29906358748
EXEC #1:c=0,e=442,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=29906361179
WAIT #1: nam=’db file scattered read’ ela= 10927 file#=10 block#=1937 blocks=8 obj#=23826 tim=29906457080
WAIT #1: nam=’db file scattered read’ ela= 5352 file#=10 block#=1929 blocks=8 obj#=23816 tim=29906463225
FETCH #1:c=0,e=40651,p=16,cr=2,cu=0,mis=0,r=1,dep=0,og=1,tim=29906464255
STAT #1 id=1 cnt=1 pid=0 pos=1 obj=23816 op=’TABLE ACCESS BY INDEX ROWID SMALL (cr=2 pr=16 pw=0 time=40231 us)’
STAT #1 id=2 cnt=1 pid=1 pos=1 obj=23826 op=’INDEX RANGE SCAN SMALL_ID_I (cr=1 pr=8 pw=0 time=34219 us)’
Best regards,
Vyacheslav Kryuchkov

Richard Foote - May 1, 2009

Hi Vyacheslav

Thank you for your feedback.

The best tools and environments to test things out are of course the tools and environments most applicable to your situation. Testing with Java is perfect if you deal with java (or pl/sql or whatever) so well done on your tests.

What I try and do is set out a basic set of tests which one can modify and adjust to one particular situation so thank you for doing just that and coming back with variations.

I also try and explain the reason for various results so that results such as the extra fetch operation can be identified and perhaps addressed. The extra fetch is strictly not necessary and indeed is not performed in other scenarios / with other tools.

However, from the perspective of comparing a FTS vs. a non-unique index, the extra fetch is not necessary in both cases and so the “advantage” of the index approach is still there.

The other thing I’ll mention is that I’m not finished yet with index advantages and will cover other cases where the advantage is even greater still and more significant when comparing a FTS vs. an index on a small table.

Again, thanks for your comments and posting your results, very much appreciated.

4. Vyacheslav Kryuchkov - May 1, 2009

Hello Richard,

I realized that not only java, but cursor in pl/sql also doesn’t need the second fetch:

set serveroutput on;
cursor c is select name from small where id = 42;
type array is table of c%rowtype index by binary_integer;
l_data array;
vl number;
vl2 number;
SELECT value into vl
FROM v$statname sn, v$mystat ms
WHERE sn.STATISTIC#=ms.STATISTIC# and name = ‘consistent gets’;
dbms_output.put_line(‘consistent gets = ‘||vl);
open c;
fetch c bulk collect into l_data LIMIT 10;
close c;
SELECT value into vl2
FROM v$statname sn, v$mystat ms
WHERE sn.STATISTIC#=ms.STATISTIC# and name = ‘consistent gets’;
dbms_output.put_line(‘consistent gets = ‘||vl2||’ (+’||(vl2-vl)||’)’);
consistent gets = 51
consistent gets = 53 (+2)

Best regards,
Vyacheslav Kryuchkov

5. Caglar - January 9, 2014

Hello Richard,

First of all, I would like to thank you in order to share your remarkable and invaluable thoughts. I learn lots of things from you.

My question is about the following quote that you wrote

“So that’s our 3 consistent gets, 1 to read the leaf block, 1 to read the table block and 1 to again read the leaf block just in case there are other rows of interest.”

As far as I know from OTN discussion, (https://community.oracle.com/message/10711767) there is fetch and there is prefetch. SQL*PLUS is written in OCI and in OCI prefetch defaults to 1. The default arraysize in SQL*PLUS is 15. So, number of fetches into SQL*PLUS buffer will be 1,15,15,15…per round-trip. In our case, there is only 1 row in order to be fetched. Since the first fetch always contain 1 row (in SQL*PLUS). It has to be known that there is nothing more to be fetched. So, the redundant fetch should be performed on the table block again. As you shown in the trace file, the second fetch contains 0 rows.

FETCH #2:c=0,e=22,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,tim=24452839275

The thing that I disagree is the last consistent gets did not perform on the leaf block. On the cantrary, it performed on the table block.

ALso, Vyacheslav Kryuchkov’s comment is another proof, that the last consistent get should have been on the table block rather than leaf block. He tried the same test on java, please correct me if I am wrong, the default arraysize in java is 10. In addition there is no prefect in java like SQL*PLUS. So, if the last consistent gets were performed on leaf block in your example, it should have performed one more consistent get on leaf block again in Vyacheslav’s example.

I hope I expressed myself. I look forward your response.
Thanks in advance.

Richard Foote - January 9, 2014

Hi Caglar

Thanks for your kind comments, glad you’ve found some useful info here.

Now I’ve re-read my article again (it helps ;) ) I’m confident the 3rd CR is indeed actually on the leaf block :)

This is a quirk with SQL*PLUS in which it performs the additional fetch call (as the first fetch operation only returns the one row) and this needs to “start again” with the index.

You can test and see this by simply running queries with differing row sets and array sizes to see how the CR count increments.

In Java, there is no subsequent call because the first fetch looks for greater than one row and therefore “knows” within the single fetch there is nothing else to do.

6. Caglar - January 10, 2014

Hi Richard,

I tought the leaf block was pinned.


Richard Foote - January 14, 2014

Hi Caglar

The leaf block is indeed pinned in the case of a “normal” consistent get. But this is only held during the course of the consistent get and needs to be re-pinned by following consistent gets (and so the memory latches to do this and subsequently unpin the block need to be reacquired).

The story is a little different with unique indexes and consistent get examinations when the blocks are not actually pinned, which I discuss here:


7. Rathi - October 22, 2015

Hi Richard,

The meaning of ‘consistent get’ is getting the data from the buffer cache. But, in the demonstrations you have showed , you have flushed the cache & then executed the examples. The data is being read from disk. So, how is the reading of data from disk counted as ‘consistent get’?

I’m sure tat I am missing something here. Please do let me know


Richard Foote - October 22, 2015

Hi Rathi

What you’re missing is that the data is read from disk and then loaded into the buffer cache and then accessed and read from there.

So a consistent get is indeed a read from the buffer cache that might also require a physical read beforehand to load it into the buffer cache if it’s not there already.

Rathi - October 22, 2015

Thank you Richard

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 2,158 other followers

%d bloggers like this: