jump to navigation

Indexes On Small Tables Part II (The Mysteries) April 24, 2009

Posted by Richard Foote in Oracle Indexes, Small Indexes.
trackback

I’m a bit pushed for time at the moment but I thought I might quickly just expand a little on the observations in Part I of this no doubt soon to be epic series (there’s at least another 3 parts to come) !!

In Part I we saw how even with a tiny table that consists of just one block containing 100 rows of application data, Oracle requires 4 consistent get operations to retrieve just the one row via a Full Table Scan (FTS). This surprises some folks as they expect Oracle to perhaps only need the 1 or maybe 2 consistent gets to access this one block containing application related row data.

If we were to flush the buffer cache before running the SELECT statement and trace the associated session, the resultant trace file shows how Oracle needs to first visit the Segment Header of the table before it can read the actual table block containing the row of interest. Oracle needs to read the table segment header in order to determine what blocks need to be accessed to begin the FTS operation. In this specific case, there’s only the one data block but Oracle also has to check to ensure there indeed are no more blocks it needs to visit. Oracle also has to perform another fetch operation to confirm there are indeed no more rows it needs to return after it fetches the first (and in this case only) row.

The following should help to show what’s going on during the FTS of our little 100 row table where all 100 rows nicely resides in the one data block.

We first flush the buffer cache in order to force Oracle to perform physical I/Os (note the following example was run on 11.1.0.6).

SQL> alter system flush buffer_cache;

System altered.

 

Next, we trace our session …

SQL> exec dbms_monitor.session_trace_enable(waits=> true);

PL/SQL procedure successfully completed.

 

Before running the SELECT statement that returns our one row of interest …

SQL> select * from small where id = 42;

 
        ID NAME
---------- -----
        42 BOWIE

 

SQL> exec dbms_monitor.session_trace_disable();

 

If we look at an example trace file:

=====================
PARSING IN CURSOR #19 len=33 dep=0 uid=88 oct=3 lid=88 tim=24079367090 hv=2225904586 ad=’23bdd320′ sqlid=’7pjs08q2at6ya’
select * from small where id = 42
END OF STMT
PARSE #19:c=0,e=18142,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=24079367080
EXEC #19:c=0,e=35,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=24079367262
WAIT #19: nam=’SQL*Net message to client’ ela= 5 driver id=1111838976 #bytes=1 p3=0 obj#=62986 tim=24079367301
WAIT #19: nam=’db file sequential read’ ela= 16020 file#=7 block#=117897blocks=1 obj#=99586 tim=24079383391
WAIT #19: nam=’db file sequential read’ ela= 16820 file#=7 block#=117898blocks=1 obj#=99586 tim=24079416967
FETCH #19:c=15625,e=49760,p=2,cr=3,cu=0,mis=0,r=1,dep=0,og=1,tim=24079417101

WAIT #19: nam=’SQL*Net message from client’ ela= 445 driver id=1111838976 #bytes=1 p3=0 obj#=99586 tim=24079417615
FETCH #19:c=0,e=27,p=0,cr=1,cu=0,mis=0,r=0,dep=0,og=1,tim=24079417699
STAT #19 id=1 cnt=1 pid=0 pos=1 obj=99586 op=’TABLE ACCESS FULL SMALL (cr=4 pr=2 pw=2 time=0 us cost=2 size=9 card=1)’
WAIT #19: nam=’SQL*Net message to client’ ela= 6 driver id=1111838976 #bytes=1 p3=0 obj#=99586 tim=24079439168

 

Notice how the first highlighted wait event performs a ‘db file sequential read’ to first  access the segment header as denoted by file#=7 block#=117897 in order to determine which blocks need to be read during the FTS operation. You can easily confirm the file# and block # corresponds to the table segment header by querying DBA_SEGMENTS.

This is then immediately followed by another ‘db file sequential read’ wait event to access the only data block of interest as denoted by file#=7 block#=117898. Notice how this block is simply the block that follows the segment header, as all 100 rows were inserted at one time by the one transaction. Note this is the only data block in the table that contains rows and is the only table block that needs to be accessed during the FTS operation.

Notice how the first FETCH operation resulted in 3 consistent gets (cr=3), 2 consistent gets that correspond to the 2 physical I/O waits events already identified plus an extra consistent read to confirm there were no more table blocks of interest.  This FETCH returns the first and (in this case) only row of interest (r=1).

A second FETCH was required resulting in an additional consistent get (cr=1) to confirm to the client that there are indeed no more rows of interest to be returned after the first row was fetched (r=0). We might know there’s only one row but Oracle doesn’t until it performs this second fetch.

Note BTW that if the query returned no rows at all, this second fetch would not have been required as the first empty fetch would have confirmed to Oracle there were no more rows to come. The total CR count would have been just 3 in this case (but would still have been bettered by an index if present).

This is a small lookup table and we’re generally interested in just the one row. As discussed in Part 1 and now expanded upon here, a FTS requires at least 4 CR behind the scenes when retrieving just the one row of interest, even if the table is tiny and can potentially store all its rows in just the one data block.

You can’t really get a table smaller than one block and yet as we’ll see, an index can beat the 4 CR overhead of reading a row from this tiny table via a FTS.

Next installment coming soon … :)

About these ads

Comments»

1. Dan Mamur - April 25, 2009

Very good information. I have a question (a very basic one):

SQL> exec show_space(‘SMALL’, ‘TEST’);
Free Blocks………………………..0
Total Blocks……………………….8
Total Bytes………………………..65536
Total MBytes……………………….0
Unused Blocks………………………6
Unused Bytes……………………….49152
Last Used Ext FileId………………..6
Last Used Ext BlockId……………….9
Last Used Block…………………….2

When only One block was used to insert all the records, why Oracle is reporting only 6 blocks as Unused.

If 2 blocks are used then Orace should visit the second (empty) block which is below HWM increasing the CR count.

Richard Foote - April 25, 2009

Hi Dan

Because the other “missing” block is the segment header which I’ve been mentioning.

This is the first and only extent of the table and being a non-ASSM tablespace, the first block in the extent is the segment header. The second block is the block containing the rows and the remaining 6 blocks are unused.

2. Dan Mamur - April 26, 2009

Thanks Richard for answer my simple question.

Similarly, if we use ASSM, does Oracle need 3 blocks to store header information?

SQL> exec show_space(‘SMALL’, ‘TEST’);
Unformatted Blocks …………………0
FS1 Blocks (0-25) …………………0
FS2 Blocks (25-50) …………………0
FS3 Blocks (50-75) …………………0
FS4 Blocks (75-100)…………………0
Full Blocks …………………1
Total Blocks……………………….8
Total Bytes………………………..65536
Total MBytes……………………….0
Unused Blocks………………………4
Unused Bytes……………………….32768
Last Used Ext FileId………………..5
Last Used Ext BlockId……………….156633
Last Used Block…………………….4

PL/SQL procedure successfully completed.

Richard Foote - April 26, 2009

Hi Dan

That’s right, if you refer to the segment space management bitmaps as being part of header information.

In your above example, which must be a segment in a ASSM tablespace:

The first block is the first level bitmap block for the ASSM storage management.

The second block is the second level bitmap block for the ASSM storage management (ASSM bitmap must have at least 2 levels).

The third block is the segment header.

The fourth block contains all the 100 rows (although this might not be the case if you don’t follow the demo exactly and insert all 100 rows within the same transaction).

So for the purposes of the demo, it doesn’t matter if the tablespace is within an ASSM tablespace or not, the results will be the same as all the data will be within 1 data block.

And his is what I’m going to highlight, that an index can be more efficient when accessing data from a table even when the table is so small as to store all it’s rows effectively within just one data block, because there has to be other block the FTS has to deal with.

3. Dan Mamur - April 26, 2009

Richard,

You guys are great. You not only share your knowledge but also take time to answer simple (and stupid) questions.

Richard Foote - April 27, 2009

Hi Dan

There’s nothing wrong with your questions, no doubt others will find them helpful as well.

Although I sometimes hear strange voices in head, there’s only the one guy here I assure you ;)

4. Statistique - May 2, 2009

I hear strange voices too ! Happy too heard than I’m “normal” !

Great series of posts btw !

Richard Foote - May 2, 2009

Hi Statistique

Who ever said I was normal ;)

5. B. Polarski - May 25, 2009

I don’t understand : You state that the First Fetch result in 3 CR (including an extra to confirm that there is no more block of interrest). Thus I translate that the First fetch is performed during the Second CR since I must place an extra CR that return null block and it becomes the third CR. : We should then represent this like:

-First CR : Read table header (Real/IO)
-Scnd CR: Read the 1st table data block (Real/IO) + Fetch first row
-Thrd CR : Return null to confirm there is no more BLOCK of interrest
sometime later:
-Forth CR : Return nulll to confirm there is no more ROW of interrest

But I don’t see this: The 10046 reports the first Fetch as third CR that return a row. In this case these is a separate CR to Write the IO and a CR to exploit the written block. In fact I don’t see the extra null block fetch unless it is the Fourth CR and then it mix the null block and the null row. Our schema becomes:

-First CR : Read table header (Real/IO)
-Scnd CR: Read the 1st table data block (Real/IO)
-Thrd CR : Fetch first row
-Forth CR : Return nulll to confirm there is no more BLOCK and ROW of interrest

What’s wrong?

Richard Foote - May 25, 2009

In summary, 2 CRs are performed on the table header and 1 CR on the table block during the first fetch while 1 CR is performed on the table block during the 2nd redundent fetch.

I “think” your missunderstanding is in thinking that the first fetch is performed by the 3rd CR whereas what the trace is detailing is that the first fetch performed a total of 3 CRs,

6. B. Polarski - May 25, 2009

Intuitively, I would have say that the first CR is for the data dictionnary to give location the table header, but I don’t see any recursive sys query in the 10046, so I will conclude the location of the table header is part of the parsing.

From this article :
-“Oracle needs to read the table segment header” that’s 1 CR on the header
-“in order to determine what blocks need to be accessed to begin the FTS operation.” that’s one CR on the data

Then comes the fetch and Oracle notes 3 CR and honestly I don’t see any rationale to go back visiting the header before or after the fetch. On the other hand, 2 CR on the data block, also puzzle me: it would means one CR while loading from the IO into the block but this is the less ‘unprobable’. Does oracle takes a latch before inserting the data from the IO and count it as a CR then a real CR when it reads the block to extract row 42.

As of the CR to determine no more block to read, I could reproduce “Vyacheslav Kryuchkov” desmonstration that it is a pure SQL*PLus feature and it poses much new questions on its own.

7. Richard Foote - May 27, 2009

That’s correct, the Oracle code path doesn’t execute recursive SQL to determine the segment header (or index root block) location, the information is not stored within a buffer within the buffer cache (in shared pool).

The additional visit to the data block is required with SQL*PLUS as it always returns just the one row (if it exists) with the first fetch, regardless of the number of possible rows it could return (or the arraysize) and if successful, only checks to see if more rows need to be returned with the second fetch. I’ve described the second fetch as “redundant” as it could possibly determine there are no more rows of interest while performing the first fetch. Consider it a feature ;)

I’m not sure what you mean by a real CR ? A consistent get is simply a block being accessed within the buffer cache which requires a latch if it’s just an examination or two latches if it needs to pinned as part of this processing.

I don’t particularly have an issue with Oracle accessing the segment header twice, it needs 2 key pieces of information, that being where to start (via the extent list) and where to stop (via the HWM). Remember, this is a FTS we’re talking about so an unnecessary or extra CR here or there for perhaps a more simplistic code path generally wouldn’t make much difference with more “conventional” larger FTS. But these smaller ones, performed millions of times in a period, any potential inefficiencies can all add up.

I would recommend having a look at the 10200 and 10202 trace events if you’re interesting in learning more on what blocks are being accessed during a consistent read. It doesn’t always include all the details (segment headers being a common example) but it can provide useful information.

8. Naseer - March 15, 2013

Hi Richard,

This post is really good. Thanks for spending lot of time and sharing your knowledge.

I have one question regarding this post:
why the oracle is performing the event ‘db file sequential read’ ? As there are no indexes and it is going to full table scan then the event should be “db file scattered read”.

Regards,
Naseer.

Richard Foote - April 30, 2013

Hi Naseer

Oracle only performs the db file scattered read when it might read multiple blocks. In this example, such as reading the segment header, Oracle is only going to read one block and one block only and so uses the db file sequential read. So a FTS on a tiny table such as this doesn’t perform scattered reads.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,822 other followers

%d bloggers like this: