jump to navigation

Indexes And Small Tables Part IV (Treefingers) May 5, 2009

Posted by Richard Foote in Index Block Splits, Index Internals, Oracle Indexes, Root Index Block, Small Indexes.

As I asked in my previous post, the key question when comparing the associated costs of accessing a small table via a Full Table Scan (FTS) vs. an index scan is why does Oracle visit the segment header during a FTS but not during an index scan ?

The answer all comes down to understanding why Oracle must visit the table segment header during a FTS and how Oracle can avoid visiting the index segment header during an index scan.

Oracle must visit the table segment header during a FTS because it contains vital information necessary to perform the FTS, namely the extent map and the High Water Mark (HWM) associated with the table. Even with a table that only contains 1 data block worth of rows as in the SMALL table in my examples, Oracle has no way of automatically determining there’s actually only just the one block worth of data. It has to somehow look that up and determine exactly what table blocks Oracle needs to access during the FTS operation and the table segment header contains this necessary meta data. During “most” FTS operations, which are generally speaking larger, “expensive” operations, these accesses to the table segment header constitute a relatively small overhead. However, for FTS operations on “small” tables, accessing the table segment header can actually be a significant proportion of the overall associated costs.

During an index scan operation, there’s nothing of interest within the index segment header. The critical index block, the index block by which all index scans must start is the root block of the index (except Fast Full Index Scans which are basically the FTS equivalent for indexes). There’s no need to access the index segment header because it’s the root block that actually contains all the necessary information by which to start the index scan operation. The root blocks contains the pointers to subsequent index blocks (be it a branch or leaf blocks) that Oracle needs to follow in order to find the index entry of interest. The key to starting an index scan therefore is in determining the location of the index root block.

But how can Oracle determine the location of the index root block ?

Well Oracle implements a little “trick”, a golden rule with regard to indexes that doesn’t change regardless of the Oracle version, regardless of the O/S version, regardless of the type of tablespace or tablespace option of the index and regardless of how the index is created or grows and block splits over time.

The index root block is always, always, always the block immediately after the index segment header.


Therefore, when the Oracle code issues the associated function calls to perform an index scan, the first index block that Oracle assesses is the index segment header plus an offset of 1. Whereas a FTS accesses the table segment header, an index scan accesses the index segment block id plus 1.

With a tiny index that only has a level of 0 (or height of 1), note there is not “root” block as such as all the index entries can fit within one index leaf block. However, this block, this one and only leaf block within the index structure is also always the block immediately after the index segment header.


When we add more index entries into this one and only leaf block, we’ll eventually reach a point when it’s full and Oracle must perform an index block split operation. Oracle will then allocate 2 new blocks to the index. Assuming a 50-50 block split, one of these new blocks is assigned the lower 1/2 of all the current index entry values and the other new block is assigned the other upper 1/2 of the current index entry values. The original index leaf block content is then cleaned out and reassigned with just relative block address pointers and value boundaries associated with the 2 new leaf blocks.

The original index leaf block has been “reborn” as the root block of the index.

It’s quite easy to demonstrate how the original index block in a level 0 index or the root block of an index never changes and is always the block that follows the index segment header.

First, just create a little table and associated index:

SQL> CREATE TABLE same_root (id NUMBER, name VARCHAR2(30));

Table created.

SQL> INSERT INTO same_root VALUES (1, ‘The Thin White Duke’);

1 row created.


Commit complete.

SQL> CREATE INDEX same_root_i ON same_root(name);

Index created.

If we dump the block immediately following the index segment header, we can confirm it’s our index block of interest, containing our one and only index entry:

SQL> select header_file, header_block from dba_segments where segment_name=’SAME_ROOT_I’;

----------- ------------
          5       107441

SQL> alter system dump datafile 5 block 107442;

System altered.

Following is an extract of the index block dump:

Leaf block dump
header address 98959964=0x5e6025c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0x26
kdxcofeo 8007=0x1f47
kdxcoavs 7969
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8007] flag: ——, lock: 0, len=29
col 0; len 19; (19):  54 68 65 20 54 68 69 6e 20 57 68 69 74 65 20 44 75 6b 65
col 1; len 6; (6):  01 41 a3 aa 00 00
—– end of leaf block dump —–

Note that it is indeed our one and only index leaf block.

If we now take a treedump of the index:

SQL> SELECT object_id FROM dba_objects where object_name = ‘SAME_ROOT_I’;


SQL> ALTER SESSION SET EVENTS ‘immediate trace name treedump level 67721’;

Session altered.


Following is the treedump output:

—– begin tree dump
leaf: 0x141a3b2 21078962(0: nrow: 1 rrow: 1)
—– end tree dump

We note that the relative block address of our one and only index leaf block is 21078962.

If we now add a whole bunch of new rows to the table so that the leaf block can no longer hold all the index entries, thereby forcing the index to block split and grow:

SQL> insert into same_root select rownum+1, ‘David Bowie’ from dual connect by level <=100000;

100000 rows created.

SQL> commit;

Commit complete.

And now take a block dump of the same index block:

SQL> alter system dump datafile 5 block 107442;

System altered.

Branch block dump
header address 98959940=0x5e60244
kdxcolev 2
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 2
kdxconro 2
kdxcofbo 32=0x20
kdxcofeo 8016=0x1f50
kdxcoavs 7984
kdxbrlmc 21238606=0x144134e
kdxbrsno 1
kdxbrbksz 8060
kdxbr2urrc 0
row#0[8038] dba: 21238607=0x144134f
col 0; len 11; (11):  44 61 76 69 64 20 42 6f 77 69 65
col 1; len 5; (5):  01 41 be 5f 01
row#1[8016] dba: 21238769=0x14413f1
col 0; len 11; (11):  44 61 76 69 64 20 42 6f 77 69 65
col 1; len 5; (5):  01 44 12 ba 01
—– end of branch block dump —–

We notice that the block is no longer a leaf block but has changed itself into an index branch block.

But not just any branch block. If we now take a new treedump:

SQL> ALTER SESSION SET EVENTS ‘immediate trace name treedump level 67721’;

Session altered.

branch: 0x141a3b2 21078962(0: nrow: 3, level: 2)
   branch: 0x144134e 21238606 (-1: nrow: 161, level: 1)
      leaf: 0x141a3b3 21078963 (-1: nrow: 179 rrow: 179)
      leaf: 0x141a3b4 21078964 (0: nrow: 179 rrow: 179)
      leaf: 0x141a3b5 21078965 (1: nrow: 179 rrow: 179)
      leaf: 0x141a3b6 21078966 (2: nrow: 179 rrow: 179)
      leaf: 0x141a3b7 21078967 (3: nrow: 179 rrow: 179)


The above partial listing of the treedump clearly shows that the index has grown from a level 0 index to a level 2 index and that the root block is the very same index block as the original leaf block listed as it has the same relative block address as before (21078962).

Indeed the original leaf block is now the index root block which is still the same block that immediately follows the index segment header.

Because Oracle doesn’t have to visit the index segment header and can simply directly access the block following the index segment header as this block is always the first index block of interest when performing an index scan, the index scan has that little advantage over the FTS. And it’s this little advantage that can give the index scan the edge over a FTS, even if we’re potentially accessing data from a very small table.

And you can’t get much smaller than a table that has all it’s rows in the one table block.

So far though, the example I’ve shown has been a “normal”, everyday, non-unique index that has a 1 consistent get advantage over the FTS when accessing a row of interest. I’ll next discuss how indexes can having an even bigger edge and more significant advantage over a FTS of a tiny table,  than just the 1 consistent get …


1. Brian Tkatch - May 5, 2009


Escpeicallt the point, that is the difference of where the header block is, quite clear. Thanx!


Richard Foote - May 5, 2009

No worries, glad you find it all useful !!


2. visitor - May 5, 2009

I really feel amazed by the amount of R&D you guys do.By you guys i mean you,Jonathon,Tanel poder etc.I mean is it just for interest or are you get paid for exploring things?In contrast,Tom Kyte believes in knowing the necessary things only and leave the rest to support.


Richard Foote - May 5, 2009

Hi Visitor

Understanding Oracle is basically my job, it’s my responsibilty, so in a manner yes I get paid for ensuring I have the necessary skills and knowledge to know what I need to know to keep my databases running effectively, securely and in a protected manner.

I don’t know the context of Tom’s comment but how does one know and define “the necessary things”.

For example, as I’m trying to explain with these posts, if you don’t know and understand that indexes can be helpful with small tables, that heavily hit small lookup tables accessed via indexes rather than through FTS you can save significant resources, reduce significant contention and increase the overall performance and scalabilty of your applications, then how do you know you’re missing vital understanding and something that is “necessary” to do your job properly.

The problem with knowing just the necessary things is knowing what is necessary …

Jonathan, Tanel and the other “guys” do the research so that they’re in the best position possible to do the best job possible, perhaps even knowing things that others don’t deem “necessary” but is actually necessary after all when the shit hits the fan.

They actually use their “skills” and knowledge and expertise on a daily basis because in the end, you never know what’s actually necessary until the time comes 😉


3. visitor - May 5, 2009

Let me clarify that I am not against anything.I really appreciate the amount of effort you guys put to explore new things out of the box.
By necessary I mean whatever documented…?
I only feel little hesitant in knowing things that are practically less feasible.
For example,How would it help me as a DBA by knowing that
how oracle determines the location of the index root block?
Ultimately it has to determine.
But I am not against anyone.I really admire you guys but I like Tom Kyte approach better.If you read his books and articles,he never go deep and limits himself to the necessary basics.


Richard Foote - May 5, 2009

Thanks for your comments.

I’m not too sure that Tom’s approach is any different. He can go into a lot of detail and depth and does so as necessary. There’s no doubt that Tom’s depth of Oracle knowledge is far deeper than mine will ever be 😉

Does Tom know how Oracle determines the location of an index root block. Very likely. By knowing, it helps him understand why an index can be more efficient in some cases over a FTS. By knowing how an index can be more efficient, one can ensure indexes are created and used when appropriate. By creating and using indexes appropriately, one can ensure an application performs as required and is scalable.

All this can actually ultimately stem back to knowing that little piece of information regarding how Oracle can directly access an index root block (or whatever).

Having met Tom and having read his books and articles, Tom has an very very deep understanding of Oracle. It’s how he can explain so many things so easily and so accurately 😉


Jonathan Lewis - May 6, 2009

Just to add my perspective to this question: most of the arcane information I know is stuff I have researched because I needed to know it to address a client problem, or to work out the most efficient implementation of a critical feature of a client system.

Some of the information I just happened to learn by accident as I was investigating something else – for example the root block position is something I learnt because I was investigating some other feature of index behaviour and then happened to notice that the root block seemed to be the first block after the segment header … so I spent a few extra minutes checking that this was not a coincidence.

Finally, one of the reasons for looking into this information, or bothering to remember it and record how to prove it, is that sometimes you see people making stupid suggestions about how to fix a problem or use a feature because they DON’T know how things work – and it’s nice to be able to help them avoid a mistake.

Jonathan Lewis


Richard Foote - May 7, 2009

Hi Jonathan

Indeed, arcane information has a habit of being useful now and again.

It’s similar to my Dad’s garage which is just full of the most amazing tools, bit’s of wood, wires, boxes of things, computer parts, books and books and books and the most arcane “stuff” imaginable.

He keeps it all because it was all useful one day and might be again.

He doesn’t have to go to “support” for anything really, he knows how pretty well everything works (from computers to cars to nuclear physics to whatever) and knows and has the tools to fix things when necessary. He’s actually a scientist by profession and a genius, it’s just such a shame it all skips a generation 😦

The point being knowing how things work and having the skills and tools to fix things generally involves storing and using lots of so-called arcane information along the way !!


4. Hans-Peter Sloot - May 9, 2009

Hello Richard,

In part 1 you show that the FTS does 4 consistent gets.
How does this relate to the multiblock read count?
I see a db file scattered read.

I have created a table with a single row
WAIT #2: nam=’db file sequential read’ ela= 8387 file#=5 block#=1098499 blocks=1 obj#=76164 tim=175627665001
WAIT #2: nam=’db file scattered read’ ela= 5517 file#=5 block#=1098500 blocks=5 obj#=76164 tim=175627670986

Tablespace is autoallocate.

regards Hans-Peter


5. Richard Foote - May 9, 2009

Hi Hans-Peter

I’m guessing here that you created the table first and then inserted some rows (hence slightly different than my demo). This makes Oracle use the bitmaps to decide which block to use, incrementing the HWM by 5 for your troubles as it may not necessarily decide to just use the first block of the table to reduce possible insert contention.

Therefore, you now need to read 5, not 1 block for this fetch and being a FTS, Oracle uses a multiblock read to do so. It’s a single I/O read request, but requires 5 consistent gets not 1 to read and process the block in memory.

I would guess that the total number of consistent gets in your case will now be 7, 2 to read the segment header info and 5 to read the various data blocks, even though potentially only one of them actually contains a row (Oracle has no way of knowing otherwise).

Note the index will be an even better option in this case as it’s cost will remain the same regardless !! For a non-unique index, it will be 3 consistent gets vs. 7 for the FTS of the maybe only one row table.


Hans-Peter Sloot - May 9, 2009

Hi Riechard,

Your guesses are correct.
Thank you for the clarification.

Regards Hans-Peter


Caglar - January 9, 2014

Hello Richard

First of all, I have to say that I am going to addicted to this blog. I learn lots of new things from you. I am wondering what are the differences between create table as select and after create table insertion? Why CTAS increments the HWM?



Richard Foote - January 9, 2014

Hi Caglar

Asking lots of questions 🙂

There are a number of differences between the two such as the undo volume that’s generated, although it does depend somewhat on how you perform the insert operation into the existing table (ie. use of APPEND or not).

Regarding the HWM, during a CTAS once whatever allocated extent has been filled, Oracle is forced to allocate another extent from the tablespace and increment the HWM. It’s not incremented in the bit by bit method of a conventional insert which uses space below the HWM, but it still needs to be incremented periodically regardless.


6. gabriel - May 13, 2009

i was at a seminar of yours in Perth on Monday and you said never say ‘always’ with Oracle. you said it about half a dozen times in this blog 🙂


Richard Foote - May 13, 2009

Hi Gabriel

LOL !!

I also said never say never so I was wrong on both counts !!

Hope you enjoyed the presentations.


7. Caglar - January 9, 2014

Hello again,

I need to ask you very simple questions.

“Whereas a FTS accesses the table segment header, an index scan accesses the index segment block id plus 1”

First, how come Oracle find out the location of the index segment block id plus 1?

My second question is I thought Oracle use data dictionary in order to locate the index segment block id. So, doesn’t this increament the consistent gets count?

Thanks in advance.


Richard Foote - January 9, 2014

Hi Caglar

In exactly the same way as Oracle finds the location of the table segment header, via info stored in the DD cache. The key difference is that with an index and the known offset to the root block, the first access to the index segment (ie. the all important root block) is directly accessing “useful” information rather than just metadata.

This is designed to make index scans as efficient as possible and avoid any redundant accesses, not so necessary for table scans which are generally expensive operations anyways.


8. Кластерные ожидания внутри одного инстанса RAC и количество LMS-процессов | Oracle mechanics - May 6, 2016

[…] являющегося index root block первичного ключа OLTP […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: