jump to navigation

Larger Block Tablespace For Indexes Revisited Part III (Prove Yourself) March 2, 2009

Posted by Richard Foote in Index Block Size, Index Height, Index Internals, Index Rebuild, Oracle Indexes, Oracle Myths.
trackback

Time to look a little at an Index Fast Full Scan (IFFS) with respect to moving indexes into a larger block tablespace.

To start, a few points about an IFFS. When performing an IFFS, Oracle will read each and every block in the index structure, including all branch and leaf blocks (up to the HWM), using multi-block read operations. So although an IFFS has its place, it’s a relatively expensive operation, especially if the index is large, as it needs to read all blocks in the current index structure. Therefore, it’s not generally considered a “common” operation, as in traditional index range scans, especially in OLTP environments. Effectively, an IFFS is the index equivalent of a Full Table Scan (FTS) and is performed in a similar fashion. During an IFFS, Oracle basically treats the index as if it were an smaller version of the table from which it can extract the necessary data.

So if an IFFS were indeed to be very common and could be performed so much more effectively in a larger block tablespace, this begs the question why not move tables in a larger block tablespace as well, so FTS can have the same benefits as the index. And if you move both tables and indexes into a larger block tablespace, why not create the database in the larger block and have done with it? Which brings us around to the question of selecting an appropriate block size for the database, which is a different discussion to the so-called benefits of just moving indexes into a larger block tablespace. For another day perhaps …

Now, although Robin’s demo clearly shows consistent gets are basically halved when an index is rebuilt with double the block size (which all sounds very impressive), what it doesn’t highlight however is that during an IFFS, approximately the same amount of data is still actually being read. Although there are certainly some efficiencies in having fewer logical reads, if each consistent gets is more expensive while the overall data being read is similar, you’re simply not going to get an enormous performance boost that simply halving consistent gets might suggest. Claims that things will suddenly run twice as fast (or 120 times as fast) purely by moving indexes into a larger block tablespace are shall we say “exaggerated” to say the least. However, as all index blocks within an index are being read with no subsequent table accesses, an IFFS is going to be as good as it gets when we talk about any possible performance improvements of moving indexes into a larger block tablespace.

Again, it’s all very simple to “give it a go” and see for yourself exactly what improvements one might expect. Move an index into a larger block tablespace and see those IFFS fly. Using exactly the same setup from Robin’s demo as in Parts I and II, let’s see the actual differences …

First, get some current session stats:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         35856
consistent gets               10685006
physical reads                    2582
index fast full scans (full)     33003

Now execute 1000 IFFS using the index in an 8K block tablespace …
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..1000 loop
  5       select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.23

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         36952
consistent gets               11106006
physical reads                    2582
index fast full scans (full)     34003

 
Note that response times are roughly 11.23 secs, mostly consisting of CPU at 10.96 secs. Note also consistent gets are 1000 x 421 as one would expect from Robin’s demo. Now lets see how things differ when we move the index into a 16K block tablespace.

 
SQL> alter index bowie_idx rebuild tablespace ts_16k;

Index altered.

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session         43192
consistent gets               12372513
physical reads                    2789
index fast full scans (full)     40003

SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..1000 loop
  5       select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6    end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:11.14

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                               VALUE
----------------------------- ----------
CPU used by this session           44260
consistent gets                 12583513
physical reads                      2789
index fast full scans (full)       41003

 
Note there’s only a slight improvement in response times and CPU usage of a couple of percentage points, even though the number of consistent gets has halved. Now your milage may vary a tad but what you will not see is things suddenly using 1/2 the resources or running twice as fast (or running 120 times as fast). Because at the end of the day, both sets of IFFS are reading and processing approximately the same amount of data.

Note also that any such possible minor improvments in IFFS performance is likely going to be cancelled out or more likely overtaken by the increased costs associated with more typical index range scans due to the index not actually reducing in height as discussed in Part II. Again, what’s more common, smaller index range scans or IFFS ?

Robin’s demo and the demo above however assumes the index is fully cached, with no physical I/O (PIO) operations being performed. Now as an IFFS needs to read the entire index, it’s quite likely that an IFFS would require some physical I/O, especially if the index is large. Surely then, if an index is in a larger block tablespace, such physical I/Os would be more efficient because Oracle would be able to read more data with a multi-block read based on a larger block  size?

The answer is unfortunately no.  Another (rather poor) analogy to get the point across.

You go to the bank to take some money out. You’re only allowed to take out a maximum of $1000 per day from your account. You ask for this in $50 notes and you get 20 x $50 = $1000 in total. Your mate wants to take out more money and asks for the money in $100 notes. However, as the limit is set to $1000 a day, regardless of the denomination of the bank notes, he simply gets 10 x $100 = $1000. You both end up with exactly the same amount of money, it’s just that you have more bank notes.

An Oracle multi-block read works in exactly the same manner. There’s a maximum amount of data you can read which is calculated by the default block size x db_file_multiblock_read_count. If the default block size is 8K and the db_file_multiblock_read_count = 8, then the maximum size of a multiblock read is 8K x 8 = 64K.

If you attempt to perform a multi-block read using a non-default block size, Oracle simply divides the 64K by the default block size to determine how many blocks to read. So if you now attempt to perform a multi-block read from a 16K tablespace, Oracle will only read 16K x 4 blocks = 64K. The I/O size is the same regardless of the blocksize.

And this makes perfect sense. If you’ve tuned the size of a multi-block read perfectly with the default block size in mind, why would you want to (say) suddenly double this perfect size when you double the block size in another tablespace. If doubling a multi-block read size were to suddenly improved things, why not also double the multi-block read size for the default block size as well …

Again, it’s very easy and simple to test how changing the block size in a specific tablespace makes no difference to the behaviour of an associated multi-block read.

First flush the buffer cache and trace a session, while performing a multi-block read via an IFFS , first with a default 8K block size index:

SQL> show parameter db_file_m

NAME                             TYPE VALUE
----------------------------- ------- -----
db_file_multiblock_read_count integer     8

SQL> alter system flush buffer_cache;

System altered.

SQL> alter session set events ‘10046 trace name context forever, level 12’;

Session altered.

SQL> select /*+ index_ffs(bowie) */ count(*) from bowie;
 
The trace file will show that blocks=8 are being read during the IFFS multiblock read operation as the following extract from the trace file highlights:

 
WAIT #2: nam=’db file scattered read’ ela= 22897 file#=7 block#=38929 blocks=8 obj#=95742 tim=1217521768059
WAIT #2: nam=’db file scattered read’ ela= 47158 file#=7 block#=38937 blocks=8 obj#=95742 tim=1217521816003
WAIT #2: nam=’db file scattered read’ ela= 37776 file#=7 block#=38945 blocks=8 obj#=95742 tim=1217521854530

However, when you trace a session as it performs the IFFS with a 16K block

WAIT #1: nam=’db file scattered read’ ela= 55181 file#=6 block#=1030 blocks=4obj#=95742 tim=867071895617
WAIT #1: nam=’db file scattered read’ ela= 68932 file#=6 block#=1034 blocks=4obj#=95742 tim=867071965238
WAIT #1: nam=’db file scattered read’ ela= 67136 file#=6 block#=1038 blocks=4obj#=95742 tim=867072142114
 

We notice that Oracle is now only reading 4 blocks at a time, ensuring that the same 64K is read each time. So increasing the blocksize of an index doesn’t suddenly impact the manner or efficiency in which multi-block PIOs are performed.

Next, we’ll perform a similar test as before, comparing the difference between the 8K and 16K block index during an IFFS but this time with PIOs introduced. There are various ways one could do this. Make the buffer cache too small to fit an index or make the index too big for the buffer cache. Keeping with Robin’s demo, I’ll use exactly the same index definitions but this time run the following procedure in another session that constantly flushes the buffer caches. So in one session:

SQL> begin
  2  for i in 1..10000 loop
  3    execute immediate (‘alter system flush buffer_cache’);
  4  end loop;
  5  end;
  6  /

PL/SQL procedure successfully completed.

 
Meanwhile in another session, run a number of IFFS (200 this time) with an 8K block size index:

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          4851
consistent gets                1536024
physical reads                  423992
index fast full scans (full)      3205

 
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..200 loop
  5     select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6   end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:35.92

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          5248
consistent gets                1620224
physical reads                  507651
index fast full scans (full)      3405

 

Note the response time is 35.92 secs and that we used approximately 3.97 secs. This time we have performed a how bunch of PIO operations.

Same thing, this time with the 16K index:

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          6695
consistent gets                1789531
physical reads                  633216
index fast full scans (full)      4205

 
SQL> declare
  2  v_count number;
  3  begin
  4    for i in 1..200 loop
  5     select /*+ index_ffs(bowie) */ count(*) into v_count from bowie;
  6   end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

Elapsed: 00:00:36.59

 
SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 137 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’ or n.name = ‘index fast full scans (full)’);

NAME                             VALUE
---------------------------- ---------
CPU used by this session          7118
consistent gets                1831731
physical reads                  674861
index fast full scans (full)      4405

 
We now note that the 16K block index has the slightly slower response time at 36.59 secs and uses a tad more CPU at 4.23 secs. As soon as we introduce PIOs, the PIOs contribute significantly to the overall costs and may well result in the larger block index being more costly depending on how the PIO is performed. So even in Robin’s example, depending on the nature of PIOs, even an IFFS can potentially run slower with an index in a larger block tablespace. Remembering of course that larger indexes are less likely to be fully cached and are more likely to incur PIOs during an IFFS.

Here’s a demo I’ve used before that uses a different approach to highlight the same issue (which doesn’t rely on a different session “interfering” with things). Another excellent example is this one by Greg Rahn on this OTN thread where he shows an IFFS performing a tad slower in a larger block tablespace. As I’ve been saying, just give it a go and see for yourself.

Perhaps, simply halving the consistent gets when doubling the index block size doesn’t tell the whole story …

The next time you see someone reference Robin’s demo as some sort of “proof” that by moving indexes into a larger block tablespace and (perhaps) reducing the number of consistent reads, the index will somehow be flatter and/or more efficient, just remember that the index may not necessarily be flatter, it might actually have the same height and that subsequent index operations, including index range scans and IFFS could very well be more expensive, not less expensive, to perform.

The next time you see someone claim that by moving indexes into a larger block tablespace, performance has suddenly improved by 100% or that things suddenly run 120 times faster, just ask one simple question. Why ?Ask what else has changed, what else might be contributing to the incredible performance improvements, because when you actually test things out for yourself, you’ll noticed results are not anywhere near as “impressive”.

Comments»

1. Brian Tkatch - March 3, 2009

Good examples and nice summation. Thanx Richard.

Like

2. Vivek Yalla - March 5, 2009

Good info

Like

3. Norman Dunbar - March 5, 2009

Morning Richard,

in a word, Brilliant!

Thanks.

Cheers,
Norm.

Like

4. Nataraj - March 9, 2009

Really nice article,

Do you have formula to calculate the index fast full scan cost along with OICA & OIC for the Global & Local Partitoined Indexes..

Vers – 10.2.0.3 / No histograms / Bind variables/ 2 Table Joins.

Thanks in adv.

Like

5. Naresh Bhandare - March 11, 2009

clear demo. Thanks.

Like

6. Richard Foote - March 26, 2009

Hi Nataraj

The cost calculations of an index fast full scan would depend on whether you’re using the I/O or CPU costing model. It’s very basically derived from the number of blocks in the index partition segments, divided by the effective multi block read count as determined by the CBO. Although under the CPU cost model, other factors such as the average time for a multiblock read, the average time for a single block read and CPU related costs all can come into play, neither the OICA or the OIC impact the costings.

Like

7. Neeraj Bhatia - September 15, 2010

Hi Richard,

May i know the link to Robin’s demo which is referred in this post. Seems it’s a part-1 of this post.

Thanks,

Like

8. Neeraj Bhatia - September 15, 2010

Got it. sorry for the trouble. Actually related posts were not poping out previously 🙂

Like

Richard Foote - September 17, 2010

Hi Neeraj

No worries, glad it’s all sorted.

Like

9. Book Review: Oracle Tuning: The Definitive Reference Second Edition « Charles Hooper's Oracle Notes - November 10, 2010

[…] Page 149 states, “The amount of logical reads has been cut in half simply by using the new 16K tablespace and accompanying 16K data cache.”  It has been demonstrated to the book author multiple times why this test case is flawed (reference reference2 reference3 reference4). […]

Like


Leave a comment