jump to navigation

Larger Block Tablespace For Indexes Revisted: Part II (Money) February 23, 2009

Posted by Richard Foote in Index Block Size, Index Height, Index Rebuild, Oracle Indexes, Oracle Myths.
trackback

In Part I I looked at Robin Schumacher’s “classic” example of the “so-called” benefits of rebuilding an index into a larger block tablespace. I started by highlighting that by simply rebuilding an index in a larger block tablespace and (say) halving the number of associated index blocks, it doesn’t necessarily mean the index will result in a “flatter structure” , that the index height will reduce. Robin’s demo is in fact a perfect example of this.

Let me start Part II by following this up with a little story …

“I went with a friend of mine to get some cash from the bank the other day in order to buy the latest David Bowie Box-Set. I got my cash in $50 notes but being an efficient, cost saving sort of bloke, my mate got out the same amount of cash in $100 notes. Although we both had the same amount of cash, his wallet was that little bit more compact and “efficient” than mine as he had less actual bank notes to carry.

However, when we got to the record store, we were surprised to discover that the actual “cost” of the David Bowie Box-Set was 2 bank notes of any domination, but with no change being given. Therefore, it cost me 2 x $50 notes to make my purchase. Unfortunately for my mate, it cost him 2 x $100 for the same thing as $100 notes were the smallest denomination he could use. Yes, I was a little bit mean not lending him some of my $50 notes but I was a little disappointed myself for not having any $5 notes on me at the time 😉 ”

OK, it’s not a perfect analogy but you perhaps get the point …

If you have to pay with a quantity of bank notes and you only have larger bank notes, you end up paying more than you would if you could only have paid with the same number of smaller denomination bank notes.

If you have to pay for an index scan in database blocks and you’re forced to use larger index blocks, you actually end up paying more if you have to read the same number of index blocks anyways.

This is one of the potential dangers with rebuilding indexes in a larger block tablespace. And like I said, Robin’s little demo is a perfect example of this. You might indeed reduce and halve the number of index blocks but this might not be sufficient to actually flatten the index structure and reduce the actual height of the index. The height of the index after rebuilding the index can remain the same, so the minimum cost of performing a small range scan increases. Even if you reduce the index height, you can still end up paying more if the savings do not compensate you enough for the additional overhead associated with now having to read and process larger index blocks. 

Therefore, you potentially start paying more for smaller index range scans because you might not actually reduce the number of index blocks you visit for these types of index scans.

Taking exactly the same table/index definitions and data used to replicate Robin’s example in Part I, let’s see if there’s any difference in the costs associated with performing a number of small index range scans after rebuilding the index in a 16K block tablespace.

Again, it’s a simple case of just giving it a go and see for yourself. What resources are used if you perform a series of small index scans ? Do things really run faster ? Do we really use less resources ?  Having rebuilt such an index in a larger block tablespace and halved the number of associated leaf blocks, are we really better off ?

Let’s begin by setting up the same example as Robin’s demo as before …

SQL> create table bowie (id number not null, value varchar2(10));

Table created.

SQL> insert into bowie select rownum, ‘BOWIE’ from dual connect by level <=187200;

187200 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(id);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>’BOWIE’, cascade=>true, estimate_percent=>null, method_opt=> ‘FOR ALL COLUMNS SIZE 1’);

PL/SQL procedure successfully completed.

SQL> select index_name, blevel from user_indexes where index_name = ‘BOWIE_IDX’;

INDEX_NAME BLEVEL
---------- ------
BOWIE_IDX       1

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select height, btree_space, used_space from index_stats;

HEIGHT BTREE_SPACE USED_SPACE
------ ----------- ----------
     2     3336032    2988168

 

Note the index height when built in an 8K block tablespace is 2 …

Let’s now capture the current amount of CPU used by the session:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      36555
consistent gets            22520369
physical reads                 3750

 

Now we run a series of small index range scans …

SQL> set timing on

SQL> declare
  2  v_id    number;
  3  v_value varchar2(10);
  4  begin
  5   for o in 1..10 loop
  6    for i in 1..187200 loop
  7    select id, value into v_id, v_value from bowie where id = i;
  8    end loop;
  9  end loop;
 10  end;
 11  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:28.42

 

Let’s see how our resource stats have changed …

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      45346
consistent gets            28140519
physical reads                 3750

 

We note we have used approximately 87.91 CPU seconds. (Note: You can run this a number of times and determine an average figure).

Let’s now rebuild the index again in a 16K block tablespace:

SQL> alter index bowie_idx rebuild tablespace ts_16k;

Index altered.

SQL> select index_name, blevel from user_indexes where index_name = ‘BOWIE_IDX’;

INDEX_NAME BLEVEL
---------- ------
BOWIE_IDX       1

SQL> analyze index bowie_idx validate structure;

Index analyzed.

SQL> select height, btree_space, used_space from index_stats;

    HEIGHT BTREE_SPACE USED_SPACE
---------- ----------- ----------
         2     3351776    2985662 

Note the index height remains at 2 …

If we run the same series of small index scans:

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      45381
consistent gets            28142640
physical reads                 3957

SQL> declare
  2  v_id    number;
  3  v_value varchar2(10);
  4  begin
  5   for o in 1..10 loop
  6    for i in 1..187200 loop
  7    select id, value into v_id, v_value from bowie where id = i;
  8    end loop;
  9  end loop;
 10  end;
 11  /

PL/SQL procedure successfully completed.

Elapsed: 00:01:42.44

SQL> select n.name, s.value from v$sesstat s, v$statname n where s.statistic# = n.statistic# and s.sid = 134 and (n.name = ‘CPU used by this session’ or n.name = ‘consistent gets’ or n.name = ‘physical reads’);

NAME                          VALUE
------------------------ ----------
CPU used by this session      54484
consistent gets            33760690
physical reads                 3957

We note that elapsed times have increased and we have now increased our overall CPU consumption to 91.03 CPU seconds as well.

As we can see, there has been no advantage with rebuilding the in index in the 16K block tablespace for these smaller index scans. In fact, there’s actually been an increase in the overall elapsed times and an increase in the overall CPU. Performance has not improved but has in fact worsened overall for these queries after rebuilding the index in a larger block tablespace.

You begin to get the point …

And of course, indexes are generally far more typically to be used in small index range scan operations than they are in performing Index Fast Full Index Scans. Just compare the numbers of index fast full scans vs. index fetch by key operations in your databases if you want some indication. 

However, Robin’s specific example used a SQL statement that performed an Index Fast Full Scan. Surely, such operations would improve dramatically if indexes were only rebuilt in a larger block tablespace ? Surely such operations would be able to read data more quickly, especially if we have to go to disk, as we would be able to read more data with each associated multiblock read if the index were built in a larger block tablespace ? Surely it would be worth all the extra effort and management considerations ?

Things will run at least 2 times faster, maybe even 150 times faster, right  😉

We’ll see how the results can be a tad “disappointing” in the next post …

Comments»

1. chris_c - February 25, 2009

Interesting as well is that the 16K block size did only 2100 less consistant gets than the 8k, and pulled data out at 0.84GB/s rather than 0.48GB/s does this mean the 16k example would get significantly worse if we could really load the system up as we would hit memory bandwidth issue sooner?
It seems the only way anyone ever justifies this is get a fullsize copy of your database, load it up on identical hardware rebuild everything and test it, has anyone come up with a set of criteria where they can say if you see this type of behaviour and measure x,y and z then its worth testing because you will see an improvement in a speficic type of process?
I’ve had a hard time in th epast justifying the requirement to have a production sized test/dev enviroment without then spending a week or two perfomrance testing something I don’t know why its going to work.

Like

2. Richard Foote - February 25, 2009

Hi Chris

A point I want to emphasis is that these posts relate specifically to the issue of moving indexes into a larger block tablespace, rather than setting the entire database to a larger block size, which is a somewhat different (albeit related) discussion.

The 16K block size did 2100 less CRs because that’s approximately 1/2 the CRs of the 8K block size. Double the block size, you roughly 1/2 the CRs.

However, that doesn’t means you “pull data out” any quicker, because the manner and the amount of data that is pulled out is not actually determined by the block size.

So even though you double the block size in a tablespace, you don’t necessarily pull data out any quicker. I’ll expand on this point in a future post.

Like

3. Robert Klemme - February 25, 2009

For me the fact that the non default block sized index goes into its own buffer cache is a big warning sign that this might not be a too good idea in itself. In my experience whenever there is a mechanism that will reserve parts of a resource for a particular task (in this case buffer cache memory) results are most of the time disappointing. In other words: systems built for automatic resource allocation (e.g. via LRU algorithms) usually do this better than we humans. Human intervention also has the dramatic downside that it’s static: if load changes for whatever reasons, you must check and adjust the manual allocation as well. This is one more manual task you have to do – and probably more often than not it’s forgotten.

The only area where this seems reasonable is when there is a task that must have certain amount of resources available whenever it is executed. Even then prioritizing mechanisms usually work better because they do not require part of the resources to sit there idle most of the time while still allowing the high prio task to get to work as soon as possible.

My 0.03 EUR

Like

4. chris_c - February 25, 2009

Sorry my comment wasn’t too clear,
One thing I said the 16K index only did 2100 less CRs, but looking at the whole process (unless I’m being thick) it did 5,620,150 CR’s with the 8k block (28140519-22520369) and 5,618,050 CR’s in the 16k block size (33760690-28142640) so saving 2100 logical IO’s a massive saving of 0.04%.

So moving to the 16k blocksize means we are forced to read far more data from memory to get the same result I’m a little supprised that the 16k index wasn’t even worse as you had to retrive 85.72 GB worth of blocks compared to 42.88 GB in 8k to get exactly the same result effectivly doubling the Logical IO but throwing much more away.

With this specific workload would you see a better response by putting the index in a smaller blocksize? and isn’t that the whole problem with the idea of multiple block size databases you need to analyse every query and index and then test every possible combination of blocksize, then re-analyse and test every time you make a change (possibly even you would find diferent times of the day required changes as workload changes through the day), maybee we need to schedule regular rebuilds of indexes in different blocksizes (2k at 9am, 16k at 5pm) to fit different workloads that would keep a few consultants in business.

Like

5. Richard Foote - February 25, 2009

Hi Robert

Agreed. Even assuming minor possible benefits in specific scenarios, having a separate cache means risking wasting memory by caching objects unnecessarily and/or additional I/Os for aging objects out of cache prematurely, which could counter any such benefits.

You make a very valid point, even though at current currency conversion rates, it’s a little expensive in AUS$ 😉

Like

6. Richard Foote - February 25, 2009

Hi Chris

Sorry, yes I see your point (I’m the thick one) !!

You’re quite correct with your general comments. What you might save by placing indexes in a larger block in one specific scenario, you would lose in another, while most times it would make little to no difference at all. So a specifc Index fast Full Scan which might run a touch faster in one instance while a specific set of range scans run slower.

Add to that the additional complexity of such environment, the issues with the CBO trying to cost things accurately, the various bugs and issues you hit and the fact not much actually changes, the whole exercise soon becomes pointless or worse.

The key points I want to make is that many of the so-called benefits, many of the so-called reasons for moving indexes into a larger block tablespace are simply not valid in the first place.

Like

7. SeánMacGC - February 27, 2009

As ever Richard, you’re the devil in the detail… don’t ever cease! ;o)

Like


Leave a comment