Separate Indexes From Tables, Some Thoughts Part II (There There) April 18, 2008Posted by Richard Foote in Oracle General, Oracle Indexes, Oracle Myths, Oracle Opinion, Tablespace Management.
In Part I, I discussed how separating indexes from tables won’t likely improve performance as:
- Oracle moves from reading index blocks to table blocks in a sequential manner
- Most of the associated I/Os are likely to be random anyways
- Multi-User environments would result in disk contention regardless
That being said, why is it then some sites claim performance improvements after separating indexes from tables ? Previously, performance was sluggish however after moving indexes into a separate tablespace, performance appears to have picked up. Clearly then, moving indexes into a separate tablespace does improve performance, even if common sense might suggest otherwise.
Well, not quite. Here’s a scenario that’s not entirely uncommon …
Currently, an application has both tables and indexes in the same tablespace. The tablespace consists of various datafiles distributed across (say) 4 physical devices. Most database waits are I/O related with both db file sequential reads and db file scattered reads featuring heavily in performance metric reports. However, I/O performance is somewhat average with slow I/O related wait times and performance is generally suffering as a result.
Maybe, just maybe, the problem is due to having tables and indexes in the same tablespace. Perhaps if we separate the indexes away from the tables, contention will reduce, I/O wait times will decrease and database performance might improve as a result.
So we create a shining new, index only tablespace spread across (say) 4 additional physical disks and rebuild all our indexes in this new tablespace. To our relief, thankfully, performance has indeed improved. Average I/O wait times have been reduced and overall database performance has improved as a result. Despite what folks like that Richard Foote dude claims, here is clear proof and evidence of performance indeed improving, purely and simply by just separating indexes from their tables.
All we did was pull the wings off the fly and now it won’t take off after clapping our hands. Clear proof that flies go deaf when you pull off their wings …
There are of course two additional, potentially significant events that have also occurred other than just the indexes being separated from the tables.
The first one is that not only have all the indexes been moved to another tablespace, but all indexes have also been rebuilt as a consequence. Now, I’m the last person to get all excited about indexes being rebuilt, however as I’ve gone to great lengths to document, there are rare scenarios when indexes can get fragmented and may benefit from a rebuild. By moving indexes into a new tablespace, we’ve effectively rebuilt all the indexes, the (say) 99% where it wouldn’t have mattered but also the (say) 1% where it may have improved things. We have also rebuilt those indexes where there may be some temporary improvement until the index starts to flesh itself out again.
As a result, there could be all manner of related changes to execution plans and performance generally, especially related to larger index range scans and index fast full scans.
It’s not the indexes being separate from the tables that’s making some difference here, it’s the fact all the indexes have been rebuilt (especially those that were badly fragmented and accessed by large index scans).
The fly isn’t really deaf …
However, the far more significant difference we’ve also made is that we have of course just introduced 4 new physical devices into our database infrastructure. As a result, we may have significantly enhanced our I/O bandwidth and possibly reduced I/O related contention issues. All the general I/O activity related to indexes that was occurring on our initial 4 disk table/index tablespace have all been removed and are now occurring on our new, separate 4 disk index only tablespace.
But that’s a good thing right, that’s what we wanted to achieve ?
In the index range scan scenario I discussed in Part I, just note how few of the overall I/Os were related to the index. In larger index range scans where in theory separating indexes might improve performance, very few of the related physical I/O activity is actually attributed to indexes. The index would have to have an extremely low (and rare) clustering factor for index costs to be significant. In most “randomly” distributed index scans, there’s significantly more table related physical I/O activity than index activity.
By moving just the indexes into these new physical devices, we’ve just moved a whole bunch of segments that as a group incur relatively low physical I/O related activity while leaving together all those that result in the majority of physical I/Os.
Yes. we’ve reduced contention and I/O demands on the initial tablespace but as whole, we haven’t done it very well at all. Yes, we’ve reduced contention and perhaps improved performance, but we could have done it so much better. Yes, it appears separating indexes from tables has improved performance but has it really …
It’s not the separating of indexes from tables that’s improved performance, it’s the fact we’ve introduced 4 new disks and we’ve shifted some of the I/O activity away from the initial tablespace.
The fly isn’t deaf after all …
As an example, previously we had 100% of related I/O activity in the initial table/index, 4 disk tablespace. However, only (say) 20% of the activity was actually related to the indexes, 80% was attributed to all the tables. By moving all the indexes into the new, 4 disk index only tablespace, we therefore reduce the load on the initial tablespace by 20%. We now have 80% of the I/O load on 4 disks and just 20% on the other 4 disks. Yes, performance might improve as a result but we could do so much better. Currently, 4 of the disks have 20% of all segment related load on them and the other 4 disks have just 5% of all associated load.
Instead, if only we either added the 4 disks to the other 4 disk set and striped both tables and indexes across all 8 disks or moved and distributed both indexes and tables into the new 4 disk set, we might have been able to distribute load much more evenly across all 8 disks with approximately 12.5% load across each one.
By doing so, we may have improved performance by an even better and more significant amount. Conversely, by separating indexes into their own tablespace, we may actually be hurting general database performance because database performance is not optimal due to the uneven distribution of I/O related activities.
Of course, there’s a very easy way to confirm this. Look at the statistics in V$FILESTAT or look at a statspack report and carefully study the physical I/O activity in the table only and index only tablespaces and compare the results. Just how evenly distirubuted are the I/O related workloads …
Yes, there are scenarios where distributing individual segments here or there may be beneficial but the overall objective is generally to try and even out disk/spindle workloads as much as possible. Separating all indexes blindly is typically a very poor method of trying to achieve this.
If an individual query is not likely to improve by having an index in a separate tablespace and if separating indexes results in a non-uniform distribution of physical I/O activity, then you may want to start questioning whether it’s all really worth it.
Of course, database recoveries will be simplified by having indexes in their own tablespace, right ?. Ummm, I’ll tackle that myth next …