Partition Pruning – Some Examples (Knives Out) September 3, 2008Posted by Richard Foote in Oracle Cost Based Optimizer, Partitioning.
Following on from my last post where I looked at the use of partition pruning vs. the use of an index, I thought it might be worthwhile to just go through a few additional basic examples of where partition pruning can come into play.
I’ll also include a little example of how execution plans may not favour indexes as much with partitioned tables as the resultant cost of a Full Table Scan can be reduced significantly due to partition pruning.
All the examples use exactly the same tables as created in the previous post. For a full listing of the examples and resultant executions plans, look at this Partition Pruning Demo. You may want to open up the demo in a different browser window to make it easier to follow.
In Example 1:
SQL> select * from big_album_sales where release_date between ’01-JUN-2003′ and ’13-APR-2004′;
the CBO can determine that all the data can only belong in Partitions 3 and 4 (as only only years that need to be referenced are 2003 and 2004) and so not have to visit any of the other partitions. The PARTITION RANGE ITERATOR step clearly shows that it only has to visit the partitions in the range 3 – 4.
In Example 2:
SQL> select * from big_album_sales where release_date > ’01-JUN-2006′;
All the possible values must exist from partition 6 which stores all the 2006 data and onwards through to the last partition in the table which is partition 8. The PARTITION RANGE ITERATOR step clearly shows that it only has to visit the partitions in the range 6 – 8.
In Example 3:
SQL> select * from big_album_sales where release_date between ’01-JUN-2006′ and ’30-NOV-2006′ or release_date > ’14-JUN-2008′;
The only possible partitions the data can reside in are partition 6 or partition 8 (for the years 2006 or 2008). In this case, the CBO performs the PARTITION RANGE OR step which visits only these specific partitions.
In Example 4:
SQL> select * from big_album_sales where release_date is null;
All NULL values must reside in the last partition, as the table was created with the last partition having a MAXVALUE boundary. That being the case, the CBO need only visit the last partition (8) within the table to find all possible NULL values of RELEASE_DATE.
Because partition pruning can make performing a so-called Full Table Scan much more efficient as only those partitions with possible data need only be accessed, in some rare cases this can make the CBO appear to ignore reasonable indexes that may be used and accessed in an equivalent non-partitioned table.
Note that the data includes a series of RELEASE_DATE values but many of the dates actually have the same timestamp component as all the row values were determined and loaded in a very quick manner. As a result, the CBO considers there are actually 718 rows per distinct occurrence of a RELEASE_DATE value. This fact is significant when costing an equality predicate.
In Example 5:
SQL> select * from big_album_sales where release_date = ’01-JUN-2006′;
we’re only interested in the one specific date value but Oracle still favours the Full Table Scan (FTS). Note this value can only possibly exist in partition 6 associated with data from 2006 and so the FTS need only actually access partition 6. Oracle assumes it will retrieve approximately 718 rows which is too costly via the index as partition pruning can make the FTS relatively efficient. Note that the actual cost of the FTS is 303.
In Example 6:
SQL> select /*+ index(b) */ * from big_album_sales b where release_date = ’01-JUN-2006′;
a hint is used to force the use of the associated index on the RELEASE_DATE column. Note in this example, it’s actually a much cheaper option (there are far fewer consistent gets) as no rows are actually being retrieved but the cost of 722 is significantly greater than that of the FTS (303) as Oracle is expecting to retrieve and access 718 rows.
In Example 7:
SQL> select * from big_album_sales2 where release_date = ’01-JUN-2006′;
we run exactly the same statement but this time on the non-partitioned table and notice that the execution plan is totally different. Oracle is indeed using the index because in this example the FTS is so much more expensive as it needs to read the entire table and not just the one partition (or approximately 1/8 of the table) as it does with the partitioned table example.
Lots more on partitioning to come in the future …
Regular Index vs. Partitions (The Best Of Both Worlds) August 28, 2008Posted by Richard Foote in Oracle Cost Based Optimizer, Oracle Indexes, Partitioning.
When asking for requests and suggestions on what I could cover in the blog, Brian Tkatch asked “When is a partition better than just a regular index ?”
As it’s a very good question, I thought I might spend a few moments attempting to provide some kind of meaningful answer.
The first point I would make however, is that indexes and partitions are not at all mutually exclusive. A table of course could very well benefit from being both partitioned and indexed. Most partitioned tables have indexes and indeed indexes can in turn be partitioned.
The second point I would make is that there are two key advantages or reasons why it might be beneficial to partition a segment. They are:
- Better Manageability
- Better Performance
This post can be viewed as being but an introduction to the potential performance benefits of partitioned segments. I will leave the topic of better manageability for another time.
The point of Brian’s question is when can the performance benefits available to partitioned segments out perform and be a better option than that of a possible associated index.
As I’ve discussed a number of times, there are no magic numbers or ratios as to when an index is the most appropriate and cost effective access path. If fact, the link shows a theoretical example of when a Full Table Scan (FTS) is by far the cheaper option when selecting just 1% of data. Generally speaking though, the higher the percentage of returned rows, the more likely a FTS is going to be the more efficient option.
However, a Full Table Scan is just that, a FULL Table Scan. Oracle needs to read the entire table, each and every block below the segment High Water Mark. So if a query needed to retrieve just 5% of the data, it’s quite likely it would perform a FTS because it’s possibly a cheaper and less costly alternative than using an associated index (the above link explains why) but it would need to access 100% of the table to do so. That means it would need to effectively access 95% of the data and simply just throw away the rows so it can access the 5% of data it actually requires in a more efficient manner than possible using an index access path.
That’s potentially (say) 95% of the work and resources for no real reason other than Oracle has no choice but to access all the data when performing a FTS. That’s potentially 95% of wasted resources, 95% of wasted effort. Yes, a TFS might be more efficient than using an index but accessing unnecessarily say 95% of data sounds rather costly nonetheless.
Wouldn’t it be nice if somehow we could get the benefits of performing a FTS (multiblock reads, reading a specific block just the once, etc.) but only read the say 5% of the data that we were actually interested in (or something significantly less than the 100%)? Somehow combine the benefits of a FTS with the benefits provided by an index, that being only needing to access data that is of direct interest.
Partitioning provides Oracle with another level of granularity (or levels of granularity with composite partitions) when accessing a table. It’s no longer an question of using an index or reading the entire table. Oracle can now either use an index, access the entire table or with partitioning access just those partitions (or bits) of a table that can only possibly contain data of interest.
By partitioning a table, Oracle knows that data belonging to the partitioning keys must belong in a specific partition. The logical table is effectively broken down into multiple physical segments with each partition segment containing only data associated with the partitioning keys or columns.
If the say 5% of data that’s of interest only logically resides in the one or whatever number of partitions, then there’s no need for Oracle to access the data in all the other partitions, containing the other say 95% of unwanted data. The Cost based Optimizer (CBO) can automatically take advantage of “partition pruning” and only needs to access the partition or those partitions that only can contain the data of interest.
Oracle can now potentially perform a so-called “FTS” of the table, but only actually accesses and reads the 5% (or significantly less than the 100%) of data that’s potentially of direct interest by accessing just those partitions it needs to.
At a simplistic level, indexes are best when accessing a relatively “small” amount of data. A FTS is best when accessing a relatively ”large” amount of the data and partitions can be extremely useful and beneficial when accessing an amount of data somewhere between the two. As such, partitioning is more likely to be beneficial in a Data Warehouse or Reporting environment where accessing a relatively high percentage of data is common.
This demo of indexes vs. a FTS vs. partition pruning shows how a FTS outperforms an index when accessing approximately 1/8 of data in a table but an appropriately partitioned table out performs both options.
More on partitioning in the future.
Local Index Issue With Partitioned PK and Unique Key Constraints December 20, 2007Posted by Richard Foote in Constraints, Index Access Path, Local Indexes, Oracle Indexes, Partitioning, Performance Tuning, Unique Indexes.
Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:
“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”
Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.
Let me explain why.
Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!
Let’s start by mentioning constraints again.
Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.
Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).
Lets say a table is Range Partitioned on column ’A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!
Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.
Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.
This is actually a good thing.
If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).
It actually makes a lot of sense to do this.
Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.
Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.
If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.
Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.
In other words, the rules apply equally to both Unique and Non-Unique indexes.
So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint* to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.
Little demo to illustrate: Local Index Issue With Partitioned PK and Unique Key Constraints