Bitmap Indexes and Not Equal Part II (Sheep)

Bitmap Indexes and Not Equal Part II (Sheep) July 7, 2011

Posted by Richard Foote in Bitmap Indexes, NOT Equal, Oracle Indexes.
trackback

An excellent comment/question by mdinh made me realise my demos in Part I might be a little extreme in returning 0 rows and perhaps give the false impression that Not Equal conditions are only considered or applicable if no rows are returned. This is not the case and with the bitmap index now considered with Not Equal conditions, the choice of whether or not to actually use the index as usual comes down to the comparative costs associated with the available plans.

So, I’ll expand on my demo a tab by introducing a new value for the FLAG column:

SQL> update radiohead
  2  set flag = 1
  3  where rownum < 101;

 100 rows updated.

 SQL> commit;

 Commit complete.

 SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'RADIOHEAD',cascade=>true, estimate_percent=>null, method_opt=> 'FOR COLUMNS FLAG SIZE 5');

 PL/SQL procedure successfully completed.

OK, so now we have some 100 rows which have a value of FLAG which are not equal to 42, which are evenly distributed among all 5 CODE values. I’ve created a histogram however on the FLAG column as the 2 values (1 and 42) are not evenly distributed.

Let’s run the query now:

SQL> select * from radiohead where code = 1 and flag <> 42;
20 rows selected.

 
Execution Plan
----------------------------------------------------------
Plan hash value: 2786215024

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    20 |   300 |     46  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |     46  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP MINUS              |                  |       |       |            |          |
|*  4 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
|*  5 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_FLAG_I |       |       |            |          |
-------------------------------------------------------------------------------------------------

 
Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CODE"=1)
   5 - access("FLAG"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         47  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

We notice a few key points. Firstly, as we have a histogram on the FLAG column and the data is perfectly evenly distributed among the CODE values, the CBO has got the estimated cardinality of 20 rows spot on. So all things being equal, we can have some confidence the CBO has done the right thing and selected the most efficient execution plan.

We also notice that the cost has now gone up considerably to 46 (up from 3) but it’s still significantly less than the cost of 761 associated with a Full Table Scan. Therefore, the CBO has still chosen the same execution plan with the two bitmap indexes returning the 20 rows, as it did when it returned none in the previous example.

In answer to another comment/question by SJ12345, regarding the use of unbounded predicates, if we now try the following:

SQL> select * from radiohead where code = 1 and flag > 42;

no rows selected
Execution Plan
----------------------------------------------------------
Plan hash value: 2939001425

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    10 |   150 |      6  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    10 |   150 |      6  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP AND                |                  |       |       |            |          |
|   4 |     BITMAP MERGE             |                  |       |       |            |          |
|*  5 |      BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |            |          |
|*  6 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("FLAG">42)
       filter("FLAG">42)
   6 - access("CODE"=1)

 
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          2  consistent gets
          0  physical reads
          0  redo size
        435  bytes sent via SQL*Net to client
        384  bytes received via SQL*Net from client
          1  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          0  rows processed

Yep, cheap with Bitmap Indexes says the CBO. Note the difference here though is that the CBO uses a BITMAP MERGE to first get all possible rowid values of FLAG that are > 42 and then uses a BITMAP AND operation in combination with the CODE Bitmap index to get all rowids that match from both Bitmap indexes. However, as it evaluates the Bitmap Index on the FLAG index first and there are no index entries with a value > 42, it doesn’t have to actually worry about the CODE condition as no rows can possibly be returned. Therefore a very tiny 2 consistent gets are all that are necessary.

The following will looking for anything < than 42, remembering we now have 20 rows that meet this condition:

SQL> select * from radiohead where code = 1 and flag < 42;

20 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2939001425

-------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name             | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                  |    20 |   300 |      8  (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |      8  (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                  |       |       |            |          |
|   3 |    BITMAP AND                |                  |       |       |            |          |
|   4 |     BITMAP MERGE             |                  |       |       |            |          |
|*  5 |      BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |            |          |
|*  6 |     BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |            |          |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("FLAG"<42)
       filter("FLAG"<42)
   6 - access("CODE"=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

Yep, also cheap with Bitmap Indexes, using the same plan as the previous > than example but using more consistent gets as there are a number of rows that need to be accessed this time (although all in the same data block).

To now complete the picture:

SQL> select * from radiohead where code = 1 and (flag < 42 or flag > 42);

20 rows selected.

 
Execution Plan
----------------------------------------------------------
Plan hash value: 3720408756

--------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name             | Rows  | Bytes | Cost(%CPU)| Time     |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                  |    30 |   450 |   14   (0)| 00:00:01 |
|   1 |  CONCATENATION                |                  |       |       |           |          |
|   2 |   TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    10 |   150 |    6   (0)| 00:00:01 |
|   3 |    BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |
|   4 |     BITMAP AND                |                  |       |       |           |          |
|   5 |      BITMAP MERGE             |                  |       |       |           |          |
|*  6 |       BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |           |          |
|*  7 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |           |          |
|   8 |   TABLE ACCESS BY INDEX ROWID | RADIOHEAD        |    20 |   300 |    8   (0)| 00:00:01 |
|   9 |    BITMAP CONVERSION TO ROWIDS|                  |       |       |           |          |
|  10 |     BITMAP AND                |                  |       |       |           |          |
|  11 |      BITMAP MERGE             |                  |       |       |           |          |
|* 12 |       BITMAP INDEX RANGE SCAN | RADIOHEAD_FLAG_I |       |       |           |          |
|* 13 |      BITMAP INDEX SINGLE VALUE| RADIOHEAD_CODE_I |       |       |           |          |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   6 - access("FLAG">42)
       filter("FLAG">42)
   7 - access("CODE"=1)
  12 - access("FLAG"<42)
       filter(LNNVL("FLAG">42) AND "FLAG"<42)
  13 - access("CODE"=1)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          8  consistent gets
          0  physical reads
          0  redo size
        775  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         20  rows processed

We now get a combination of both previous plans, concatenated together. Note in this case, it’s actually a cheaper and more efficient alternative to the first Not Equal example. Having got the setup for this demo, you can of course create the same demo yourselves and have a play and experiment. Me, I’m now off to watch Cadel Evans win the Tour De France 🙂

Comments»

1. mdinhdinh - July 7, 2011: It looks like I have some experiments to play with.
I’ve created a histogram however on the FLAG column as the. method_opt=> ‘FOR COLUMNS FLAG SIZE 5’

Am I missing something or should that be SIZE 2.

Thanks.

LikeLike

Reply
2. Richard Foote - July 8, 2011: With 2 distinct values, it doesn’t matter whether you specify 2 or 5 as Oracle will only use as many buckets as there are distinct values and ignore the rest. Personally, I think it’s actually good practice to specify a few more than necessary in case an extra value or two sneaks in someday and you suddenly start using height based histograms that could change things unexpectedly.

LikeLike

Reply
3. Jonathan Lewis - July 8, 2011: One option your readers should play with is changing the columns to allow null values (your create table included not null coustraints). The consequences can be quite entertaining.

LikeLike

Reply
Richard Foote - July 8, 2011: Hi Jonathan

Yes indeed, although not as entertaining as the Tour De France 🙂

I made the columns not null as didn’t I want to give the B-tree any semblance of an excuse for not being used and I wanted to make the bitmap plans as simplistic as possible.

To the readers out there, note that nulls are indexed within a bitmap index and so it’s something else the CBO needs to consider in the example.

LikeLike

Reply
4. Mark Brady - July 13, 2011: In the first post you mentioned:

Oracle [determines] the maximum number of rows that could potentially fit within a block and assigns a bit for every possible rowid (I’ll expand on this point in my next post).

Will that explanation be in the next, next post? I’m very interested in understanding that mechanism. With variable length columns being the norm, it seems like the difference between the actual number of rows/block and the theoretical rows/block would be quite different. Right? I know you mentioned a lot of the consecutive zeros would get compressed but why would they all be consecutive? Wouldn’t the theoretical rowids be evenly interspersed between existing rowids?

Anxiously awaiting your next post.

😉

LikeLike

Reply
Mark Brady - July 13, 2011: Ok, so I thought that the “row” portion of the rowid was an offset and therefore the “unused” rowids could exist between actual rowids but if that portion is a counter then of course they will all be at the “end”. i.e. a block COULD hold 100 rows but actually holds 25, there will be zeros for potential rowids 26-100. Is that right?

LikeLike

Reply
Richard Foote - July 19, 2011: Hi Mark

Yes, that’s right. There will logically be a group of 0s that will corresponding to potential rowids for each block.

LikeLike

Reply
Richard Foote - July 19, 2011: Hi Mark

Wait no more 🙂 Hope the new post helps to explain things.

LikeLike

Reply
5. Brian Tkatch - July 20, 2011: I’ve been away; i miss your posts.

You have got to write a book already. An outline of a possible book, if nothing else.

LikeLike

Reply
Richard Foote - July 27, 2011: Hi Brian

Welcome back !!

And I’ve missed your comments 🙂

One day, I’ll print off all my blog articles, create a table of contents and an index (no pun intended of course) and voila, there’s my book !!

LikeLike

Reply
Brian Tkatch - July 27, 2011: Heh. I’m not letting you get away with that!

You are going to write a nice outline, you are going to give a nice introduction, and you are going to discuss each topic before giving your wonderful blog examples.

It would be a waste to do it otherwise. I have learnt so much from your blog already

And no more of this “one day…” business. 🙂 I want an outline by New Year’s! Of course, i’m willing to help.

LikeLike

Reply

	PP on Costing Concatenated Indexes W…
	Zeeshan Moin on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	PP on The End Of Richard Foote…
	Ram on The Fake Index Trap (“No…
	Richard Foote on The End Of Richard Foote…
	NormanDunbar on The End Of Richard Foote…
	NormanDunbar on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Richard Foote on The End Of Richard Foote…
	Sayan Malakshinov on The End Of Richard Foote…

Richard Foote's Oracle Blog