jump to navigation

Oracle11g Bitmap-Join IOTs (Us and Them) January 25, 2011

Posted by Richard Foote in 11g, Bitmap Indexes, Index Organized Tables, Oracle Indexes.
trackback

With each new database release, nice little improvements and enhanced options continually get added. Since 11g R1, two index related features can finally be used in combination with each other.
 
To demonstrate, I’m first going to create and populate a so-called “large” Data Warehouse table. 
 

 
SQL> CREATE TABLE big_dwh_table (id NUMBER PRIMARY KEY, album_id NUMBER, artist_id NUMBER, country_id NUMBER, format_id NUMBER, release_date DATE, total_sales NUMBER);
 
Table created.
 
SQL> CREATE SEQUENCE dwh_seq;
 
Sequence created.
 
SQL> create or replace procedure pop_big_dwh_table as
  2  v_id          number;
  3  v_artist_id   number;
  4  begin
  5    for v_album_id in 1..10000 loop
  6        v_artist_id:= ceil(dbms_random.value(0,100));
  7        for v_country_id in 1..100 loop
  8          select dwh_seq.nextval into v_id from dual;
  9          insert into big_dwh_table values (v_id, v_album_id, v_artist_id, v_country_id, ceil(dbms_random.value(0,4)), trunc(sysdate-mod(v_id,ceil(dbms_random.value(0,1000)))), ceil(dbms_random.value(0,500000)));
 10       end loop;
 11    end loop;
 12 commit;
 13 end;
 14 /
 
Procedure created.
 
SQL> exec pop_big_dwh_table
 
PL/SQL procedure successfully completed.

 

I’ll next create a standard bitmap index on the ALBUM_ID column and collect a few statistics:
 

  
SQL> create bitmap index big_dwh_table_album_id_i on big_dwh_table(album_id);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'BIG_DWH_TABLE', estimate_percent=> null, cascade=> true, method_opt=>'FOR  ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 

OK, I’m now going to create and populate a “smaller” dimension/detail heap table and a few associated indexes:

 
SQL> CREATE TABLE albums (album_id number, album_details varchar2(30));
 
Table created.
 
SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000;
 
10000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> alter table albums add primary key(album_id);
 
Table altered.
 
SQL> create index albums_details_i on albums(album_details);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 

If we now run a little query that joins the two tables together:
 

 
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$';
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 1936297994
 
----------------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name                     | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |                          |   125 |  4250 |    25   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                 |                          |       |       |            |          |
|   2 |   NESTED LOOPS                |                          |   125 |  4250 |    25   (0)| 00:00:01 |
|   3 |    TABLE ACCESS BY INDEX ROWID| ALBUMS                   |     1 |    22 |     2   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN          | ALBUMS_DETAILS_I         |     1 |       |     1   (0)| 00:00:01 |
|   5 |    BITMAP CONVERSION TO ROWIDS|                          |       |       |            |          |
|*  6 |     BITMAP INDEX SINGLE VALUE | BIG_DWH_TABLE_ALBUM_ID_I |       |       |            |          |
|   7 |   TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE            |   100 |  1200 |    25   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("A"."ALBUM_DETAILS"='TAB$')
   6 - access("B"."ALBUM_ID"="A"."ALBUM_ID")
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
         10  consistent gets
          0  physical reads
          0  redo size
       1648  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 

The resultant execution plan is pretty good and efficient and what we would expect. It performs a nested loop join to join the tables together which based on the relatively small number of rows returned makes sense and uses the b-tree index to get the specific album details from the dimension table and the bitmap index to find the matching albums details from the larger table.
 
However, as this is a very frequently executed join condition, we can potentially improve things and reduce the 10 consistent gets by introducing a bitmap-join index. A bitmap-join index performs the “join” operation once, when the index is created and during subsequent DML operations by creating an index based on column(s) on the smaller dimension tables that directly references rows in the larger fact table.
    

 
SQL> drop index albums_details_i;
 
Index dropped.
 

SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details)
     FROM big_dwh_table b, albums a
     WHERE b.album_id = a.album_id;
 
Index created.

 

So the bitmap-join index is based on the ALBUM_DETAILS column from the smaller ALBUMS table, but it references and has rowids associated with the larger BIG_DWH_TABLE table, with the bitmap-join definition containing details on how the join between the two tables needs to be performed. It if want to know what rows within the larger table have ALBUM_DETAILS of interest, the corresponding bitmap-join index will find all such rows without having to access the smaller ALBUMS table that contains this column.
 
If we now run the same query as before:
  

 
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$';
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 950886520
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |   125 |  1500 |    26   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE           |   125 |  1500 |    26   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                         |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I |       |       |            |          |
--------------------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("B"."SYS_NC00008$"='TAB$')
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1648  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 
 
   

We notice the nested loop operation is no longer required. In fact, we don’t need to reference the smaller ALBUMS table at all as all the required information can now be obtained by using the bitmap-join index and direct accesses to the larger table. The number of consistent gets has therefore reduced from 10 down to just 6.
 
Note in our example, there is no actual Foreign Key (FK) constraint in the larger table (in a Data Warehouse, such constraints may not be necessary and/or get in the way). The bitmap-join index doesn’t require a FK constraint to be in place however it’s necessary that the column in the join condition referencing the detail table be Unique else there could be a many-to-many join condition which wouldn’t make sense when attempting to populate the bitmap-join index.
 
However, make one of the tables in the Bitmap-Join index an Index Organized Table (IOT), in this case the smaller detail table …
 

 
SQL> drop table albums;
 
Table dropped.
 
SQL> CREATE TABLE albums (album_id number primary key, album_details varchar2(30)) organization index;
 
Table created.
 
SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000;
 
10000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 

SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details)
  2       FROM big_dwh_table b, albums a
  3       WHERE b.album_id = a.album_id;
CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details)
                                               *
ERROR at line 1:
ORA-25966: join index cannot be based on an index organized table

 

 
   
 
and we get the above error as prior to 11g R1, there was a restriction that no table within a Bitmap-Join index could be an Index Organized Table.
 

Now, if we run exactly the same demo but in an Oracle11g database:
 
 

 
SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details)
     FROM big_dwh_table b, albums a
     WHERE b.album_id = a.album_id;
 
Index created.
 

SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$';
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 950886520
 
--------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                    | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                         |   125 |  1500 |    26   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE           |   125 |  1500 |    26   (0)| 00:00:01 |
|   2 |   BITMAP CONVERSION TO ROWIDS|                         |       |       |            |          |
|*  3 |    BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I |       |       |            |          |
--------------------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   3 - access("B"."SYS_NC00008$"='TAB$')
 

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
       1648  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 

It all now works fine.
 
So since Oracle 11g R1, there’s one less reason not use Index Organized Tables and/or Bitmap-Join indexes in your Data Warehouse 🙂

Comments»

1. pellicle - January 25, 2011

very sweet

thanks for explaining this

Like

2. Statistique - January 26, 2011

Nice stuff.

I made a test adding a column to the album table.

How many “consistent gets” you get if the query had to “hit” the albums table. Let’s say we add a release_date column to the query.

With my test ( I made the tables way bigger), I get 122 gets with the “normal” bitmap index, 44 gets if I don’t hit the album table (using the bitmap join index) and 65 gets if I add the release_date to the query.

Now, 65 is still way better than 122. But I’m asking myself, should we consider creating a bitmap join index when we can avoid hitting a table only ? When we consider all the overhead that are inherent to index is it worth it to create bitmap join index at large ???

What does your experience teach you on this ?

P.S. Sorry if I’m unclear, french is my mother tongue !

Like

3. Statistique - January 26, 2011

When I write : “is it worth it to create bitmap join index at large (wide) ???”

I know the answer is know it depends on your need.

But I’d like to heard what your experience has to say on this ! ;p

Like

4. Venzi - January 28, 2011

Nice article Richard!

Speaking of nice little improvements and enhancements:
On of my favorite is that you don’t need the “select from dual” for sequences anymore!
Now you can assign a variable directly to the .nextval/.currval value. And by doing that you can use .nextval now directly within an insert:

create table test(MyId number);

create sequence id_seq;

begin
for n in 1..10 loop
insert into test values (id_seq.nextval);
end loop;
commit;
end;

select * from test;

Table created.
Sequence created.
PL/SQL procedure successfully completed.

MYID
———-
2
3
4
5
6
7
8
9
10
11

10 rows selected.

Like

5. Richard Foote - February 17, 2011

I’ll get back and answer the various questions and comments in the next day or two. Too many things to attend to at the moment !!

Like

6. Book Review: Oracle Database 11gR2 Performance Tuning Cookbook (Part 1) « Charles Hooper's Oracle Notes - March 4, 2012

[…] place of a materialized view.  The statement in the book needs additional clarification (reference reference2 page […]

Like


Leave a comment