Oracle11g Bitmap-Join IOTs (Us and Them) January 25, 2011
Posted by Richard Foote in 11g, Bitmap Indexes, Index Organized Tables, Oracle Indexes.6 comments
With each new database release, nice little improvements and enhanced options continually get added. Since 11g R1, two index related features can finally be used in combination with each other.
To demonstrate, I’m first going to create and populate a so-called “large” Data Warehouse table.
SQL> CREATE TABLE big_dwh_table (id NUMBER PRIMARY KEY, album_id NUMBER, artist_id NUMBER, country_id NUMBER, format_id NUMBER, release_date DATE, total_sales NUMBER); Table created. SQL> CREATE SEQUENCE dwh_seq; Sequence created. SQL> create or replace procedure pop_big_dwh_table as 2 v_id number; 3 v_artist_id number; 4 begin 5 for v_album_id in 1..10000 loop 6 v_artist_id:= ceil(dbms_random.value(0,100)); 7 for v_country_id in 1..100 loop 8 select dwh_seq.nextval into v_id from dual; 9 insert into big_dwh_table values (v_id, v_album_id, v_artist_id, v_country_id, ceil(dbms_random.value(0,4)), trunc(sysdate-mod(v_id,ceil(dbms_random.value(0,1000)))), ceil(dbms_random.value(0,500000))); 10 end loop; 11 end loop; 12 commit; 13 end; 14 / Procedure created. SQL> exec pop_big_dwh_table PL/SQL procedure successfully completed.
I’ll next create a standard bitmap index on the ALBUM_ID column and collect a few statistics:
SQL> create bitmap index big_dwh_table_album_id_i on big_dwh_table(album_id); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'BIG_DWH_TABLE', estimate_percent=> null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
OK, I’m now going to create and populate a “smaller” dimension/detail heap table and a few associated indexes:
SQL> CREATE TABLE albums (album_id number, album_details varchar2(30)); Table created. SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000; 10000 rows created. SQL> commit; Commit complete. SQL> alter table albums add primary key(album_id); Table altered. SQL> create index albums_details_i on albums(album_details); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed.
If we now run a little query that joins the two tables together:
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1936297994 ---------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 4250 | 25 (0)| 00:00:01 | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 125 | 4250 | 25 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| ALBUMS | 1 | 22 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | ALBUMS_DETAILS_I | 1 | | 1 (0)| 00:00:01 | | 5 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 6 | BITMAP INDEX SINGLE VALUE | BIG_DWH_TABLE_ALBUM_ID_I | | | | | | 7 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 100 | 1200 | 25 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("A"."ALBUM_DETAILS"='TAB$') 6 - access("B"."ALBUM_ID"="A"."ALBUM_ID") Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 10 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
The resultant execution plan is pretty good and efficient and what we would expect. It performs a nested loop join to join the tables together which based on the relatively small number of rows returned makes sense and uses the b-tree index to get the specific album details from the dimension table and the bitmap index to find the matching albums details from the larger table.
However, as this is a very frequently executed join condition, we can potentially improve things and reduce the 10 consistent gets by introducing a bitmap-join index. A bitmap-join index performs the “join” operation once, when the index is created and during subsequent DML operations by creating an index based on column(s) on the smaller dimension tables that directly references rows in the larger fact table.
SQL> drop index albums_details_i; Index dropped. SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id; Index created.
So the bitmap-join index is based on the ALBUM_DETAILS column from the smaller ALBUMS table, but it references and has rowids associated with the larger BIG_DWH_TABLE table, with the bitmap-join definition containing details on how the join between the two tables needs to be performed. It if want to know what rows within the larger table have ALBUM_DETAILS of interest, the corresponding bitmap-join index will find all such rows without having to access the smaller ALBUMS table that contains this column.
If we now run the same query as before:
SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 950886520 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 1500 | 26 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 125 | 1500 | 26 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I | | | | | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("B"."SYS_NC00008$"='TAB$') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
We notice the nested loop operation is no longer required. In fact, we don’t need to reference the smaller ALBUMS table at all as all the required information can now be obtained by using the bitmap-join index and direct accesses to the larger table. The number of consistent gets has therefore reduced from 10 down to just 6.
Note in our example, there is no actual Foreign Key (FK) constraint in the larger table (in a Data Warehouse, such constraints may not be necessary and/or get in the way). The bitmap-join index doesn’t require a FK constraint to be in place however it’s necessary that the column in the join condition referencing the detail table be Unique else there could be a many-to-many join condition which wouldn’t make sense when attempting to populate the bitmap-join index.
However, make one of the tables in the Bitmap-Join index an Index Organized Table (IOT), in this case the smaller detail table …
SQL> drop table albums; Table dropped. SQL> CREATE TABLE albums (album_id number primary key, album_details varchar2(30)) organization index; Table created. SQL> INSERT INTO albums SELECT rownum, substr(object_name,1,30) FROM dba_objects WHERE rownum <= 10000; 10000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'ALBUMS', estimate_percent=> null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1'); PL/SQL procedure successfully completed. SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) 2 FROM big_dwh_table b, albums a 3 WHERE b.album_id = a.album_id; CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) * ERROR at line 1: ORA-25966: join index cannot be based on an index organized table
and we get the above error as prior to 11g R1, there was a restriction that no table within a Bitmap-Join index could be an Index Organized Table.
Now, if we run exactly the same demo but in an Oracle11g database:
SQL> CREATE BITMAP INDEX big_dwh_album_details_i ON big_dwh_table(a.album_details) FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id; Index created. SQL> SELECT b.id, b.album_id, b.format_id FROM big_dwh_table b, albums a WHERE b.album_id = a.album_id and a.album_details = 'TAB$'; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 950886520 -------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 125 | 1500 | 26 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BIG_DWH_TABLE | 125 | 1500 | 26 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BIG_DWH_ALBUM_DETAILS_I | | | | | -------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("B"."SYS_NC00008$"='TAB$') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 6 consistent gets 0 physical reads 0 redo size 1648 bytes sent via SQL*Net to client 396 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
It all now works fine.
So since Oracle 11g R1, there’s one less reason not use Index Organized Tables and/or Bitmap-Join indexes in your Data Warehouse 🙂
New Indexing Seminars Scheduled For Vienna and Tallinn (New Europeans) January 15, 2011
Posted by Richard Foote in Oracle Index Seminar.6 comments
I have two Oracle Index Internals and Best Practices seminars scheduled in the coming months, in Austria (Vienna) and Estonia (Tallinn). These could very well be my last seminars for quite a while as I’m unsure whether I’ll be able to travel again this year, so this could be your last opportunity to see me in action 🙂
Details are:
You need to book early to ensure your place and to ensure the seminars go ahead. Last year, a seminar in Prague was nearly cancelled due to lack of numbers only for 20 people to enroll in the last couple of weeks.
Hopefully, I’ll get to meet some of you at these events, assuming of course the Australian airports aren’t flooded, a distinct possibility considering the amount of rain that has been falling lately in this continent !!