ACT Oracle User Group Seminar Session: 2 June 2010 May 27, 2010
Posted by Richard Foote in 11g New features, Advert, Oracle Indexes.add a comment
Just a short note for anyone near the Canberra region that the next ACT Oracle User Group Seminar Session will be held next Wednesday, 2 June 2010 at the Oracle Offices in Turner.
I’ll be presenting “Indexing New Features and Improvements Introduced in Oracle 11g Release 1 & 2″. Follow the above link for the full agenda.
If that isn’t enough to make you want to attend, there will be free beer and pizza available afterwards 🙂
Bitmap Index Degradation After DML Prior To 10g (Beauty and the Beast) May 25, 2010
Posted by Richard Foote in Bitmap Indexes, Block Dumps, Index Internals, Oracle Indexes.5 comments
Bitmap Indexes have a bad reputation with regard to being problematic and suffering from severe degradation following heavy DML operations, especially larger amounts of insert activity. Bitmap indexes have been known to grow substantially and require periodic rebuilds following such insert activity.
While this was certainly the case in previous versions of Oracle, things have dramatically improved since version 10g. I thought it might be worthwhile explaining why Bitmap Indexes had such issues prior to 10g and how things are now so much better, in many cases making Bitmap index rebuilds unnecessary.
To start with, a demo on an Oracle 9.2.0.7 database. I’m first going to create a simple little table with 1M rows, with a Bitmap index on a CODE column with 1,000 distinct values. The values are effectively inserted into the table in an evenly distributed manner throughout the entire table.
SQL> create table bowie as select mod(rownum,1000)+1 id, mod(rownum,1000)+1 code,'BOWIE' name from dual connect by level <= 1000000; Table created. SQL> create bitmap index bowie_code_i on bowie(code) compute statistics; Index created. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- BOWIE_CODE_I 1 500 1000
We notice that the Bitmap index has 1000 index entries, one for each index value. There are 500 leaf blocks which means we can only fit 2 index entries in each leaf block on average. If we look at a partial block dump of the first leaf block:
row#0[5013] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 cc 0c 00 e0 col 2; len 6; (6): 02 02 d6 2d 01 7f col 3; len 2998; (2998): 03 c1 eb 01 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c1 eb 01 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c1 eb 01 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c1 eb 01 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 ... row#1[1994] flag: -----, lock: 0 col 0; len 2; (2): c1 03 col 1; len 6; (6): 02 02 cc 0a 00 00 col 2; len 6; (6): 02 02 d6 2b 00 9f col 3; len 2998; (2998): 00 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c0 be 01 c6 ea 01 c4 eb 01 c7 bd 01 c5 eb 01 c0 be 01 c6 ea 01 c4
I’ve listed just the first portions of each of the 2 index entries in the leaf block. Each indexed value is different (col 0) and each have a bitmap string of 2998 bytes in size (col3). Therefore in an 8K block, Oracle can’t indeed fit a 3rd index entry in a block, hence why there are only 2 index entries per block. The bitmap string column is quite large because in a 1M row table, there are 1000 occurences of each indexed value littered throughout the table that need to be referenced.
OK, next I’m going to insert just 1 additional row. It has a CODE value of 1, which is the minimum existing CODE value and so should be indexed in the first index leaf block. Let’s see what impact this has on the index statistics:
SQL> insert into bowie values (1, 1, 'ZIGGY'); 1 row created. SQL> commit; Commit complete. SQL> analyze index bowie_code_i compute statistics; Index analyzed. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- BOWIE_CODE_I 1 500 1001
The crucial point here is that although the value (1) already existed within the index, Oracle has nonetheless created an additional index entry (1001 NUM_ROWS up from 1000). If look at a partial dump of the leaf block now:
row#1[1973] flag: -----, lock: 2 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d6 2e 00 00 col 2; len 6; (6): 02 02 d6 2e 00 07 col 3; len 1; (1): 00 row#2[1994] flag: -----, lock: 0 col 0; len 2; (2): c1 03 col 1; len 6; (6): 02 02 cc 0a 00 00 col 2; len 6; (6): 02 02 d6 2b 00 9f col 3; len 2998; (2998): 00 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd 01 c5 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c0 be 01 c6 ea 01 c4 eb 01 c7 bd 01 c5 eb 01 c0 be 01 c6 ea 01 c4
We notice there is indeed a new index entry (row#1) which only has a rowid range (col 1 and col 2) of just 8 rows (between 02 02 d6 2e 00 00 and 02 02 d6 2e 00 07). So rather than somehow modify the existing index entry, Oracle has created a new index entry with a rowid range of just 8 consecutive rowids.
This means we should now be able to add an additional 7 rows with a CODE value of 1, which providing they can all fit within the same table block, will be associated with this allocated rowid range.
SQL> begin 2 for i in 1..7 loop 3 insert into bowie values (1, 1, 'ZIGGY'); 4 commit; 5 end loop; 6 end; 7 / PL/SQL procedure successfully completed. SQL> analyze index bowie_code_i compute statistics; Index analyzed. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- BOWIE_CODE_I 1 500 1001
Indeed, after inserting these additional rows, no new index entries were required as all the new rows fall within the rowid range of the new index entry. But add one more row …
SQL> insert into bowie values (1, 1, 'ZIGGY'); 1 row created. SQL> commit; Commit complete. SQL> analyze index bowie_code_i compute statistics; Index analyzed. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- BOWIE_CODE_I 1 500 1002
Indeed, we now have yet another additional index entry (1002 NUM_ROWS up from 1001). If we now look at a partial block dump of the index leaf block:
row#1[1819] flag: -----, lock: 2 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d6 2e 00 00 col 2; len 6; (6): 02 02 d6 2e 00 07 col 3; len 2; (2): c8 ff row#2[1798] flag: -----, lock: 2 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d6 2e 00 08 col 2; len 6; (6): 02 02 d6 2e 00 0f col 3; len 1; (1): 00 row#3[1994] flag: -----, lock: 0 col 0; len 2; (2): c1 03 col 1; len 6; (6): 02 02 cc 0a 00 00 col 2; len 6; (6): 02 02 d6 2b 00 9f col 3; len 2998; (2998): 00 c4 bd 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c2 eb 01 c5 bd 01 c3 eb 01 c6 bd 01 c4 eb 01 c3 eb 01 c6 bd 01 c4 eb 01 c7 bd
We notice we have yet another index entry (row#2) covering another narrow 8 row range.
Basically prior to 10g, if a new index value didn’t fit within an existing bitmap index rowid range for the corresponding indexed value, a new index entry is added and all the overheads associated with it, with a default range of just 8 rowids. This is an extremely costly process as the new index entry not only has to store the index value itself again, but additionally 2 rowids and corresponding length bytes and the necessary bitmap string column as well. Additionally, in all likelihood, the rows in the specific rowid range are quite likely to contain differing values in the index columns and so will in turn require new index entries. I’ve just inserted a whole bunch of CODE values of 1 in the above example. In a randomly inserted column value, this might be quite a rare event as each subsequent indexed value is quite likely to differ for each new consecutive row.
If we create a table and index first and then populate the table with the 1 million rows, the number of index entries and size of index itself becomes huge …
SQL> create table radiohead (id number, code number, name char(5)); Table created. SQL> create bitmap index radiohead_code_i on radiohead(code); Index created. SQL> begin 2 for i in 1..1000 loop 3 for j in 1..1000 loop 4 insert into radiohead values (j, j, 'BOWIE'); 5 commit; 6 end loop; 7 end loop; 8 end; 9 / PL/SQL procedure successfully completed. SQL> analyze index radiohead_code_i compute statistics; Index analyzed. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='RADIOHEAD_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- RADIOHEAD_CODE_I 2 5347 1000000
Because the clustering is so bad, each new row for a given index value isn’t within an existing 8 rowid window and so each and every row in the table requires its own bitmap index entry. As a result, the Bitmap index is huge at 5,347 leaf blocks and has a full1M NUM_ROWS. A partital block dump shows just how inefficient this index is:
row#0[4273] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d8 8a 00 00 col 2; len 6; (6): 02 02 d8 8a 00 07 col 3; len 1; (1): 00 row#1[4294] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d8 8c 00 e0 col 2; len 6; (6): 02 02 d8 8c 00 e7 col 3; len 1; (1): 01 row#2[4315] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d8 8f 00 40 col 2; len 6; (6): 02 02 d8 8f 00 47 col 3; len 1; (1): 04 row#3[4336] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d8 91 01 20 col 2; len 6; (6): 02 02 d8 91 01 27 col 3; len 1; (1): 05 row#4[4357] flag: -----, lock: 0 col 0; len 2; (2): c1 02 col 1; len 6; (6): 02 02 d8 94 00 88 col 2; len 6; (6): 02 02 d8 94 00 8f col 3; len 1; (1): 00
We can see that each and every indexed value (col 1) in this partial leaf block dump is the same (hex c1 02) with each only having a rowid range (col 1 and col 2) that covers the bare minimum 8 row range. It’s about as inefficient a Bitmap index as you can get with 1000 distinct values in a 1M row table.
If we now however rebuild the Bitmap index:
SQL> alter index radiohead_code_i rebuild compute statistics; Index altered. SQL> select index_name, blevel, leaf_blocks, num_rows from dba_indexes where index_name='RADIOHEAD_CODE_I'; INDEX_NAME BLEVEL LEAF_BLOCKS NUM_ROWS ------------------------------ ---------- ----------- ---------- RADIOHEAD_CODE_I 1 500 1000
We get back to our nice, efficient Bitmap Index structure with just the bare minimum 1000 index entries, all fitting into a fraction of the index leaf blocks (just 500 down from 5,347).
Fortunately, Oracle has got a lot more cleverer since 10g with regard to how it maintains its Bitmap Index structures during DML as I’ll cover soon in Part II.
Concatenated Bitmap Indexes Part II (Everybody’s Got To Learn Sometime) May 12, 2010
Posted by Richard Foote in Bitmap Indexes, CBO, Concatenated Indexes, Oracle Cost Based Optimizer, Oracle Indexes.2 comments
A basic little post to conclude this discussion.
The issues regarding whether to go for single column indexes vs. concatenated indexes are similar for Bitmap indexes as they are for B-Tree indexes.
It’s generally more efficient to access a concatenated index as it’s only the one index with less processing and less throwaway rowids/rows to contend with. However it’s more flexible to have single column indexes, especially for Bitmap indexes that are kinda designed to be used concurrently, as concatenated indexes are heavily dependant on the leading column being known in queries.
If we look at the second table from Part I which had the concatenated index being significantly larger than the sum of the single column indexes, we notice that it can still have a part to play with the CBO. When we run a query that references both columns in predicates:
SQL> select * from bowie2 where id = 42 and code = 42; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 4165488265 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 1200 | 21 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BOWIE2 | 100 | 1200 | 21 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BOWIE2_3_I | | | | | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("ID"=42 AND "CODE"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 103 consistent gets 26 physical reads 0 redo size 3030 bytes sent via SQL*Net to client 482 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
The CBO favours the concatenated index with the total number of consistent gets at 103. This despite the fact the concatenated index has some 10,000 distinct entries and is somewhat larger than the sum of the single column indexes. If we now drop the concatenated index and re-run the same query:
SQL> drop index bowie2_3_i; Index dropped. SQL> select * from bowie2 where id = 42 and code = 42; 100 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2338088592 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 1200 | 22 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | BOWIE2 | 100 | 1200 | 22 (0)| 00:00:01 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | | 3 | BITMAP AND | | | | | | |* 4 | BITMAP INDEX SINGLE VALUE| BOWIE2_1_I | | | | | |* 5 | BITMAP INDEX SINGLE VALUE| BOWIE2_2_I | | | | | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("ID"=42) 5 - access("CODE"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 105 consistent gets 0 physical reads 0 redo size 3030 bytes sent via SQL*Net to client 482 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
The CBO can use a BITMAP AND operation by accessing and ANDing the associated bitmap columns from both single column indexes. However this is little less efficient than using the single concatenated index (105 vs 103 consistent gets) even though the concatenated index is somewhat larger than the other 2 indexes combined as Oracle needs to access and process two Bitmap index segments, not one. However as is very common, note in both examples, most of the consistent gets are in relation to getting the 100 rows out of the table, not so much with regard to the indexes themselves.
However, it we just reference the CODE column in a predicate:
SQL> select * from bowie2 where code = 42; 10000 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2522233487 ------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 117K| 489 (1)| 00:00:03 | | 1 | TABLE ACCESS BY INDEX ROWID | BOWIE2 | 10000 | 117K| 489 (1)| 00:00:03 | | 2 | BITMAP CONVERSION TO ROWIDS| | | | | | |* 3 | BITMAP INDEX SINGLE VALUE | BOWIE2_2_I | | | | | ------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("CODE"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2861 consistent gets 0 physical reads 0 redo size 257130 bytes sent via SQL*Net to client 7742 bytes received via SQL*Net from client 668 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000 rows processed
Providing it’s cheaper than other alternatives, the single column bitmap index can be considered and used by the CBO. However, if we only had the previous concatenated index:
SQL> drop index bowie2_1_i; Index dropped. SQL> drop index bowie2_2_i; Index dropped. SQL> create bitmap index bowie2_3_i on bowie2(id,code) pctfree 0; Index created. SQL> select * from bowie2 where code = 42; 10000 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1495904576 ---------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 117K| 497 (6)| 00:00:03 | |* 1 | TABLE ACCESS FULL| BOWIE2 | 10000 | 117K| 497 (6)| 00:00:03 | ---------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("CODE"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3011 consistent gets 2343 physical reads 0 redo size 165134 bytes sent via SQL*Net to client 7742 bytes received via SQL*Net from client 668 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000 rows processed
As the leading column is not specified, the concatenated Bitmap index is ineffective and the CBO decides to use a FTS. So it’s a similar scenario as with B-tree indexes.
A concatenated Bitmap index can potentially use less or more space than corresponding single column Bitmap indexes, it depends on the number of index entries that are derived and the distribution of the data with the table. However regardless, a concatenated Bitmap index can still be a viable alternative if at least the leading column is specified and be the more efficient option if all columns are generally specified, even if the overall size of the index is somewhat greater than the sum of the alternative single column Bitmap indexes. Then again, it’s less flexible and may not be considered if the leading column is not referenced.
If columns are generally all specified in SQL predicates, then combining them all in a single concatenated Bitmap index is a viable option. It all depends. Understanding why it depends is of course important in making the correct decision with regard which way to go with Bitmap indexes …
Concatenated Bitmap Indexes Part I (Two Of Us) May 6, 2010
Posted by Richard Foote in Bitmap Indexes, Block Dumps, Clustering Factor, Concatenated Indexes, Oracle Indexes.6 comments
Although Bitmap Indexes are commonly created on one column, you can create multi-column, concatenated Bitmap indexes as well.
Many of the same issues and factors in deciding to create a single, multi-column index vs. several, single column indexes apply to Bitmap indexes as they do with B-Tree indexes, although there are a number of key differences to consider as well.
The first difference to note is that a bitmap index doesn’t have as many index entries as there are rows in the table (with not null values), as with B-Tree indexes. A Bitmap index can potentially just have only as many index entries as there are distinct values for the indexed columns. This is one of the main reasons why Bitmap indexes can be considerably smaller than equivalent B-Tree indexes. A newly created Bitmap index only needs to have multiple index entries for the same column value if the associated index entry is greater than 1/2 a block size. If an index entry were to be larger than 1/2 a block size, Oracle creates another Bitmap index “piece” for the same indexed value, with a bitmap column covering a different range of rowids. (Note: additional Bitmap index pieces can be created based on subsequent DML, especially in earlier versions of Oracle. To be discussed at a later point in time).
Another thing to note regarding a concatenated Bitmap index is that the potential number of index entries is a product of distinct combinations of data of the indexed columns. For example, if two columns have 100 distinct values each, then as separate Bitmap indexes, they potentially may have as few as 100 index entries each. However, when combined as a concatenated Bitmap index, there may be a minimum of just 100 index entries only if there are just 100 different combinations of data between the columns (ie. there is a 1 to 1 relationship between column values). If there are however say 100 x 100 = 10,000 maximum combinations of data, there will be 10,000 index entries as a minimum in the associated concatenated Bitmap index.
A key point though is that if there are many more combinations of data when stored in a concatenated index, the occurrence of each distinct value will be far less within the table, meaning there will be many more “0” (not true) values in the corresponding bitmap column for each index entry and so can be compressed more effectively and likely use substantially less space than a corresponding index entry in a single column Bitmap index.
A simple little example to illustrate. To start, I’m going to create a 1M row table that has an ID and a CODE Colum, each with 100 distinct values.
In this first example, there’s an implicit 1 to 1 relationship between these 2 columns (eg. they always have the same corresponding values) such that there’s also only 100 distinct combinations of ID and CODE. Additionally, the values are distributed evenly throughout the table so the effective clustering of the data in relation to the index is awful.
SQL> create table bowie as select mod(rownum,100)+1 id, mod(rownum,100)+1 code,'BOWIE' name from dual connect by level <= 1000000; Table created. SQL> create bitmap index bowie_1_i on bowie(id) pctfree 0; Index created. SQL> create bitmap index bowie_2_i on bowie(code) pctfree 0; Index created. SQL> create bitmap index bowie_3_i on bowie(id,code) pctfree 0; Index created.
If we look at the size and characteristics of these indexes we notice a couple of interesting points:
SQL> select index_name,leaf_blocks,num_rows from dba_indexes where index_name in ('BOWIE_1_I', 'BOWIE_2_I', 'BOWIE_3_I'); INDEX_NAME LEAF_BLOCKS NUM_ROWS ---------- ----------- ---------- BOWIE_1_I 200 400 BOWIE_2_I 200 400 BOWIE_3_I 200 400
Firstly, even though there are only 100 distinct values for each indexed column or columns, there are actually 400 index entries in these indexes. This means there are on average 4 Bitmap index pieces for each distinct indexed value. Oracle can’t fit the associated index entry for a specific value within 1/2 a block (in this example, the block size is 8k). In fact, it takes approximately two 8K index leaf blocks it fit all the index data for a specific value and it therefore requires 4 Bitmap index pieces per indexed value.
If we look at a partial block dump of a leaf block from say the BOWIE_1_I:
row#0[4089] flag: ------, lock: 0, len=3947 col 0; len 2; (2): c1 02 col 1; len 6; (6): 01 c2 b5 09 00 60 col 2; len 6; (6): 01 c2 b8 f3 00 3f col 3; len 3926; (3926): 01 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d ... row#1[142] flag: ------, lock: 0, len=3946 col 0; len 2; (2): c1 02 col 1; len 6; (6): 01 c2 b8 f3 00 a0 col 2; len 6; (6): 02 03 62 5d 00 1f col 3; len 3925; (3925): 01 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c5 18 61 5d 61 c3 1b 5f 63 5f c1 ...
We notice there are actually 2 index entries within the block. Each index entry is for the same indexed value (both col 0 have identical values) but because the associated bitmap column (col 3) is so large (3926 and 3925 byes respectively), we need 4 index entries on average to store the bitmap data for each specific indexed value in order for each piece to not exceed the 1/2 block size limit.
Remember, for 100 distinct values in a 1M row table, that’s approximately 10,000 occurrences for each distinct value that need to somehow be mapped within the index. Oracle needs 4 index entries per value to fit the necessary bitmap information with the index such that no single index entry exceeds the 1/2 block size limit.
The next thing to note is that the size of the concatenated Bitmap index is actually about the same size of each of the individual single column Bitmap index (200 leaf blocks). Therefore, the overall storage required to store the two columns in one Bitmap is only half of that required to store the columns as separate Bitmap indexes.
If we look at a partial dump of the concatenated index:
row#0[4091] flag: ------, lock: 0, len=3945 col 0; len 2; (2): c1 02 col 1; len 2; (2): c1 02 col 2; len 6; (6): 01 c2 b5 09 00 60 col 3; len 6; (6): 01 c2 b8 f2 00 57 col 4; len 3921; (3921): 01 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f 63 c5 1b 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d 61 c3 1b 5f 63 5f c1 1c 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d 61 5d c7 1b 63 5f 63 c5 1b 61 5d ...
We notice the overall length of an index entry is practically identical to those of the single column index. The storage required to store the additional indexed column (col 1) within the index entry is just 3 byes in the above example, 2 bytes for the numeric index value and 1 byte for its length. Considering the length of the bitmap column (col 4) is in the order of 3920 bytes for each index entry, an additional 3 bytes here or there is trivial and so doesn’t impact the overall size of the Bitmap index.
OK, let’s look at a slightly different example. The table is again 1M rows with the overall data being similar. However, I’m making a few subtle differences. Firstly, the data for the ID is actually perfectly clustered and is ordered in exactly the same manner as the index. Additionally, the distribution of data between the columns is such that there are now 100 x 100 = 10,000 distinct combinations of ID and CODE values.
SQL> create table bowie2 (id number, code number, name char(5)); Table created. SQL> begin 2 for i in 1..100 loop 3 for x in 1..100 loop 4 for y in 1..100 loop 5 insert into bowie2 values (x, y, 'BOWIE'); 6 end loop; 7 end loop; 8 end loop; 9 commit; 10 end; 11 / PL/SQL procedure successfully completed. SQL> create bitmap index bowie2_1_i on bowie2(id) pctfree 0; Index created. SQL> create bitmap index bowie2_2_i on bowie2(code) pctfree 0; Index created. SQL> create bitmap index bowie2_3_i on bowie2(id,code) pctfree 0; Index created.
If we look at the size and characteristics of the these Bitmap indexes:
SQL> select index_name,leaf_blocks,num_rows from dba_indexes where index_name in ('BOWIE2_1_I', 'BOWIE2_2_I', 'BOWIE2_3_I'); INDEX_NAME LEAF_BLOCKS NUM_ROWS ---------- ----------- ---------- BOWIE2_1_I 25 100 BOWIE2_2_I 200 400 BOWIE2_3_I 417 10000
We see a number of key differences when compared to the Bitmap indexes in the first example. Firstly, the Bitmap index for the ID column is tiny, just 25 leaf blocks compared to the 200 leaf blocks required previously. Additionally, there are only 100 index entries, rather than the 400 previous index entries. This is due to the fact the data is perfectly clustered within the table and as such, all the “1”s (true) are all grouped together and all the “0”s (false) are likewise grouped together and can be compressed very efficiently. The overall size of the bitmap has now reduced such that it can all fit comfortably within 1/2 a block and so just the 1 index entry is necessary for each indexed value.
If we look at a partial block dump of the ID Bitmap index:
row#0[6218] flag: ------, lock: 0, len=1818 col 0; len 2; (2): c1 02 col 1; len 6; (6): 01 c2 c6 16 00 d8 col 2; len 6; (6): 02 03 78 18 01 3f col 3; len 1797; (1797): cf f0 ff ff ff ff ff ff ff cc ff ff ff ff ff fc de 10 80 ff ff ff 07 ff 21 ff ff ff ff ff ff ff ff c8 ff ff de 10 80 ff ff ff ff ff ff ff cd ff ff ff ff ff 07 ff de 10 fc ff ff ff ff ff ff ff cc ff ff ff ff 07 f8 21 07 ff de 10 fc ff ff ff ff ff ff ff cc ff ff ff ff 3f ff fd 19 c0 ff ff ff ff ff ff ff cd ff ff ff ff ff 03 ff de 10 fe ff ff ff ff ff ff ff cc ff ff ff ff 1f ...
We see that the bitmap column (col 3) is now only 1797 bytes and the overall index entry is 1818 bytes, which comfortably fits within 1/2 an 8K block.
The Bitmap index for the CODE remains unchanged because its values are still poorly clustered and distributed throughout the entire table.
The concatenated Bitmap index is now significantly larger than is was previously (417 leaf blocks, up from 200 leaf blocks), primarily because we now have 10,000 distinct values to deal with rather than just 100. However, as the occurrences of each of these 10,000 distinct values is so much rarer (only 100 occurrences per distinct combination of values), the associated bitmap column is going to be relatively small and well compressed for each Bitmap index entry.
If we look at a complete index entry from a partial block dump of the concatenated Bitmap index:
row#0[7710] flag: ------, lock: 0, len=326 col 0; len 2; (2): c1 02 col 1; len 2; (2): c1 02 col 2; len 6; (6): 01 c2 c6 16 00 d8 col 3; len 6; (6): 02 03 78 18 00 d7 col 4; len 302; (302): 04 c7 d8 10 c7 80 11 c2 d9 10 c2 80 11 c6 f7 19 c1 d9 10 c4 d8 10 c4 80 11 c7 d8 10 c3 f8 19 c3 80 11 c6 d8 10 c1 d9 10 c1 80 11 c5 f7 19 c0 d9 10 c0 80 11 c3 d8 10 c3 80 11 c2 d0 19 c2 80 11 c5 d8 10 c5 80 11 c0 d9 10 c4 f7 19 c7 d8 10 c7 80 11 c2 d9 10 c2 80 11 c6 f7 19 c1 d9 10 c1 80 11 c4 d8 10 c7 d8 10 c0 87 09 c3 d8 10 c3 80 11 c6 d8 10 c6 80 11 c1 d9 10 c5 de 08 c5 80 11 c0 d9 10 c0 80 11 c3 d8 10 c3 80 11 c0 a8 f4 ec bb 01 c3 d8 10 c6 d8 10 c6 80 11 c1 d9 10 c1 80 11 c5 de e4 08 c5 80 11 c0 d9 10 c3 d8 10 c3 80 11 c6 d8 10 c7 86 09 c2 d9 10 c2 80 11 c5 d8 10 c0 d9 10 c0 80 11 c4 de 08 c4 80 11 c7 d8 10 c7 80 11 c2 d9 10 c5 d8 10 c6 86 09 c1 d9 10 c1 80 11 c4 d8 10 c4 80 11 c7 d8 10 c0 87 09 c3 d8 10 c6 d8 10 c6 80 11 c1 d9 10 c1 80 11 c5 de 08 c5 80 11 c0 d9 10 c3 d8 10 c3 80 11 c6 d8 10 c7 86 09 c2 d9 10 c2 80 11 c5 d8 10 c0 d9 10 c4 f7 19 c4 80 11 c7 d8 10 c7 80 11 c2 d9 10 c6 f7 19
We notice the index entries are relatively small at just 326 bytes even though we have 2 indexed columns (col 0 and col 1). The size of the bitmap column (col 4) is just 302 bytes, only a fraction of that of the single column Bitmap indexes. With so few 1s (true) to contend with, the resultant bitmap is far more efficiently compressed than with the single Bitmap column indexes as it has far more 0s (false) than can be effectively compressed.
A concatenated Bitmap index can potentially use less or more space than corresponding single column indexes, it depends on the number of index entries that are derived and the distribution of the data with the table.
I’ll post Part II in the next few days.