jump to navigation

Bitmap Indexes & Minimize Records_Per_Block (Little Wonder) July 19, 2011

Posted by Richard Foote in Bitmap Indexes, MINIMIZE RECORDS_PER_BLOCK, Oracle Indexes.
Tags:
trackback

As mentioned in my first post regarding the use of Bitmap Indexes to service Not Equal predicates, an index entry in a Bitmap Index consists of the indexed value and a pair of rowids that determine the range of potential rows spanned by the corresponding bitmap sequence. However, Oracle has no way of determining how many rows might actually be in each specific data block and so must make an assumption that all blocks might hold the maximum number of rows that could potentially fit within a block and assign a bitmap bit accordingly. If a row doesn’t actually exist, then it’s simply a “phantom” row and is assigned a 0 to signify that it doesn’t contain the value of the index entry.
 
This maximum number of possible rows that could potentially fit in a block is called the “Hakan Factor” and is determined at the creation of the table based on the definition of the table (such as number of columns, type of columns, whether they’re nullable, etc.) and of course the block size. The smaller the possible size of the row, the more rows that could fit in a block and the more bits that need to be assigned by the Bitmap Index to cover all possible rowids within the rowid range within a Bitmap Index entry. As an example within an 8K block, taking into consideration block overheads, the maximum number of rows within a block that Oracle can potentially estimate can be as many as 736 rows.
 
These additional 0s that get assigned to cater for rows that might not actually exist, although compressed to some degree, still takes up some space within the index. This additional space can be very significant if:
 
– The difference between the minimum possible size of a row and the actual average size of a row is large (or to put it another way, if the difference between the estimated number of rows per blocks and the actual number of rows per block is large)
– The effective clustering of the indexed data is poor within the table as this will limit the effective compression of the additional 0 bits
 
To highlight how all this can make a significant difference to the size of a Bitmap Index, a simple demo as usual to illustrate.
 
First, I’m going to create a table that has a number of nullable VARCHAR2(100) fields, so they might contain up to 100 characters or perhaps no value at all. The potential size of a row might be tiny or it might be quite large.

 
SQL> create table muse (id number, code number, name1 varchar2(100), name2 varchar2(100), name3 varchar2(100), name4 varchar2(100), name5 varchar2(100), name6 varchar2(100), name7 varchar2(100), name8 varchar2(100), name9 varchar2(100), name10 varchar2(100));
 
Table created.

 
OK, time to populate the table. A couple of key points with the data I’m going to use.
 
Firstly, the CODE column is going to have 100 distinct values but these values will be evenly distributed throughout the entire table. So the clustering associated with this column will be terrible, as bad as it gets.
 
Secondly, although all the VARCHAR2(100) columns might not contain much data (or indeed any at all), in actual fact they’re going to be almost fully populated with data. So although the potential average size of a row could have been quite tiny, in actual fact all the rows are quite large. Although we could potentially have fitted many rows within our (8K) block, in actual fact we’re only going to be able to fit just 7 rows per block. There isn’t actually a single block within our table that contains more than 7 rows.

 
SQL> insert into muse select rownum, mod(rownum,100), 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia','Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia', 'Black Holes and Revelations is a really good album featuring my favourite track Knights Of Cydonia' from dual connect by level <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'MUSE', estimate_percent=>null, cascade=>true, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> select table_name, num_rows, blocks, avg_row_len from dba_tables where table_name='MUSE';
 
TABLE_NAME   NUM_ROWS     BLOCKS AVG_ROW_LEN
---------- ---------- ---------- -----------
MUSE          1000000     145549         998

 
 
Let’s now create a Bitmap Index on the CODE column. I’ll set the PCTFREE to 0 to build the smallest possible index structure:

 
SQL> create bitmap index muse_code_i on muse(code) pctfree 0;
 
Index created.
 
SQL> select index_name, leaf_blocks, avg_leaf_blocks_per_key, num_rows from dba_indexes where index_name = 'MUSE_CODE_I';
 
INDEX_NAME  LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY   NUM_ROWS
----------- ----------- ----------------------- ----------
MUSE_CODE_I         400                       4        800

 
 
So the Bitmap Index currently consists of 400 leaf blocks.
 
As we now know this table has rows that on average are considerably larger than the minimum possible row size, the Bitmap Index has had to cater for the possible existence of many rows that don’t actually exist. Additionally, as the clustering of the indexed data is very poor, the Bitmap Index will not be able to effectively compress these additional 0 bits as much as it might, as there are bits set to 1 littered all over the place that will hamper the effective compression capabilities of the index (I’ve discuss the impact of the Clustering Factor on the effectiveness of Bitmap Index compression previously).
 
Therefore, it might well be beneficial to more accurately determine the number of rows that really exist within a block. We can change the Hakan Factor by altering the table with the MINIMIZE RECORDS_PER_BLOCK clause. Effectively this results in Oracle performing a full table scan, checking for the number of rows per block (a quick check of the nrow count in the block header suffices) and keeping track of the block that currently contains the most number of rows. The highest value of the nrow count within the table becomes the new Hakan Factor.
 
Let’s give it a go:

 
SQL> alter table muse minimize records_per_block;
alter table muse minimize records_per_block
*
ERROR at line 1:
ORA-28602: statement not permitted on tables containing bitmap indexes

 
Unfortunately, this statement is not permitted if there are already any Bitmap indexes assigned to the table as they have already been based on the current Hakan Factor. All current Bitmap Indexes assigned to the table must first be dropped.

 
SQL> drop index muse_code_i;
 
Index dropped.
 
SQL> alter table muse minimize records_per_block;
 
Table altered.

 
 
OK, so now Oracle has a much more accurate picture of the actual number of rows that exist within a block in this table. The new Hakan Factor is based on the maximum number of rows that actually currently exist within a block in the table (just 7 in this specific example), which is significantly less than was defined previously. Oracle ensures the integrity of the new Hakan Factor from here on in by now limiting the number of rows that can be inserted into blocks within the table to this new value, even if in the future additional rows could potentially have fitted within a block. Once the Hakan Factor is reached, the block is taken off the freelist or marked as full in an ASSM segment.
 
Now any Bitmap Index on this table only has to cater for a relatively small number of rows per block, vastly reducing the number of bits that need to be considered and stored.
 
This can significantly reduce the overall size of associated bitmap indexes:
 

 
SQL> create bitmap index muse_code_i on muse(code) pctfree 0;
 
Index created.
 
SQL> select index_name, leaf_blocks, avg_leaf_blocks_per_key, num_rows from dba_indexes where index_name = 'MUSE_CODE_I';
 
INDEX_NAME  LEAF_BLOCKS AVG_LEAF_BLOCKS_PER_KEY   NUM_ROWS
----------- ----------- ----------------------- ----------
MUSE_CODE_I         150                       1        300

The new Bitmap Index is now only 150 leaf blocks in size, substantially smaller than the previous 400 leaf blocks.

Comments»

1. David Aldridge - July 20, 2011

Of course there used to be a nasty issue with performing partition exchanges when minimize records_per_block had been invoked, as the tables would often have different hakan factors and hence entries per block in their indexes. I’ve been mildly allergic to it ever since.

Like

Richard Foote - July 27, 2011

Hi David

Ah nasty issues don’t you just love them !! Thanks for the heads-up with exchanging partitions.

Like

2. Martin Preiss - July 20, 2011

Richard,
in a recent post in his blog Randolf Geist used “minimize records_per_block” and mentioned that the Hakan Factor is shown in the SPARE1 column of SYS.TAB$ (http://oracle-randolf.blogspot.com/2011/07/logical-io-evolution-part-1-baseline.html). Minimize records_per_block sets the 0x8000 bit in SPARE1 – so the Integer value is incremented by 32768. I made some simple tests and the results were as expected – with one interesting exception:

-- test_minimize_records_per_block.sql
create table t&rowcount 
tablespace test_ts
as
select rownum id
  from dual
connect by level <= &rowcount;

alter table T&rowcount minimize records_per_block;

truncate table T&rowcount;

insert into T&rowcount
select rownum id
  from dual
connect by level <= 10000;

exec dbms_stats.gather_table_stats(user, 'T&rowcount', estimate_percent=>100)

-- results in sys.tab$ with different rowcount values
INITIAL_ROWS     ROWCNT     BLKCNT     SPARE1
------------ ---------- ---------- ----------
           1      10000       5001      32769
           2      10000       5001      32769
           3      10000       3337      32770
           4      10000       2500      32771
           5      10000       2003      32772
          10      10000       1004      32777
         100      10000        103      32867

For the case with a single initial row in the test table I get SPARE1 = 32769 and two rows per block. I was not able to get one row by block with minimize records_per_block (except by using bigger rows of a size > blocksize/2).

Regards

Martin

Like

Richard Foote - July 27, 2011

Hi Martin

Thank-you, I forgot to mention the TAB$ table for looking up the Hakan Factor. Not sure why the 2 rows minimum, it’s not something I’ve previously noticed. Thanks for the heads-up 🙂

Like

3. Chagan Lal - November 30, 2011

Hi Richard,
Thanks for this imformative blog. We are facing the issue of ORA-28604: table too fragmented to build bitmap index when doing a datapump import of a schema for a particular table. Any reason why this would happen on an empty schema? Is because there is an issue on the source table?
Thanks,
Chagan

Like

Richard Foote - December 19, 2011

Hi Changan

This can happen when you have a whole bunch of deleted rows that haven’t been cleaned out properly (as deleted entries can’t be reused during the transaction that peformed the deletes). Jonathan Lewis (as usual) has some interesting things to say on the subject: http://jonathanlewis.wordpress.com/2009/05/21/row-directory/

Like

4. Spreading Out Data With MINIMIZE RECORDS_PER_BLOCK | ORAganism - October 8, 2014

[…] in the official documentation but there is a very nice example on Richard Foote’s blog titled “Bitmap Indexes & Minimize Records_Per_Block (Little Wonder)”, and he works for Oracle so perhaps it counts as official […]

Like

5. Spreading Out Data With MINIMIZE RECORDS_PER_BLOCK - Oracle - Oracle - Toad World - October 8, 2014

[…] in the official documentation but there is a very nice example on Richard Foote’s blog titled “Bitmap Indexes & Minimize Records_Per_Block (Little Wonder)”, and he works for Oracle so perhaps it counts as official […]

Like


Leave a comment