Fragmented Indexes Due To Large Number Of Duplicate Entries (More) October 28, 2008Posted by Richard Foote in Fragmented Indexes, Oracle Indexes.
Many think an index can only be “fragmented” if you have a high number of delete or update operations on the index. An index that only has inserts can’t really be fragmented as no space is “wasted” due to delete (or update) related operations, right ?
I use the term “fragmented” here in the context of having free and available space within the index structure which can not actually be used by Oracle for subsequent insert operations. The space is effectively wasted and can only usually be cleaned up after an index maintenance operation such as a rebuild or coalesce.
However, there are a number of scenarios where an index can become fragmented, despite the lack of associated delete or update operations.
One example is where a non-unique index stores large number of duplicate values, indeed so many duplicates of a specific value that it actually requires many index leaf blocks to store all occurrences of each specific value.
The first point to make here is that with Oracle B-Tree indexes, there is actually no such thing as a non-unique index. All indexes, even those defined as non-unique, are actually unique indexes “behind the covers”.
For the simple reason that if we were to (say) delete a specific row in a table, Oracle would need to delete the associated index entry. If there were (say) 1000 occurrences of the indexed value, Oracle can’t simply delete any of the 1000 index entries, it must delete the specific index entry that has the corresponding rowid of the row being deleted.
Therefore, in order for Oracle to quickly find this associated index entry and be able to immediately find the related index leaf block of interest, all the index entries within the same index value are sorted by the rowid. If this wasn’t the case, Oracle would be forced to access the first leaf block that contains an index value of interest and be forced to search through all the duplicate index entries until it finds the specific one Oracle is after, perhaps reading many index leaf blocks in the process.
Therefore, for a specific indexed value, all index entries are sorted by the corresponding rowid. Oracle physically implements this by making the rowid an additional index column value within the index.
A unique index entry basically looks like this, courtesey of a block dump:
row#0 flag: ——, lock: 2, len=20, data:(6): 01 42 1c 7a 00 00
col 0; len 11; (11): 44 41 56 49 44 20 42 4f 57 49 45
Note: the 6 byte rowid is not stored as an index column entry (the index entry only has the one column, starting with 0).
However, a corresponding non-unique index entry look like this:
row#0 flag: ——, lock: 0, len=21
col 0; len 11; (11): 44 41 56 49 44 20 42 4f 57 49 45
col 1; len 6; (6): 01 42 1c 7a 00 00
Note: the rowid is treated and stored as additional index column within the index entry, and hence requires an additional byte per index entry to store the column length value. The rowid is now a second column (number 1, remembering columns start with 0) in the index.
This effectively makes each and every index entry unique, even for non-unique indexes, as the combination of index value and rowid must be unique for a given table.
Now when a new index entry is inserted into an index, Oracle is very specific where the index entry is housed, it must be housed in the index leaf block such that the order of the rowids within duplicate index entries is maintained. The rowid is basically just another column within the index and like all index columns, it must be stored and sorted in the index entry order.
Now here’s a funny thing with rowids. When new rows are added to a table, the rowids generally increase in value. Although this is not always the case and there are exceptions, this is generally the trend with most rowids within most tables.
A (restricted) rowid consists of a relative file id as the leading portion of the rowid. As a tablespace grows and new files are added to the tablespace, generally the subsequent relative files ids increase and so the subsequent rowid increases as well.
The next portion of the rowid is the block id. As a datafile within a tablespace fills up and allocates new extents to a table, the block ids used within the new extent increase and so the rowids increase in value as well. Also as new blocks get used within an extent and the high water mark of a table increases, the rowids increase as well.
Yes there are always exceptions. For example, a newly allocated extent could use space previously allocated to a dropped segment and so use block ids in a range lower than those previously allocated. But they’re the exception rather than the rule. Even with ASSM tablespaces and segments with multiple freelists/freelist groups where multiple blocks can be inserted into concurrently, the trend is still for the associated rowids to generally increase over time.
Now getting back to our index with lots of duplicate index values. What happens when a block is filled and a 50-50 block split eventuates ? 1/2 the entries go into one block and 1/2 the entries go into the newly allocated block and the new index entry goes where ? Well, if the block is full of nothing but the same index entry because we have lots of duplicate values and the new index entry is associated with the maximum rowid for the specific index value, it goes into the newly allocated leaf block because remember, all index entries must maintain their logical sorted order.
In fact, if most/all of the subsequent inserts of the duplicate value are also associated with the maximum rowid currently assigned to the index entry, all the subsequent inserts now go into the newly allocated block. The 50% of free space remaining in the previous leaf block doesn’t get used.
Once the newly allocated block also gets filled, again with the same duplicate index values, a 50-50 block split occurs and again, most if not all the subsequent inserts go into the newly allocated leaf block. We now have two, 1/2 empty leaf blocks that has space that can not effectively not be reused because they are filled the same index value but with rowid values less than those being currently allocated.
And so the cycle continues with these duplicate index entries and any other duplicate index values that span multiple leaf blocks …
This can easily be illustrated with the following example. First, create a simple table and index, and initially insert a row with a large index entry to prevent subsequent 90-10 block splits from occurring:
SQL> CREATE TABLE common_values (id NUMBER, common VARCHAR2(10));
SQL> CREATE INDEX common_values_i ON common_values(common);
SQL> INSERT INTO common_values VALUES (1, ‘ZZZ’);
1 row created.
Next, populate the table with lots of duplicate values:
2 FOR i IN 1..90000 LOOP
4 WHEN mod(i,3) = 0 THEN INSERT INTO common_values VALUES (i, ‘AAA’);
5 WHEN mod(i,3) = 1 THEN INSERT INTO common_values VALUES (i, ‘BBB’);
6 WHEN mod(i,3) = 2 THEN INSERT INTO common_values VALUES (i, ‘CCC’);
7 END CASE;
8 END LOOP;
SQL> ANALYZE INDEX common_values_i VALIDATE STRUCTURE;
Now lets see how much of the allocated index space is actually used:
SQL> SELECT BTREE_SPACE, USED_SPACE, PCT_USED FROM INDEX_STATS;
BTREE_SPACE USED_SPACE PCT_USED
———– ———- ——–
2648032 1355551 52
As we can see, only 52% of the index is actually being used, a value much less than would be expected of most randomly inserted indexes.
Now of course if we had an index with few distinct values, it’s unlikely to be considered by the CBO. However, it’s not uncommon to index column values that have uneven distribution of values, with the index being used to retrieve data for those columns with relatively few occurances of specific values. In these scenarios, it’s possible to have a large portion of the index with poor index space utilisation due to this issue. It’s unlikely to impact performance because the poorly fragmented portion of the index is not usually used, but it’s something to consider if you wish to reclaim wasted index storage or you do indeed have Fast Full Index Scan operations that need addressing.