jump to navigation

Updates and Indexes Part I (Fashion) January 17, 2009

Posted by Richard Foote in Index Delete Operations, Oracle Indexes, Oracle Myths, Update Indexes.
27 comments

Let’s start with a few basic concepts.

Simplistically, Oracle doesn’t generally care where or in what order data is stored in heap tables, unless the table is partitioned or clustered. Therefore, when an update operation is performed on a column value, Oracle can basically make the change “in place” within the table block. If the value of a name column changes from say “BOWIE” to a new value of “FOOTE”, Oracle can simply just make the necessary change to the specific table block. Only if the new value is larger and there’s not sufficient space within the current table block for the new value does the picture get a little more complicated, with Oracle having to migrate the row.

However, the story isn’t quite so simple for corresponding changes to index column values. The key difference (no pun intended) is that unlike most heap tables, the location and order of the data within the index is very much an issue. All index entries must be and must always be in the order of the indexed columns. Otherwise, it would be impossible for Oracle to efficiently navigate down the index structure to find within the index all the  necessary indexed values required for an index range scan. Imagine for one moment how difficult it would be to find someones phone details if the telephone book wasn’t ordered on names or to use an index in the back of a reference book if the index wasn’t ordered.

The order of an index is crucial and must always be in the order of the indexed column values. Therefore, if a specific indexed value of  say “BOWIE” is updated to a new value of “FOOTE”, Oracle can’t simply make the change “in place” within the leaf block as this would result in the index order no longer being maintained. The new value of “FOOTE” would be surrounded by index values starting with “B” and there would be no easy way to subsequently find it within the index. It would likely not even be in the same index leaf block as those other index entries beginning with “F” where the new value should actually reside.

Therefore, Oracle doesn’t actually “update” an index value. The old index entry is marked as deleted and the a new index entry is inserted in the appropriate location within the index to ensure the index order is maintained. So the update of an index value is effectively a delete operation followed by an insert.

Simple example to demonstrate. Let’s first create a little table and index.

SQL> create table test_update (id number, name varchar2(10));

Table created.

SQL> create index test_update_idx on test_update (name);

Index created.

Now, let’s insert one row.

SQL> insert into test_update values (1, ‘BOWIE’);

1 row created.

SQL> commit;

Commit complete.

Let’s have a quick look at a block dump of our little index. As the index only contains the one index entry, the index will only consist of a single leaf block. This specific block is located next to the segment header of the index.

SQL> select header_file, header_block from dba_segments where segment_name = ‘TEST_UPDATE_IDX’;

HEADER_FILE HEADER_BLOCK
----------- ------------
          7       118537

We just need to add 1 to the header_block to get us the specific block id we need to dump.

SQL> alter system dump datafile 7 block 118538;

System altered.

Following is the portion of interest from the index block dump:

Leaf block dump
===============
header address 425361500=0x195a805c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 1
kdxcofbo 38=0x26
kdxcofeo 8021=0x1f55
kdxcoavs 7983
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8021] flag: ——, lock: 2, len=15
col 0; len 5; (5):  42 4f 57 49 45
col 1; len 6; (6):  01 c1 ce 8a 00 00

—– end of leaf block dump —–

We note for now that the index indeed just has the one index entry (with a row# 0).

Let’s now update this indexed column and see what impact this has in the index.

SQL> update test_update set name = ‘ZIGGY’ where id = 1;

1 row updated.

SQL> commit;

Commit complete.

SQL> alter system dump datafile 7 block 118538;

System altered.

New block dump follows:

Leaf block dump
===============
header address 425361500=0x195a805c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 8006=0x1f46
kdxcoavs 7966
kdxlespl 0
kdxlende 1
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8021] flag: —D–, lock: 2, len=15
col 0; len 5; (5):  42 4f 57 49 45
col 1; len 6; (6):  01 c1 ce 8a 00 00
row#1[8006] flag: ——, lock: 2, len=15
col 0; len 5; (5):  5a 49 47 47 59
col 1; len 6; (6):  01 c1 ce 8a 00 00
—– end of leaf block dump —–

We notice of few interesting things. Firstly, even though we’ve only ever inserted the one row in the table, the index now actually has two index entries (note the index row count kdxconro value has increased from 1 to 2 and we can see two actual index entries, row#0 and row#1).

Another interesting point is that the initial index entry has been marked as deleted (via the “D” flag highlighted in blue) and that the deleted row count kdxlende value has gone up to 1.

So yes indeed, an update index operation actually consists of a delete operation followed by an insert, even if the new index value were to reside in the same index leaf block as the previous value. Note the deleted index entry isn’t actually physically deleted, it’s only marked as deleted and so continues to consume space within the index leaf block.

If we look at the index_stats for this index:

SQL> analyze index test_update_idx validate structure;

Index analyzed.

SQL> select del_lf_rows from index_stats …

DEL_LF_ROWS
-----------
          1

We notice that we do indeed have a deleted index entry listed in the statistics.

Now it’s at this point of the story when some people go “Aaah haa !! So if we have lots of updates, we effectively have lots deletes and we therefore have lots of deleted space that wastes space within the index. Therefore we would need to periodically rebuild such an index as the deleted space will just continue to grow, making the index larger and less efficient over time”.

Not necessarily.

The point that many don’t quite understand is that this “wasted” delete space can be subsequently reused by Oracle. It’s just free space that in the vast majority of indexes is automatically cleaned out and reused by Oracle.

To illustrate, let’s now insert a second row into this table. Notice the new row has an indexed value that is completely different to the previous values but because we still have available space in our current index leaf block, Oracle will simply insert the new value in this leaf block as well.

SQL> insert into test_update values (2, ‘PINK FLOYD’);

1 row created.

SQL> commit;

Commit complete.

SQL> alter system dump datafile 7 block 118538;

System altered.

Let’s have a look at our block dump now …

Leaf block dump
===============
header address 425361500=0x195a805c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 2
kdxcofbo 40=0x28
kdxcofeo 7986=0x1f32
kdxcoavs 7961
kdxlespl 0
kdxlende 0
kdxlenxt 0=0x0
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[7986] flag: ——, lock: 2, len=20
col 0; len 10; (10):  50 49 4e 4b 20 46 4c 4f 59 44
col 1; len 6; (6):  01 c1 ce 8a 00 01
row#1[8006] flag: ——, lock: 0, len=15
col 0; len 5; (5):  5a 49 47 47 59
col 1; len 6; (6):  01 c1 ce 8a 00 00
—– end of leaf block dump —–

Now isn’t that interesting. The previously marked deleted index entry has disappeared. It’s been automatically cleaned out by Oracle and the deleted row count value kdxlende is now back to 0

We only have an index row count  kdxconro value of 2 for the two actual index entries corresponding to the two rows in the table. The previously deleted index entry is no longer recorded or stored in any way within the index.

When the new index entry was inserted, Oracle basically needed to reorganise the index leaf block to accommodate for the new index entry and automatically cleaned out any deleted index entries there may have been in the leaf block in the process. That’s all it takes to clean out deleted index entries from an index leaf block, just one subsequent insert in the leaf block.

If we now look at the index_stats …

SQL> analyze index test_update_idx validate structure;

Index analyzed.

SQL> select del_lf_rows from index_stats;

DEL_LF_ROWS
-----------
          0

We note that indeed, no deleted index entries are now stored within the index. Periodically rebuilding such an index may not be necessary after all …

Regular readers will recall I’ve previously discussed how deleted index entries are generally cleaned out by Oracle automatically. Sometimes, it’s worth hammering the point a few times 😉

More on the topic of update operations and indexes to come …

How To Rebuild And Make An Index Bigger, Not Smaller (Carry That Weight) January 13, 2009

Posted by Richard Foote in Index Shrink, Oracle Myths.
25 comments

I sometimes hear suggestions along the lines of:

 “when you rebuild an index, at least you make the index as small and efficient as possible, even if it doesn’t necessarily improve performance”

or

“when you rebuild an index, at least you’ll always save some space and storage if nothing else”.

However, this of course is not necessarily the case. There are many scenarios where by rebuilding an index, you can actually make the index bigger, not smaller, you can make the index use more storage, not less storage, you can make the index less efficient, not more efficient and hence you can make the exercise worse than useless, not just useless.

It all depends.

Here are just a few little examples whereby rebuilding the index has resulted in a larger index than it was before the index was rebuilt. Note in all of these scenarios, I’ll be sticking with the default PCTFREE value of 10%, which is by far the most commonly set PCTFREE value of every Oracle index out there in existence.

The first example is very simple and so very common. It’s simply an index on a monotonically increasing value, as used by many Primary Keys out there in the “real world”. 

First we create the table and associated index. Nice and simple …

SQL> create table bowie (id number);

Table created.

SQL> create index bowie_i on bowie(id);

Index created.

We now populate the index with a whole bunch of monotonically increasing values …

SQL> insert into bowie select rownum from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete

Let’s see how big our index might now be …

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2048    1999    16024128      100

Note the index has a “perfect” PCT_USED value of 100%, you can’t get any better than that !! Why ? Because as the index is on a monotonically increasing value, all the inserts take place on the “right hand side” of the index and Oracle performs it’s 90-10 index block split operation, rather than the 50-50 block split. 

If we try and rebuild such an index …

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2226    17848160       90

We notice that the index is now actually bigger, not smaller. Not only that, but we have now introduced 10% free space that will likely not be reused.

 

Another example.

This time we build a similar table but this time populate it with randomly generated numbers, not monotonically increasing values.

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number);

Table created.

SQL> insert into bowie select ceil(dbms_random.value(0,100000)) from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Let’s now create an index on this numeric column …

SQL> create index bowie_i on bowie(id);

Index created.

Let’s now add more random numbers to the table …

SQL> insert into bowie select ceil(dbms_random.value(0,100000)) from dual connect by level <=50000;

50000 rows created.

SQL> commit;

Commit complete.

OK, let’s check out the index …

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2214    17760192       95

OK, the index has a PCT_USED value of 95%.

However, after the rebuild …

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2432    2324    18640192       90

The index is now bigger, with more leaf blocks and a worse PCT_USED value.

If we rebuild an index before the index has had the opportunity to use its available free space, an index rebuild can introduce more free space, not less. Now this may not necessarily be a bad thing, but it’s another example of the index potentially being bigger, not smaller after a rebuild.

 

Here’s yet another example.

Similar to the previous example, but this time we’ll throw in a whole bunch of update statements as well. As we know, an update statement within an index is effectively a delete statement, followed by an insert statement. All that “wasted” deleted space should be nicely cleaned up after an index rebuild, resulting in a smaller index structure. Or so we might think …

First, create and populate the table with randomly generated numbers similar to before …

SQL> drop table bowie;

Table dropped.

SQL> create table bowie (id number, value number);

Table created.

SQL> insert into bowie select rownum, ceil(dbms_random.value(0,100000)) from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

Now let’s create the index and aftewards insert a bunch of new rows as before.

SQL> create index bowie_i on bowie(value);

Index created.

SQL> insert into bowie select rownum+1000000, ceil(dbms_random.value(0,100000)) from dual connect by level <=50000;

50000 rows created.

SQL> commit;

Commit complete.

Now let’s run a little procedure that will update roughly 10% of all the data (a  reasonable overall percentage), with new random numbers.

SQL> begin
  2  for i in 1..105000 loop
  3    update bowie set value = ceil(dbms_random.value(0,100000))
  4    where id = i;
  5    commit;
  6  end loop;
  7  end;
  8  /

PL/SQL procedure successfully completed.

OK, let’s now look at the index, wasted space and all …

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2304    2234    17952320       94

Interestingly, even with all the random updates (and associated deletes), the index is relatively efficient at 2234 leaf blocks.

Let’s see how many of the 105,000 updated entries have remained deleted:

SQL> select del_lf_rows from index_stats;

DEL_LF_ROWS
-----------
       1129

Not many at all, just 1129 deleted entries remain, less than 1% !!

In fact, the vast majority of the deleted entries have all been automatically cleaned out by Oracle. Much more on the myth that an index with lots of update activity needs to be rebuilt to come in the near future …

Let’s rebuild this index now.

SQL> alter index bowie_i rebuild;

Index altered.

SQL> analyze index bowie_i validate structure;

Index analyzed.

SQL> select blocks, lf_blks, btree_space, pct_used from index_stats;

BLOCKS LF_BLKS BTREE_SPACE PCT_USED
------ ------- ----------- --------
  2432    2324    18640192       90

The index is again bigger than it was previously. Even after we’ve added in a bunch of rows and updated 10% of the entire table.

The point here of course is that even if an index is heavily updated, the deleted space can generally be subsequently reused. Like I said, more on all this in a future post.

For now however, just note that rebuilding an index doesn’t necessarily mean the resultant index will now be smaller or more efficient or more pristine. In many cases, rebuilding an index blindly can result in a larger index structure than it was prior to the rebuild.

Bear this in mind the next time you read or hear someone suggest an index rebuild will always ensure the new index is smaller and as efficient as possible …

Possible To Shrink A Newly Created Index ? (Just) January 5, 2009

Posted by Richard Foote in Shrink Index.
8 comments

OK, first post of the year !!

I thought I might answer a recent question by Alon Principle on my post Index Rebuild vs. Coalesce vs. Shrink Space as it’s an interesting one which deserves a bit of a discussion.

Basically the question is why right after an index has been created has the size of the index reduced after a subsequent Shrink Space command ? Shouldn’t the index be as efficient as possible after it’s been created and yet the index has indeed shrunk with the number of blocks and the overall size of the index reducing ?

Good question !!

Now I was initially a little confused as indeed, after an index has been created, it’s as efficient as the the value of PCTFREE allows. As I discuss in the above mentioned post, a Shrink Space command will reorganise the index in a similar manner to the coalesce but importantly ensure that any freed blocks are at the physical”end” of the segment so that any empty blocks can be de-allocated and returned to the tablespace for reuse by potentially another segments.

Therefore, the Shrink Space command should not reduce the number of leaf blocks for a freshly created index at all …

So I asked Alon for some more information and he kindly provided a demo which immediately told me what the issue was.

BTW, here’s a bit of advice. If someone asks a question and the information provided is not sufficient to provide a meaningful answer, there is absolutely nothing wrong, rude, disrespectful or insulting in asking for more information (as claimed by certain folk I could mention). It’s actually the most appropriate and respectful manner in which to correctly diagnose and provide the most appropriate and accurate answer or solution to a question. When you see a doctor and mention you’re ill, you expect the doctor to ask a few prudent questions to determine what’s actually wrong with you rather than having a doctor say “Oh, I had someone who said they were ill last week, here just take these pills, it worked for them” !! The same goes for our profession …

But I disgress.

OK, the first important thing to note is that to be able to shrink a segment, it must be in an Automatic Segment Space Management (ASSM) tablespace. However, the extent management policy of the tablespace can be either Uniform or Autoallocate, it doesn’t matter. If the tablespace has Uniform extent sizes, then basically all the extents have a uniform (or the same) size, if it’s autoallocate, then Oracle decides the appropriate extent sizes and extent sizes can differ for a specific segment as it grows.

The next important point to note is what do we actually mean by the “size” of an index. Do we mean the actual number of leaf (and branch) blocks that are currently assigned to the index “structure” or do we mean the number of blocks currently assigned to the index “segment” ? It’s an important distinction in this discussion.

In the demo provided by Alon, it showed a new single column table being populated with numbers, a new index being created, with the bytes, the number of extents and blocks in the index “segment” listed from user_segments. The index was immediately shrunk and indeed the bytes and the number of blocks reduced, but the number of extents remained the same.

It’s all I needed, I immediately had the answer to the question (as indeed I now knew exactly what was meant by the question).

The tablespace in question must have been a ASSM tablespace and it was only the size of the index “segment” that had reduced, not the size of the index structure. The index structure however would be exactly the same size as it had previously, with the index just as “efficient” as it had been previously.

Indeed, the Shrink Space command would have gone through the index structure and ultimately would have had nothing to do as each leaf block within the index was packed as efficiently as possible (based on whatever PCTFREE value is set).

BUT, there may very well be some blocks within the last allocated extent that may not yet have been used within the index structure. They currently belong within the last extent assigned to the index segment, but they’re currently above the High Water Mark (HWM) of the segment.

Now, if the tablespace were uniformly managed, then these “unused” blocks can’t be de-allocated and returned to the tablespace, because the extents within the tablespace  need to remain uniform. Therefore there would be no effect of the Shrink Space command on such a segment.

BUT if the tablespace was ASSM with autoallocate, then the extents are controlled and managed by Oracle and can differ in size, both within the tablespace and for a specific segment. Therefore the Shrink Space command can “trim” off the unused blocks in the last extent.

Note if the segment was created with too large a Minextents value and not all the allocated extents have been used, a Shrink Space command could deallocate space from more than one extent, including a Uniform tablespace, but only if entire extents currently exisit above the HWM.

This demo on Shrinking A Newly Created Index expands a little on the demo provided by Alon and shows how the last extent of a freshly created index in an ASSM LMT with autoallocate has been “trimmed” by the Shrink Space command, but has had no effect on an identical segment created in a Uniform LMT. In both examples however, the actual number of leaf blocks in the indexes remains the same as the index structures themselves are as efficient as can be.