Empty Leaf Blocks After Rollback Part I (Empty Spaces) June 23, 2015
Posted by Richard Foote in Index Coalesce, Oracle Indexes, Update Indexes.trackback
There’s been an interesting recent discussion on the OTN Database forum regarding “Index blank blocks after large update that was rolled back“. Setting aside the odd scenario of updating a column that previously had 20 million distinct values to the same value on a 2 billion row table, the key questions raised are why the blank index leaf blocks and why the performance degradation after the update failed and rolled back.
This is a topic I’ve previously discussed in the blog such as in Updates and Indexes and Differences Between Unique and Non-Unique Indexes. However, I though it might be worthwhile discussing these again the in context of the rollback of a large update-based transaction.
The key point to make is that an Update is actually a delete/insert operation in the context of indexes. So if we perform a large update, all the previous indexed values are marked as deleted in the index and the new values re-inserted elsewhere within the index structure, potentially filling up a whole bunch of new leaf blocks. If we then decide to rollback the transaction (or the transaction fails and automatically rolls back), then all these newly inserted index entries are deleted potentially leaving behind now empty new leaf blocks in the expanded index structure. Here’s the thing, Oracle will roll back changes to index entries but not changes to the index structure such as block splits.
If an index scan is forced to navigate through these empty leaf blocks, this can indeed potentially have a detrimental impact on subsequent performance.
However, depending on whether the index is Unique or Non-Unique and the type of update being performed, the impact on the index could be quite different.
To illustrate all this, a simple demo.
Let’s start with a simple table and Non-Unique index on the (effectively) unique ID column:
SQL> create table bowie (id number, name varchar2(42)); Table created. SQL> insert into bowie select rownum, 'DAVID BOWIE' from dual connect by level <=10000; 10000 rows created. SQL> commit; Commit complete. SQL> create index bowie_id_i on bowie(id); Index created.
If we look at an index tree dump of this index:
SQL> select object_id from dba_objects where object_name='BOWIE_ID_I'; OBJECT_ID ---------- 98700 SQL> alter session set events 'immediate trace name treedump level 98700'; Session altered.
—– begin tree dump
branch: 0x1800523 25167139 (0: nrow: 21, level: 1)
leaf: 0x1800524 25167140 (-1: row:485.485 avs:828)
leaf: 0x1800525 25167141 (0: row:479.479 avs:820)
leaf: 0x1800526 25167142 (1: row:479.479 avs:820)
leaf: 0x1800527 25167143 (2: row:479.479 avs:820)
leaf: 0x1800528 25167144 (3: row:479.479 avs:820)
leaf: 0x1800529 25167145 (4: row:479.479 avs:819)
leaf: 0x180052a 25167146 (5: row:479.479 avs:820)
leaf: 0x180052b 25167147 (6: row:479.479 avs:820)
leaf: 0x180052c 25167148 (7: row:479.479 avs:820)
leaf: 0x180052d 25167149 (8: row:479.479 avs:819)
leaf: 0x180052e 25167150 (9: row:479.479 avs:820)
leaf: 0x180052f 25167151 (10: row:479.479 avs:820)
leaf: 0x1800531 25167153 (11: row:479.479 avs:820)
leaf: 0x1800532 25167154 (12: row:479.479 avs:820)
leaf: 0x1800533 25167155 (13: row:479.479 avs:819)
leaf: 0x1800534 25167156 (14: row:479.479 avs:820)
leaf: 0x1800535 25167157 (15: row:479.479 avs:820)
leaf: 0x1800536 25167158 (16: row:479.479 avs:820)
leaf: 0x1800537 25167159 (17: row:479.479 avs:820)
leaf: 0x1800538 25167160 (18: row:479.479 avs:819)
leaf: 0x1800539 25167161 (19: row:414.414 avs:1795)
—– end tree dump
We notice the index is nice and compact with just the default 10% of free space.
I’m going to start by performing an update of the ID column which simply increments the ID by 1 for all rows in the table. But then, rather than commit, I’ll roll back the transaction:
SQL> update bowie set id=id+1; 10000 rows updated. SQL> rollback; Rollback complete.
Now the interesting thing to note here is that for each ID value, we temporarily have the same value twice as we progress and update each ID value (for example, for ID=1, it becomes 2 which already exists. Then the previous ID=2 becomes 3 which already exists, etc.). As the index is Non-Unique, this means when we update say ID=1 to 2, we need mark as deleted the index entry with ID=1 and insert a new index entry with an ID=2. When we update the previous ID=2 to 3, we again mark as deleted the previous indexed value of 2 and insert a new index entry of 3. Etc. Etc.
As we only have 10% of free space available in the index before the update, by updating all rows in this fashion, it means we have to keep performing 50-50 block splits to fit in the new index entries in the corresponding leaf blocks. This effectively results in the index doubling in size as we now have twice the number of index entries (with the previous index entries now marked as deleted).
However, having now performed all these index block splits, if we now roll back the update transaction, it simply means that all the new index entries are deleted and the delete byte removed from the previously deleted entries, with the index structure retaining its newly bloated size. The resultant index block splits are not rolled back. If we look at a new index tree dump:
—– begin tree dump
branch: 0x1800523 25167139 (0: nrow: 43, level: 1)
leaf: 0x1800524 25167140 (-1: row:223.223 avs:4756)
leaf: 0x180053e 25167166 (0: row:262.262 avs:4072)
leaf: 0x1800525 25167141 (1: row:242.242 avs:4373)
leaf: 0x180053f 25167167 (2: row:237.237 avs:4447)
leaf: 0x1800526 25167142 (3: row:217.217 avs:4747)
leaf: 0x180053a 25167162 (4: row:262.262 avs:4073)
leaf: 0x1800527 25167143 (5: row:217.217 avs:4747)
leaf: 0x180053b 25167163 (6: row:262.262 avs:4073)
leaf: 0x1800528 25167144 (7: row:217.217 avs:4747)
leaf: 0x180053c 25167164 (8: row:262.262 avs:4073)
leaf: 0x1800529 25167145 (9: row:217.217 avs:4747)
leaf: 0x180053d 25167165 (10: row:262.262 avs:4072)
leaf: 0x180052a 25167146 (11: row:217.217 avs:4747)
leaf: 0x1800545 25167173 (12: row:262.262 avs:4073)
leaf: 0x180052b 25167147 (13: row:217.217 avs:4747)
leaf: 0x1800546 25167174 (14: row:262.262 avs:4073)
leaf: 0x180052c 25167148 (15: row:217.217 avs:4747)
leaf: 0x1800547 25167175 (16: row:262.262 avs:4073)
leaf: 0x180052d 25167149 (17: row:217.217 avs:4747)
leaf: 0x1800541 25167169 (18: row:262.262 avs:4072)
leaf: 0x180052e 25167150 (19: row:217.217 avs:4748)
leaf: 0x180054e 25167182 (20: row:262.262 avs:4072)
leaf: 0x180052f 25167151 (21: row:136.136 avs:5962)
leaf: 0x180054f 25167183 (22: row:133.133 avs:6006)
leaf: 0x1800542 25167170 (23: row:210.210 avs:4852)
leaf: 0x1800531 25167153 (24: row:217.217 avs:4747)
leaf: 0x1800543 25167171 (25: row:262.262 avs:4073)
leaf: 0x1800532 25167154 (26: row:217.217 avs:4747)
leaf: 0x1800544 25167172 (27: row:262.262 avs:4073)
leaf: 0x1800533 25167155 (28: row:217.217 avs:4747)
leaf: 0x180054d 25167181 (29: row:262.262 avs:4072)
leaf: 0x1800534 25167156 (30: row:217.217 avs:4748)
leaf: 0x1800549 25167177 (31: row:262.262 avs:4072)
leaf: 0x1800535 25167157 (32: row:217.217 avs:4747)
leaf: 0x180054a 25167178 (33: row:262.262 avs:4073)
leaf: 0x1800536 25167158 (34: row:217.217 avs:4747)
leaf: 0x180054b 25167179 (35: row:262.262 avs:4073)
leaf: 0x1800537 25167159 (36: row:217.217 avs:4747)
leaf: 0x180054c 25167180 (37: row:262.262 avs:4073)
leaf: 0x1800538 25167160 (38: row:256.256 avs:4162)
leaf: 0x1800548 25167176 (39: row:223.223 avs:4657)
leaf: 0x1800539 25167161 (40: row:152.152 avs:5722)
leaf: 0x1800555 25167189 (41: row:262.262 avs:4073)
—– end tree dump
The index is now approximately double the size with each leaf block now approximately 1/2 empty (or 1/2 full if you’re the positive type).
If we now perform another update, but this time update all the IDs to values that don’t currently exist (by simply adding 10000 to the ID) and then again rollback:
SQL> update bowie set id=id+10000; 10000 rows updated. SQL> rollback; Rollback complete.
—– begin tree dump
branch: 0x1800523 25167139 (0: nrow: 73, level: 1)
leaf: 0x1800524 25167140 (-1: row:223.223 avs:4756)
leaf: 0x180053e 25167166 (0: row:262.262 avs:4072)
leaf: 0x1800525 25167141 (1: row:242.242 avs:4373)
leaf: 0x180053f 25167167 (2: row:237.237 avs:4447)
leaf: 0x1800526 25167142 (3: row:217.217 avs:4747)
leaf: 0x180053a 25167162 (4: row:262.262 avs:4073)
leaf: 0x1800527 25167143 (5: row:217.217 avs:4747)
leaf: 0x180053b 25167163 (6: row:262.262 avs:4073)
leaf: 0x1800528 25167144 (7: row:217.217 avs:4747)
leaf: 0x180053c 25167164 (8: row:262.262 avs:4073)
leaf: 0x1800529 25167145 (9: row:217.217 avs:4747)
leaf: 0x180053d 25167165 (10: row:262.262 avs:4072)
leaf: 0x180052a 25167146 (11: row:217.217 avs:4747)
leaf: 0x1800545 25167173 (12: row:262.262 avs:4073)
leaf: 0x180052b 25167147 (13: row:217.217 avs:4747)
leaf: 0x1800546 25167174 (14: row:262.262 avs:4073)
leaf: 0x180052c 25167148 (15: row:217.217 avs:4747)
leaf: 0x1800547 25167175 (16: row:262.262 avs:4073)
leaf: 0x180052d 25167149 (17: row:217.217 avs:4747)
leaf: 0x1800541 25167169 (18: row:262.262 avs:4072)
leaf: 0x180052e 25167150 (19: row:217.217 avs:4748)
leaf: 0x180054e 25167182 (20: row:262.262 avs:4072)
leaf: 0x180052f 25167151 (21: row:136.136 avs:5962)
leaf: 0x180054f 25167183 (22: row:133.133 avs:6006)
leaf: 0x1800542 25167170 (23: row:210.210 avs:4852)
leaf: 0x1800531 25167153 (24: row:217.217 avs:4747)
leaf: 0x1800543 25167171 (25: row:262.262 avs:4073)
leaf: 0x1800532 25167154 (26: row:217.217 avs:4747)
leaf: 0x1800544 25167172 (27: row:262.262 avs:4073)
leaf: 0x1800533 25167155 (28: row:217.217 avs:4747)
leaf: 0x180054d 25167181 (29: row:262.262 avs:4072)
leaf: 0x1800534 25167156 (30: row:217.217 avs:4748)
leaf: 0x1800549 25167177 (31: row:262.262 avs:4072)
leaf: 0x1800535 25167157 (32: row:217.217 avs:4747)
leaf: 0x180054a 25167178 (33: row:262.262 avs:4073)
leaf: 0x1800536 25167158 (34: row:217.217 avs:4747)
leaf: 0x180054b 25167179 (35: row:262.262 avs:4073)
leaf: 0x1800537 25167159 (36: row:217.217 avs:4747)
leaf: 0x180054c 25167180 (37: row:262.262 avs:4073)
leaf: 0x1800538 25167160 (38: row:256.256 avs:4162)
leaf: 0x1800548 25167176 (39: row:223.223 avs:4657)
leaf: 0x1800539 25167161 (40: row:152.152 avs:5722)
leaf: 0x1800555 25167189 (41: row:262.262 avs:4073)
leaf: 0x1800552 25167186 (42: row:0.0 avs:8000)
leaf: 0x1800553 25167187 (43: row:0.0 avs:8000)
leaf: 0x1800551 25167185 (44: row:0.0 avs:8000)
leaf: 0x1800556 25167190 (45: row:0.0 avs:8000)
leaf: 0x1800557 25167191 (46: row:0.0 avs:8000)
leaf: 0x1800554 25167188 (47: row:0.0 avs:8000)
leaf: 0x180055d 25167197 (48: row:0.0 avs:8000)
leaf: 0x180055e 25167198 (49: row:0.0 avs:8000)
leaf: 0x180055f 25167199 (50: row:0.0 avs:8000)
leaf: 0x1800558 25167192 (51: row:0.0 avs:8000)
leaf: 0x1800559 25167193 (52: row:0.0 avs:8000)
leaf: 0x1800565 25167205 (53: row:0.0 avs:8000)
leaf: 0x1800566 25167206 (54: row:0.0 avs:8000)
leaf: 0x1800567 25167207 (55: row:0.0 avs:8000)
leaf: 0x1800561 25167201 (56: row:0.0 avs:8000)
leaf: 0x180055c 25167196 (57: row:0.0 avs:8000)
leaf: 0x180055a 25167194 (58: row:0.0 avs:8000)
leaf: 0x180055b 25167195 (59: row:0.0 avs:8000)
leaf: 0x1800563 25167203 (60: row:0.0 avs:8000)
leaf: 0x1800564 25167204 (61: row:0.0 avs:8000)
leaf: 0x180056d 25167213 (62: row:0.0 avs:8000)
leaf: 0x180056e 25167214 (63: row:0.0 avs:8000)
leaf: 0x180056f 25167215 (64: row:0.0 avs:8000)
leaf: 0x1800568 25167208 (65: row:0.0 avs:8000)
leaf: 0x1800569 25167209 (66: row:0.0 avs:8000)
leaf: 0x180056a 25167210 (67: row:0.0 avs:8000)
leaf: 0x180056b 25167211 (68: row:0.0 avs:8000)
leaf: 0x180056c 25167212 (69: row:0.0 avs:8000)
leaf: 0x1800562 25167202 (70: row:0.0 avs:8000)
leaf: 0x1800575 25167221 (71: row:0.0 avs:8000)
—– end tree dump
As all the inserts now occurred in the right-hand most side of the index, Oracle allocated a bunch of new index leaf blocks via 90-10 block splits to store all the new index entries. After the rollback however, all these new entries were removed leaving behind nothing but these new empty leaf blocks which are still part of the overall index structure.
Query performance now depends on what part of the index we need to access.
If we just want to select a single value, then no problem as the ID column is effectively unique and we just need to generally access down to the one leaf block:
SQL> select * from bowie where id=42; ID NAME ---------- ------------------------------------------ 42 DAVID BOWIE Execution Plan ---------------------------------------------------------- Plan hash value: 1405654398 -------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost(%CPU)| Time | -------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 36 | 2 (0)| 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE | 1 | 36 | 2 (0)| 00:00:01 | |* 2 | INDEX RANGE SCAN | BOWIE_ID_I | 1 | | 1 (0)| 00:00:01 | -------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ID"=42) Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 618 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
4 consistent gets is about as good as it gets for a non-unique Blevel 1 index.
Larger index range scan might need to access additional leaf blocks as they now only contain 1/2 the number of index entries than before, although the additional overhead of such scans would still likely be minimal as most of the work is associated with visiting the table blocks.
One of the worst case scenarios would be having to now plough through all these empty leaf blocks as with the following search for the max ID value:
SQL> select max(id) from bowie; MAX(ID) ---------- 10000 Execution Plan ---------------------------------------------------------- Plan hash value: 3831694747 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| BOWIE_ID_I | 1 | 13 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 32 consistent gets 0 physical reads 0 redo size 541 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Oracle uses the Index Full (Min/Max) Scan by starting with the right-most leaf block but as it’s empty, Oracle is forced to make its way across through all the empty leaf blocks until it finally hits upon the first non-empty leaf block that contains the max ID. The excessive 32 consistent gets is due to having to access all these new empty blocks.
If we now Coalesce the index and try again:
SQL> alter index bowie_id_i coalesce; Index altered. SQL> select max(id) from bowie; MAX(ID) ---------- 10000 Execution Plan ---------------------------------------------------------- Plan hash value: 3831694747 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 13 | 2 (0)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 13 | | | | 2 | INDEX FULL SCAN (MIN/MAX)| BOWIE_ID_I | 1 | 13 | 2 (0)| 00:00:01 | ----------------------------------------------------------------------------------------- Note ----- - dynamic statistics used: dynamic sampling (level=2) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 2 consistent gets 0 physical reads 0 redo size 541 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We notice the Max(ID) query now returns the result with just the 2 expected consistent gets from a BLevel 1 index.
If we now look at the index tree dump:
branch: 0x1800523 25167139 (0: nrow: 21, level: 1)
leaf: 0x1800524 25167140 (-1: row:485.485 avs:828)
leaf: 0x1800525 25167141 (0: row:479.479 avs:820)
leaf: 0x1800526 25167142 (1: row:479.479 avs:820)
leaf: 0x1800527 25167143 (2: row:479.479 avs:820)
leaf: 0x1800528 25167144 (3: row:479.479 avs:820)
leaf: 0x1800529 25167145 (4: row:479.479 avs:819)
leaf: 0x180052a 25167146 (5: row:479.479 avs:820)
leaf: 0x180052b 25167147 (6: row:479.479 avs:820)
leaf: 0x180052c 25167148 (7: row:479.479 avs:820)
leaf: 0x180052d 25167149 (8: row:479.479 avs:819)
leaf: 0x180052e 25167150 (9: row:479.479 avs:820)
leaf: 0x180052f 25167151 (10: row:479.479 avs:820)
leaf: 0x1800531 25167153 (11: row:479.479 avs:820)
leaf: 0x1800532 25167154 (12: row:479.479 avs:820)
leaf: 0x1800533 25167155 (13: row:479.479 avs:819)
leaf: 0x1800534 25167156 (14: row:479.479 avs:820)
leaf: 0x1800535 25167157 (15: row:479.479 avs:820)
leaf: 0x1800536 25167158 (16: row:479.479 avs:820)
leaf: 0x1800537 25167159 (17: row:479.479 avs:820)
leaf: 0x1800538 25167160 (18: row:479.479 avs:819)
leaf: 0x1800539 25167161 (19: row:414.414 avs:1795)
—– end tree dump
We notice the index is back to its original compact self again and we can determine the Max(ID) by just visiting the last leaf block.
That’s enough of a read for now !! I’ll next look at how this scenario differs with a Unique index in Part II.
Excellent read! So is it a good idea to coalesce our indexes in the exception handling section of bulk DML updates i.e. in case of updates failure after we rollback?
LikeLike
Hi Ash
In general, probably not.
A point you may have missed is that the indexes would actually be in a similar state if the transaction had committed rather than rolled back. In the first example, it would have been all the previous index entries that are now marked as deleted and littered throughout the whole index. The index would still be effectively 1/2 filled with useful data. The question remains, will this space be reused by subsequent inserts. Is performance being compromised by having leaf blocks only half filled with useful index entries.
In the second example, we would not have the blank index blocks but we would again have effectively empty blocks in the leaf blocks occupied by the previous index entries. So all the previous leaf blocks would be effectively empty. The max query would indeed return the data straight away but a query based on the min(ID) would now be compromised.
So what needs to be considered is what impact does a large update have on such indexed columns. If possible, if you intend to update all rows of an indexed column, a better strategy would be to mark the index unusable first and then rebuild the index after the update. That way, you significantly improve the performance of the update (as maintaining the index on the fly is really expensive) and you end up with a “perfect” at the end of the process.
So it depends on the size of the update, the type of update and the subsequent impact of SQLs dependent on the index … 🙂
LikeLike
Many thanks for the detailed reply Richard! I get it now!
Kind regards,
Ash
LikeLike
[…] been an interesting recent discussion on the OTN Database forum regarding “Index blank blocks after a large update that was rolled […]
LikeLike
Hi Richard,
When a oracle does a Coalesce, it does not reallocate the freed blocks. So shouldn’t, the max query still go through the empty leaf blocks? And shouldn’t the index not get it into its original self again as it still has its empty leaf blocks with it after coalesce?
Request you to help me understand the same.
Regards
Rathi
LikeLike
Hi Rathi
With a coalesce, empty blocks still belong to the index “segment” (and so can’t be used by other segments), but the empty blocks no longer belong within the index “structure” (else what would be the point of the coalesce operation ?). So no, min/max do not therefore have to visit the empty leaf blocks.
The difference between a coalesce and an index shrink is that with a shrink, the empty blocks can be freed from the index segment and re-used by other segments.
LikeLike
Many thanks Richard!
LikeLike