jump to navigation

When Does A ROWID Change? Part IV (“Mass Production”) December 21, 2022

Posted by Richard Foote in Attribute Clustering, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Changing ROWID, Clustering Factor, Data Clustering, Flashback, Move Partitions, Oracle, Oracle Blog, Oracle Cloud, Oracle General, Oracle Indexes, Partitioning, Richard's Blog, ROWID.
trackback

In Part II in this series, I discussed how the update of the partitioned key column of a row that results in the row being moved to a different partition, will result in the ROWID of such rows changing.

However, there a quite a number of other user initiated actions in which ROWIDs can easily change (as indeed discussed in Connor McDonald’s video on this subject).

Some of these include:

  • Moving a table or partition, as this results in the segment being reorganised, with all associated rows being physically relocated and their associated ROWIDs changing
  • Altering a non-partitioned table such that it be now be partitioned, which again results in the physical relocation of all rows and their ROWIDs changing (which BTW, can potentially occur on a Autonomous Database without any user intervention)
  • Altering the partitioning strategy of a partitioned table, again changes the physical location of all rows
  • Hybrid Columnar Compression (HCC), which by packing rows more tightly, can more likely result in the physical relocation of a row during subsequent DML statements
  • Altering a table to Shrink Space, which attempts to move rows between table blocks to pack rows more tightly, again potentially resulting in rows physically moving and the changing of their associated ROWIDs
  • Flashback of a table, which results in rows being deleted and inserted and hence the change of their associated ROWIDs

I’ll illustrate an example of all this, with one of the key reasons why you may want to re-organise a table (and implicitly change all the ROWIDs of a table).

I’ll start by creating and populating a simple little table, with a CODE column that has very poorly clustered data:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum, 500), 'DAVID BOWIE' from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

Let’s now create an index on this CODE column:

SQL> create index bowie_code_i on bowie(code);

Index created.

We take note of the ROWIDs of a few random rows:

SQL> select id, rowid from bowie where id in (42, 4242, 424242) order by id;

        ID ROWID
---------- ------------------
        42 AAASn1AAMAAAgB2AAp
      4242 AAASn1AAMAAAgCHACL
    424242 AAASn1AAMAAAgbtAAJ

If we run a simple query with a predicate based on the CODE column:

SQL> select * from bowie where code=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 2000 | 42000 |    1004 (2)| 00:00:01 |
| * 1 |  TABLE ACCESS FULL | BOWIE | 2000 | 42000 |    1004 (2)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       3596 consistent gets
          0 physical reads
          0 redo size
      20757 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

We notice the CBO has chosen to ignore the index and use a FTS instead, even though only 2000 rows in a 1M row table (just 0.2%) are returned.

Why?

Because the clustering of the CODE data is terrible, with the required values littered throughout the table. If we look at the Clustering Factor of the index:

SQL> select index_name, leaf_blocks, clustering_factor from user_indexes where index_name='BOWIE_CODE_I';

INDEX_NAME   LEAF_BLOCKS CLUSTERING_FACTOR
------------ ----------- -----------------
BOWIE_CODE_I        2063           1000000

We notice the index has the worst possible Clustering Factor value of 1000000.

So to improve the performance of this (say critical) query, we can add a Clustering Attribute to this table based on the CODE column and then reorganise the table:

SQL> alter table bowie add clustering by linear order (code);

Table altered.

SQL> alter table bowie move online;

Table altered.

If we now look at the Clustering Factor of the index:

SQL> select index_name, leaf_blocks, clustering_factor from user_indexes where index_name='BOWIE_CODE_I';

INDEX_NAME   LEAF_BLOCKS CLUSTERING_FACTOR
------------ ----------- -----------------
BOWIE_CODE_I        2063              3568

We can see it has substantially improved, down to just 3568 from the previous 1000000 value, as the data is now perfectly clustered based on the CODE column.

If we now re-run the query:

SQL> select * from bowie where code=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 853003755

----------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Rows | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              | 2000 | 42000 |      15 (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE        | 2000 | 42000 |      15 (0)| 00:00:01 |
| * 2 |   INDEX RANGE SCAN                   | BOWIE_CODE_I | 2000 |       |       7 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42) 

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         17 consistent gets
          0 physical reads
          0 redo size
      50735 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

The CBO now choses to use the index and the query is much more efficient as a result (consistent gets down to just 17 from the previous 3596).

So all is now much better, except for any application that was reliant on using ROWIDs to fetch the data, as all ROWIDs have now changed:

SQL> select id, rowid from bowie where id in (42, 4242, 424242) order by id;

        ID ROWID
---------- ------------------
        42 AAASn6AAMAAAACvAEf
      4242 AAASn6AAMAAAiRaAA4
    424242 AAASn6AAMAAAiRWAEQ

So there are many ways in which the ROWID of a row can potentially change.

And now there’s another key manner in which a ROWID can very easily change in Oracle Autonomous Database environments, as I’ll next discuss…

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: