jump to navigation

When Does A ROWID Change? Part IV (“Mass Production”) December 21, 2022

Posted by Richard Foote in Attribute Clustering, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, Changing ROWID, Clustering Factor, Data Clustering, Flashback, Move Partitions, Oracle, Oracle Blog, Oracle Cloud, Oracle General, Oracle Indexes, Partitioning, Richard's Blog, ROWID.
add a comment

In Part II in this series, I discussed how the update of the partitioned key column of a row that results in the row being moved to a different partition, will result in the ROWID of such rows changing.

However, there a quite a number of other user initiated actions in which ROWIDs can easily change (as indeed discussed in Connor McDonald’s video on this subject).

Some of these include:

  • Moving a table or partition, as this results in the segment being reorganised, with all associated rows being physically relocated and their associated ROWIDs changing
  • Altering a non-partitioned table such that it be now be partitioned, which again results in the physical relocation of all rows and their ROWIDs changing (which BTW, can potentially occur on a Autonomous Database without any user intervention)
  • Altering the partitioning strategy of a partitioned table, again changes the physical location of all rows
  • Hybrid Columnar Compression (HCC), which by packing rows more tightly, can more likely result in the physical relocation of a row during subsequent DML statements
  • Altering a table to Shrink Space, which attempts to move rows between table blocks to pack rows more tightly, again potentially resulting in rows physically moving and the changing of their associated ROWIDs
  • Flashback of a table, which results in rows being deleted and inserted and hence the change of their associated ROWIDs

I’ll illustrate an example of all this, with one of the key reasons why you may want to re-organise a table (and implicitly change all the ROWIDs of a table).

I’ll start by creating and populating a simple little table, with a CODE column that has very poorly clustered data:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum, 500), 'DAVID BOWIE' from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

Let’s now create an index on this CODE column:

SQL> create index bowie_code_i on bowie(code);

Index created.

We take note of the ROWIDs of a few random rows:

SQL> select id, rowid from bowie where id in (42, 4242, 424242) order by id;

        ID ROWID
---------- ------------------
        42 AAASn1AAMAAAgB2AAp
      4242 AAASn1AAMAAAgCHACL
    424242 AAASn1AAMAAAgbtAAJ

If we run a simple query with a predicate based on the CODE column:

SQL> select * from bowie where code=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id  | Operation          | Name  | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |       | 2000 | 42000 |    1004 (2)| 00:00:01 |
| * 1 |  TABLE ACCESS FULL | BOWIE | 2000 | 42000 |    1004 (2)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       3596 consistent gets
          0 physical reads
          0 redo size
      20757 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

We notice the CBO has chosen to ignore the index and use a FTS instead, even though only 2000 rows in a 1M row table (just 0.2%) are returned.

Why?

Because the clustering of the CODE data is terrible, with the required values littered throughout the table. If we look at the Clustering Factor of the index:

SQL> select index_name, leaf_blocks, clustering_factor from user_indexes where index_name='BOWIE_CODE_I';

INDEX_NAME   LEAF_BLOCKS CLUSTERING_FACTOR
------------ ----------- -----------------
BOWIE_CODE_I        2063           1000000

We notice the index has the worst possible Clustering Factor value of 1000000.

So to improve the performance of this (say critical) query, we can add a Clustering Attribute to this table based on the CODE column and then reorganise the table:

SQL> alter table bowie add clustering by linear order (code);

Table altered.

SQL> alter table bowie move online;

Table altered.

If we now look at the Clustering Factor of the index:

SQL> select index_name, leaf_blocks, clustering_factor from user_indexes where index_name='BOWIE_CODE_I';

INDEX_NAME   LEAF_BLOCKS CLUSTERING_FACTOR
------------ ----------- -----------------
BOWIE_CODE_I        2063              3568

We can see it has substantially improved, down to just 3568 from the previous 1000000 value, as the data is now perfectly clustered based on the CODE column.

If we now re-run the query:

SQL> select * from bowie where code=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 853003755

----------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name         | Rows | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |              | 2000 | 42000 |      15 (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE        | 2000 | 42000 |      15 (0)| 00:00:01 |
| * 2 |   INDEX RANGE SCAN                   | BOWIE_CODE_I | 2000 |       |       7 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42) 

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         17 consistent gets
          0 physical reads
          0 redo size
      50735 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

The CBO now choses to use the index and the query is much more efficient as a result (consistent gets down to just 17 from the previous 3596).

So all is now much better, except for any application that was reliant on using ROWIDs to fetch the data, as all ROWIDs have now changed:

SQL> select id, rowid from bowie where id in (42, 4242, 424242) order by id;

        ID ROWID
---------- ------------------
        42 AAASn6AAMAAAACvAEf
      4242 AAASn6AAMAAAiRaAA4
    424242 AAASn6AAMAAAiRWAEQ

So there are many ways in which the ROWID of a row can potentially change.

And now there’s another key manner in which a ROWID can very easily change in Oracle Autonomous Database environments, as I’ll next discuss…

Constraints With Oracle Total Recall (Remember A Day) October 7, 2008

Posted by Richard Foote in 11g, Constraints, Flashback, Total Recall.
15 comments

11g has many new and potentially useful features which I’m slowly investigating and evaluating.

One feature that has the potential to be particularly useful where I work is “Oracle Total Recall“. It provides the facility to automatically capture all changes to any tables you choose, in a “secure, tamper proof” manner. This enables all changes to be easily tracked and viewed for historical, auditing, security control, etc. purposes.

Thought I might “introduce” Total Recall to those of you that may not have seen it in action and to offer an initial word of warning of a little trap one can easily fall into …

One nice aspect with it all is that it’s very easy to setup and configure.

To begin with, you need to create a storage area called a “flashback data archive” where all the historical changes can be stored, e.g.:

SQL> CREATE FLASHBACK ARCHIVE bowie_archive
        TABLESPACE total_recall_ts QUOTA 500m RETENTION 1 YEAR;

You simply specify the tablespace you wish to use, how much space within the tablespace you wish to allocate for archiving purposes and a retention period to specify how long to keep the archived data.

To create a table with the “Total Recall” change tracking capability, simply specify which flashback archive area to use:

SQL> CREATE TABLE bowie_recall (id number, text varchar2(20), hist_date date) FLASHBACK ARCHIVE bowie_archive;

Table created.

All changes to the table are now automatically tracked and archived and can easily be accessed as necessary. To illustrate, I’m just going to simply insert a row into the table.

SQL> insert into bowie_recall values (1, ‘Bowie’, sysdate);

1 row created.

SQL> commit;

Commit complete.

Sometime later, I perform an update on the row:

SQL> update bowie_recall
  2  set text = ‘Ziggy’, hist_date = sysdate
  3  where id = 1;

1 row updated.

SQL> commit;

Commit complete.

OK, so the current row basically looks like thus:

SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
         from bowie_recall;

  ID TEXT     HIST_DATE
---- -------- -------------------
   1 Ziggy    06-10-2008 10:09:55

 

However, if I want to see what the row looked like at a previous point in time, I can use the AS OF TIMESTAMP clause such as this:

SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
         from bowie_recall
         as of timestamp to_timestamp(‘2008-10-06 10:07:00’,
                     ‘yyyy-mm-dd hh24:mi:ss’)
         where id = 1;

  ID TEXT     HIST_DATE
---- -------- -------------------
   1 Bowie    06-10-2008 10:06:15

 

This is what the row looked like when it was initially inserted.

If I want to look at all the different versions of the row between any specific times of interest, I can write something such as this using the VERSIONS BETWEEN clause:

SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date
         from bowie_recall
         versions between timestamp
         to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’)
         and sysdate
         where id = 1;

  ID TEXT     HIST_DATE
---- -------- -------------------
   1 Ziggy    06-10-2008 10:09:55
   1 Bowie    06-10-2008 10:06:15

 

It lists all versions of the rows between the two points of time …

Like I said, very easy to setup and very easy to subsequently query.

One of the disadvantages however of enabling the tracking of changes to a table is that it prevents a number of subsequent operations on the table. For example, you can’t simply drop a column:

SQL> alter table bowie_recall drop column status;
alter table bowie_recall drop column status
*
ERROR at line 1:
ORA-55610: Invalid DDL statement on history-tracked table

You also for example can’t modify a column to say add a NOT NULL constraint:

SQL> alter table bowie_recall modify text not null;
alter table bowie_recall modify text not null
*
ERROR at line 1:
ORA-55610: Invalid DDL statement on history-tracked table

Now the trap …

Oracle does however allow you to add any other type of constraint, such as say a Primary Key, without complaint:

SQL> alter table bowie_recall add primary key (id);

Table altered.

All well and good, except for when you want to now retrieve some of the historical changes from the table:

SQL> select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date from bowie_recall versions between timestamp to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’) and sysdate where id = 1;
select id, text, to_char(hist_date, ‘dd-mm-yyyy hh24:mi:ss’) hist_date from  bowie_recall versions between timestamp to_timestamp(‘2008-10-06 10:07:00’, ‘yyyy-mm-dd hh24:mi:ss’) and sysdate where id = 1
                                                                             *
ERROR at line 1:
ORA-01466: unable to read data – table definition has changed

Remember, one of the reasons for implementing Total Recall on a table is to implement a “tamper proof” data tracking solution and yet the addition of any constraint on a table (Primary Key, Unique Key, Foreign Key or Check Constraint) will cause you to be unable to automatically read the historical data, without directly accessing the underlining SYS_FBA_ archive tables.

Be careful with constraint management and the historical tracking of tables …

Total Recall is of course all relatively new and these sorts of early glitches are to be expected. More on Total Recall later, including possible performance implications when both modifying and querying historical data.