jump to navigation

The Fake Index Trap (“Nowhere Now”) May 2, 2023

Posted by Richard Foote in CBO, Drop Index, Fake Indexes, Index Internals, NOSEGMENT Option, Online DDL, Oracle, Oracle 21c, Oracle Bugs, Oracle General, Oracle Indexes, Oracle Indexing Internals Webinar, Oracle19c, Tablespace Management, Virtual Indexes.
add a comment

In a recent correspondence, I was alerted to an issue in relation to the use of Virtual/Fake/Nosegment Indexes that I wasn’t aware of previously. Having a play, it appears this issue is still present in at least Oracle Database 21c, so I thought it worth a mention in case anyone else happens to fall into this trap.

I’ve discussed Virtual/Fake/Nosegment Indexes a number of times previously. These are indexes that do not exist as a physical segment (and so can be created almost immediately without consuming any storage), that can be used to determine if an index could potentially be used by the CBO if it were to be actually created.

Although such Fake Indexes don’t physically exist, they can cause issues if forgotten…

To illustrate this issue, I’ll start by creating a new tablespace:

SQL> create tablespace BOWIE_TS datafile 'C:\ORADATA\ZIGGY\ZIGGYPDB1\BOWIE_TS.DBF' size 100M;

Tablespace created.

Next, I’ll create and populate a table in this BOWIE_TS tablespace:

SQL> create table bowie_test (id number, name varchar2(42)) tablespace bowie_ts;

Table created.

SQL> insert into bowie_test select rownum, 'DAVID BOWIE' from dual connect by level <=10000;

10000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_TEST');

PL/SQL procedure successfully completed.

 

I’ll next create a Virtual/Fake index, using the NOSEGMENT option:

SQL> create index bowie_test_id_i on bowie_test(id) nosegment tablespace bowie_ts;

Index created.

We note this Fake Index is NOT listed in either USER_INDEXES or USER_SEGMENTS:

SQL> select index_name, tablespace_name from user_indexes where table_name='BOWIE_TEST';

no rows selected

SQL> select segment_name, segment_type, tablespace_name from user_segments
     where segment_name='BOWIE_TEST_ID_I';

no rows selected

If we run a basic, highly selective query on this table:

SQL> select * from bowie_test where id=42;

Execution Plan
----------------------------------------------------------
Plan hash value: 65548668

--------------------------------------------------------------------------------
| Id | Operation          | Name       | Rows | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------
|  0 | SELECT STATEMENT   |            |    1 |    16 |      11 (0) | 00:00:01 |
|* 1 |  TABLE ACCESS FULL | BOWIE_TEST |    1 |    16 |      11 (0) | 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

     1 - filter("ID"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         38 consistent gets
          0 physical reads
          0 redo size
        648 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We notice the CBO uses a FTS. The Fake Index is NOT considered by default.

However, if we set the session as follows and re-run the query:

SQL> alter session set "_use_nosegment_indexes" = true;

Session altered.

SQL> select * from bowie_test where id=42;

Execution Plan
----------------------------------------------------------
Plan hash value: 1280686875

-------------------------------------------------------------------------------------------------------
| Id | Operation                            | Name            | Rows | Bytes | Cost (%CPU) | Time     |
-------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                     |                 |    1 |    16 |       2 (0) | 00:00:01 |
|  1 |  TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_TEST      |    1 |    16 |       2 (0) | 00:00:01 |
|* 2 |   INDEX RANGE SCAN                   | BOWIE_TEST_ID_I |    1 |       |       1 (0) | 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

     2 - access("ID"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
         38 consistent gets
          0 physical reads
          0 redo size
        648 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          1 rows processed

We can see the CBO appears to now use the Fake Index, but as it doesn’t actually physically exist, actually uses a FTS behind the scenes (the number of consistent gets is evidence of this). But at least we now know the CBO would at least consider such an index if it physically existed.

We now decide to drop the tablespace and so first try to MOVE the table to another tablespace using the ONLINE option:

SQL> alter table bowie_test move online tablespace users;
alter table bowie_test move online tablespace users
*
ERROR at line 1:
ORA-14808: table does not support ONLINE MOVE TABLE because of the presence of nosegment index

The error message clearly states we can’t move the table ONLINE if such a Fake/Nosegment Index exists. This is our official warning of the potential danger to come…

We try to move the table using the default OFFLINE method:

SQL> alter table bowie_test move tablespace users;

Table altered.

We have now successfully moved the table to another tablespace.

If we check to see if we have any other segments within the tablespace yto be dropped:

SQL> select segment_name from dba_segments where tablespace_name='BOWIE_TS';

no rows selected

Oracle tells us that no, we do NOT have any current segments in this tablespace.

So it’s now safe to purge and drop this tablespace (or so we think):

SQL> purge tablespace bowie_ts;

Tablespace purged.

SQL> drop tablespace bowie_ts;

Tablespace dropped.

The tablespace has been successfully dropped.

However, if we now re-run the query on this table:

SQL> select * from bowie_test where id=42;
select * from bowie_test where id=42
*
ERROR at line 1:
ORA-00959: tablespace 'BOWIE_TS' does not exist

We get this unexpected error that the tablespace BOWIE_TS does not exist.

BUT, we already know the tablespace doesn’t exist, we’ve just dropped it !!!

So why are we getting this error?

It’s all due to the damn Fake Index we created previously.

Although there is no physical index segment for our Fake Index, there are still some internal Data Dictionary links between the Fake Index and the tablespace it was associated with. The tablespace is gone, but NOT the Fake Index.

The only place where fake indexes can be easily found within Oracle, is within the USER_OBJECTS view:

SQL> select o.object_name, o.object_type, o.status
from user_objects o left join user_indexes i
on o.object_name=i.index_name
where o.object_type='INDEX' and i.index_name is null;

OBJECT_NAME                    OBJECT_TYPE             STATUS
------------------------------ ----------------------- -------
BOWIE_TEST_ID_I                INDEX                   VALID

To eliminate this error, we have to first drop the Fake Index associated with the dropped tablespace:

SQL> drop index bowie_test_id_i;

Index dropped.

We can now safely run the query without error:

SQL> select * from bowie_test where id=42;

        ID NAME
---------- ------------------------------------------
42         DAVID BOWIE

So if you do ever create Fake Indexes, don’t forget to drop them once you’ve finished experimenting with them.

ESPECIALLY if you ever decide to drop the tablespace into which they were associated. This is explained in part in Oracle Support Doc ID 1361049.1.

Cost of Virtual Indexes (Little Lies) May 30, 2012

Posted by Richard Foote in CBO, Clustering Factor, Fake Indexes, Oracle Indexes, Virtual Indexes.
5 comments

I’ve previously discussed Virtual Indexes and how they can be used to do basic “what if” analysis if such an index really existed. However, a recent comment on the OTN forums regarding using them to compare index costs made me think a follow-up post regarding the dangers of Virtual Indexes might be warranted.

The big advantage of a Virtual Index of course is that it doesn’t really exist and so consumes no storage and can be created extremely quickly/cheaply. The disadvantage of a Virtual index is that it doesn’t really exist and so Oracle can not collect segment level statistics. Without statistics however, the CBO has a very tough time of doing its job properly …

To illustrate, a simple little demo. I begin by creating a table in which the data in the table is stored in CODE column order. An index on the CODE column would therefore have an excellent (very low) Clustering Factor. Note that the Clustering Factor is the most important index related statistic regarding the efficiency and potential cost of using the index.

SQL> create table bowie_ordered (id number, code number, name varchar2(30));

Table created.

SQL> create sequence bowie_seq;

Sequence created.

SQL> declare
  2  begin
  3  for i in 1..100 loop
  4     for j in 1..10000 loop
  5        insert into bowie_ordered values (bowie_seq.nextval, i, 'DAVID BOWIE');
  6     end loop;
  7  end loop;
  8  end;
  9  /

PL/SQL procedure successfully completed.

OK, I’m now going to create a Virtual Index on the CODE column and collect 100% accurate statistics on the table:

SQL> create index bowie_ordered_i on bowie_ordered(code) nosegment;

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_ORDERED', cascade=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

Oracle allows statistics to be collected on the table and associated Virtual Index (so that existing statistic gathering jobs won’t now fail), however without an associated segment, no statistics can actually be derived for the index itself.

SQL> select index_name, num_rows,clustering_factor from dba_indexes where index_name='BOWIE_ORDERED_I';

no rows selected

I’ll list the system statistics so anyone who wants to replicate the demo can get similar results (the database blocksize is 8K):

SQL> SELECT pname, pval1 FROM SYS.AUX_STATS$
     WHERE pname IN ('SREADTIM', 'MREADTIM', 'MBRC', 'CPUSPEED');

PNAME                               PVAL1
------------------------------ ----------
SREADTIM                                2
MREADTIM                               10
CPUSPEED                             1000
MBRC                                   20

If we run the following query:

SQL> set arraysize 5000
SQL> select * from bowie_ordered where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1678744259

-----------------------------------------------------------------------------------
| Id  | Operation         | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |               | 10000 |   195K|  1005  (13)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BOWIE_ORDERED | 10000 |   195K|  1005  (13)| 00:00:03 |
-----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3471  consistent gets
          0  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

We note that the CBO uses a Full Table Scan as the CBO has no real choice here as Virtual Indexes are not considered by default.

However, if we change the following hidden parameter and re-run:

SQL> alter session set "_use_nosegment_indexes" = true;

Session altered.

SQL> select * from bowie_ordered where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1860500051

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 | 10000 |   195K|        9(0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE_ORDERED   | 10000 |   195K|        9(0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ORDERED_I | 10000 |       |        1(0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3471  consistent gets
          0  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

We notice a few interesting details. Firstly, the CBO has decided to use the Virtual Index however the number of consistent gets remains the same as the previous run so we can clearly see that behind the covers, the Full Table Scan is still performed. The index is not “really there” and so at execution time, the SQL statement is reparsed using the next best available plan.

If we look at the execution plan costs, both the estimate row (10000) and byte values are spot on as these statistics are based on the underlining table/column statistics and the 100 distinct CODE values are evenly distributed. However, the index related costs look remarkably low. Just a cost of 1 to read the index and extract 10,000 index entries (that means an index entry is less than 1 byte in length on average !!). Just a cost of 9 to visit the table and read 10,000 rows. Even with the most efficient of physical indexes, these costings are not realistic and are based on highly questionable default metrics.

Basically, the creation of this Virtual Column is telling us that there is no reason why the index couldn’t potentially be used, IF (that’s a big IF in case no-one noticed) the actual index related statistics are such that the CBO determines the index to be the cheaper option. But it depends on the actual characteristics of the index which can’t be accurately determined until it’s been physically created.

As the Virtual Index suggests the index might be used if it existed, let’s now create it for real:

SQL> drop index bowie_ordered_i;

Index dropped.

SQL> create index bowie_ordered_i on bowie_ordered(code);

Index created.

SQL> select index_name, num_rows,clustering_factor from dba_indexes where index_
name='BOWIE_ORDERED_I';

INDEX_NAME                       NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- -----------------
BOWIE_ORDERED_I                   1000000              3546

As predicted, a Clustering Factor of 3546 on an index with 1M index entries is indeed nice and low.

If we now re-run the query again:

SQL> select * from bowie_ordered where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1860500051

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 | 10000 |   195K|       60(4)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE_ORDERED   | 10000 |   195K|       60(4)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ORDERED_I | 10000 |       |       23(5)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
         61  consistent gets
         21  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

Indeed, the index has been used by the CBO. However, note that the costs are substantially higher (and more accurate) than previously suggested with the Virtual Index. Indeed the final cost of 60 is very close to the number of consistent gets (61) required by the execution plan and so suggests the CBO is making reasonable index based calculations here.

OK, another demo, but this time with a table in which the CODE values are distributed throughout the whole table (rather than being perfectly clustered together as in the first example):

SQL> create table bowie_random (id number, code number, name varchar2(30));

Table created.

SQL> insert into bowie_random select rownum, mod(rownum,100)+1, 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_RANDOM', cascade=>true, estimate_percent=>null, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

When we now create a Virtual Index based on the CODE column and re-run the same query:

SQL> create index bowie_random_i on bowie_random(code) nosegment;

Index created.

SQL> select * from bowie_random where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 711259049

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                | 10000 |   195K|     9   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE_RANDOM   | 10000 |   195K|     9   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_RANDOM_I | 10000 |       |     1   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42)
Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
       3470  consistent gets
          0  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

We notice that both the execution plan and all the associated costs are identical to those of the previous example. So although the actual Clustering Factor of the index is likely to be dramatically greater here than it was in the previous example and so likely dramatically impact the costs associated with using this index, the Virtual Index is treated and costed identically. This is the simple consequence of not having the physical index structure by which to calculate the appropriate segment statistics.

If we now physically create this index for real:

SQL> drop index bowie_random_i;

Index dropped.

SQL> create index bowie_random_i on bowie_random(code);

Index created.

SQL> select index_name, num_rows,clustering_factor from dba_indexes where index_
name='BOWIE_RANDOM_I';

INDEX_NAME                       NUM_ROWS CLUSTERING_FACTOR
------------------------------ ---------- -----------------
BOWIE_RANDOM_I                    1000000            344700

We can see that indeed the Clustering Factor is dramatically worse than before, increasing here from 3546 to 344700.

If we now re-run the query:

SQL> select * from bowie_random where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1983602984

----------------------------------------------------------------------------------
| Id  | Operation         | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |              | 10000 |   195K|  1005  (13)| 00:00:03 |
|*  1 |  TABLE ACCESS FULL| BOWIE_RANDOM | 10000 |   195K|  1005  (13)| 00:00:03 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter("CODE"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3470  consistent gets
          0  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

We see the CBO has decided to perform the now cheaper Full Table Scan. Although the Virtual Index on this column was used, once the actual characteristics of the index are determined via the index statistics, the CBO has decided the actual physical index was just too expensive to use to retrieve the 1% of rows.

If we re-run the query with an index hint:

SQL> select /*+ index (bowie_random) */ * from bowie_random where code = 42;

10000 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 711259049

----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name           | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                | 10000 |   195K|  3483   (1)| 00:00:07 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE_RANDOM   | 10000 |   195K|  3483   (1)| 00:00:07 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_RANDOM_I | 10000 |       |    23   (5)| 00:00:01 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("CODE"=42)
Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
       3472  consistent gets
         21  physical reads
          0  redo size
     100802  bytes sent via SQL*Net to client
        534  bytes received via SQL*Net from client
          3  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10000  rows processed

The index is used but we can see why at a cost of 3483, the Full Table Scan at a cost of only 1005 was selected by the CBO.

Virtual Indexes can be useful to quickly determine whether an index is a viable option if it were to be actually created. However, caution needs to be exercised if Virtual Indexes are used for cost comparison purposes and although Virtual Indexes might be  used by the CBO, it might be another story entirely once the index is physically created and the actual index related statistics determined.

Introduction to Fake / Virtual / NOSEGMENT Indexes January 11, 2008

Posted by Richard Foote in Fake Indexes, Index Access Path, NOSEGMENT Option, Oracle Cost Based Optimizer, Oracle Indexes, Virtual Indexes.
10 comments

OK, as promised, answer to index fact #5 you may not have known:

“It’s possible to make the CBO reference and use within an execution plan indexes that don’t in actual fact exist”.

Before I start, please note this feature is not officially documented other than the odd Metalink note and requires the setting of an undocumented parameter to work, so please exercise caution.

Fake Indexes (also known as Virtual or Nosegment Indexes) have been around for a long time, since 8i days. They’re used primarily by Oracle Enterprise Manager and its Tuning Pack which has various wizards that can do “what if” type analysis. One of these is the Index Wizard which can kinda “pretend” to create an index and see what the Cost Based Optimizer might do if such an index really existed.

It’s possible to create these Fake indexes manually by using the NOSEGMENT clause when creating an index:

CREATE INDEX Bowie_idx ON Bowie_Table(Ziggy) NOSEGMENT;

This will populate some (but not many) DD related tables but will not actually create an index segment or consume any actual storage. It’s not maintained in any way by DML operations on the parent table and it can’t be altered or rebuilt as can a conventional, “real” index (it will generate an ORA-08114 error if you try to do so). You can analyze or run dbms_stats over the index but the index is not treated as analyzed as such (as can be seen via a 10053 trace).

It’s also only visible to the CBO, if and only if a session has the following parameter set:

ALTER SESSION SET “_use_nosegment_indexes” = true;

The CBO will now consider the index and potentially include it within an execution plan. However, at execution time Oracle can of course not use the index and will revert to the next best thing.

A Fake index is basically an index you have when you don’t really have an index in order to see if it could be useful if it really existed.

This Fake Indexes Demo shows how they work and can be used.

1 down, 7 to go … 😉