jump to navigation

Automatic Indexing: Potential Locking Issues Part I (“Rattle That Lock”) December 1, 2022

Posted by Richard Foote in 19c, Automatic Indexing, Autonomous Database, Exadata, Full Table Scans, Invisible Indexes, Locking Issues, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Unusable Indexes.
1 comment so far

 

I’ve discussed previously locking issues associated with the creation of indexes. Although things have changed and improved over the years, even with the ONLINE option currently, an index creation process still requires (albeit brief and non-escalating) locks on the underlining table.

Basically, there needs to be a brief period where there isn’t an active transaction on the underlining table for the index creation process to complete, else it will forced to wait and hang. Oracle requires a table lock on the underlining table at the start of the CREATE or REBUILD process (to guarantee data dictionary information) and a lock at the end of the process (to merge index changes made during the rebuild into the final index structure).

So how do these index lock requirements potentially impact the Automatic Indexing process?

To investigate, I’ll create and populate a basic table with a highly selective CODE column:

SQL> create table bowie_busy (id number constraint bowie_busy_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_busy select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_BUSY');

PL/SQL procedure successfully completed.

 

In a second session, I’ll insert a new row but NOT commit the change, thereby creating a extended transaction:

SQL> insert into bowie_busy values (10000001, 42, 'Ziggy Stardust');

1 row created.

 

Back in the original session, I’ll run the following SQL numerous times:

 

SQL> select * from bowie_busy where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3896751453

--------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |   10 |   230 |    6714 (2)| 00:00:01 |
| * 1 |  TABLE ACCESS FULL | BOWIE_BUSY |   10 |   230 |    6714 (2)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      71423 consistent gets
      38657 physical reads
          0 redo size
        885 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

Without an associated index in place, the CBO currently has no choice but to perform a Full Table Scan. But with the SQL only returning 10 rows from the 10M table, clearly an index would be beneficial.

But how does the existing transaction and associated locks on table impact the Automatic Indexing process?

There’s nothing magical here. With the current transaction in place on the underlying table, the index creation process simply can’t be completed. If we look at the status of the Automatic Index:

SQL> select index_name, auto, constraint_index, visibility, status, num_rows, leaf_blocks from user_indexes where table_
name='BOWIE_BUSY';

INDEX_NAME                     AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS
------------------------------ --- --- --------- -------- ---------- -----------
BOWIE_BUSY_PK                  NO  YES VISIBLE   VALID      10000000       19856
SYS_AI_8pkdh6q096qvs           YES NO  INVISIBLE UNUSABLE   10000000       23058

It remains in its initial INVISIBLE/USABLE state.

If we look at the Automatic Indexing monitoring report, some 6 HOURS after the initial running of the Automatic Index process for this index:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT
--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start              : 01-DEC-2022 07:12:31
Activity end                : 01-DEC-2022 13:05:53
Executions completed        : 0
Executions interrupted      : 0
Executions with fatal error : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates           : 0
Indexes created            : 0
Space used                 : 0 B
Indexes dropped            : 0
SQL statements verified    : 0
SQL statements improved    : 0
SQL plan baselines created : 0
Overall improvement factor : 1x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes   : 0
Space used       : 0 B
Unusable indexes : 0
-------------------------------------------------------------------------------

ERRORS
--------------------------------------------------------------------------------
-------------
No errors found.
--------------------------------------------------------------------------------
-------------

 

We notice that the whole Automatic Indexing process has been locked out and left in a hanging state (the times between the activity start/end times just keep climbing, with 0 executions of the Automatic Indexing process completed).

Without a VISIBLE/USABLE automatic index in place, if we re-run the SQL again:

SQL> select * from bowie_busy where code=42;

10 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 3896751453

--------------------------------------------------------------------------------
| Id  | Operation          | Name       | Rows | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |            |   10 |   230 |    6714 (2)| 00:00:01 |
| * 1 |  TABLE ACCESS FULL | BOWIE_BUSY |   10 |   230 |    6714 (2)| 00:00:01 |
--------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      71423 consistent gets
      38657 physical reads
          0 redo size
        885 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         10 rows processed

 

The CBO has again no choice but to still perform the highly inefficient Full Table Scan.

And the required Automatic Index won’t be able to be created until the existing transaction on the underlying table has completed.

HOWEVER, as we’ll see in Part II, the possible ramifications of this locking transaction goes way past the impact it has on just this SQL or specific automatic index…

Automatic Indexes: Automatically Rebuild Unusable Indexes Part IV (“Nothing Has Changed”) May 31, 2022

Posted by Richard Foote in 19c, 19c New Features, 21c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Full Table Scans, Index Column Order, Index Internals, Local Indexes, Mixing Auto and Manual Indexes, Oracle, Oracle 21c, Oracle Blog, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Indexing Internals Webinar, Oracle19c, Unusable Indexes.
1 comment so far

In a previous post, I discussed how Automatic Indexing (AI) does not automatically rebuild a manually built index that is in an Unusable state (but will rebuild an Unusable automatically created index).

The demo I used was a simple one, based on manually created indexes referencing a non-partitioned table.

In this post, I’m going to use a demo based on manually created indexes referencing a partitioned table.

I’ll start by creating a rather basic range-based partitioned table, using the RELEASE_DATE column to partition the data by year:

SQL> CREATE TABLE big_bowie (id number, album_id number, country_id number, release_date date,
total_sales number) PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2020 VALUES LESS THAN (TO_DATE('01-JAN-2021', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2021 VALUES LESS THAN (MAXVALUE));

Table created.

SQL> INSERT INTO big_bowie SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE');

PL/SQL procedure successfully completed.

I’ll next manually create a couple indexes; a non-partitioned index based on just the ALBUM_ID column and a prefixed locally partitioned index, based on the columns RELEASE_DATE, TOTAL_SALES:

 

SQL> create index album_id_i on big_bowie(album_id);

Index created.

SQL> create index release_date_total_sales_i on big_bowie(release_date, total_sales) local;

Index created.

 

If we now re-organise just partition ALBUMS_2017 (without using the ONLINE clause):

SQL> alter table big_bowie move partition albums_2017;

Table altered.

This results in the non-partitioned index and the ALBUMS_2017 local index partition becoming Unusable:

SQL> select index_name, status from user_indexes where table_name='BIG_BOWIE';

INDEX_NAME                     STATUS
------------------------------ --------
ALBUM_ID_I                     UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     N/A

SQL> select index_name, partition_name, status from user_ind_partitions
     where index_name='RELEASE_DATE_TOTAL_SALES_I';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2014          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2015          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2016          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2017          UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2018          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2019          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2020          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2021          USABLE

Let’s now run a number of queries a number of times. The first series is based on a predicate on just the ALBUM_ID column, such as:

SQL> select * from big_bowie where album_id=42;

2000 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1510748290

-------------------------------------------------------------------------------------------------
| Id  | Operation           | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart| Pstop |
-------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT    |           | 2000 | 52000 |    7959 (2) | 00:00:01 |       |       |
|   1 | PARTITION RANGE ALL |           | 2000 | 52000 |    7959 (2) | 00:00:01 |     1 |     8 |
| * 2 |  TABLE ACCESS FULL  | BIG_BOWIE | 2000 | 52000 |    7959 (2) | 00:00:01 |     1 |     8 |
-------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("ALBUM_ID"=42)
  - filter("ALBUM_ID"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      48593 consistent gets
      42881 physical reads
          0 redo size
      44289 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
       2000 rows processed

We’ll also run a series of queries based on both the RELEASE_DATE column using dates from the unusable index partition and the TOTAL_SALES column, such as:

SQL> select * from big_bowie where release_date='01-JUN-2017' and total_sales=42;

no rows selected

Execution Plan
----------------------------------------------------------
Plan hash value: 3245457041

----------------------------------------------------------------------------------------------------
| Id  | Operation              | Name      | Rows | Bytes | Cost (%CPU) | Time     | Pstart| Pstop |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT       |           |    1 |    26 |     986 (2) | 00:00:01 |       |       |
|   1 | PARTITION RANGE SINGLE |           |    1 |    26 |     986 (2) | 00:00:01 |     4 |     4 |
| * 2 |  TABLE ACCESS FULL     | BIG_BOWIE |    1 |    26 |     986 (2) | 00:00:01 |     4 |     4 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - storage("TOTAL_SALES"=42 AND "RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))
   - filter("TOTAL_SALES"=42 AND "RELEASE_DATE"=TO_DATE(' 2017-06-01 00:00:00',
'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       5573 consistent gets
          0 physical reads
          0 redo size
        676 bytes sent via SQL*Net to client
         41 bytes received via SQL*Net from client
          1 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          0 rows processed

Without a valid/usable index, the CBO currently has no choice but to use a Full Table Scan on the first query, and a Full Partition Scan on the partition with the unusable local index.

So what does AI make of things? Does it rebuild the unusable manually created indexes so the associated indexes can be used to improve these queries?

If we wait until the next AI task completes and check out the indexes on the table:

SQL> select index_name, status, partitioned from user_indexes where table_name='BIG_BOWIE';

INDEX_NAME                     STATUS   PAR
------------------------------ -------- ---
RELEASE_DATE_TOTAL_SALES_I     N/A      YES
ALBUM_ID_I                     UNUSABLE NO
SYS_AI_aw2825ffpus5s           VALID    NO
SYS_AI_2hf33fpvnqztw           VALID    NO

SQL> select index_name, partition_name, status from user_ind_partitions
     where index_name='RELEASE_DATE_TOTAL_SALES_I';

INDEX_NAME                     PARTITION_NAME       STATUS
------------------------------ -------------------- --------
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2014          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2015          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2016          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2017          UNUSABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2018          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2019          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2020          USABLE
RELEASE_DATE_TOTAL_SALES_I     ALBUMS_2021          USABLE

We notice that AI has created two new non-partitioned automatic indexes, while both the manually created indexes remain in the same unusable state. If we look at the columns associated with these new automatic indexes:

SQL> select index_name, column_name, column_position
from user_ind_columns where table_name='BIG_BOWIE';

INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
------------------------------ -------------------- ---------------
ALBUM_ID_I                     ALBUM_ID                           1
RELEASE_DATE_TOTAL_SALES_I     RELEASE_DATE                       1
RELEASE_DATE_TOTAL_SALES_I     TOTAL_SALES                        2
SYS_AI_aw2825ffpus5s           ALBUM_ID                           1
SYS_AI_aw2825ffpus5s           RELEASE_DATE                       2
SYS_AI_2hf33fpvnqztw           TOTAL_SALES                        1
SYS_AI_2hf33fpvnqztw           RELEASE_DATE                       2

As we can see, AI has logically replaced both unusable indexes.

The manual index based on ALBUM_ID has been replaced with an inferior index based on the ALBUM_ID, RELEASE_DATE columns. Inferior in that the automatic index is both redundant (if only the manual index on ALBUM_ID were rebuilt) and in that it has the logically unnecessary RELEASE_DATE column to inflate the size of the index.

The manual index based on the RELEASE_DATE, TOTAL_SALES columns has been replaced with a redundant automatic index based on the reversed TOTAL_SALES, RELEASE_DATE columns.

Now, AI has indeed automatically addressed the current FTS performance issues associated with these queries by creating these indexes, but a better remedy would have been to rebuild the unusable manual indexes and hence negate the need for these redundant automatic indexes.

But currently (including with version 21.3), AI will NOT rebuild unusable manually created indexes, no matter the scenario, and will instead create additional automatic indexes if it’s viable for it to do so.

A reason why Oracle at times recommends dropping all current manually created secondary indexes before implementing AI (although of course this comes with a range of obvious issues and concerns).

If these manually created indexes didn’t exist, I’ll leave it as an exercise to the discernable reader on what automatic indexes would have been created…

As always, this restriction may change in future releases…

Automatic Indexes: Automatically Rebuild Unusable Indexes Part III (“Waiting For The Man”) May 17, 2022

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, Exadata, Full Table Scans, Manual Indexes, Mixing Auto and Manual Indexes, Oracle, Oracle Blog, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Unusable Indexes.
1 comment so far

I’ve previously discussed how Automatic Indexing (AI) will not only create missing indexes, but will also rebuild unusable indexes, be it a Global or Local index.

However, all my previous examples have been with Automatic Indexes. How does AI handle unusable indexes in which the indexes were manually created?

In my first demo, I’ll start by creating a basic non-partitioned table:

SQL> create table bowie_stuff (id number, album_id number, country_id number, release_date date, total_sales number);

Table created.

SQL> insert into bowie_stuff select rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800),
ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BOWIE_STUFF');

PL/SQL procedure successfully completed.

We next manually create an index on the highly selective TOTAL_SALES column:

SQL> create index bowie_stuff_total_sales_i on bowie_stuff(total_sales);

Index created.

Let’s now invalidate the index by re-organising the table without the online clause:

SQL> alter table bowie_stuff move;

Table altered.

SQL> select index_name, status from user_indexes where table_name='BOWIE_STUFF';

INDEX_NAME                     STATUS
------------------------------ --------
BOWIE_STUFF_TOTAL_SALES_I      UNUSABLE

So the index is now in an UNUSABLE state.

To perk up the interest of AI, I’ll run a number of queries such as the following with a predicate condition on TOTAL_SALES:

select * from bowie_stuff where total_sales=42;

18 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 910563088

---------------------------------------------------------------------------------
| Id | Operation          | Name        | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------------
|  0 | SELECT STATEMENT   |             |   20 |   520 |    7427 (2) | 00:00:01 |
|* 1 |  TABLE ACCESS FULL | BOWIE_STUFF |   20 |   520 |    7427 (2) | 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("TOTAL_SALES"=42)
    filter("TOTAL_SALES"=42)

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
      42746 consistent gets
      42741 physical reads
          0 redo size
       1392 bytes sent via SQL*Net to client
         52 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
         18 rows processed

Without a valid index, the CBO has no choice but to perform an expensive full table scan.

However, it doesn’t matter how long I wait or how many different queries I run similar to the above, AI currently will never rebuild an unusable index if the index was manually created.

AI will only rebuild unusable automatically created indexes.

I’ve discussed previously how automatic and manually created indexes often don’t gel well together and is one of the key reasons why Oracle recommends dropping all manually created secondary indexes if you wish to implement AI (using the DBMS_AUTO_INDEX.DROP_SECONDARY_INDEXES procedure, which I’ll discuss in a future post).

Things can get a little interesting with AI, if the underlining table is partitioned and you have manually created unusable indexes.

As I’ll discuss in my next post…

Oracle 19c Automatic Indexing: Invisible/Valid Automatic Indexes (Bowie Rare) August 31, 2021

Posted by Richard Foote in 19c, 19c New Features, Attribute Clustering, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Exadata, Index Access Path, Index statistics, Invisible Indexes, Invisible/Valid Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle Indexes, Oracle Statistics, Oracle19c, Unusable Indexes.
1 comment so far

In my previous post, I discussed how newly created Automatic Indexes can have one of three statuses, depending the selectivity and effectiveness of the associated Automatic Index.

Indexes that improve performance sufficiently are created as Visible/Valid indexes and can be subsequently considered by the CBO. Indexes that are woeful and have no chance of improving performance are created as Invisible/Unusable indexes. Indexes considered potentially suitable but ultimately don’t sufficiently improve performance, are created as Invisible/Valid indexes.

Automatic Indexes are created as Visible/Valid indexes when shown to improve performance (by the _AUTO_INDEX_IMPROVEMENT_THRESHOLD parameter). But as I rarely came across Invisible/Valid Automatic Indexes (except for when Automatic Indexing is set to “Report Only” mode), I was curious to determine approximately at what point were such indexes created by the Automatic Indexing process.

To investigate things, I created a table with columns that contain data with various levels of selectivity, some of which should fall inside and outside the range of viability of any associated index, based on the cost of the associated Full Table Scan.

The following table has 32 columns of interest, each with a slight variation of distinct values giving small differences in overall column selectivity:

SQL> create table bowie_stuff1 (id number, code1 number, code2 number, code3 number, code4 number, code5 number, code6 number, code7 number, code8 number, code9 number, code10 number, code11 number, code12 number, code13 number, code14 number, code15 number, code16 number, code17 number, code18 number, code19 number, code20 number, code21 number, code22 number, code23 number, code24 number, code25 number, code26 number, code27 number, code28 number, code29 number, code30 number, code31 number, code32 number, name varchar2(42));

Table created.

SQL> insert into bowie_stuff1 
select rownum, 
       mod(rownum, 900)+1, 
       mod(rownum, 1000)+1, 
       mod(rownum, 1100)+1, 
       mod(rownum, 1200)+1, 
       mod(rownum, 1300)+1, 
       mod(rownum, 1400)+1, 
       mod(rownum, 1500)+1, 
       mod(rownum, 1600)+1, 
       mod(rownum, 1700)+1, 
       mod(rownum, 1800)+1, 
       mod(rownum, 1900)+1, 
       mod(rownum, 2000)+1, 
       mod(rownum, 2100)+1, 
       mod(rownum, 2200)+1, 
       mod(rownum, 2300)+1, 
       mod(rownum, 2400)+1, 
       mod(rownum, 2500)+1, 
       mod(rownum, 2600)+1, 
       mod(rownum, 2700)+1, 
       mod(rownum, 2800)+1, 
       mod(rownum, 2900)+1, 
       mod(rownum, 3000)+1, 
       mod(rownum, 3100)+1, 
       mod(rownum, 3200)+1, 
       mod(rownum, 3300)+1, 
       mod(rownum, 3400)+1, 
       mod(rownum, 3500)+1, 
       mod(rownum, 3600)+1, 
       mod(rownum, 3700)+1, 
       mod(rownum, 3800)+1, 
       mod(rownum, 3900)+1, 
       mod(rownum, 4000)+1,
       'THE RISE AND FALL OF ZIGGY STARDUST' 
from dual connect by level >=10000000;

10000000 rows created.

SQL> commit;

Commit complete.

As always, it’s important that statistics be collected for Automatic Indexing to function properly:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE_STUFF1', estimate_percent=>null);

PL/SQL procedure successfully completed.

 

So on a 10M row table, I have 32 columns with the number of distinct values varying by only 100 values per column (or by a selectivity of just 0.001%):

SQL> select column_name, num_distinct, density, histogram from dba_tab_columns where table_name='BOWIE_STUFF1' order by num_distinct;

COLUMN_NAME  NUM_DISTINCT    DENSITY HISTOGRAM
------------ ------------ ---------- ---------------
NAME                    1  .00000005 FREQUENCY
CODE1                 900    .001111 HYBRID
CODE2                1000       .001 HYBRID
CODE3                1100    .000909 HYBRID
CODE4                1200    .000833 HYBRID
CODE5                1300    .000769 HYBRID
CODE6                1400    .000714 HYBRID
CODE7                1500    .000667 HYBRID
CODE8                1600    .000625 HYBRID
CODE9                1700    .000588 HYBRID
CODE10               1800    .000556 HYBRID
CODE11               1900    .000526 HYBRID
CODE12               2000      .0005 HYBRID
CODE13               2100    .000476 HYBRID
CODE14               2200    .000455 HYBRID
CODE15               2300    .000435 HYBRID
CODE16               2400    .000417 HYBRID
CODE17               2500      .0004 HYBRID
CODE18               2600    .000385 HYBRID
CODE19               2700     .00037 HYBRID
CODE20               2800    .000357 HYBRID
CODE21               2900    .000345 HYBRID
CODE22               3000    .000333 HYBRID
CODE23               3100    .000323 HYBRID
CODE24               3200    .000312 HYBRID
CODE25               3300    .000303 HYBRID
CODE26               3400    .000294 HYBRID
CODE27               3500    .000286 HYBRID
CODE28               3600    .000278 HYBRID
CODE29               3700     .00027 HYBRID
CODE30               3800    .000263 HYBRID
CODE31               3900    .000256 HYBRID
CODE32               4000     .00025 HYBRID
ID               10000000          0 HYBRID

I’ll next run the below queries (based on a simple equality predicate on each column) several times each in batches of 8 queries, so as to not swamp the Automatic Indexing process with potential new index requests (the ramifications of which I’ll discuss in another future post):

SQL> select * from bowie_stuff1 where code1=42;
SQL> select * from bowie_stuff1 where code2=42;
SQL> select * from bowie_stuff1 where code3=42;
SQL> select * from bowie_stuff1 where code4=42;
SQL> select * from bowie_stuff1 where code5=42;
...
SQL> select * from bowie_stuff1 where code31=42;
SQL> select * from bowie_stuff1 where code32=42;

 

If we now look at the statuses of the Automatic Indexes subsequently created:

SQL> select i.index_name, c.column_name, i.auto, i.constraint_index, i.visibility, i.status, i.num_rows, i.leaf_blocks, i.clustering_factor
from user_indexes i, user_ind_columns c
where i.index_name=c.index_name and i.table_name='BOWIE_STUFF1' order by visibility, status;

INDEX_NAME             COLUMN_NAME  AUT CON VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
---------------------- ------------ --- --- --------- -------- ---------- ----------- -----------------
SYS_AI_5rw9j3d8pc422   CODE5        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_48q3j752csn1p   CODE4        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_9sgharttf3yr7   CODE3        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_8n92acdfbuh65   CODE2        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_brgtfgngu3cj9   CODE1        YES NO  INVISIBLE UNUSABLE   10000000       21702           4272987
SYS_AI_1tu5u4012mkzu   CODE11       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_34b6zwgtm86rr   CODE12       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_gd0ccvdwwb4mk   CODE13       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_7k7wh28n3nczy   CODE14       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_67k2zjp09w101   CODE15       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_5fa6k6fm0k6wg   CODE10       YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_4624ju6bxsv57   CODE9        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_bstrdkkxqtj4f   CODE8        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_39xqjjar239zq   CODE7        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_6h0adp60faytk   CODE6        YES NO  INVISIBLE VALID      10000000       15364          10000000
SYS_AI_5u0bqdgcx52vh   CODE16       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_0hzmhsraqkcgr   CODE22       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_4x716k4mdn040   CODE21       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_6wsuwr7p6drsu   CODE20       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_b424tdjx82rwy   CODE19       YES NO  INVISIBLE VALID      10000000       15366          10000000
SYS_AI_3a2y07fqkzv8x   CODE18       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_8dp0b3z0vxzyg   CODE17       YES NO  INVISIBLE VALID      10000000       15365          10000000
SYS_AI_d95hnqayd7t08   CODE23       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_fry4zrxqtpyzg   CODE24       YES NO  VISIBLE   VALID      10000000       15366          10000000
SYS_AI_920asb69q1r0m   CODE25       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_026pa8880hnm2   CODE31       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_96xhzrguz2qpy   CODE32       YES NO  VISIBLE   VALID      10000000       15368          10000000
SYS_AI_3dq93cc7uxruu   CODE29       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_5nbz41xny8fvc   CODE28       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_fz4q9bhydu2qt   CODE27       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_0kwczzg3k3pfw   CODE26       YES NO  VISIBLE   VALID      10000000       15367          10000000
SYS_AI_4qd5tsab7fnwx   CODE30       YES NO  VISIBLE   VALID      10000000       15367          10000000

We can see we indeed have the 3 statuses of Automatic Indexes captured:

Columns with a selectivity equal or worse to that of COL5 with 1300 distinct values are created as Invisible/Unusable indexes. Returning 10M/1300 rows or a cardinality of approx. 7,693 or more rows is just too expensive for such indexes on this table to be viable. This represents a selectivity of approx. 0.077%.

Note how the index statistics for these Invisible/Unusable indexes are not accurate. They all have an estimated LEAF_BLOCKS of 21702 and a CLUSTERING_FACTOR of 4272987. However, we can see from the other indexes which are physically created that these are not correct and are substantially off the mark with the actual LEAF_BLOCKS being around 15364 and the CLUSTERING_FACTOR actually much worse at around 10000000.

Again worthy of a future post to discuss how Automatic Indexing processing has to make (potentially inaccurate) guesstimates for these statistics in its analysis of index viability when such indexes don’t yet physically exist.

Columns with a selectivity equal or better to that of COL23 which has 3100 distinct values are created as Visible/Valid indexes. Returning 10M/3100 rows or a cardinality of approx. 3226 or less rows is cheap enough for such indexes on this table to be viable. This represents a selectivity of approx. 0.032%.

So in this specific example, only those columns between 1400 and 3000 distinct values meet the “borderline” criteria in which the Automatic Indexing process creates Invisible/Valid indexes. This represents a very very narrow selectivity range of only approx. 0.045% in which such Invisible/Valid indexes are created. Or for this specific example, only those columns that return approx. between 3,333 and 7,143 rows from the 10M row table.

Now the actual numbers and total range of selectivities for which Invisible/Valid Automatic Indexes are created of course depends on all sorts of factors, such as the size/cost of FTS of the table and not least the clustering of the associated data (which I’ve blogged about ad nauseam).

The point I want to make is that the range of viability for such Invisible/Valid indexes is relatively narrow and the occurrences of such indexes relatively rare in your databases. As such, the vast majority of Automatic Indexes are likely to be either Visible/Valid or Invisible/Unusable indexes.

It’s important to recognised this when you encounter such Invisible/Valid Automatic Indexes (outside of “REPORT ONLY” implementations), as it’s an indication that such an index is a borderline case that is currently NOT considered by the CBO (because of it being Invisible).

However, this Invisible/Valid Automatic Index status should really change to either of the other two more common statuses in the near future.

I’ll expand on this point in a future post…

Oracle 19c Automatic Indexing: Indexing With Stale Statistics Part I (Dead Against It) October 6, 2020

Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Exadata X8, Full Table Scans, High Frequency Statistics Collection, Index Access Path, Index statistics, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Performance Tuning, Stale Statistics, Unusable Indexes.
5 comments

A “golden rule” when working with Automatic Indexing is that things don’t work properly if there are stale statistics on the dependant objects. Stale statistics can of course be problematic but they can be particularly troublesome when dealing with Automatic Indexing.

In the Oracle Autonomous Database environments, this issue is addressed somewhat by the new High Frequency Statistics Collection capability, which helps to automatically collect stale statistics on a regular basis. However, in on-prem Exadata environments where this can more easily be turned off or collected less frequently, it’s a potential issue worth consideration.

I’ll start with a simple little table, with a CODE column that has lots of distinct values:

SQL> create table bowie_stale (id number constraint bowie_stale_pk primary key, code number, name varchar2(42));

Table created.

SQL> insert into bowie_stale select rownum, mod(rownum, 1000000)+1, 'David Bowie' from dual connect by level <= 10000000;

10000000 rows created.

SQL> commit;

Commit complete.

Importantly, I don’t collect statistics on this newly populated table…

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables
where table_name='BOWIE_STALE';

TABLE_NAME        NUM_ROWS     BLOCKS LAST_ANAL
--------------- ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

If we now run the following query a number of times while there are no statistics on the table:

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice but to use a FTS as I don’t yet have an index on the CODE column.

If I now wait for the next Automatic Indexing task to kick in AND if there are still NO statistics on the table:

 

SQL> select dbms_auto_index.report_last_activity('text', 'ALL', 'ALL' ) report from dual;

REPORT

--------------------------------------------------------------------------------
GENERAL INFORMATION
-------------------------------------------------------------------------------
Activity start               : 05-JUL-2020 06:36:31
Activity end                 : 05-JUL-2020 06:37:07
Executions completed         : 1
Executions interrupted       : 0
Executions with fatal error  : 0
-------------------------------------------------------------------------------

SUMMARY (AUTO INDEXES)
-------------------------------------------------------------------------------
Index candidates            : 1
Indexes created             : 0
Space used                  : 0 B
Indexes dropped             : 0
SQL statements verified     : 0
SQL statements improved     : 0
SQL plan baselines created  : 0
Overall improvement factor  : 0x
-------------------------------------------------------------------------------

SUMMARY (MANUAL INDEXES)
-------------------------------------------------------------------------------
Unused indexes    : 0
Space used        : 0 B
Unusable indexes  : 0

 

You can see that there was the one index candidate BUT no Automatic Index appears to have been created.

Assuming there are still no statistics:

 

SQL> select table_name, num_rows, blocks, last_analyzed from user_tables where table_name='BOWIE_STALE';

TABLE_NAME                       NUM_ROWS    BLOCKS  LAST_ANAL
------------------------------ ---------- ---------- ---------
BOWIE_STALE

SQL> select column_name, num_distinct, density, histogram, last_analyzed from user_tab_cols
where table_name='BOWIE_STALE2';

COLUMN_NAME          NUM_DISTINCT    DENSITY HISTOGRAM       LAST_ANAL
-------------------- ------------ ---------- --------------- ---------
ID                                           NONE
CODE                                         NONE
NAME                                         NONE

 

If we look now at what indexes exist on the table:

SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_STALE';

INDEX_NAME                     AUT CON VISIBILIT COMPRESSION   STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --- --------- ------------- -------- ---------- ----------- -----------------
BOWIE_STALE_PK                 NO  YES VISIBLE   DISABLED      VALID
SYS_AI_300kk2unp8tr0           YES NO  INVISIBLE DISABLED      UNUSABLE          0           0                 0

SQL> select index_name, column_name, column_position from user_ind_columns
where table_name='BOWIE_STALE2' order by index_name, column_position;

INDEX_NAME                     COLUMN_NAME          COLUMN_POSITION
------------------------------ -------------------- ---------------
BOWIE_STALE_PK                 ID                                 1
SYS_AI_300kk2unp8tr0           CODE                               1

 

We notice there is now an Automatic Index BUT it remains in an UNUSABLE/INVISIBLE state. This means the index can’t be used by the CBO.

So if we now re-run the SQL query again:

 

SQL> select * from bowie_stale where code=42;

10 rows selected.

Execution Plan

-----------------------------------------------------------------------------------------
| Id | Operation                | Name        | Rows | Bytes | Cost (%CPU)| Time        |
-----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |             |  437 | 21413 |    553 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_STALE |  437 | 21413 |    553 (16)| 00:00:01    |
-----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=42)
    filter("CODE"=42)

Note
-----
- dynamic statistics used: dynamic sampling (level=2)
- automatic DOP: Computed Degree of Parallelism is 1

Statistics
----------------------------------------------------------
          6  recursive calls
          0  db block gets
      39026  consistent gets
          0  physical reads
          0  redo size
        867  bytes sent via SQL*Net to client
        588  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
         10  rows processed

 

The CBO has no choice still but to use the FTS.

In Part II, we’ll see that once we get into this scenario, it can be a tad problematic to get ourselves out of it and get the Automatic Index created as we would like…

Oracle 19c Automatic Indexing: Data Skew Part III (The Good Son) September 16, 2020

Posted by Richard Foote in 19c, 19c New Features, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Data Skew, Index Access Path, Oracle, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle Statistics, Oracle19c, Unusable Indexes.
add a comment

 

I’m going to expand just a tad on my previous posts on data skew and run a simple query that returns a few rows based on a column predicate AND another query on the same column that returns many rows.

The following table has a CODE column as with previous posts with the data heavily skewed:

SQL> create table bowie_skew (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie_skew select rownum, 10, 'DAVID BOWIE' from dual connect by level <=1000000;

1000000 rows created.

SQL> update bowie_skew set code = 9 where mod(id,3) = 0;

333333 rows updated.

SQL> update bowie_skew set code = 1 where mod(id,2) = 0 and id between 1 and 20000;

10000 rows updated.

SQL> update bowie_skew set code = 2 where mod(id,2) = 0 and id between 30001 and 40000;

5000 rows updated.

SQL> update bowie_skew set code = 3 where mod(id,100) = 0 and id between 300001 and 400000;

1000 rows updated.

SQL> update bowie_skew set code = 4 where mod(id,100) = 0 and id between 400001 and 500000;

1000 rows updated.

SQL> update bowie_skew set code = 5 where mod(id,100) = 0 and id between 600001 and 700000;

1000 rows updated.

SQL> update bowie_skew set code = 6 where mod(id,1000) = 0 and id between 700001 and 800000;

100 rows updated.

SQL> update bowie_skew set code = 7 where mod(id,1000) = 0 and id between 800001 and 900000;

100 rows updated.

SQL> update bowie_skew set code = 8 where mod(id,1000) = 0 and id between 900001 and 1000000;

100 rows updated.

SQL> commit;

Commit complete.

 

I’ll next collect statistics with NO histogram, as I don’t think they’re required at this point:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'bowie_skew', estimate_percent=>100, method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

If we look at the table data:

SQL> select code, count(*) from bowie_skew group by code order by code;

      CODE   COUNT(*)
---------- ----------
         1      10000
         2       5000
         3       1000
         4       1000
         5       1000
         6        100
         7        100
         8        100
         9     327235
        10     654465

 

The value “7” only has 100 associated rows, while the value “10” is very common with 654,465 rows.

But I currently have no histograms:

SQL> select column_name, num_buckets, histogram from user_tab_cols
where table_name='BOWIE_SKEW';

COLUMN_NAME     NUM_BUCKETS HISTOGRAM
--------------- ----------- ---------------
ID                        1 NONE
CODE                      1 NONE
NAME                      1 NONE

 

If I run the following query with a CODE=7 predicate just once:

SQL> select * from bowie_skew where code=7;

100 rows selected.

Execution Plan

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time      |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |   100K|  1953K|   570   (7)| 00:00:01  |
|   1 |  PX COORDINATOR              |            |       |       |            |           |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |   100K|  1953K|   570   (7)| 00:00:01  |
|   3 |    PX BLOCK ITERATOR         |            |   100K|  1953K|   570   (7)| 00:00:01  |
|*  4 |     TABLE ACCESS STORAGE FULL| bowie_skew |   100K|  1953K|   570   (7)| 00:00:01  |
--------------------------------------------------------------------------------------------

 

It uses a Full Table Scan (the CBO has no choice without an index) AND hopelessly gets the cardinality estimate wrong, thinking 100K are going to be returned (and not the 100 actual rows).  So the CBO is unlikely to use an index anyways as it would be deemed too expensive to return so many rows.

I’ll now run the following query many times on the CODE=10 predicate that returns many rows:

SQL> select * from bowie_skew where code=10;

654465 rows selected.

Execution Plan

--------------------------------------------------------------------------------------------
| Id  | Operation                    | Name       | Rows  | Bytes | Cost (%CPU)| Time      |
--------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |            |   100K|  1953K|   570   (7)| 00:00:01  |
|   1 |  PX COORDINATOR              |            |       |       |            |           |
|   2 |   PX SEND QC (RANDOM)        | :TQ10000   |   100K|  1953K|   570   (7)| 00:00:01  |
|   3 |    PX BLOCK ITERATOR         |            |   100K|  1953K|   570   (7)| 00:00:01  |
|*  4 |     TABLE ACCESS STORAGE FULL| bowie_skew |   100K|  1953K|   570   (7)| 00:00:01  |
--------------------------------------------------------------------------------------------

 

So again, no choice here with a FTS and we likely wouldn’t want to use an index anyways as it would be just too expensive.

If we check out what the Automatic Indexing process has done with such a workload:

SQL> select dbms_auto_index.report_last_activity() report from dual;

REPORT

INDEX DETAILS
-------------------------------------------------------------------------------
The following indexes were created:
*: invisible
-------------------------------------------------------------------------------
--------------------------------------------------------------------------
| Owner | Table      | Index                | Key  | Type   | Properties |
--------------------------------------------------------------------------
| BOWIE | BOWIE_SKEW | SYS_AI_7psvzc164vbng | CODE | B-TREE | NONE       |
--------------------------------------------------------------------------
-------------------------------------------------------------------------------

VERIFICATION DETAILS
-------------------------------------------------------------------------------
The performance of the following statements improved:
-------------------------------------------------------------------------------
-------------------------------------------------------------------------------
Parsing Schema Name : BOWIE
SQL ID              : 6fm3m8cg2jnun
SQL Text            : select * from bowie_skew where code=7
Improvement Factor  : 46.6x

Execution Statistics:
-----------------------------
                    Original Plan                Auto Index Plan
                    ---------------------------- ----------------------------
Elapsed Time (s):   36653                        1992
CPU Time (s):       33899                        967
Buffer Gets:        4291                         103
Optimizer Cost:     52                           4
Disk Reads:         0                            2
Direct Writes:      0                            0
Rows Processed:     100                          100
Executions:         1                            1

 

An Automatic Index on the CODE column is created (SYS_AI_7psvzc164vbng), with ONLY the SQL based on the CODE=7 predicate listed in the report. The other query is indeed too expensive for a new index to be viable and so isn’t listed.

If we look at the Plans Section of the Automatic Indexing report:

 

PLANS SECTION
---------------------------------------------------------------------------------------------

- Original
-----------------------------
Plan Hash Value : 410492785

--------------------------------------------------------------------------------------
| Id | Operation                 | Name       | Rows   | Bytes   | Cost | Time       |
--------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT          |            |        |         | 52   |            |
| 1  | TABLE ACCESS STORAGE FULL | BOWIE_SKEW | 100000 | 2000000 | 52   | 00:00:01   |
--------------------------------------------------------------------------------------

Notes
-----
- dop_reason = no expensive parallel operation
- dop = 1
- px_in_memory_imc = no
- px_in_memory = no

- With Auto Indexes
-----------------------------
Plan Hash Value : 140816325

-------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name                 | Rows | Bytes | Cost | Time       |
-------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                      | 119  | 2380  | 4    | 00:00:01   |
|   1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_SKEW           | 119  | 2380  | 4    | 00:00:01   |
| * 2 | INDEX RANGE SCAN                    | SYS_AI_7psvzc164vbng | 100  |       | 3    | 00:00:01   |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
------------------------------------------
* 2 - access("CODE"=7)

Notes
-----
- Dynamic sampling used for this statement ( level = 11 )

 

The important point to note here is that the cardinality estimates are relatively accurate despite there being no histograms at this stage because the Automatic Indexing session uses Dynamic Sampling Level=11. Missing/inaccurate statistics are calculated on fly and this enables the session to accurately determine the size of the returned data set and that an index is indeed the more efficient access path.

So with mixed workloads, all it takes is one SQL executed once that demonstrably improves thanks to an index for the associated Automatic Index to be created as a VISIBLE/VALID index:

SQL> select index_name, auto, visibility, status, num_rows, leaf_blocks, clustering_factor
from user_indexes where table_name='BOWIE_SKEW';

INDEX_NAME                     AUT VISIBILIT STATUS     NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ --- --------- -------- ---------- ----------- -----------------
SYS_AI_7psvzc164vbng           YES VISIBLE   VALID       1000000        1537              8534

 

If we now run the query AFTER the histograms are subsequently created thanks to the High-Frequency Automatic Statistics Collection (see previous post), the new Automatic Index is now used:

SQL> select * from bowie_skew where code=7;

100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 140816325

------------------------------------------------------------------------------------------------------------
| Id | Operation                          | Name                 | Rows | Bytes | Cost (%CPU)| Time        |
------------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |                      | 100  | 2000  |       4 (0)| 00:00:01    |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE_SKEW           | 100  | 2000  |       4 (0)| 00:00:01    |
|* 2 | INDEX RANGE SCAN                   | SYS_AI_7psvzc164vbng | 100  |       |       3 (0)| 00:00:01    |
------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=7)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
        104 consistent gets
          0 physical reads
          0 redo size
       2871 bytes sent via SQL*Net to client
        359 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
        100 rows processed

 

Note if the histogram is NOT yet collected, the CBO will not determine the correct cardinality estimate and will ignore the new Automatic Index (as previously discussed).

If we run again the query that returns many rows:

SQL> select * from bowie_skew where code=10;

654465 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 410492785

----------------------------------------------------------------------------------------
| Id | Operation                | Name       | Rows | Bytes | Cost (%CPU)| Time        |
----------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT         |            |  654K|    12M|     52 (16)| 00:00:01    |
|* 1 | TABLE ACCESS STORAGE FULL| BOWIE_SKEW |  654K|    12M|     52 (16)| 00:00:01    |
----------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - storage("CODE"=10)
    filter("CODE"=10)

Note
-----
- automatic DOP: Computed Degree of Parallelism is 1 because of no expensive parallel operation

Statistics
----------------------------------------------------------
          0 recursive calls
          0 db block gets
       3725 consistent gets
          0 physical reads
          0 redo size
    6549708 bytes sent via SQL*Net to client
       1790 bytes received via SQL*Net from client
        132 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
     654465 rows processed

The new Automatic Index is correctly ignored by the CBO, as the query returns too many rows for the index to be viable.

So in this example, Automatic Indexing works exactly as it should. It creates a new Automatic Index for a query where it will indeed improve the performance, while other queries on the same column in which many more rows are returned are also run. For these other queries, the new Automatic Index is correctly not used as such an index would degrade the performance of the query.

In my next post, I’ll look at the first example with data skew where Automatic Indexing can be problematic…

12c Enhanced Online Index DDL Operations (Lady Godiva’s Operation) February 17, 2014

Posted by Richard Foote in 12c, Drop Index, Invisible Indexes, Online DDL, Oracle Indexes, Unusable Indexes.
6 comments

In my last couple of posts, I discussed how table partitions can be moved online since 12c, keeping all indexes in sync as part of the process.

12c also introduced enhancements to a number of index related DDL statements, removing blocking locks and making their use online and far less intrusive. The following commands now have a new ONLINE option:

DROP INDEX ONLINE

ALTER INDEX UNUSABLE ONLINE

So if we look at a little example (initially on 11g R2), where we create a table and associated index on the CODE column:

SQL> create table radiohead (id number, code number, name varchar2(30));

Table created.

SQL> insert into radiohead select rownum, mod(rownum,1000), 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index radiohead_code_i on radiohead(code);

Index created.

If we now insert a new row in one session but not commit:

SQL> insert into radiohead values (1000001, 42, 'ZIGGY STARDUST');

1 row created.

And then attempt any of the following DDL commands in another session:

SQL> drop index radiohead_code_i;

drop index radiohead_code_i
           *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
SQL> alter index radiohead_code_i invisible;

alter index radiohead_code_i invisible
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
SQL> alter index radiohead_code_i unusable;

alter index radiohead_code_i unusable
            *
ERROR at line 1:
ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

They all get the well-known “ORA-00054: resource busy” error.

If on the other hand, one of these DDL statements is already running in a session:

SQL> alter index radiohead_code_i unusable;

All DML statements in other sessions will hang until the DDL completes:

SQL> insert into radiohead values (1000002, 42, 'THIN WHITE DUKE');

Once the index is finally made unusable:

SQL> alter index radiohead_code_i unusable;

Index altered.

SQL> select index_name, status from dba_indexes where index_name = 'RADIOHEAD_CODE_I';

INDEX_NAME                     STATUS
------------------------------ --------
RADIOHEAD_CODE_I               UNUSABLE

SQL> select segment_name, blocks, extents from dba_segments where segment_name = 'RADIOHEAD_CODE_I';

no rows selected

We can see not only is the index now in an unusable state but the index segment has been dropped (in 11g r2) as the storage associated with the unusable index is of no further use.

So these commands prior to the Oracle 12c Database previously had locking related issues.

If we now perform the same setup in 12c and again have an outstanding transaction in a session:

SQL> drop index radiohead_code_i online;

The Drop Index command doesn’t now get the Ora-00054: resource busy, but rather hangs until all prior transactions complete.

However, while the Drop Index command hangs, it doesn’t in turn lock out transactions within other sessions. In another session:

SQL> insert into radiohead values (1000002, 42, 'THIN WHITE DUKE');

1 row created.

And in yet other session:

SQL> delete radiohead where id = 42;

1 row deleted.

SQL> commit;

Commit complete.

These all complete successfully. The Drop Index command itself will eventually complete successfully once all prior transaction have finished.

SQL> drop index radiohead_code_i online;

Index dropped.

Another more subtle difference in behaviour with 12c. If there’s an existing transaction when you decide to make an index unusable:

SQL> insert into radiohead values (1000001, 42, 'ZIGGY STARDUST');

1 row created.
SQL> alter index radiohead_code_i unusable online;

As in the previous demo, the alter index command will hang indefinitely until the previous transaction commits:

SQL> commit;

Commit complete.

SQL> alter index radiohead_code_i unusable online;

Index altered.

SQL> select index_name, status from dba_indexes where index_name = 'RADIOHEAD_CODE_I';

INDEX_NAME                STATUS
------------------------- --------
RADIOHEAD_CODE_I          UNUSABLE

SQL> select segment_name, blocks, extents from dba_segments where segment_name = 'RADIOHEAD_CODE_I';

SEGMENT_NAME         BLOCKS    EXTENTS
---------------- ---------- ----------
RADIOHEAD_CODE_I       2176         32

We note the index has eventually been made Unusable, however the segment has not now been dropped (as it was in the 11g R2 demo) due to the use of the ONLINE clause.

With the Oracle 12c Database, the locking implications and concurrency issues associated these index related DDL commands have been reduced with these new ONLINE options.

12c Online Partitioned Table Reorganisation Part I (Prelude) January 7, 2014

Posted by Richard Foote in 12c, Oracle Indexes, Partitioning, Unusable Indexes, Update Indexes, Update Indexes Online.
2 comments

First post for 2014 !!

Although it’s generally not an overly common activity with Oracle databases, reorganising a table can be somewhat painful, primarily because of the associated locking implications and the impact it has on indexes.

If we look at the following example:

SQL> create table muse2 (id number, status varchar2(6), name varchar2(30));

Table created.

SQL> insert into muse2 select rownum, 'CLOSED', 'DAVID BOWIE'
     from dual connect by level <= 3000000;

3000000 rows created.

SQL> commit;

Commit complete.

SQL> create index muse2_id_pk on muse2(id);

Index created.

SQL> alter table muse2 add constraint muse2_id_pk primary key(id);

Table altered.

SQL> create index muse2_status_i on muse2(status);

Index created.

So we have a table with a couple of indexes. We can’t move the table using the ONLINE option as it’s only applicable for Index Organized Tables:

SQL> alter table muse2 move online;
alter table muse2 move online
 *
 ERROR at line 1:
 ORA-01735: invalid ALTER TABLE option

If in one session, we have a current transaction on the table (i.e. not committed):

SQL> insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST');

1 row created.

An attempt to MOVE the table in another session will fail with locking issues:

SQL> alter table muse2 move;
alter table muse2 move
 *
 ERROR at line 1:
 ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

On the other hand, if the table MOVE command proceeds:

SQL> alter table muse2 move;

Table altered.

It in turn locks other transactions out during the duration and leaves all indexes in an UNUSABLE state:

SQL> insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST');
insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST')
 *
 ERROR at line 1:
 ORA-01502: index 'BOWIE.MUSE2_ID_PK' or partition of such index is in unusable state

SQL> select index_name, status from dba_indexes
     where table_name='MUSE2';

 INDEX_NAME      STATUS
 --------------- --------
 MUSE2_ID_PK     UNUSABLE
 MUSE2_STATUS_I  UNUSABLE

If we now look at a similar Partitioned Table example:

SQL> create table muse (id number, status varchar2(6), name varchar2(30))
2 partition by range (id)
3 (partition p1 values less than (1000001),
4 partition p2 values less than (2000001),
5 partition p3 values less than (maxvalue));

Table created.

SQL> insert into muse select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level <= 3000000;

3000000 rows created.

SQL> commit;

Commit complete.

SQL> create index muse_id_pk on muse(id);

Index created.

SQL> alter table muse add constraint muse_id_pk primary key(id);

Table altered.

SQL> create index muse_status_i on muse(status) local;

Index created.

Similar locking and indexing issues occur if we try and reorganise a partition, even if we UPDATE INDEXES on the fly. For example, if we have an active transaction in one session:

SQL> insert into muse values (3000001, 'OPEN', 'ZIGGY STARDUST');

1 row created.

While in another session:

SQL> alter table muse move partition p3 tablespace users update indexes;
alter table muse move partition p3 tablespace users update indexes
 *
 ERROR at line 1:
 ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired

We get the same old resource busy error. If we tried things the other way around, so in one session we first attempt to move a table partition:

SQL> alter table muse move partition p3 tablespace users update indexes;
alter table muse move partition p3 tablespace users update indexes
 *
 ERROR at line 1:
 ORA-14327: Some index [sub]partitions could not be rebuilt

We can eventually get the above error if in another session we then attempt to insert a new row into this partition:

SQL> insert into muse values (3000002, 'OPEN', 'ZIGGY STARDUST');

1 row created.

which in turn hangs for the period of time until the above error is generated.

The associated local index is now not a happy chappy:

SQL> select index_name, partition_name, status from dba_ind_partitions
where index_name='MUSE_STATUS_I'
union
select index_name, null, status from dba_indexes where index_name='MUSE_ID_PK';

INDEX_NAME      PARTITION_NAME  STATUS
--------------- --------------- --------
MUSE_ID_PK                      VALID
MUSE_STATUS_I   P1              USABLE
MUSE_STATUS_I   P2              USABLE
MUSE_STATUS_I   P3              UNUSABLE

So Moving tables and table partitions around can all get a bit messy, especially if high availability is required.

The DBMS_REDEFINITION package is designed specifically to enable the online redefinition of objects, but it has its own issues and is not as clean and simplistic as a simple MOVE operation when we just want to perform a table re-org.

So what was has changed in Oracle Database 12c ?

We’ll see in the next post although the title here does rather give it away 🙂

Oracle11g: Zero Sized Unusable Indexes Part II (Nathan Adler) February 27, 2011

Posted by Richard Foote in 11g, 11g New features, Oracle Indexes, Unusable Indexes.
12 comments

In my previous post, I discussed how Oracle from 11g R2 onwards will automatically drop the segment and associated storage from unusable index objects. Mohamend Houri asked in the comments section the excellent question of just how useful this feature will be in real life cases when typically indexes are not left in an unusuable state for a long time, perhaps only when performing large data loads when such indexes would ordinarily be rebuilt anyways.

Thought the question was worth a seperate blog entry to provide a worthy answer.

The first point I would make is that we need to think a little outside the box and consider how such change in behaviour can open up new possibilities and flexibilities in how we index our tables.

For example, previously a Local Partitioned Index must have the same number of index partitions as the parent table. But what if an index is only useful for the “current” partition, where accessing newish data makes sense via an index. However, historical data in “older” partitions might only be accessed in batch processes via full partition scans. Why have a local index for older partitions when such indexes are never used. Previously, we had no choice, it was a case of if one or some of the partitions needed an index, then all the partitions needed to be indexed. If we made such unnecessary partitioned indexes unusable, we still needed to allocate storage for the index segment. Now, we can make any unnecessary index partition unusable and no storage at all is allocated to such index partitions.

Taking this a step further, we now have a really nice method of potentially indexing only portions of a table that need indexing, values which don’t have any benefit of being indexed (perhaps because the values are too numerous to ever be accessed efficiently via an index) no longer need to be indexed at all.

Here’s a classic example. Following is a table with a flag  in which the vast number of rows in the data have been “processed”. However, we have a few rows, those current rows which are of interest to us, which have not yet been processed (they may have a status of another value). We need an index in order to find the few rows which have not yet been processed but the index needs to also include all the values which are not of interest and have been processed.

 
SQL> create table bowie_stuff (id number, processed varchar2(10));
Table created.

SQL> insert into bowie_stuff select rownum, 'YES' from dual connect by level <= 1000000;
1000000 rows created.

SQL> commit;
Commit complete.

SQL> update bowie_stuff set processed = ‘NO’ where id in (999990, 999992, 999994, 999996, 999998);
5 rows updated.

SQL> commit;
Commit complete.

SQL> create index bowie_stuff_i on bowie_stuff(processed) pctfree 0;
Index created.

SQL> select index_name, leaf_blocks from dba_indexes where index_name = 'BOWIE_STUFF_I';

INDEX_NAME                     LEAF_BLOCKS
------------------------------ -----------
BOWIE_STUFF_I                         1877

SQL> select segment_name, blocks from dba_segments where segment_name = 'BOWIE_STUFF_I';

SEGMENT_NAME             BLOCKS
-------------------- ----------
BOWIE_STUFF_I              1920

 
 

Notice how the index is quite large (1,877 leaf blocks) as it needs to hold values for all 1M rows, even though only a relative handful of values within the index are ultimately of any use.

If we now gather stats (note we need to collect histograms as the column value distribution is very skewed) and run a query to select just the 5 rows that have not actually been processed:

 
SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_STUFF', estimate_percent=>null, cascade=> true, method_opt=> 'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=>'BOWIE', tabname=>'BOWIE_STUFF', estimate_percent=>null, method_opt=> 'FOR COLUMNS PROCESSED SIZE 5');

PL/SQL procedure successfully completed.

SQL> select * from bowie_stuff where processed = 'NO';

Execution Plan
---------------------------------------------------------------------------------------------
| Id  | Operation                   | Name          | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |               |     5 |    40 |     4   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF   |     5 |    40 |     4   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_STUFF_I |     5 |       |     3   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        540  bytes sent via SQL*Net to client
        396  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

 
 

Note the CBO uses the index but it requires a total of 6 consistent reads.

Not bad but we can do somewhat better and perform less I/O , significantly reduce storage overheads and significantly reduce index maintenance operations, if only we didn’t store the unnecessary index values within the index.

One method could be to create a function-based index based on the decode function and only store non-null values that are of interest. However, this requires the application to likewise use the decode function in order to make use of the index.

Another method is to use a partitioned index and now with this new Oracle11g feature of zero sized unusable indexes, we don’t need any storage at all for the unwanted indexed values.

Let’s now re-create the index as a globally partitioned index, with one partition defined to contain all values of interest and another partition defined to contain the vast number of processed values. Initially, the index is created in an unusable state so no segments and no storage is allocated to any of the partitions:

 
SQL> drop index bowie_stuff_i;

Index dropped.

SQL> create index bowie_stuff_i on bowie_stuff(processed)
  2  global partition by range (processed)
  3  (partition not_processed_part values less than ('YES'),
  4   partition processed_part values less than (MAXVALUE))
  5  unusable;

Index created.

 
 

Next, we’re only going to rebuild the partition containing just the relatively few rows of interest. The partition containing the values that are not of interest is left in an unusable state and so continues to occupy no storage at all:

 
SQL> alter index bowie_stuff_i rebuild partition not_processed_part;

Index altered.

SQL> select index_name, partition_name, leaf_blocks from dba_ind_partitions where index_name = 'BOWIE_STUFF_I';

INDEX_NAME           PARTITION_NAME       LEAF_BLOCKS
-------------------- -------------------- -----------
BOWIE_STUFF_I        PROCESSED_PART                 0
BOWIE_STUFF_I        NOT_PROCESSED_PART             1

SQL> select segment_name, partition_name, blocks from dba_segments where segment_name = 'BOWIE_STUFF_I';

SEGMENT_NAME         PARTITION_NAME           BLOCKS
-------------------- -------------------- ----------
BOWIE_STUFF_I        NOT_PROCESSED_PART            8

 
 

Note how the index is now tiny (reduced from 1,877 leaf blocks to just 1) as it is only now just storing the index entries that are of interest. We have just saved ourselves heaps of storage as the other partition remains unusable and uses no storage at all.

If we now run our query again:

 
SQL> select * from bowie_stuff where processed = 'NO';

Execution Plan
--------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name          | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |               |     5 |    45 |     1   (0)| 00:00:01 |       |       |
|   1 |  PARTITION RANGE SINGLE      |               |     5 |    45 |     1   (0)| 00:00:01 |     1 |     1 |
|   2 |   TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF   |     5 |    45 |     1   (0)| 00:00:01 |       |       |
|*  3 |    INDEX RANGE SCAN          | BOWIE_STUFF_I |     5 |       |     1   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          4  consistent gets
          0  physical reads
          0  redo size
        542  bytes sent via SQL*Net to client
        395  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          5  rows processed

 
 

We notice that the execution plan is just using the tiny index partition and as a result we have reduced our consistent gets down from 6 to just 4. Additionally, we have not had to change our application at all to use the improved index, it was the exact same query as executed previously.

This method can of course be used in Oracle prior to 11g R2 but now with  zero sized unusable indexes, we do not have to allocate any storage at all to those indexes that we may wish to remain in an unusable state for extended or indefinite periods of time. So yes, zero sized unusable indexes can be extremely useful in many real life scenarios 🙂

Oracle11g: Zero Sized Unusable Indexes (Zeroes) February 25, 2011

Posted by Richard Foote in 11g, 11g New features, Oracle Indexes, Unusable Indexes.
2 comments

Following on from my previous discussion on “Create On Demand” segments, Oracle 11g R2 has also introduced storage saving initiatives in relation to useable indexes.  Starting with a simple Oracle 10g example, we create a table and associated index:
 

 
SQL> create table bowie as select rownum id, 'BOWIE' name from dual connect by level <= 1000000;
 
Table created.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 
  

If we now make the index unusable:
 

 
SQL> alter index bowie_id_i unusable;
 
Index altered.
 
SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = 'BOWIE_ID_I';
 
INDEX_NAME     BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DRO
---------- ---------- ----------- ---------- -------- ---
BOWIE_ID_I          2        2226    1000000 UNUSABLE NO
 

SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = 'BOWIE_ID_I';
 
SEGMENT_NAME      BYTES     BLOCKS    EXTENTS
------------ ---------- ---------- ----------
BOWIE_ID_I     18874368       2304         18

 
  

We notice that the storage associated with the segment remains, even though the data within the index is now totally useless to us now. The index definition is of course vital but why bother continuing to assign 18 extents of storage (in this example) to the index ?  Oracle 11g Release 2 has now by default changed this behaviour. 
 
Using the same demo as before but running Oracle11g R2:
 

 
SQL> create table bowie as select rownum id, 'BOWIE' name from dual connect by level <= 1000000;
 
Table created.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', cascade=> true, estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.
 
SQL> alter index bowie_id_i unusable;
 
Index altered.
 
SQL> select index_name, blevel, leaf_blocks, num_rows, status, dropped from dba_indexes where index_name = 'BOWIE_ID_I';
 
INDEX_NAME     BLEVEL LEAF_BLOCKS   NUM_ROWS STATUS   DRO
---------- ---------- ----------- ---------- -------- ---
BOWIE_ID_I          2        2226    1000000 UNUSABLE NO
 
SQL> select segment_name, bytes, blocks, extents from dba_segments where segment_name = 'BOWIE_ID_I';
 
no rows selected

 
  

We notice that the storage associated with the object is all gone. The index object remains but the underling segment and storage have been automatically dropped.

If we now look at a partitioning example, where we create 3 types of indexes:
 

 
SQL> CREATE TABLE big_album_sales(id number, album_id number, country_id number,
          release_date date, total_sales number)  PARTITION BY RANGE (release_date)
(PARTITION ALBUMS_2007 VALUES LESS THAN (TO_DATE('01-JAN-2008', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2008 VALUES LESS THAN (TO_DATE('01-JAN-2009', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2009 VALUES LESS THAN (TO_DATE('01-JAN-2010', 'DD-MON-YYYY')),
 PARTITION ALBUMS_2010 VALUES LESS THAN (MAXVALUE));
 
Table created.
  

SQL> INSERT INTO big_album_sales SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2000), ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 1000000;
 
1000000 rows created.
 
SQL> commit;
 
Commit complete.

 
  

We first create a Non-Partitioned Index:
 

 
SQL> CREATE INDEX big_album_tot_sales_i ON big_album_sales(total_sales);
 
Index created.

 
  

Next a Global Partitioned Index:
 

 
SQL> CREATE INDEX big_album_country_id_i  ON big_album_sales(country_id)
     GLOBAL PARTITION BY RANGE (country_id)
     (PARTITION TS1 VALUES LESS THAN (26),
      PARTITION TS2 VALUES LESS THAN (51),
      PARTITION TS3 VALUES LESS THAN (76),
      PARTITION TS4 VALUES LESS THAN (MAXVALUE));
 
Index created.

  

Finally,  a Local Partitioned index:
  

 
SQL> CREATE INDEX big_album_album_id_i ON big_album_sales(album_id) local;
 
Index created.
 
SQL> exec dbms_stats.gather_table_stats(ownname=> 'BOWIE', tabname=> 'BIG_ALBUM_SALES', estimate_percent=> null, method_opt=> 'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 
  

If we now split the last table partition, this will effectively make the:
 
1) Non-Partitioned Unusable
2) All partitions of the Global Partitioned index unusable
3) Just the last 2 partitions of the Local Partitioned Index unusable
  

 
SQL> ALTER TABLE big_album_sales SPLIT PARTITION ALBUMS_2010
     AT (TO_DATE('01-JAN-2011', 'DD-MON-YYYY'))
     INTO (PARTITION ALBUMS_2010, PARTITION ALBUMS_2011);
 
Table altered.
 
SQL> select index_name, status from dba_indexes where table_name = 'BIG_ALBUM_SALES';
 
INDEX_NAME               STATUS
------------------------ --------
BIG_ALBUM_TOT_SALES_I    UNUSABLE
BIG_ALBUM_COUNTRY_ID_I   N/A
BIG_ALBUM_ALBUM_ID_I     N/A
 
SQL> select index_name, partition_name, status, leaf_blocks from dba_ind_partitions where index_name like 'BIG_ALBUM_%';
 
INDEX_NAME              PARTITION_NAME STATUS   LEAF_BLOCKS
----------------------- -------------- -------- -----------
BIG_ALBUM_ALBUM_ID_I    ALBUMS_2007    USABLE           807
BIG_ALBUM_ALBUM_ID_I    ALBUMS_2008    USABLE           381
BIG_ALBUM_ALBUM_ID_I    ALBUMS_2009    USABLE           383
BIG_ALBUM_ALBUM_ID_I    ALBUMS_2010    UNUSABLE
BIG_ALBUM_ALBUM_ID_I    ALBUMS_2011    UNUSABLE
BIG_ALBUM_COUNTRY_ID_I  TS1            UNUSABLE         629
BIG_ALBUM_COUNTRY_ID_I  TS2            UNUSABLE         629
BIG_ALBUM_COUNTRY_ID_I  TS3            UNUSABLE         629
BIG_ALBUM_COUNTRY_ID_I  TS4            UNUSABLE         629
 
SQL> select segment_name, partition_name, bytes, blocks from dba_segments where segment_name like 'BIG_ALBUM_%' and segment_type like 'INDEX%';
 
SEGMENT_NAME          PARTITION_NAME    BYTES BLOCKS
--------------------- -------------- -------- ------
BIG_ALBUM_ALBUM_ID_I  ALBUMS_2007     7340032    896
BIG_ALBUM_ALBUM_ID_I  ALBUMS_2008     3145728    384
BIG_ALBUM_ALBUM_ID_I  ALBUMS_2009     4194304    512
BIG_ALBUM_TOT_SALES_I                23068672   2816

 
  

We notice that all segments associated with the Global Partitioned index which are now unusable have been dropped. As have both unusable partitions from the Local Partitioned Index. However, the segment and storage associated with the unusable Non-Partitioned index still remains. Perhaps a missing feature for another time …
 
It’s a nice little touch that the unusable and somewhat useless index segments now get automatically cleaned out in Oracle11g R2, although they did previously act as “placeholders” in that nothing else within the tablespace could come along and use the occupied storage.