jump to navigation

Why Is My Index Not Being Used Quiz (Brain Damage) September 28, 2011

Posted by Richard Foote in CBO, Oracle Indexes, Quiz.
trackback

This one is a little different as it comes in the form of a demo (and about 1 minute to read) so you have to work a little 🙂
 
I create table, index and sequence:
 

SQL> create table bowie (id number, name varchar2(30)) tablespace user_data;
 
Table created.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.
 
SQL> create sequence bowie_id order;
 
Sequence created.

 
I then create a little procedure that simply adds 100,000 rows to the table:
 

SQL> create or replace procedure add_bowie_rows as
  2  begin
  3  for i in 1..100000 loop
  4  insert into bowie values (bowie_id.nextval, 'DAVID BOWIE');
  5  commit;
  6  end loop;
  7  end;
  8  /
 
Procedure created.

 
I then have 3 different sessions that run the procedure simultaneously (eg. exec add_bowie_rows).
 
I collect 100% accurate stats:
 

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE', estimate_percent=>null, cascade=>true, method_opt=>'FOR ALL COLUMNS SIZE 1');
 
PL/SQL procedure successfully completed.

 
I run a query that selects about 10,000 rows (out of the 300,000 the table now has):
 

SQL> select * from bowie where id between 42000 and 52000;
 
10001 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 131801496
 
------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            | 10002 |   166K|   125   (1)|00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      | 10002 |   166K|   125   (1)|00:00:02 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I | 10002 |       |    51   (0)|00:00:01 |
------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID">=42000 AND "ID"<=52000)
 

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        129  consistent gets
         44  physical reads
          0  redo size
     100270  bytes sent via SQL*Net to client
        264  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10001  rows processed

 
 
Oracle uses the index and all is well.
 
OK, in exactly the same database, I repeat the demo again with the same 3 sessions populating the data in exactly the same way using exactly the same procedure with 100% accurate statistics, but there’s just one tiny little difference in the setup script:
 

SQL> drop table bowie;
 
Table dropped.
 
SQL> drop sequence bowie_id;
 
Sequence dropped.
 
SQL> create table bowie (id number, name varchar2(30)) tablespace user_data1;
 
Table created.
 
SQL> create index bowie_id_i on bowie(id);
 
Index created.
 
SQL> create sequence bowie_id order;
 
Sequence created.

 
 
I next populate the table in 3 different sessions concurrently and collect stats exactly as before…

However, now when I run my query:
 
 

SQL> select * from bowie where id between 42000 and 52000;
 
10001 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4157583811
 
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       | 10002 |   166K|   285   (4)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| BOWIE | 10002 |   166K|   285   (4)| 00:00:04 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("ID"<=52000 AND "ID">=42000)
 

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        998  consistent gets
        693  physical reads
          0  redo size
     100270  bytes sent via SQL*Net to client
        264  bytes received via SQL*Net from client
          4  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
      10001  rows processed

 
It performs a FTS ??? Note, the cardinality estimate of 1002 is practically spot on and identical to previously when the index was used by the CBO.

Instead of selecting 10,000 rows, if I now select say just 500 rows:
 

SQL> select * from bowie where id between 42000 and 42499;
 
500 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 4157583811
 
---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |   501 |  8517 |   284   (4)| 00:00:04 |
|*  1 |  TABLE ACCESS FULL| BOWIE |   501 |  8517 |   284   (4)| 00:00:04 |
---------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - filter("ID"<=42499 AND "ID">=42000)
 

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        997  consistent gets
          0  physical reads
          0  redo size
       5263  bytes sent via SQL*Net to client
        248  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        500  rows processed

 
It still performs a FTS !!
 
Only when I get down to a really low number of rows, for example 100 rows:
 

SQL> select * from bowie where id between 42000 and 42099;
 
100 rows selected.
 

Execution Plan
----------------------------------------------------------
Plan hash value: 131801496
 
------------------------------------------------------------------------------------------
| Id  | Operation                   | Name       | Rows  | Bytes | Cost (%CPU)|Time     |
------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |            |   101 |  1717 |    95   (0)|00:00:02 |
|   1 |  TABLE ACCESS BY INDEX ROWID| BOWIE      |   101 |  1717 |    95   (0)|00:00:02 |
|*  2 |   INDEX RANGE SCAN          | BOWIE_ID_I |   101 |       |     3   (0)|00:00:01 |
------------------------------------------------------------------------------------------
 

Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("ID">=42000 AND "ID"<=42099)
 

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
        103  consistent gets
          0  physical reads
          0  redo size
       1266  bytes sent via SQL*Net to client
        248  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
        100  rows processed

 
Will Oracle use the index.
  
QUESTION: Why, what is the tiny little difference that has made such a huge difference in behaviour ???
 
Now there are a couple of possible answers (at least) that come to mind …

Enjoy !!

Comments»

1. Tony Sleight - September 28, 2011

Has tablespace USER_DATA been created with freelist space management and USER_DATA1 with ASSM by any chance?

This example has effects very similar to a demonstration of how freelist management can badly affect clustering factor of indexes. Jonathan Lewis has a similar example in his ‘Cost Based Oracle Fundamentals’ book.

Like

Tony Sleight - September 28, 2011

forgot to add why the different freelist strategies would cause an issue with clustering factor.

Basically with traditional free list management the three sessions inserting rows are more likely to be inserting the sequence numbers in roughly the same data blocks and hence getting a clustering factor that favours the index.

With ASSM, the sessions may create records with sequence numbers in vastly different data blocks and therefore increase clustering factor to make the index less favourable.

Like

2. Flado - September 28, 2011

Tablespaces user_data and user_data1 have hugely different block sizes? The index stays in the user’s default tablespace (which might be user_data, user_data1, or something completely different), but the table uses different numbers of blocks, hence the index has different clustering factor AND the cost of a full scan of the table differs significantly.
It might be that they have the same block sizes but different default storage parameters (pctfree comes to mind) which would affect the stats and the relative costs in a similar way.
I’m not sure what the significance of the three concurrent inserting sessions is – they should crash the index (50-50 splits, index ITL explosion if using the default initrans of 2) the same way in both cases.

Can’t wait 🙂
Flado

Like

3. Mladen Gogala - September 29, 2011

First of all, I wasn’t able to produce the desired effect on my 11.2.0.3 database. Even the table populated from 3 concurrent sessions was not causing a full table scan. Without being able to reproduce, I can only speculate.

Each session is operating on its own extent, clustering factor will be much larger if you populate the table concurrently. I executed your example and I got clustering factor of 303 for the serial load and 398 for 3 concurrent sessions, which is almost 33% increase. Depending on the DB version and the exact text of the concurrent loading scripts, your clustering factor may be significantly increased, which would make the full table scan faster for the large number of rows.

Like

4. vishaldesai - September 29, 2011

tablespace user_data – segment space management manual. If freelist 3 is added to create table here it will probably do full table scan instead of index range scan.

tablespace user_data1 – segment space management auto. clustering_factor of index will be much higher here compared to manual segment space management

Like

5. Martin Preiss - September 29, 2011

Richard,

testing a little bit with different tablespaces I got the following results:

select index_name
     , leaf_blocks
     , clustering_factor 
  from user_indexes
 where index_name = 'BOWIE_ID_I';

-- 8K MSSM
INDEX_NAME                     LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ----------- -----------------
BOWIE_ID_I                            1565              1234

-- 8K ASSM
INDEX_NAME                     LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ----------- -----------------
BOWIE_ID_I                            1623            241479

-- 2K MSSM
INDEX_NAME                     LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ----------- -----------------
BOWIE_ID_I                            1680              4495

-- 2K ASSM
INDEX_NAME                     LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ----------- -----------------
BOWIE_ID_I                            1736            259593

-- 16K MSSM
INDEX_NAME                     LEAF_BLOCKS CLUSTERING_FACTOR
------------------------------ ----------- -----------------
BOWIE_ID_I                            1760               657

Obviously the difference is ASSM. Then I remembered something I had read somewhere: “Generally, rows with monotonically increasing column values are physically inserted in the order of the monotonically increasing columns. This may not be the case however with tables in ASSM tablespaces or tables with multiple freelists or freelist groups as concurrent inserts will be directed to differing blocks. In these cases we may actually have data that is quite well clustered but may have quite poor CF values due to the manner in which the CF is calculated.” So I think Tony already gave a correct explanation.

Like

6. zhwsh - September 29, 2011

I guess
USER_DATA=>MSSM
USER_DATA1=>ASSM

Like

7. Donatello Settembrino - September 29, 2011

A range scan, which turns into a table scan …
First, it makes me think that’s a problem with
Clustering Factor …
so I think(as others have already responded) the difference is called “ASSM” (or “MSSM”) associated with multiple concurrent inserts.In particular, the erroneous calculation of the clustering factor performed by DBMS_STATS is due to the parameter History
the function

sys_op_countchg(substrb(t.rowid,1,15),1)

History, the last parameter of the function sys_op_countchg,
represented by the number 1, that is the number of blocks that are remembered to determine if the block has changed.
In the case of use of ASSM, to calculate the correct CF
you should replace the parameter “History” with the number
concurrent processes that populate the table or the number
of freelist if the number of processes exceeds that of the freelist.
But this suspicion, if you remember, is what I had in mind for the
Alberto problem

Big Tables, Sorts and Indexes Solution (Right On Mother)

Regards,

Donatello

Like

8. Why Is My Index Not Being Used Quiz (Brain Damage) « Ukrainian Oracle User Group - September 30, 2011

[…] Why Is My Index Not Being Used Quiz (Brain Damage) September 28, 2011 […]

Like


Leave a reply to vishaldesai Cancel reply