jump to navigation

Unique Indexes Force Hints To Be “Ignored” Part II (One Of The Few) February 19, 2019

Posted by Richard Foote in CBO, Hash Join, Hints, Oracle Indexes, Transitive Closure, Unique Indexes.
5 comments

Final Cut

In Part I, I showed a demo of how the introduction of a Unique Index appears to force a hint to be “ignored”. This is a classic case of what difference a Unique Index can make in the CBO deliberations.

So what’s going on here?

When I run the first, un-hinted query:

SQL> select * from bowie1, bowie2
where bowie1.id=bowie2.code and bowie1.id=1;
we notice something a little odd in the Predicate Information section of the execution plan:
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("BOWIE1"."ID"=1)
6 - access("BOWIE2"."CODE"=1)
Where the hell is the join condition, it’s not listed? Additionally, where does the BOWIE2.CODE=1 condition come from, it’s not a predicate within the above SQL statement?
The answer is “Transitive Closure“, whereby the CBO can automatically infer that BOWIE2.CODE must equal 1, if BOWIE2.CODE=BOWIE1.ID and BOWIE1.ID=1. This is something that the CBO master Jonathan Lewis has blogged about a number of times, including this post on Cartesian Merge Join.
Because the CBO is picking up the fact (based on the column statistics) that BOWIE1.ID is basically a unique column, it can effectively drop the join condition and simply use a  Merge Join Cartesian to get all rows from BOWIE1 that match the BOWIE1.ID=1 predicate (just 1 row), with all those rows from BOWIE2 that match the BOWIE2.CODE=1 predicate (estimated 50 rows).
-------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) |   00:00:01 |
|  1 | MERGE JOIN CARTESIAN                |               |   50 |  1150 |       5 (0) |   00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE1        |    1 |    16 |       2 (0) |   00:00:01 |
|* 3 | INDEX RANGE SCAN                    | BOWIE1_ID_I   |    1 |       |       1 (0) |   00:00:01 |
|  4 | BUFFER SORT                         |               |   50 |   350 |       3 (0) |   00:00:01 |
|  5 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) |   00:00:01 |
|* 6 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------------------------------
The CBO is picking up the fact that a Merge Join Cartesian is not as bad as it might sound if only 1 row is likely to be returned from one side of this process that can be combined with just the rows of interest from the other table, with no unnecessary row throwaways.
However, the CBO might not get this right, the efficiency of this depends somewhat on there really only being the one row returned from the BOWIE1.ID=1 condition. If there were many more than one row possible, then this can become relatively inefficient.
The second hinted query is therefore directing Oracle to perform a Hash Join, as I potentially know my data better than Oracle and think it a better option in this case:
SQL> select /*+ use_hash(bowie2) */ * from bowie1, bowie2
where bowie1.id=bowie2.code and bowie1.id=1;
The hint is directing Oracle to access the BOWIE2 table to be the probe table in a Hash Join operation.
We notice that the join predicate is now listed in the Predicate Information of the execution plan:
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("BOWIE1"."ID"="BOWIE2"."CODE")
3 - access("BOWIE1"."ID"=1)
5 - access("BOWIE2"."CODE"=1)
The key point being that there could be a many to many join operation that needs to be performed and Oracle can’t perform a Hash Join unless there is a join predicate listed.
As such, the CBO uses a Hash Join as requested by the hint:
------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) |  Time     |
------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) |  00:00:01 |
|* 1 | HASH JOIN                           |               |   50 |  1150 |       5 (0) |  00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE1        |    1 |    16 |       2 (0) |  00:00:01 |
|* 3 | INDEX RANGE SCAN                    | BOWIE1_ID_I   |    1 |       |       1 (0) |  00:00:01 |
|  4 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) |  00:00:01 |
|* 5 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) |  00:00:01 |
------------------------------------------------------------------------------------------------------
Note currently, there is no Unique constraint on the BOWIE1.ID column and the index on BOWIE1.ID is non-unique. Although the column statistics suggests the ID is basically unique (the number of distinct values matches the number of rows in the table), there is no certainly that this is correct. The statistics might not be accurate and there could be a bunch of duplicate IDs that would result in a many to many join operation whereby a Hash Join might be preferable.
But by replacing the non-unique index on BOWIE1.ID with a unique index, the CBO now knows there is indeed just the one viable row from the BOWIE1 side of the join, with the BOWIE1.ID=1 predicate. As such, the CBO goes back to using Transitive Closure to again effectively eliminate the join predicate.

 

Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("BOWIE1"."ID"=1)
5 - access("BOWIE2"."CODE"=1)
The CBO can now safely get just the required row from BOWIE1 table via the BOWIE1.ID=1 predicate and the required data from BOWIE2 directly via the BOWIE2.CODE=1 predicate. The CBO makes this decision before considering the most appropriate join strategy, which can now not possibly be the Hash Join, as the Hash Join is only possible with a join predicate in place.
------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) | 00:00:01 |
|  1 | NESTED LOOPS                        |               |   50 |  1150 |       5 (0) | 00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID         | BOWIE1        |    1 |    16 |       2 (0) | 00:00:01 |
|* 3 | INDEX UNIQUE SCAN                   | BOWIE1_ID_I   |    1 |       |       1 (0) | 00:00:01 |
|  4 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) | 00:00:01 |
|* 5 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------
As such, the USE_HASH hint is now “ignored”, because it’s simply now not a viable option for the CBO. A Nested Loop is now performed instead, in which the row for the Outer Table (BOWIE1) can be retrieved and all “corresponding” rows for the Inner Table (BOWIE2) can be likewise accessed via just the BOWIE2.CODE=1 predicate.
A Nested Loop is the join type you can have when you’re not necessarily performing a join…

“Oracle Indexing Internals and Best Practices” Seminar – Berlin 8-9 May: DOAG Website February 8, 2019

Posted by Richard Foote in Oracle Indexes.
add a comment

DOAG logo

Just a short note to say that DOAG have now a registration page for my upcoming “Oracle Indexing Internals and Best Practices” seminar running in Berlin, Germany on 8-9 May 2019.

For all the details regarding this acclaimed educational experience and how to book your place, please visit:

https://www.doag.org/de/eventdetails?tx_doagevents_single[id]=577320

Please mention you heard this seminar from me when booking and just note that places are strictly limited for these events, so please book early to avoid disappointment.

This is likely to be the only time I run this seminar in mainland Europe this year, so don’t miss this unique opportunity to learn in person all that’s worth learning in to how to use and maintain indexes judiciously to improve Oracle database/application performance.

If you have any questions, don’t hesitate to contact me at richard@richardfooteconsulting.com.

Unique Indexes Force Hints To Be “Ignored” Part I (What’s Really Happening) February 5, 2019

Posted by Richard Foote in CBO, Hash Join, Hints, Non-Unique Indexes, Oracle Indexes, Unique Indexes.
2 comments

hours album

As I was compiling my new “Oracle Diagnostics and Performance Tuning” seminar, I realised there were quite a number of indexing performance issues I haven’t discussed here previously.

The following is a classic example of what difference a Unique Index can have over a Non-Unique index, while also covering the classic myth that Oracle sometimes chooses to “ignore” hints that might be present in your SQL.

To set the scene, I’m going to create two simple little tables, but importantly initially create only non-unique indexes for columns of interest (Note: I’ve had to remove the “<” in the “<=” predicate when populating the table to avoid formatting issues):

SQL> create table bowie1 as
select rownum id, 'David Bowie' name from dual connect by level = 1000;

Table created.

SQL> create table bowie2 as
select rownum id, mod(rownum,20)+1 code from dual connect by level = 1000;

Table created.
SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=>'BOWIE1', method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=>'BOWIE2', method_opt=>'FOR ALL COLUMNS SIZE 1');

PL/SQL procedure successfully completed.

SQL> create index bowie1_id_i on bowie1(id);

Index created.

SQL> create index bowie2_id_i on bowie2(id);

Index created.

SQL> create index bowie2_code_i on bowie2(code);

Index created.

I’m now going to run the following query which does a simple join between the two tables and filters on the ID column from the BOWIE1 table:

 

SQL> select * from bowie1, bowie2
where bowie1.id=bowie2.code and bowie1.id=1;

50 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4266778954

-------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) |   Time     |
-------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) |   00:00:01 |
|  1 | MERGE JOIN CARTESIAN                |               |   50 |  1150 |       5 (0) |   00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE1        |    1 |    16 |       2 (0) |   00:00:01 |
|* 3 | INDEX RANGE SCAN                    | BOWIE1_ID_I   |    1 |       |       1 (0) |   00:00:01 |
|  4 | BUFFER SORT                         |               |   50 |   350 |       3 (0) |   00:00:01 |
|  5 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) |   00:00:01 |
|* 6 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) |   00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("BOWIE1"."ID"=1)
6 - access("BOWIE2"."CODE"=1)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   8 consistent gets
   0 physical reads
   0 redo size
1815 bytes sent via SQL*Net to client
 641 bytes received via SQL*Net from client
   5 SQL*Net roundtrips to/from client
   1 sorts (memory)
   0 sorts (disk)
  50 rows processed

 

The query uses a MERGE JOIN which I (incorrectly) think is a concern and decide that a HASH JOIN should be a better option. So I now put in a basic USE_HASH hint:

SQL> select /*+ use_hash(bowie2) */ * from bowie1, bowie2
where bowie1.id=bowie2.code and bowie1.id=1;

50 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1413846643

------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) |  Time     |
------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) |  00:00:01 |
|* 1 | HASH JOIN                           |               |   50 |  1150 |       5 (0) |  00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE1        |    1 |    16 |       2 (0) |  00:00:01 |
|* 3 | INDEX RANGE SCAN                    | BOWIE1_ID_I   |    1 |       |       1 (0) |  00:00:01 |
|  4 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) |  00:00:01 |
|* 5 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) |  00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - access("BOWIE1"."ID"="BOWIE2"."CODE")
3 - access("BOWIE1"."ID"=1)
5 - access("BOWIE2"."CODE"=1)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
  15 consistent gets
   0 physical reads
   0 redo size
1815 bytes sent via SQL*Net to client
 641 bytes received via SQL*Net from client
   5 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  50 rows processed

And the hint has worked as I had hoped.

I then decide that perhaps a Unique Index on the ID column might be a good idea (perhaps because I read up on all the advantages on Unique Indexes in this blog). So I drop and recreate the index as a Unique Index:

SQL> drop index bowie1_id_i;

Index dropped.

SQL> create unique index bowie1_id_i on bowie1(id);

Index created.

I now re-run my hinted query to again use the Hash Join:

SQL> select /*+ use_hash(bowie2) */ * from bowie1, bowie2
where bowie1.id=bowie2.code and bowie1.id=1;

50 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 4272245076

------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name          | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                    |               |   50 |  1150 |       5 (0) | 00:00:01 |
|  1 | NESTED LOOPS                        |               |   50 |  1150 |       5 (0) | 00:00:01 |
|  2 | TABLE ACCESS BY INDEX ROWID         | BOWIE1        |    1 |    16 |       2 (0) | 00:00:01 |
|* 3 | INDEX UNIQUE SCAN                   | BOWIE1_ID_I   |    1 |       |       1 (0) | 00:00:01 |
|  4 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE2        |   50 |   350 |       3 (0) | 00:00:01 |
|* 5 | INDEX RANGE SCAN                    | BOWIE2_CODE_I |   50 |       |       1 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("BOWIE1"."ID"=1)
5 - access("BOWIE2"."CODE"=1)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
  15 consistent gets
   0 physical reads
   0 redo size
1815 bytes sent via SQL*Net to client
 641 bytes received via SQL*Net from client
   5 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  50 rows processed

And we notice the hint is now being “ignored”. The hint isn’t really ignored, it’s just no longer relevant to how the CBO has now constructed the query and associated plan with the Unique Index now making a key difference (no pun intended).

I’ll discuss in Part II why the Unique Index has made such a difference and why the hint is no longer viable.

Of course, to learn all this and a lot lot more, you can always attend my new Oracle Diagnostics and Performance Tuning” seminar one day 🙂