jump to navigation

Storing Date Values As Characters Part II (A Better Future) May 30, 2016

Posted by Richard Foote in 12c, CBO, Function Based Indexes, Oracle Indexes, Storing Dates as Characters.
1 comment so far

In the previous post, I discussed how storing date values within a character data type is a really really bad idea and illustrated how the CBO can easily get its costings totally wrong as a result. A function-based date index helped the CBO get the correct costings and protect the integrity of the date data.

During the demo, I re-collected statistics on the table as the associated hidden virtual column after creating the function-based index doesn’t have statistics.

Before re-collecting statistics:

SQL> select column_name, num_distinct, density, histogram, hidden_column, virtual_column from dba_tab_cols where table_name='BOWIE';</pre>
 
COLUMN_NAME  NUM_DISTINCT DENSITY     HISTOGRAM      HID VIR
------------ ------------ ---------- --------------- --- ---
SYS_NC00004$                         NONE            YES YES
BOWIE_DATE          10000      .0001 NONE            NO  NO
CODE                 1000       .001 NONE            NO  NO
ID                1000000    .000001 NONE            NO  NO

And afterwards:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed. 

SQL> select column_name, num_distinct, density, histogram, hidden_column, virtual_column 
from dba_tab_cols where table_name='BOWIE'; 

COLUMN_NAME  NUM_DISTINCT DENSITY    HISTOGRAM       HID VIR 
------------ ------------ ---------- --------------- --- --- 
SYS_NC00004$        10000      .0001 NONE            YES YES 
BOWIE_DATE          10000      .0001 HYBRID          NO  NO 
CODE                 1000       .001 NONE            NO  NO 
ID                1000000    .000001 NONE            NO  NO

 

We can see that the hidden virtual column now has statistics.

But we also notice another difference, that being the BOWIE_DATE column now has a histogram (of type Hybrid).

As discussed in the previous post, the issue here is that the date data within the character column covers only a very specific subset of all the potential character values that could reside within the column. Therefore the CBO is getting the range scan selectivity hopelessly incorrect.

Now that we’ve run a few queries featuring the BOWIE_DATE column in the predicates and as there’s effectively data skew within the column, the column becomes a candidate for a histogram with the default SIZE AUTO collection method.

The histogram now provides the CBO with a much more accurate picture of the distribution of the data within the BOWIE_DATE and that between discrete “date” column values, there are only so many rows that qualify.

As a result of the histogram, the CBO can now make much more accurate cardinality estimates.

If we now re-run the query that actually returns 8300 rows but the CBO previously estimated only 100 rows be returned:

SQL> select * from bowie where bowie_date between '2015 10 10' and '2015 12 31'
                                             
8300 rows selected.
                                                            
Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id | Operation         | Name  | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |       | 4152 | 83040 |   1000 (12) | 00:00:01 |
|* 1 | TABLE ACCESS FULL | BOWIE | 4152 | 83040 |   1000 (12) | 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("BOWIE_DATE">='2015 10 10' AND "BOWIE_DATE"<='2015 12 31')
Statistics
----------------------------------------------------------
14 recursive calls
0 db block gets
4063 consistent gets
0 physical reads
0 redo size
282075 bytes sent via SQL*Net to client
6635 bytes received via SQL*Net from client
555 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
8300 rows processed

We see that at an estimated 4,152 rows, it’s a much better estimate. Not perfect, but maybe good enough to now get the more efficient Full Table Scan execution plan.

If we re-run the query that returned over 1/2 the table at some 570,000 rows but with the CBO previously estimating only 116 rows:

SQL> select * from bowie where bowie_date between '2000 10 10' and '2016 12 31';</pre>
  
570800 rows selected.
    
Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id | Operation         | Name  | Rows | Bytes | Cost (%CPU) | Time     |
---------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |       | 572K |   10M |   1012 (13) | 00:00:01 |
|* 1 | TABLE ACCESS FULL | BOWIE | 572K |   10M |   1012 (13) | 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("BOWIE_DATE">='2000 10 10' AND "BOWIE_DATE"<='2016 12 31')
   
Statistics
----------------------------------------------------------
14 recursive calls
0 db block gets
41456 consistent gets
4 physical reads
0 redo size
19292352 bytes sent via SQL*Net to client
419135 bytes received via SQL*Net from client
38055 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
570800 rows processed

 

We see that at an estimate of 572K rows, it’s now got this just about right and again has made the right decision with the Full Table Scan execution plan.

Storing date data in character based columns is still a really really bad idea and limits the manner in which date data can be analysed, protected and accessed, but with appropriate histograms in place, at least the CBO has some chance of making a reasonable fist of things with some range based queries.

As a follow-up, I was asked if storing dates in NUMBER format is a better option than as a string. I’ll discuss that next.

Storing Date Values As Characters (What’s Really Happening) May 26, 2016

Posted by Richard Foote in 12c, Function Based Indexes, Oracle Indexes, Storing Dates as Characters.
1 comment so far

For something that’s generally considered an extremely bad idea, I’ve lost count of the number of times I’ve come across applications that insist on storing date values as characters within the database. We’ve all seen them …

I recently got called in to assist a customer who was having issues with a POC in relation to the database not appearing to want to use the In-Memory Database option as expected. In various key scenarios, the CBO kept coming up with execution plans that used index plans (they were hesitant to drop these particular indexes), when if it only just used the Database In-Memory store, the queries ran so much faster. So I was called in to find out what’s really happening and it turned out that the main culprit was indeed queries against columns where dates were stored as characters within the database. In the process, we found another issue with some “invalid” date values. Go figure.

Interestingly, both issues could be addressed by creating a new index on the date column …

I’ve kinda replicated the scenario here with the following little test case. I’ll begin by creating a table with a varchar2 field (bowie_date) that actually stores a whole bunch of “dates”:

SQL> create table bowie (id number, code number, bowie_date varchar2(42));
Table created

SQL> insert into bowie select rownum, mod(rownum,1000), to_char(sysdate-mod(rownum,10000), 'YYYY MM DD') from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

So the bowie_date column basically has 10000 different dates, with each date evenly distributed with 100 occurrences for each date.

I’ll now collect statistics on the table:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');
PL/SQL procedure successfully completed.

SQL> select column_name, num_distinct, density, histogram, hidden_column, virtual_column from dba_tab_cols where table_name='BOWIE';

COLUMN_NAME  NUM_DISTINCT DENSITY          HISTOGRAM HID VIR
------------ ------------ ---------- --------------- --- ---
BOWIE_DATE          10000      .0001 NONE            NO  NO
CODE                 1000       .001 NONE            NO  NO
ID                1000000    .000001 NONE            NO  NO

 

So the bowie_date column indeed has 10000 distinct dates.

I’ll now create a standard b-tree index on this column:

SQL> create index bowie_date_i on bowie(bowie_date);
Index created.

So the data looks as follows with the bowie_date a varchar2 column that actually contains date data, with the following min/max ranges:

SQL> select * from bowie where rownum <11;

        ID       CODE BOWIE_DATE
---------- ---------- ----------
       916        916 2013 11 22
       917        917 2013 11 21
       918        918 2013 11 20
       919        919 2013 11 19
       920        920 2013 11 18
       921        921 2013 11 17
       922        922 2013 11 16
       923        923 2013 11 15
       924        924 2013 11 14
       925        925 2013 11 13

10 rows selected.

SQL> select min(bowie_date) min, max(bowie_date) max from bowie;

MIN        MAX
---------- ----------
1989 01 09 2016 05 26

If we run a query that uses an equality predicate as follows:

SQL> select * from bowie where bowie_date = '2015 06 13';
100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1525056162

----------------------------------------------------------------------------------------------------
| Id | Operation                             | Name         | Rows | Bytes | Cost (%CPU)|     Time |
----------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                      |              |  100 |  2000 |     103 (0)| 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED   | BOWIE        |  100 | 2000  |     103 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN                      | BOWIE_DATE_I |  100 |       |       3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("BOWIE_DATE"='2015 06 13')

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
110 consistent gets
16 physical reads
0 redo size
4383 bytes sent via SQL*Net to client
618 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed 

The CBO gets things just about right. 100 rows are estimated and indeed 100 rows are retrieved. So we have confidence the CBO has made the right decision in using the index here as so few rows are actually retrieved.

However, if we run a range scan predicate such as the following:

SQL> select * from bowie where bowie_date between '2015 10 10' and '2015 12 31';

8300 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1525056162

----------------------------------------------------------------------------------------------------
| Id | Operation                          | Name         | Rows | Bytes | Cost (%CPU)|     Time |
----------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                   |              |  100 | 2000  |     104 (0)| 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE        |  100 | 2000  |     104 (0)| 00:00:01 |
|*2  | INDEX RANGE SCAN                   | BOWIE_DATE_I |  100 |       |       3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("BOWIE_DATE">='2015 10 10' AND "BOWIE_DATE"<='2015 12 31')

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
8881 consistent gets
38 physical reads
0 redo size
315219 bytes sent via SQL*Net to client
6635 bytes received via SQL*Net from client
555 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
8300 rows processed

The CBO has got the costings wrong here. It still estimates  only 100 rows are to be returned when in actual fact 8300 rows come back.

If we select an even larger “date” range:

SQL> select * from bowie where bowie_date between '2000 10 10' and '2016 12 31';

570800 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1525056162

----------------------------------------------------------------------------------------------------
| Id | Operation                          | Name         | Rows | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT                   |              |  116 |  2320 |     120 (0)| 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID BATCHED| BOWIE        |  116 |  2320 |     120 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN                   | BOWIE_DATE_I |  116 |       |       3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("BOWIE_DATE">='2000 10 10' AND "BOWIE_DATE"<='2016 12 31')

Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
610491 consistent gets
1704 physical reads
0 redo size
21575496 bytes sent via SQL*Net to client
419135 bytes received via SQL*Net from client
38055 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
570800 rows processed

The CBO has got things seriously wrong here. We’re actually returning over 1/2 the table, some 570,800 rows but the CBO thinks only 116 rows will be returned. Why ?

The problem comes back to storing date values as characters. The CBO has absolutely no idea that these “characters” are actually meant to be dates and has no idea that the only valid ranges of values are date values.

With a possible range between “1989 01 09” and “2016 05 26“, any character range/combination of values (up to the 42 column size) could potentially exist in this column ( value ‘1zxgs$.jKN6tglhasgdlhlhd23bkbk?k’ for example).

So the required range between ‘2000 10 10’ and ‘2016 12 31’ actually represents a relatively narrow range of possible values within the range of all possible values (especially as the leading column differs between the min/max).

Hence why the CBO is estimating such a low number of rows to be returned and hence why the CBO is deciding to incorrectly use the index. “Hiding” the meaning and distribution of values from the CBO in this manner can be problematic to say the least.

Worse of course is also the possibility of “invalid” dates being entered as the database has no implicit way to police the integrity of the data:

SQL> insert into bowie values (1000001, 42, '2016 6 31');

1 row created.

SQL> rollback;

Rollback complete.

We know there’s no such date as 31st June but Oracle has no idea that this is logically invalid data. Or the value ‘lh;ghsgdsd7gdGLH96bb’ for that matter …

Did I mention that storing dates in a character column is a really really bad idea …

Now there are a couple of ways to help address these issues if changing the column and its datatype is not possible due to the application. One way is to create a function-based index as follows on a date version of the column:

SQL> create index bowie_date_fn_i on bowie(to_date(bowie_date,'YYYY MM DD'));

Index created.

If we run the equality query but this time referencing the to_date function:

SQL> select * from bowie where to_date(bowie_date, 'YYYY MM DD') = '13-JUN-2015';

100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 960797537

-------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name            | Rows  | Bytes  | Cost (%CPU)|     Time |
-------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                 | 10000 |   273K |     893 (2)| 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE           | 10000 |   273K |     893 (2)| 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | BOWIE_DATE_FN_I |  4000 |        |       3 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(TO_DATE("BOWIE_DATE",'YYYY MM DD')=TO_DATE(' 2015-06-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
12 recursive calls
0 db block gets
113 consistent gets
16 physical reads
0 redo size
3268 bytes sent via SQL*Net to client
618 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed

We notice the CBO has got the row estimate way wrong here, thinking that 10000 rows, not 100 rows are to be returned. Why ?

Remember, when we create a function-based index, Oracle creates a hidden virtual column on the table as Oracle needs a way to store the statistics associated with the result set from the function. But these statistics aren’t populated until we next collect statistics on the table (or explicitly for just the hidden columns) and without the column statistics, the CBO can make poor assumptions:

SQL> select column_name, data_type, num_distinct, density, histogram, hidden_col
umn, virtual_column from dba_tab_cols where table_name='BOWIE';

COLUMN_NAME  DATA_TYPE NUM_DISTINCT    DENSITY HISTOGRAM       HID VIR
------------ --------- ------------ ---------- --------------- --- ---
SYS_NC00004$ DATE                              NONE            YES YES
BOWIE_DATE   VARCHAR2         10000      .0001 NONE            NO  NO
CODE         NUMBER            1000       .001 NONE            NO  NO
ID           NUMBER         1000000    .000001 NONE            NO  NO

There are no column statistics for the virtual column but we note the data type of the virtual column is DATE. So let’s collect new statistics on the table:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.
         .
SQL> select column_name, data_type, num_distinct, density, histogram, hidden_col
umn, virtual_column from dba_tab_cols where table_name='BOWIE';

COLUMN_NAME  DATA_TYPE NUM_DISTINCT    DENSITY HISTOGRAM       HID VIR
------------ --------- ------------ ---------- --------------- --- ---
SYS_NC00004$ DATE             10000      .0001 NONE            YES YES
BOWIE_DATE   VARCHAR2         10000      .0001 HYBRID          NO  NO
CODE         NUMBER            1000       .001 NONE            NO  NO
ID           NUMBER         1000000    .000001 NONE            NO  NO

There are actually two key differences in the above statistics, but in this post I’ll just focus on the fact that the hidden virtual column now has associated statistics (I’ll discuss the other key difference in my next blog post).

When we re-run the query:

SQL> select * from bowie where to_date(bowie_date, 'YYYY MM DD') = '13-JUN-2015';

100 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 960797537

-------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name            | Rows | Bytes | Cost (%CPU) | Time     |
-------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                 |  100 |  2800 |     102 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE           |  100 |  2800 |     102 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | BOWIE_DATE_FN_I |  100 |       |       3 (0) | 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(TO_DATE("BOWIE_DATE",'YYYY MM DD')=TO_DATE(' 2015-06-13 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
108 consistent gets
0 physical reads
0 redo size
3268 bytes sent via SQL*Net to client
618 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed

The CBO has now got the estimate spot on.

If we now run the previous range scan query that accessed 1/2 the table, referencing the to_date function:

SQL> select * from bowie where to_date(bowie_date, 'YYYY MM DD') between '10-OCT-2000' and '31-DEC-2015';

556100 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id | Operation         | Name  | Rows  | Bytes  | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |       |  556K |    14M |   1328 (34)| 00:00:01 |
|* 1 | TABLE ACCESS FULL | BOWIE |  556K |    14M |   1328 (34)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(TO_DATE("BOWIE_DATE",'YYYY MM DD')>=TO_DATE(' 2000-10-10
00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND TO_DATE("BOWIE_DATE",'YYYY MM
DD')<=TO_DATE(' 2015-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
40444 consistent gets
0 physical reads
0 redo size
18804277 bytes sent via SQL*Net to client
408355 bytes received via SQL*Net from client
37075 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
556100 rows processed

The CBO has got the estimates pretty well spot on and is now performing the far more efficient Full Table Scan. The CBO knows that the virtual column is of type DATE and therefore can much more accurately determine the actual cardinality estimates for the range scan on the “date” column.

If we now run the other corresponding range scan that returned a moderate number of rows:

SQL> select * from bowie where to_date(bowie_date, 'YYYY MM DD') between '10-OCT-2015' and '31-DEC-2015';

8300 rows selected.

Execution Plan
----------------------------------------------------------
Plan hash value: 1845943507

---------------------------------------------------------------------------
| Id | Operation         | Name  | Rows | Bytes  | Cost (%CPU) | Time     |
---------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |       | 8401 |   229K |   1205 (27) | 00:00:01 |
|* 1 | TABLE ACCESS FULL | BOWIE | 8401 |   229K |   1205 (27) | 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(TO_DATE("BOWIE_DATE",'YYYY MM DD')>=TO_DATE(' 2015-10-10
00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND TO_DATE("BOWIE_DATE",'YYYY MM
DD')<=TO_DATE(' 2015-12-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))

Statistics
----------------------------------------------------------
14 recursive calls
0 db block gets
4058 consistent gets
0 physical reads
0 redo size
282075 bytes sent via SQL*Net to client
6635 bytes received via SQL*Net from client
555 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
8300 rows processed

We notice that again the CBO has got the row estimate just about right and because the index has a poor clustering factor, the CBO still decided to go down the Full Table Scan path.

Even if we don’t use the index all that much (perhaps the Database In-Memory store is a better option for most queries) , it can still play an important role in policing the integrity of the data. An attempt to insert an invalid date will now automatically be captured by the database and fail:

SQL> insert into bowie values (1000001, 42, '2016 6 31');
insert into bowie values (1000001, 42, '2016 6 31')
*
ERROR at line 1:
ORA-01839: date not valid for month specified

As the function-based index can only be populated or updated if a valid date is inserted into the table, any attempt to insert an invalid date will fail thus protecting the integrity of the data.

The best “fix” here is to store these dates in a date field within the database, where the above issues are automatically addressed. If this is not possible, then the introduction and usage of an associated function-based index can certainly assist the CBO in making the correct decision.

However, if data integrity is not a concern (the application does a fantastic job of it and no one ever has to perform manual data fixes directly in the database), then there’s another option to help make the CBO do the right thing.

The clue is back in how I collected the table statistics, which I’ll discuss in my next post.

New Oracle Cloud Offering – Indexing as a Service (IDXaaS) (I Pity The Fool) April 1, 2016

Posted by Richard Foote in 12c Rel 2, IDXaaS, Oracle Cloud, Oracle Indexes.
12 comments

This of course is an April Fools joke. Sorry !!

A very exciting announcement !! I’ve recently been promoted within Oracle Corporation to lead their brand new Oracle Cloud offering “Indexing as a Service” (IDXaaS) team, based on my previous work and experience in the indexing space. Yes, I’m both thrilled and excited to be involved in this new venture:)

Promising both improved levels of service and reductions in maintenance and administration overheads, this new Oracle Cloud offering will take all indexing related hassles from on-premise databases and address them in the Cloud. No more index rebuilds, no more concerns you don’t have the right indexes in place and importantly no more worrying that indexes are taking up valuable TBs of storage. Oracle indexes can now be safely and seamlessly migrated and stored in the Oracle Cloud.

Relational databases depend heavily on indexes to ensure data can be retrieved quickly and efficiently, but indexes can be very difficult to administer, can use considerable resources to maintain and consume considerable amounts of disk space. They can be so problematic that some folk actually devote an entire blog on the subject !! Imagine if all these issues and overheads can be taken away from you when administrating Oracle databases…

Index DDL will be enhanced in the 12c Release 2 Oracle Database to optionally specify a Cloud Domian Identifier (CDI) instead of a tablespace. So you will be able to either build or rebuild an index into the Oracle IDXaaS Cloud with syntax such as follows: CREATE INDEX bowie_idx on bowie(id) CLOUD LOCATION bowies_cloud_domain; This index will fully index the bowie table that exists in an on premise database (or indeed a database that is also in the cloud) but the index itself will be created and maintained in the Oracle Cloud. The database is fully aware of the actual physical location of the index as it is when created in a traditional tablespace and will simply update the index structure as DML is applied to the base table as it has always done. All Oracle indexing types and options will be fully supported including Oracle Text and Spatial indexes.

Like indexes stored in a tablespace, these indexes will need to be rebuilt or coalesced every now and then, but you won’t need to worry about this as Oracle will now implicitly do this for you in the Oracle Cloud. I’m currently hiring a team of DBAs with expertise in the “art” of rebuilding indexes who will ensure all your indexes will be rebuilt automatically for customers as necessary. By default, these operations will usually be performed on Sundays within the Oracle Cloud, but as indexes can be rebuilt online and as these indexes are no longer physically stored within your database, it’s an operation that is now performed automatically and seamlessly for you. Customers no longer need concern themselves with these types of index operations or when they occur, so while “we” work over the weekend, your DBAs can instead relax and keep their gardens nice and weed free. Please leave a comment on the blog if you think you have what it takes to effectively rebuild Oracle indexes and interested in joining my elite IDXaaS team !!

For customers that choose to use the “Premium IDXaaS” offering, Oracle will automatically create or drop indexes as required in the Oracle Cloud. Customers no longer have to make the decision on what indexes should be created on which columns in which order; this will be performed automatically for you by Oracle’s IDXaaS. By running new advanced analysis routines on your database workloads, the Premium IDXaaS offering will create an index if it will help the efficiency of any of your current workloads. Conversely, if Oracle believes an index is redundant (for example if your database is running on Exadata), indexes will first be automatically made “invisible” and if there are no serious repercussions, will be automatically dropped 7 days later. DBAs and Developers will no longer need to worry about which indexes to create/drop, significantly reducing the complexities in managing both current and future Oracle database environments.

Oracle will also introduce a new package that will easily migrate all your existing indexes for you into the Oracle Cloud. The DBMS_INDEX_CLOUD package will automatically migrate all indexes for a specified Table/Partition/Schema/Pluggable Database/Database into the Oracle Cloud, which will be a fully online, seamless operation. Once completed, associated indexes within “on premise” tablespaces will effectively be dropped and be replaced with their Cloud equivalents. All that “expensive” storage that was previously tied up in storing all those Oracle indexes can now be freed up to store much more business rich data such as emails, selfie photos, David Bowie music catalogs, etc. Note that these significant storage savings will also translate in smaller backups, smaller Standby databases, smaller Production clones, etc. Importantly, the Oracle Cost Based Optimizer will be fully aware of the Cloud Indexes (as their definitions, statistics and other metadata are still stored within the database data dictionary) and will use the Cloud Indexes as appropriate. No changes to existing SQL is necessary, the CBO will decide to use an index in the cloud in the same manner as it chooses to use an index currently. There is also a new Cloud Index execution path option that allows data to be retrieved via the Cloud Index in a “good enough” manner which is much more efficient than a normal index scan, although it might not necessarily retrieve all the possible data that meets an SQL predicate. It effectively provides what we refer to as that “internet” data retrieval experience.

Oracle Open World 2016 will be the time of the official release for this exciting new Oracle Cloud offering. I will be presenting a number of sessions at OOW16 on this new IDXaaS and there will be a number of live demos to demonstrate its unique capabilities and the simplicity of the migration process. However, there will be a number of announcements before then, including the opportunity to be involved in a beta customer program. There will also be a number of hands-on workshops being conducted globally, with customers getting the chance to see for themselves how easily it is to move database indexes into the Oracle Cloud.

Lastly and perhaps most exciting of all, this new IDXaaS capability will initially be a free option with all current Oracle databases from 12c Release 2 onwards. You will be able to migrate all your existing 12.2 Oracle database indexes onto the Oracle Cloud and they will be stored, maintained and administrated for free as part of your current Oracle database licenses. All Oracle database editions will be supported. This free offer however is likely to end once the 12c Release 2 database is officially released but the additional pay as you go licensing constructs are proposed to be extremely competitive with yearly “indexed” priced increments. Also to be subsequently released will be Oracle IDXaaS support for most other commercial databases including DB2, SQL Server, NoSQL, MySQL, Sybase, Hana, Hadoop and Notepad.

So administrating Oracle (and other) Databases will soon change forever.  Oracle indexes will no longer have to be a concern for customers as all database indexes can instead be stored in the Oracle Cloud via the new IDXaaS offering, allowing Oracle to automatically store and manage these tricky database objects for you. Not only will index rebuilds be a thing of the past, but so will all concerns regarding index storage, creation and maintenance. I’m very excited to be involved in this new undertaking and indeed with my move to Oracle HQ in San Francisco in the coming months (I’ll up in the Clouds level of Building 1). I look forward to talking IDXaaS with you in the days to come:)

 

This of course is an April Fools joke. Sorry !!

New Round of Let’s Talk Database Events February 12, 2016

Posted by Richard Foote in Oracle Indexes.
3 comments

I’ll be presenting a new round of “Let’s Talk Database” events around Australia and NZ next month. These are free events but have often “sold out” in the past so booking early is recommended to avoid disappointment.

All events run between 9:00am – 12:30pm and are followed by a networking lunch.

Currently, the confirmed events are:

  • Sydney – Tuesday, 22 March 2016: Let’s Talk Oracle Database: Performance Diagnostics – How To Correctly Use AWR Reports. To register, email: mitch.foster@oracle.com
  • Melbourne – Wednesday, 23 March 2016: Let’s Talk Oracle Database: Oracle 12c Database New Features. To register, email: johanne.sergeant@oracle.com SOLD OUT !!
  • Canberra – Thursday, 31 March 2016: Let’s Talk Database: Intro To CBO Costings, APEX New Features and Software-on-Silicon. To register, email: richard.foote@oracle.com
  • Perth – Thursday, 5 May 2016: Let’s Talk Database: Performance Diagnostics – How To Correctly Use AWR Reports. To register, email: dennis.ward@oracle.com
  • Adelaide – Friday, 6 May 2016: Let’s Talk Database: Intro To CBO Costings, Database Futures. To register, email: brodie.james@oracle.com
  • Wellington Wednesday, 11 May 2016: Let’s Talk Oracle Database: Oracle 12c Database New Features. To register, email: tanya.fea@oracle.com
  • Brisbane: Wednesday, 18 May 2016: Let’s Talk Oracle Database: Performance Diagnostics – How To Correctly Use AWR Reports. To register, email: gordon.laing@oracle.com

 

More “Let’s Talk Database” events will be added in the coming days.

Details of sessions:

Oracle 12c Database New Features

An in-depth technical look at many of the exciting enhancements and new capabilities introduced in the Oracle 12c Database that help make life easier for the DBA and Oracle Developer. Among a number of topics, we’ll have a look at changes with respect the Cost Based Optimizer, new statistics gathering options, discuss many of the new indexing related capabilities such as Partial Indexes, Advanced Index Compression, Asynchronous Global Index Maintenance, etc. and look at many of those cool little things such as In-Database Archiving, Automatic Data Optimization, Identity Columns, Row-Limiting Clauses, Temporary Undo, Online Move operations, Clustering Attributes, Temporal and Improved Flashback Queries etc. to name but a few that can sometimes make all the difference. Might even squeeze in a bit of Database In-Memory. So plenty to talk about !!!

Performance Diagnostics – How To Correctly Use AWR Reports

I often get called in by customers to determine and address the root cause of database performance issues. Depending on the issue, a request for a simple Automatic Workload Repository (AWR) report is often sufficient to accurately diagnose the root problem(s). However, many DBAs find AWR reports daunting as they contain so much database diagnostic information, they don’t even know where to begin to find the truly useful information.

In this session, I begin by discussing the basic concepts of response times, DB times, the Oracle Wait Interface and how to focus on what is truly important. I then look at how to best read an AWR report to quickly go to the most relevant sections that detail any specific issues. I also discuss a general tuning and diagnostic methodology that ensures one can quickly determine whether an AWR report will indeed be sufficient and how to accurately and consistently use the AWR report to pinpoint and determine root causes for global database performance issues. We’ll go through a number of actual “real-life” examples that highlight various performance issues and how one accurately determines the actual root issues through an appropriate AWR report.

Time permitting, we’ll go through some other useful diagnostic tools such as the AWR Warehouse, Automatic Database Diagnostic Monitor (ADDM), Active Session History (ASH)  and SQL Extended Tracing and scenarios when they can prove to be invaluable in addressing database performance issues.

Intro To CBO Costings, APEX New Features and Software-on-Silicon

A common issue is confusion on why the Oracle Cost Based Optimizer (CBO) has made the decision the use a full table scan or use that index and not the one that appears more appropriate. The costing figures that the CBO derives and displays in execution plans are often viewed as cryptic, which is unfortunate as these costings are key to CBO decisions. Often the attempt to resolve such issues is to set inappropriate database parameters that introduces as many issues as it tries to resolve. This session looks under the covers of the CBO to see how it precisely derives its various costings and hence how to correctly troubleshoot and resolve inefficient queries. Bring a calculator, it might come in handy J

Oracle Application Express (APEX) is a rapid development environment built into the Oracle database. APEX 5 is the latest and largest release of Application Express. This APEX session will include an overview of APEX concentrating on new features, followed by a live demonstration. The session will also look at how easy it is to deploy APEX applications in the cloud.

Oracle’s Software-in-Silicon features extend the concepts of co-engineering hardware and software inherent in engineered systems such as the Exadata. Embedding software which has traditionally sat in the database layer into the hardware layer can yield both performance and security beyond that available of a traditional CPU. This presentation covers how Oracle Database 12c can leverage Software-in-Silicon features, and how third party developers can make use of the new features within their own applications.

 

Hope to see you at one of these events :)

Indexes and Initrans (Blackstar) January 14, 2016

Posted by Richard Foote in Oracle Indexes.
7 comments

It’s been a very tough week so to help keep my mind off rather sad events, thought I’ll finish off one of my unpublished articles.

Initrans is a physical attribute that determines the initial number of concurrent transaction entries allocated within each data block for a given table/index/cluster. Every transaction that updates a block has to acquire an Interested Transaction List (ITL) slot, in order to store the transaction id, rollback information and ultimately lock the necessary row/index entry within the block.

For a table, the inability to acquire a necessary ITL can lead to hanging transactions and the resultant poor performance. A simple demonstration.

I’ll first create and populate a table with PCTFREE 0 so that the table blocks are full with no free space:

SQL> create table ziggy (id number, code number, name varchar2(42)) pctfree 0;

Table created.

SQL> insert into ziggy select rownum, mod(rownum,100), ‘Ziggy Stardust’
from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

If we look at a partial block dump of an example block from the table (which happens to hold the first rows inserted):

Block header dump: 0x01835b37
Object id on Block? Y
seg/obj: 0x190b7 csc: 0x00.2affca5 itc: 2 flg: E typ: 1 – DATA
brn: 0 bdba: 0x1835b30 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0009.010.0000d52e 0x014001a4.0576.0e –U- 302 fsc 0x0000.02affcf5
0x02 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
bdba: 0x01835b37
data_block_dump,data header at 0x1fc8264
===============
tsiz: 0x1f98
hsiz: 0x26e
pbl: 0x01fc8264
76543210
flag=——–
ntab=1
nrow=302
frre=-1
fsbo=0x26e
fseo=0x283
avsp=0x15
tosp=0x15
0xe:pti[0] nrow=302 offs=0
0x12:pri[0] offs=0x718
0x14:pri[1] offs=0x730
0x16:pri[2] offs=0x748
0x18:pri[3] offs=0x760
0x1a:pri[4] offs=0x778

….

0x26a:pri[300] offs=0x6e6
0x26c:pri[301] offs=0x6ff
block_row_dump:
tab 0, row 0, @0x718
tl: 24 fb: –H-FL– lb: 0x1 cc: 3
col 0: [ 2] c1 02
col 1: [ 2] c1 02
col 2: [14] 5a 69 67 67 79 20 53 74 61 72 64 75 73 74
tab 0, row 1, @0x730
tl: 24 fb: –H-FL– lb: 0x1 cc: 3
col 0: [ 2] c1 03
col 1: [ 2] c1 03
col 2: [14] 5a 69 67 67 79 20 53 74 61 72 64 75 73 74
tab 0, row 2, @0x748
tl: 24 fb: –H-FL– lb: 0x1 cc: 3
col 0: [ 2] c1 04
col 1: [ 2] c1 04
col 2: [14] 5a 69 67 67 79 20 53 74 61 72 64 75 73 74
tab 0, row 3, @0x760
tl: 24 fb: –H-FL– lb: 0x1 cc: 3
col 0: [ 2] c1 05
col 1: [ 2] c1 05
col 2: [14] 5a 69 67 67 79 20 53 74 61 72 64 75 73 74
tab 0, row 4, @0x778
tl: 24 fb: –H-FL– lb: 0x1 cc: 3
col 0: [ 2] c1 06
col 1: [ 2] c1 06
col 2: [14] 5a 69 67 67 79 20 53 74 61 72 64 75 73 74

….

The key aspect to note here is that the table is allocated two ITL entries by default. Therefore, two concurrent transactions can safely update this block but as there is no free space sufficient for an additional ITL to be dynamically added, what happens if we have three concurrent transactions ?

In session one:

SQL> delete ziggy where id=1;

1 row deleted.

In session two:

SQL> delete ziggy where id=2;

1 row deleted.

But in session three:

SQL> delete ziggy where id=3;

 

It hangs and will do so until one of the other transactions complete.

So Initrans can preallocate more ITL entries (up to 255) to prevent these types of scenarios.

But what of indexes, is the story the same ? Well, no as we’ll see.

I’ll create another table, but this time give the table its default 10% of freespace but create the index with a PCTFREE 0:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,100), ‘David Bowie’
from dual connect by level <=1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_id_i on bowie(id) pctfree 0;

Index created.

If we look at a partial block dump of the first leaf block in the index:

Block header dump: 0x018ad71c
Object id on Block? Y
seg/obj: 0x190b6 csc: 0x00.2aff9d5 itc: 2 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18ad718 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x02 0xffff.000.00000000 0x00000000.0000.00 C— 0 scn 0x0000.02aff9d5
Leaf block dump
===============
header address 33325668=0x1fc8264
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 540
kdxcofbo 1116=0x45c
kdxcofeo 1120=0x460
kdxcoavs 4
kdxlespl 0
kdxlende 0
kdxlenxt 25876253=0x18ad71d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 80 01 57 00 00
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 80 01 57 00 01
row#2[4523] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 04
col 1; len 6; (6): 01 80 01 57 00 02
row#3[4535] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 05
col 1; len 6; (6): 01 80 01 57 00 03
row#4[4547] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 06
col 1; len 6; (6): 01 80 01 57 00 04

We notice that it also has two ITL entries by default.

However, the first ITL entry is reserved exclusively for Oracle recursive operations on the index (such as changes as a result of an index block split). An index leaf block can therefore only handle one concurrent transaction by default.

Notice also that the leaf block currently contains 540 index entries.

So what happens if we have 2 concurrent transactions wanting to update the same leaf block ?

In session one:

SQL> delete bowie where id=1;

1 row deleted.

In session two:

SQL> delete bowie where id=2;

1 row deleted.

Both complete successfully !! So what has occurred here ? Let’s look at a fresh partial block dump of this leaf block:

Block header dump: 0x018ad71c
Object id on Block? Y
seg/obj: 0x190b6 csc: 0x00.2affa76 itc: 3 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18ad718 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0007.00b.0000ce8a 0x0140da06.0568.01 CB– 0 scn 0x0000.02affa76
0x02 0x0001.013.0000cd57 0x01400083.0559.02 —- 1 fsc 0x000e.00000000
0x03 0x0005.01a.0000d062 0x014009eb.0532.13 —- 1 fsc 0x000e.00000000
Leaf block dump
===============
header address 33325692=0x1fc827c
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 278
kdxcofbo 592=0x250
kdxcofeo 4499=0x1193
kdxcoavs 3907
kdxlespl 0
kdxlende 2
kdxlenxt 25675901=0x187c87d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8012
row#0[4499] flag: —DS–, lock: 2, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 80 01 57 00 00
row#1[4511] flag: —D—, lock: 3, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 80 01 57 00 01
row#2[4523] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 04
col 1; len 6; (6): 01 80 01 57 00 02
row#3[4535] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 05
col 1; len 6; (6): 01 80 01 57 00 03
row#4[4547] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 06
col 1; len 6; (6): 01 80 01 57 00 04

So we have a few interesting things happening here.

Firstly, Oracle has allocated an additional ITL entry, we now have three in this leaf block. And indeed, both deleted rows have the deleted byte (D) set. But how is this possible when the leaf block was previously full ?

When an index is full but requires additional data to be inserted (as in the case of a new index entry), it has a mechanism to cater for this. It’s called an index block split and this is exactly what has occurred here. The second transaction needed additional space within the leaf block for an new ITL entry, so Oracle performed a 50-50 index block split thus providing a half empty block in which to now insert the new ITL entry.

If we look at the number of index entries now in the leaf block (kdxconro), we notice it has gone down from 540 to just 278. If we look at the next leaf block point (kdxlenxt), we notice it now points to a different block (25675901=0x187c87d).

An index block split, which ordinarily occurs frequently enough within an index, is a much better option than having a hanging transaction suffering from ITL contention issues.

As we now have heaps of free space within the leaf block, if we want to now perform yet another concurrent transaction:

In session three:

SQL> delete bowie where id=3;

1 row deleted.

It has also completed successfully. A fresh partial block dump of the index leaf block will show that Oracle has just added yet another ITL entry:

Block header dump: 0x018ad71c
Object id on Block? Y
seg/obj: 0x190b6 csc: 0x00.2affb79 itc: 4 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18ad718 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0007.00b.0000ce8a 0x0140da06.0568.01 CB– 0 scn 0x0000.02affa76
0x02 0x0001.013.0000cd57 0x01400083.0559.02 —- 1 fsc 0x000e.00000000
0x03 0x0005.01a.0000d062 0x014009eb.0532.13 —- 1 fsc 0x000e.00000000
0x04 0x0006.015.0000cfbe 0x0140040f.055e.18 —- 1 fsc 0x000e.00000000
Leaf block dump
===============
header address 33325716=0x1fc8294
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 278
kdxcofbo 592=0x250
kdxcofeo 4475=0x117b
kdxcoavs 3883
kdxlespl 0
kdxlende 3
kdxlenxt 25675901=0x187c87d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 7988
row#0[4475] flag: —DS–, lock: 2, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 80 01 57 00 00
row#1[4487] flag: —D—, lock: 3, len=12
col 0; len 2; (2): c1 03
col 1; len 6; (6): 01 80 01 57 00 01
row#2[4499] flag: —D—, lock: 4, len=12
col 0; len 2; (2): c1 04
col 1; len 6; (6): 01 80 01 57 00 02
row#3[4511] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 05
col 1; len 6; (6): 01 80 01 57 00 03
row#4[4523] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 06
col 1; len 6; (6): 01 80 01 57 00 04

So indexes are far more forgiving than tables when it comes to concurrency issues in relation to transactions acquiring ITL entries.

If we want to perhaps avoid (the generally minimal) overheads of dynamically allocating new ITL entries within indexes or perhaps unnecessary wasted space in monotonically increasing indexes from such block split operations on compactly created indexes, we can use Initrans to preallocate more than the default two ITL entries for indexes:

SQL> drop index bowie_id_i;

Index dropped.

SQL> create index bowie_id_i on bowie(id) initrans 10;

Index created.

Block header dump: 0x018ad71d
Object id on Block? Y
seg/obj: 0x190b2 csc: 0x00.2aff443 itc: 10 flg: E typ: 2 – INDEX
brn: 0 bdba: 0x18ad718 ver: 0x01 opc: 0
inc: 0 exflg: 0

Itl Xid Uba Flag Lck Scn/Fsc
0x01 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x02 0xffff.000.00000000 0x00000000.0000.00 C— 0 scn 0x0000.02aff443
0x03 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x04 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x05 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x06 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x07 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x08 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x09 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
0x0a 0x0000.000.00000000 0x00000000.0000.00 —- 0 fsc 0x0000.00000000
Leaf block dump
===============
header address 33325860=0x1fc8324
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 466
kdxcofbo 968=0x3c8
kdxcofeo 1791=0x6ff
kdxcoavs 823
kdxlespl 0
kdxlende 0
kdxlenxt 25876254=0x18ad71e
kdxleprv 25876252=0x18ad71c
kdxledsz 0
kdxlebksz 7844
row#0[7831] flag: ——-, lock: 0, len=13
col 0; len 3; (3): c2 05 4b
col 1; len 6; (6): 01 80 01 53 00 a6
row#1[7818] flag: ——-, lock: 0, len=13
col 0; len 3; (3): c2 05 4c
col 1; len 6; (6): 01 80 01 53 00 a7

But this is generally unnecessary for indexes and just wastes space within index leaf blocks.

Now back to my grieving😦

David Bowie 1947-2016. My Memories. January 12, 2016

Posted by Richard Foote in Oracle Indexes.
16 comments

bowie1

In mid-April 1979, as a nerdy 13 year old, I sat in my bedroom in Sale, North England listening to the radio when a song called “Boys Keep Swinging” came on by an singer called David Bowie who I never heard of before. I instantly loved it and taped it next time it came on the radio via my cheap radio-cassette recorder. A few weeks later I saw David Bowie for the first time on TV performing this song on the Kenny Everett Video Show and from that moment on I was hooked. His performance was not only funny, but the song rocked and Bowie had a swagger that I just thought was instantly cool. OK, I can never be as cool as Bowie but perhaps I can be cool just by being a fan of Bowie.

I had little money but saved my pennies and eventually bought the “Best of Bowie” K-tel album on cassette (which I still have) to check out his previous work and just couldn’t believe that I loved each and every track. Space Oddity, Life on Mars, Jean Genie, Sound and Vision, “Heroes”, etc. etc. How can one person have so many great songs ?? So began the thrilling journey during the following years of slowly buying his back catalogue, album by album. Hunky Dory, “Heroes”, Diamond Dogs, his first “David Bowie” album, each and every one of them brilliant and I can still remember that feeling of sitting in my bedroom or in the family room back in Sale, listening to each of them for the first time.

In 1980, Bowie released “Scary Monsters and Super Creeps”, his first album release while a Bowie fan. It was stunning and watching the “Ashes to Ashes” video for the first time was an amazing experience. It felt more like a movie than a music video. I was excited that the single made No 1 in the UK charts, a bit like my football team winning a match.

In 1981, my parents bought a record player and the first album I bought on record (previously they had all been on cassette) was “The Rise and Fall of Ziggy Stardust and the Spiders From Mars”. I remember mum being cranky for playing it so loud but I said I had to, look, it says so on the back cover: “To be played at maximum volume”.

That year , I got my first book on David Bowie for my birthday “David Bowie: An Illustrated Record” by Roy Carr and Charles Shaar Murray. I treasure it still, even though I now have over 70 books on Bowie. Yes, I love music and I love to read.

In 1982, we moved to Canberra Australia but my Bowie odyssey continued. The first album I bought soon after our move was Station to Station. What an album. That first Aussie Christmas, I got the “really” expensive “David Live” double-album and just played it to death. What a great time.

In 1983, Bowie toured Australia during the huge “Let’s Dance” world tour and my parents bought me a ticket and bus ride to see him live in Sydney. I couldn’t sleep the night before and after getting to the Sydney Show Grounds nice and early, managed to get myself to the second front row. I spent hours in eager excitement, reading through the program (they were like mini books in those days) and then the moment when I saw him live for the first time. I was THAT close to him and I know he looked at me several times during the show.

As the years passed, I finally bought all his albums and started making enough money that I didn’t have to sell some books or whatever in order to afford the latest magazine that featured Bowie on the cover. Having some money came particularly useful as his music got re-released on CD format !!

1987, Bowie toured Australia again, this time as part of the “Glass Spiders” world tour. I had my first full-time job but no leave. Tricky, especially as Bowie was playing 8 shows at the Sydney Entertainment Arena.  So I bought 2 tickets to each show and spent one of the most wonderful weeks of my life going to work, leaving at around 4pm, driving to Sydney with someone different at most shows and then driving home afterwards to be in bed by 2am to do it all again the next day. Yes, I’m a fully fledged David Bowie fan and yes I loved each and every show:)

While many people stopped following Bowie after his Tin Machine period (which I of course loved), I continue to follow his career keenly and just couldn’t wait to buy his latest album, be it “Black Tie White Noise”, “The Buddha of Suburbia” and one of my all-time favorites, “Outside”. The great albums continued on into the 2000’s when he joined forces again with the great Tony Visconti as record producer.

And so came February 2004 and Bowie’s last tour of Australia during his “Reality” world tour. This time I went to both of his concerts again at the Sydney Entertainment arena, now some 17 years since he last toured. He looked and sounded great and it was fantastic to have my wife with me this time to share the experience. I remember as he walked off after the final show I had a deep and sad feeling that I would never see him live again …

A few months later, he had a heart attack while on stage and the tour finished prematurely. Except for the odd single and public appearance he was rarely seen, before he finally disappeared from the scene in 2006. I continued to play and love his music but with rumours that he wasn’t well health wise, I really thought he had finally retired from making music. Or so it seemed.

While on holiday in Hawaii in January 2013, my son said “Hey Dad, Bowie’s releasing a new single”. I thought it was a joke but to my utter amazement and sheer joy, not only had he just released a new single “Where Are We Now?”, but also a new album “The Next Day”. Not only was Bowie back, but the album was just as good as anything he’s done. What a come back.

Last week on Friday (was it really just a few days ago !!), Bowie released his new album, Blackstar and again, I couldn’t contain my excitement. The Blackstar single and video were both stunning as was the follow-up single Lazarus. But I was a touch worried. He looked frail and thinner than he’d been for a while. But hey, this is Bowie, the Thin White Duke, he often looks thin and frail.

On Sunday night I had a dream about Bowie (not surprisingly really as his new songs were going through me mind), but I dreamt he had died (seriously). I was sooooo relived it was just a silly dream. But on Monday at 6pm, as I was leaving work in the car and the ABC news came on the radio, I was shocked, stunned, shattered to hear that David Bowie had indeed suddenly passed away. I just burst into tears in the car park (thankfully it was late and nobody was around) and then my wife rang and then messages from friends and colleagues come pouring in and so life post David Bowie began.

It’s funny how so many people have said to me how I was the first person they thought of after they heard the news David Bowie had passed away. Some people contacted me who I haven’t spoken to in years (and thank you, it really means a lot). I guess I wasn’t a closet fan. But that’s because David Bowie for the past 37 odd years has been a very real, important part of my life, one of my best ever friends even though I never actually (quite) met him in person. But I knew him like a best friend through his music and the incredible cultural influence he’s had on my life and those of so many many others. Hey, we were even both born in South London.

While it’s deeply sad that I’ll never again enjoy that new David Bowie album into the future, how lucky am I to have lived at the same time as the great genius that is/was David Bowie. And though he’s now gone, his music, legacy and wonderful memories will live on with me for the rest of my life. He was indeed the soundtrack of my life. The Starman is finally up among the stars.

Thank you David Bowie. RIP.

bowie2

An Index or Disaster, You Choose (It’s The End Of The World As We Know It) October 30, 2015

Posted by Richard Foote in Oracle Indexes, skip_unusable_indexes.
3 comments

This come up recently with a customer and I saw a related question on OTN that made me realize I haven’t previously discussed the SKIP_UNUSABLE_INDEXES parameter.

Let me start by making a very simple point. All it potentially takes to make your entire database effectively inoperable or indeed to maybe bring down a RAC instance is for one teeny weeny index to become unusable. That’s it.

If the unusable/missing index is used often enough on a table that is big enough, the resultant explosion of load on the database generate by 10,000s of (say) massive Full Table Scans can be catastrophic on all users of the database (or indeed on all users of the disk subsystem or database servers as in the case of my customer).

Sometimes the lesser evil would be to simply ensure the statement(s) that were reliant on the now unusable index are prevented from executing and causing general havoc, so that all other database activities can continue unabated.

Hence why we have the SKIP_UNUSABLE_INDEXES parameter. When set to TRUE (the default), the CBO simply ignores the fact there’s an index in an unusable state it would have used and chooses the next best available plan (which might be plan horrendous, such as a FTS on a massive table).

If however SKIP_UNUSABLE_INDEXES is set to FALSE, then an error is generated if an index the CBO would have used is in an unusable state. Yes, such statements don’t now run causing possible frustration for some end users, but at least the “damage” is minimized to just those users, with the rest of the database community blissfully unaware and non-impacted by the unusable index.

To illustrate, a simple little example. Let’s just create a table and populate it with 1M rows. Note that the CODE column has 100,000 distinct values, so 10 rows returned for each value.

SQL> create table ziggy (id number, code number, name varchar2(42), description varchar(42));

Table created.

SQL> insert into ziggy select rownum, mod(rownum,100000), 'ZIGGY STARDUST', 'AND THE SPIDERS FROM MARS' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY');

PL/SQL procedure successfully completed.

Let’s create an index on this column as this would be a really cheap way to find a particular CODE of interest:

SQL> create index ziggy_code_i on ziggy(code);

Index created.

So a query that just selects a CODE of interest is going to be nice and efficient:

SQL> select * from ziggy where code=42;

10 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3294205578

----------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |              |    10 |   510 |    13   (0)| 00:00:01 |
|   1 |  TABLE ACCESS BY INDEX ROWID BATCHED| ZIGGY        |    10 |   510 |    13   (0)| 00:00:01 |
|*  2 |   INDEX RANGE SCAN                  | ZIGGY_CODE_I |    10 |       |     3   (0)| 00:00:01 |
----------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CODE"=42)

Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
14  consistent gets
0  physical reads
0  redo size
1325  bytes sent via SQL*Net to client
551  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
10  rows processed

The index is used as expected and at just 14 consistent gets, it burns little CPU and performs no physical IOs.

If however the index were to become unusable …

SQL> alter index ziggy_code_i unusable;

Index altered.

And we now re-run the query:

SQL> select * from ziggy where code=42;

10 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2421001569

---------------------------------------------------------------------------
| Id  | Operation         | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |       |    10 |   510 |  2122   (1)| 00:00:01 |
|*  1 |  TABLE ACCESS FULL| ZIGGY |    10 |   510 |  2122   (1)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("CODE"=42)
Statistics
----------------------------------------------------------
0  recursive calls
0  db block gets
7765  consistent gets
7760  physical reads
0  redo size
965  bytes sent via SQL*Net to client
551  bytes received via SQL*Net from client
2  SQL*Net roundtrips to/from client
0  sorts (memory)
0  sorts (disk)
10  rows processed

We can see the query is now using the next best execution plan involving a FTS on our “large” table, which is performing many more consistent gets and physical IOs. If this query (or others also dependent on the now unusable index) were executed frequently enough and if the table was big enough, the resultant increase in both CPU and IO could be crippling to the entire database environment(s).

Rather than continually running this now extremely costly query, perhaps the better alternative is to prevent this query from running at all.

By changing the SKIP_UNUSABLE_INDEXES parameter to FALSE:

SQL> alter system set skip_unusable_indexes=false;

System altered.

If we try now to execute this query:

SQL> select * from ziggy where code=42;
select * from ziggy where code=42
*
ERROR at line 1:
ORA-01502: index 'BOWIE.ZIGGY_CODE_I' or partition of such index is in unusable
state

We get our beloved (in this scenario) ORA-01502 error. We now prevent 10,000s of expensive FTS and the general database remains unaffected.

So you have the choice in what to do here. In the event of an unusable index, is there a sufficiently “good enough” alternate option that makes executing statements impacted by the unusable index viable or is the resultant impact on the database just too much and preventing such statements from executing at all the  better option.

It’s potentially a quick fix for a dire database emergency.

Richard Foote: Upcoming Presentation Events (David Live) October 20, 2015

Posted by Richard Foote in Oracle Indexes.
4 comments

I’ll be doing the rounds in Australia/NZ in the coming weeks so plenty of opportunity to catch-up and see me in action:)

I’ll be doing a “National Tour” of my Let’s Talk Oracle Database sessions that I’ve been running locally in Canberra for a number of years. All events have limited places available so please register early to avoid disappointment.

All Let’s Talk Oracle Database events are free, start at 8:30am – 9:00am for registration and coffee, conclude at 12:30pm and are followed by lunch, networking and an informal Q&A. Details of events and dates are:

11 November Brisbane: Let’s Talk Oracle Database: Oracle 12c Database New Features (click here to register). Note: This event only has very few places remaining.

17 November Canberra: Let’s Talk Oracle Database: Performance Diagnostics – How To Correctly Use AWR Reports (email richard.foote@oracle.com to register). Note: This event is now officially FULL.

19 November Sydney: Let’s Talk Oracle Database: Oracle 12c Database New Features (email mitch.foster@oracle.com or click here to register)

25 November Adelaide: Let’s Talk Oracle Database: Performance Diagnostics – How To Correctly Use AWR Reports (email brodie.james@oracle.com to register).

I’ll also be participating on the OTN APAC 2015 Tour with a number of other great presenters. I’ll be presenting at the following events:

30 November Sydney (click here to register)

2 December Perth (click here to register)

4 December Wellington (click here to register)

Summary of the Let’s Talk Oracle Database sessions:

Oracle 12c Database New Features

An in-depth technical look at many of the exciting enhancements and new capabilities introduced in the Oracle 12c Database that help make life easier for the DBA and Oracle Developer. Among a number of topics, we’ll have a look at changes with respect the Cost Based Optimizer, new statistics gathering options, discuss many of the new indexing related capabilities such as Partial Indexes, Advanced Index Compression, Asynchronous Global Index Maintenance, etc. and look at many of those cool little things such as In-Database Archiving, Automatic Data Optimization, Identity Columns, Row-Limiting Clauses, Temporary Undo, Online Move operations, Clustering Attributes, Temporal and Improved Flashback Queries etc. to name but a few that can sometimes make all the difference. Might even squeeze in a bit of Database In-Memory. So plenty to talk about !!!

Performance Diagnostics – How To Correctly Use AWR Reports

I often get called in by customers to determine and address the root cause of database performance issues. Depending on the issue, a request for a simple Automatic Workload Repository (AWR) report is often sufficient to accurately diagnose the root problem(s). However, many DBAs find AWR reports daunting as they contain so much database diagnostic information, they don’t even know where to begin to find the truly useful information.

In this session, I begin by discussing the basic concepts of response times, DB times, the Oracle Wait Interface and how to focus on what is truly important. I then look at how to best read an AWR report to quickly go to the most relevant sections that detail any specific issues. I also discuss a general tuning and diagnostic methodology that ensures one can quickly determine whether an AWR report will indeed be sufficient and how to accurately and consistently use the AWR report to pinpoint and determine root causes for global database performance issues. We’ll go through a number of actual “real-life” examples that highlight various performance issues and how one accurately determines the actual root issues through an appropriate AWR report.

Time permitting, we’ll go through some other useful diagnostic tools such as the AWR Warehouse, Automatic Database Diagnostic Monitor (ADDM), Active Session History (ASH)  and SQL Extended Tracing and scenarios when they can prove to be invaluable in addressing database performance issues.

Hope to see you at one of these events:)

Index Advanced Compression: Multi-Column Index Part II (Blow Out) September 24, 2015

Posted by Richard Foote in Advanced Index Compression, Concatenated Indexes, Index Column Order, Index Rebuild, Oracle Indexes.
add a comment

I previously discussed how Index Advanced Compression can automatically determine not only the correct number of columns to compress, but also the correct number of columns to compress within specific leaf blocks of the index.

However, this doesn’t mean we can just order the columns within the index without due consideration from a “compression” perspective. As I’ve discussed previously, the column order within an index can be very important (especially with regard the use of the index if the leading column of the index is not specified in the SQL), including with regard to the possible compression capabilities of an index.

Advanced Index Compression does not change this and if we order columns inappropriately, one of the consequences can be the index simply can’t be compressed.

To illustrate, back to my simple little example:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000; 1000000 rows created. SQL> commit;

Commit complete.

But this time, I’m going to create the index with the columns the other way around than I had in the previous post. The effectively unique ID column is now the leading column in the index, followed by the CODE column that indeed has many duplicate values. There is a “myth” that suggests this is actually a more “efficient” way to order an index, put the column with most distinct values first in the index. This is of course not true (yes, I’ve covered this one before as well).

SQL> create index bowie_idx on bowie(id, code) pctfree 0;

Index created.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2363          2

So the index without compression has 2363 leaf blocks.

As the leading column is effectively unique, we simply can’t now compress this index effectively. That’s because compression requires there to be duplicate index entries starting with at least the leading column. If the leading column has few (or no) duplicates, then by compressing the index Oracle is effectively creating a prefixed entry within the leaf block for each and every index entry. The whole point of index (or table) compression is to effectively de-duplicate the index values but there’s nothing to de-duplicate if there are no repeating values in at least the leading column of the index.

If we attempt to just compress fully the index anyways:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3115          2

It actually results in a bigger, not smaller index. The leaf blocks has gone up from 2363 to 3115.

Unlike the previous post where the columns in the index were the other way around, if we attempt to just compress the first column, it makes no difference to the inefficiency of the index compression because the number of prefix entries we create remains exactly the same:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3115          2

So the index remains at the higher 3115 leaf blocks.

The good thing with Advanced Index Compression is that we can “give it a go”, but it will not result in a larger index structure. If there’s nothing to compress within a leaf block, Oracle just ignores it and moves on to the next leaf block. If there’s nothing to compress at all within the index, the index remains the same as if it’s not been compressed:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2363          2

So the index is now back to 2363 leaf blocks, the same as if it wasn’t compressed at all. No it hasn’t helped, but at least it hasn’t made things worse.

So the order of the columns still plays a vital part in the “compress-ability” of the index, even with Index Advanced Compression at your disposal. If both the ID and CODE columns are referenced in your code, then having CODE as the leading column of the index would both improve the manner in which the index can be compressed and make a Skip-Scan index scan viable in the case when the CODE column might not occasionally be specified.

Now, if we change the leading column and create some duplicates (in this case, we update about 10% of the rows to now have duplicate values in the leading ID column):

SQL> update bowie set id=42 where id between 442000 and 542000;

100001 rows updated.

SQL> commit;

Commit complete.

SQL> alter index bowie_idx rebuild nocompress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2338          2

With a whole bunch of IDs with a value of 42, the non-compressed index now has 2338 leaf blocks. Yes, 10% of the leading columns have duplicates, but 90% of the index doesn’t and remains effectively unique. So if we try and compress this index now:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2941          2

The compressed index now has 2941 leaf blocks and is still larger than the 2338 non-compressed index. Yes, it’s compressed the 10% of the index that it could, but the inefficiencies in dealing with the other 90% has resulted in an overall larger index. So not too good really.

Again, compressing just the leading ID column doesn’t improve matters:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2977          2

In fact, at 2977 it’s even worse than compressing all the index because by compressing both columns, we could also effectively compress the duplicate CODE columns as well within that 10% of the index where we had duplicate ID values. With compressing just the ID column, we don’t get the benefit of compressing the duplicate CODE values. So not very good either.

In either case, compressing the index is ineffective as we end up with a bigger, not smaller index.

But not with Index Advanced Compression:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2265          2

We now have a index structure at just 2265 leaf blocks that is indeed smaller than the non-compressed index (2338 leaf blocks) because Oracle can now compress just the 10% of index where compression is effective and just ignore the rest of the index (90%) where compression is ineffective.

The best of both worlds, where Index Advanced Compression can compress just the part of an index where it effectively can and ignore and not make matters worse in any parts of the index where index compression is ineffective.

An indexing no-brainer …

Index Advanced Compression: Multi-Column Index Part I (There There) September 17, 2015

Posted by Richard Foote in 12c, Advanced Index Compression, Concatenated Indexes, Index Rebuild, Oracle Indexes.
add a comment

I’ve discussed Index Advanced Compression here a number of times previously. It’s the really cool additional capability introduced to the Advanced Compression Option with 12.1.0.2, that not only makes compressing indexes a much easier exercise but also enables indexes to be compressed more effectively than previously possible.

Thought I might look at a multi-column index to highlight just how truly cool this new feature is in automatically managing the compression of indexes.

First, let’s create a little table and multi-column index:

SQL> create table bowie (id number, code number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, mod(rownum,10), 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_idx on bowie(code, id) pctfree 0;

Index created.

OK, the key thing to note here is that the leading CODE column in the index only has 10 distinct values and so is repeated very frequently. However, the second ID column is effectively unique such that the index entry overall is also likewise effectively unique. I’ve created this index initially with no compression, but with a PCTFREE 0 to make the non-compressed index as small as possible.

If we look at the size of the index:

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2361          2

We notice the index currently has 2361 leaf blocks.

I’ve previously discussed how index compression basically de-duplicates the indexed values by storing them in a pre-fixed table within the index leaf block. These pre-fixed entries are them referenced in the actual index entries, meaning it’s only now necessary to store repeated values once within a leaf block. Only repeated index values within an index leaf block can therefore be effectively compressed.

In this example, it would be pointless in compressing both indexed columns as this would only result in a unique pre-fixed entry for each any every index entry, given that the ID column is unique. In fact, the overhead of having the pre-fixed table for each and every index entry would actually result in a larger, not small overall index structure.

To show how compressing the whole index would be a really dumb idea for this particular index:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         3120          2

The COMPRESS option basically compresses the whole index and we note that rather than creating a smaller, compressed index structure, the index is in fact bigger at 3120 leaf blocks.

However, as the leading CODE column in the index only has 10 distinct values and so is heavily repeated, it would make sense to just compress this first CODE column only in the index. This of course requires us to fully understand the data associated with the index.

We can do this by specifying just how many leading columns to compress (in this case just 1):

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2002          2

We note the index is indeed smaller than it was originally, now at just 2002 leaf blocks.

So this requires us to make the correct decision in how many columns in the index to compress. Getting this wrong can result in a worse, not better overall index structure.

Now with Advanced Index Compression, we don’t have to make this decision, we can simply let Oracle do it for us. As discussed previously, Oracle can go through each leaf block and decide how to best compress each leaf block. In this case, it can automatically determine that it’s only beneficial to compress the CODE column throughout the index.

If we compress this index with the new COMPRESS ADVANCED LOW clause:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2002          2

We note we get the index at the nice, small 2002 leaf blocks, as if we used the correct COMPRESS 1 decision.

However, the story gets a little better than this …

Let’s now modify the contents of the table so that we create some duplicates also for the second ID column:

SQL> update bowie set id=42 where id between 442000 and 542000;

100001 rows updated.

SQL> commit;

Commit complete.

OK, so for about 10% of rows, the ID column value is indeed repeated with the value 42. However, for the remaining 90% of rows (and hence index entries), the ID column remains effectively unique. So we have this 10% section of the index where ID is indeed heavily repeated with the value 42, but everywhere else within the index the ID remain unique.

If we rebuild this index again with no compression:

SQL> alter index bowie_idx rebuild nocompress pctfree 0;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2336          2

We now end up with 2336 leaf blocks (a little smaller than before the update as we’re replacing 10% of the IDs with a smaller value of just 42).

However, the vast majority (90%) of the index entries are still unique, so attempting to compress the entire index is again unlikely to be beneficial:

SQL> alter index bowie_idx rebuild compress;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         2946          2

Indeed, the index is again now bigger at 2946 than it was when it wasn’t compressed.

We can again effectively compress just the CODE column in the index:

SQL> alter index bowie_idx rebuild compress 1;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         1977          2

OK, just compressing the CODE column has indeed resulted in a smaller index structure (just 1977 leaf blocks) as it did before.

Without Advanced Index Compression we have the option to not compress the index (the result is average), compress both columns (the result is worse) or compress just the leading column (the result is better). It’s an all or nothing approach to index compression with the best method decided at the overall index level.

We don’t have the option to compress just the leading column when it makes sense to do so, but to also compress both columns in just the 10% portion of the index where it also makes sense to do so (when we have lots of repeating 42 values for ID).

We do have this option though with Advanced Index Compression and indeed this is performed automatically by Oracle in just those leaf blocks where it’s beneficial because the decision on how to compress an index is not performed at the overall index level but at the leaf block level. As such, Advanced Index Compression has the potential to compress an index in a manner that was simply not possible previously:

SQL> alter index bowie_idx rebuild compress advanced low;

Index altered.

SQL> select index_name, leaf_blocks, blevel from dba_indexes where table_name='BOWIE';

INDEX_NAME LEAF_BLOCKS     BLEVEL
---------- ----------- ----------
BOWIE_IDX         1941          2

We notice the index is now even smaller at just 1941 leaf blocks than it was when just compressing the leading column as we now also compress the CODE column in just that 10% of the table where we also had repeating ID values.

I can’t emphasise enough just how cool this feature is !!

In fact, I would recommend something I don’t usually recommend and that is rebuilding all your indexes at least once (where you know the leading column has some repeated values) with the Advanced Index Compression option, so that all indexes can be compressed to their optimal manner.

Note though that this does require the Advanced Compression Option !!

More later:)

Presenting the Hotsos Symposium Training Day – 10 March 2016 (Heat) September 16, 2015

Posted by Richard Foote in Oracle Indexes.
4 comments

I’ve just accepted an invitation to present the Hotsos Symposium Training Day on 10 March 2016 in sunny Dallas, Texas. In the age of Exadata and In-Memory databases, it’ll be an updated and consolidated version of my Index Internals and Best Practices seminar. With an emphasis on using indexes appropriately to boost performance, it’ll feature lots of tips, tricks and references to David Bowie:)

The Hotsos Symposium (6-10 March 2016), with it’s focus on performance analysis is one of the very best conferences I’ve ever presented at and I’m very excited at the prospect of returning next year. The confirmed list of presenters is already impressive: Tanel Põder, Bryn Llewellyn, Kellyn Pot’Vin, Jim Czuprynski, Jeff Smith, Kerry Osborne, Carlos Sierra and now Richard Foote, as I’ll likely present a session or two during the conference proper as well.

If you get the chance, do yourself a favour and join me at one of the best Oracle conferences available and hopefully stay on for the Training Day where there’ll be lots of learning and fun to be had !!

Presenting in Perth on 9 September and Adelaide on 11 September (Stage) August 22, 2015

Posted by Richard Foote in Oracle Indexes.
add a comment

For those of you lucky enough to live on the western half of Australia, I’ll be presenting at a couple of events in both Perth and Adelaide in the coming weeks.

On Wednesday, 9th September 2015, I’ll be presenting on Oracle Database 12c New Features For DBAs (and Developers) at a “Let’s Talk Oracle” event in Perth, with fellow Ex-Oracle ACE Directors Connor McDonald and Chris Muir. Full agenda as follows:

8:30-9:00 Registration and coffee
9:00-10:30 Richard Part I – Database 12c New Features for DBAs (and Developers)
10:30-11:00 Break
11:00-12:30 Richard Part II – Database 12c New Features for DBAs (and Developers)
12:30-1:30 Lunch
1:30-2:30 Chris – Creating RESTful APIs with Oracle Data Services (for Developers and DBAs)
2:30-2:45 Break
2:45-4:15 Connor – Database 12c New Features for Developers (and DBAs)
4:15-5:00 Q&A with the ACES !
5:00         Wrap up

This is a free event to be held at the Oracle Offices in Perth so places are strictly limited (and almost exhausted). It should be a fantastic day, so I hope you can make it. To register, just send an email to dennis.ward@oracle.com

The following morning, I’ll likely run an Indexing Q&A breakfast session for the WAOUG. Further details to come.

On Friday, 11th September 2015, I’ll be presenting a similar session on Oracle Database 12 New Features (For both DBAs and Developers) at the Adelaide Offices between 9am and 12:30pm. Again, places will be strictly limited on a first come first served basis. To register, just send an email to brodie.james@oracle.com.

Hopefully, I’ll see you at one of these events. If not, I’m planning to run similar events in the other capital cities in Australia in the coming weeks, more details to follow soon.

Why A Brand New Index Might Benefit From An Immediate Coalesce (One Slip) July 6, 2015

Posted by Richard Foote in ASSM, Index Block Splits, Index Internals, Insert Append, Oracle Indexes, Tree Dumps, Truncate Indexes.
add a comment

A recent question on the OTN Forums Reg: Index – Gathering Statistics vs. Rebuild got me thinking on a scenario not unlike the one raised in the question where a newly populated index might immediately benefit from a coalesce.

I’ve previously discussed some of the pertinent concepts such as how index rebuilds can make indexes bigger, not smaller and some of the issues around 90-10 block splits not occurring.

Let me show you a scenario where a newly populated index might benefit from an immediate coalesce.

As usual, I start with my little Bowie table and index on the ID column:

SQL> create table bowie (id number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, 'DAVID BOWIE' from dual connect by level <= 1000000;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> create index bowie_id_i on bowie(id);

Index created.

If we look at the current statistics on table and index:

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

SQL> select num_rows, blocks, empty_blocks from dba_tables where table_name='BOWIE';

NUM_ROWS     BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
1000000       3142            0

SQL> select blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_ID_I';

BLEVEL LEAF_BLOCKS   NUM_ROWS
---------- ----------- ----------
2        2226    1000000

We note the index currently has 2226 leaf blocks. A look at a partial tree dump of the index:

—– begin tree dump
branch: 0x1834723 25380643 (0: nrow: 4, level: 2)
branch: 0x18340b6 25378998 (-1: nrow: 672, level: 1)
leaf: 0x1834724 25380644 (-1: row:485.485 avs:828)
leaf: 0x1834725 25380645 (0: row:479.479 avs:820)
leaf: 0x1834726 25380646 (1: row:479.479 avs:820)
leaf: 0x1834727 25380647 (2: row:479.479 avs:820)
leaf: 0x1834728 25380648 (3: row:479.479 avs:820)
leaf: 0x1834729 25380649 (4: row:479.479 avs:819)
leaf: 0x183472a 25380650 (5: row:479.479 avs:820)
leaf: 0x183472b 25380651 (6: row:479.479 avs:820)
leaf: 0x183472c 25380652 (7: row:479.479 avs:820)
leaf: 0x183472d 25380653 (8: row:479.479 avs:819)
leaf: 0x183472e 25380654 (9: row:479.479 avs:820)
leaf: 0x183472f 25380655 (10: row:479.479 avs:820)

Shows us that the index is fully populated with just the default 10% pctfree of free space (most leaf blocks have an avs of 819/820 free bytes).

The forum question mentions a scenario where 70% of the table is archived away. This is done by storing in a temporary table the 30% of required data, followed by a truncate of the original table and the 30% of data being re-inserted. Something like the following:

SQL> create table bowie_temp as select * from bowie where id > 700000;

Table created.

SQL> truncate table bowie;

Table truncated.

So we store in the temp table all values with an ID > 700000. When a table is truncated, so are all the associated indexes. If we performed a tree dump of the index straight after the truncate:

—– begin tree dump
leaf: 0x1834723 25380643 (0: row:0.0 avs:8000)
—– end tree dump

We can see the index consists now of nothing but an empty leaf block.

If we now re-insert the 30% of data of interest and collect fresh statistics:

SQL> insert into bowie select * from bowie_temp;

300000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

SQL> select num_rows, blocks, empty_blocks from dba_tables where table_name='BOWIE';

NUM_ROWS     BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
300000       1000            0

SQL> select blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_ID_I';

BLEVEL LEAF_BLOCKS   NUM_ROWS
---------- ----------- ----------
2        1112     300000

We note the number of leaf blocks has effectively halved to just 1112 leaf blocks down from 2226 leaf blocks.

But isn’t this index somewhat larger than it should be ? If we removed 70% of data, why has the index just reduced by 50% ?

If we look at a partial tree dump of the index now:

branch: 0x1834723 25380643 (0: nrow: 3, level: 2)
branch: 0x18335de 25376222 (-1: nrow: 332, level: 1)
leaf: 0x1834727 25380647 (-1: row:255.255 avs:3922)
leaf: 0x18004f6 25167094 (0: row:255.255 avs:3923)
leaf: 0x18004f2 25167090 (1: row:255.255 avs:3922)
leaf: 0x18004f3 25167091 (2: row:255.255 avs:3923)
leaf: 0x18004f4 25167092 (3: row:255.255 avs:3922)
leaf: 0x1800505 25167109 (4: row:449.449 avs:821)
leaf: 0x18004f5 25167093 (5: row:246.246 avs:4066)
leaf: 0x18004f1 25167089 (6: row:246.246 avs:4067)
leaf: 0x1834724 25380644 (7: row:500.500 avs:5)
leaf: 0x1834725 25380645 (8: row:500.500 avs:5)
leaf: 0x1834726 25380646 (9: row:500.500 avs:5)
leaf: 0x18004f7 25167095 (10: row:500.500 avs:5)
leaf: 0x18004f0 25167088 (11: row:255.255 avs:3922)
leaf: 0x1833d8c 25378188 (12: row:255.255 avs:3923)
leaf: 0x18331ed 25375213 (13: row:255.255 avs:3922)
leaf: 0x18331fd 25375229 (14: row:255.255 avs:3923)
leaf: 0x18331fa 25375226 (15: row:255.255 avs:3922)
leaf: 0x18331c8 25375176 (16: row:255.255 avs:3923)
leaf: 0x18331c9 25375177 (17: row:253.253 avs:3954)
leaf: 0x18331cd 25375181 (18: row:255.255 avs:3923)
leaf: 0x18331d4 25375188 (19: row:255.255 avs:3923)
leaf: 0x18331e0 25375200 (20: row:255.255 avs:3922)

We notice that vast areas of the index now has 50% of free space, not the default 10% it had previously.

The “problem” here is that when data was stored in the temp table, there was nothing to guarantee that the data be physically stored in the same order as the ID column. ASSM tablespaces will effectively pick random free blocks below the high water mark of the table so that although data might well be ordered within a block, the blocks within the table might not be logically ordered with respect to the data being inserted.

A simple select from the temp table displaying the “first” 20 rows of the table will illustrate this:

SQL> select * from bowie_temp where rownum <=20;

ID NAME
---------- ------------------------------------------
701717 DAVID BOWIE
701718 DAVID BOWIE
701719 DAVID BOWIE
701720 DAVID BOWIE
701721 DAVID BOWIE
701722 DAVID BOWIE
701723 DAVID BOWIE
701724 DAVID BOWIE
701725 DAVID BOWIE
701726 DAVID BOWIE
701727 DAVID BOWIE
701728 DAVID BOWIE
701729 DAVID BOWIE
701730 DAVID BOWIE
701731 DAVID BOWIE
701732 DAVID BOWIE
701733 DAVID BOWIE
701734 DAVID BOWIE
701735 DAVID BOWIE
701736 DAVID BOWIE

Note that the first selected range of values starts with 701717 and not 700001.

Therefore, when the data is loaded back within the table, the data is not necessarily ordered as it was previously and an insert into a full leaf block might not necessarily have the largest ID column currently within the table/index. That being the case, a 50-50 block split is performed rather than the 90-10 block splits that only occur when it’s the largest indexed value being inserted into a full leaf block. 90-10 block splits leaves behind nice full leaf blocks, but 50-50 block splits leaves behind 1/2 emptied blocks that don’t get filled if they don’t house subsequent inserts.

The 50-50 leaf block split is resulting in a larger index but most importantly, these areas of free space are not going to be used by subsequent monotonically increasing ID index values.

To reclaim this effectively unusable index storage, the index would benefit from an immediate coalesce.

Of course, the best cure would be to prevent this scenario from occurring in the first place.

One option would be to insert the data in a manner that ensures the effective ordering of the indexed data:

SQL> truncate table bowie;

Table truncated.

SQL> insert into bowie select * from bowie_temp order by id;

300000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

SQL> select num_rows, blocks, empty_blocks from dba_tables where table_name='BOWIE';

NUM_ROWS     BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
300000       1000            0

SQL> select blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_ID_I';

BLEVEL LEAF_BLOCKS   NUM_ROWS
---------- ----------- ----------
1         600     300000

We notice by inserting the data in ID order within the table (via the “order by” clause), we guarantee that an insert into a full leaf block will indeed be as a result of the highest current index entry and that 90-10 block splits are performed. This leaves behind perfectly full leaf blocks and a nice, compact index structure. The index now only has 600 leaf blocks, significantly less than before with no “wasted” storage.

A partial tree dump of the index highlights this:

—– begin tree dump
branch: 0x1834723 25380643 (0: nrow: 600, level: 1)
leaf: 0x1834727 25380647 (-1: row:500.500 avs:5)
leaf: 0x1834724 25380644 (0: row:500.500 avs:5)
leaf: 0x1834725 25380645 (1: row:500.500 avs:5)
leaf: 0x1834726 25380646 (2: row:500.500 avs:5)
leaf: 0x18004f7 25167095 (3: row:500.500 avs:5)
leaf: 0x18004f0 25167088 (4: row:500.500 avs:5)
leaf: 0x18004f1 25167089 (5: row:500.500 avs:5)
leaf: 0x18004f5 25167093 (6: row:500.500 avs:5)
leaf: 0x18004f6 25167094 (7: row:500.500 avs:5)
leaf: 0x18004f2 25167090 (8: row:500.500 avs:5)
leaf: 0x18004f3 25167091 (9: row:500.500 avs:5)
leaf: 0x18004f4 25167092 (10: row:500.500 avs:5)

The index leaf blocks are chock-a-block full with just 5 bytes free, not enough space for another index entry.

Another alterative would of course be to make the indexes unusable before re-inserting data into the data and rebuild the index later with a pctfree of 0, a safe and appropriate value for monotonically increasing values. This has the added advantage of significantly improving the performance of the bulk insert operation.

If the data isn’t monotonically increasing, then 50-50 block splits are fine as the resultant free space would indeed be effectively used.

Update: 7 July 2015

And as Jonathan Lewis kindly reminded me, another method to avoid this issue is to simply use an insert Append. This will record all the key entries as it goes, sort them and populate the index much more efficiently in one step:

SQL> truncate table bowie;

Table truncated.

SQL> insert /*+ append */ into bowie select * from bowie_temp;

300000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'BOWIE');

PL/SQL procedure successfully completed.

SQL> select num_rows, blocks, empty_blocks from dba_tables where table_name='BOWIE';

NUM_ROWS     BLOCKS EMPTY_BLOCKS
---------- ---------- ------------
300000        936            0

SQL> select blevel, leaf_blocks, num_rows from dba_indexes where index_name='BOWIE_ID_I';

BLEVEL LEAF_BLOCKS   NUM_ROWS
---------- ----------- ----------
1         600     300000

So at 600 leaf blocks, the index is again populated within a fully compacted index structure.

Quiz Time. Why Do Deletes Cause An Index To Grow ? (Solution) June 29, 2015

Posted by Richard Foote in Oracle Indexes.
add a comment

OK, time to reveal how a couple of simple deletes can cause an index to double in size.

If we go back and look at the tree dump before the delete operation:

—– begin tree dump
branch: 0x180050b 25167115 (0: nrow: 19, level: 1)
leaf: 0x180050c 25167116 (-1: row:540.540 avs:4)
leaf: 0x180050d 25167117 (0: row:533.533 avs:10)
leaf: 0x180050e 25167118 (1: row:533.533 avs:11)
leaf: 0x180050f 25167119 (2: row:533.533 avs:10)
leaf: 0x1800510 25167120 (3: row:533.533 avs:10)
leaf: 0x1800511 25167121 (4: row:533.533 avs:11)
leaf: 0x1800512 25167122 (5: row:533.533 avs:10)
leaf: 0x1800513 25167123 (6: row:533.533 avs:10)
leaf: 0x1800514 25167124 (7: row:533.533 avs:11)
leaf: 0x1800515 25167125 (8: row:533.533 avs:10)
leaf: 0x1800516 25167126 (9: row:533.533 avs:10)
leaf: 0x1800517 25167127 (10: row:533.533 avs:11)
leaf: 0x1800519 25167129 (11: row:533.533 avs:10)
leaf: 0x180051a 25167130 (12: row:533.533 avs:10)
leaf: 0x180051b 25167131 (13: row:533.533 avs:11)
leaf: 0x180051c 25167132 (14: row:533.533 avs:10)
leaf: 0x180051d 25167133 (15: row:533.533 avs:10)
leaf: 0x180051e 25167134 (16: row:533.533 avs:11)
leaf: 0x180051f 25167135 (17: row:399.399 avs:2019)
—– end tree dump

We notice there’s only 10 or 11 free bytes available (the avs) in most of the leaf blocks as we created the index with a pctfree of 0.

If we look at a partial block dump of the first leaf block after the deletes:

Block header dump:  0x0180050c
Object id on Block? Y
seg/obj: 0x18244  csc: 0x00.4b9940  itc: 2  flg: E  typ: 2 - INDEX
brn: 0  bdba: 0x1800508 ver: 0x01 opc: 0
inc: 0  exflg: 0

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.000.0000097f  0x0140010a.0186.01  -BU-    1  fsc 0x0000.004b9956
0x02   0x0005.001.00000a03  0x01400ba7.0156.28  ----    0  fsc 0x0000.00000000
Leaf block dump
===============
header address 37170788=0x2372e64
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 1
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 278
kdxcofbo 592=0x250
kdxcofeo 4523=0x11ab
kdxcoavs 3931
kdxlespl 0
kdxlende 0
kdxlenxt 25167140=0x1800524
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[4523] flag: -------, lock: 0, len=12
col 0; len 2; (2):  c1 02
col 1; len 6; (6):  01 80 04 e7 00 00
row#1[4535] flag: -------, lock: 0, len=12
col 0; len 2; (2):  c1 03
col 1; len 6; (6):  01 80 04 e7 00 01

.....

The key thing to note here is that the leaf block has two Interested Transaction List (ITL) slots, each of which use 24 bytes. Two is the default number of ITL slots in an index leaf block (index branch blocks only have 1 by default) and are used by transactions to store vital information such as transaction id, locking information, location of undo and SCN details. However, the first slot (No. 1) is only used by recursive transactions such as those required to perform index block splits and can’t be used for standard user-based transactions. I discuss this in my (in)famous Rebuilding The Truth presentation.

Now my quiz demo had two concurrent delete transactions occurring within the same leaf block(s) but with effectively just the one free ITL slot available for the two transactions. Ordinarily, Oracle would just allocate another ITL slot so both transactions can both concurrently delete the different index entries within the same leaf block. However, Oracle is unable to simply add another ITL slot in this scenario as it requires 24 bytes of free space and there is only the 10 or 11 bytes free in our leaf blocks.

In a similar scenario with a table segment, being unable to allocate another ITL slot like this would result in a nasty ITL wait event for one of the transactions. But for indexes, there is a “naturally occurring” event that results in plenty of additional free space as required.

The index block split.

So rather than have one transaction having to hang and wait for a ITL slot to become available (i.e. for the other transaction to complete), Oracle simply performs a 50-50 block split and allocates the additional ITL slot as necessary, if both transactions still occur within the same leaf block after the block split.

In my quiz demo, both delete transactions were actually performed on index entries that existed in the other half of the block split. Therefore, the number of ITL slots in the first leaf block remains at just the default two and the kdxlende value which denotes deleted index entries remains at 0.

If we look at a partial block dump of the other half of the leaf block split, now the second logical leaf block (as identified by kdxlenxt 25167140=0x1800524):

Block header dump:  0x01800524
Object id on Block? Y
seg/obj: 0x18244  csc: 0x00.4b9956  itc: 3  flg: E  typ: 2 - INDEX
brn: 1  bdba: 0x1800518 ver: 0x01 opc: 0
inc: 0  exflg: 0

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc
0x01   0x0009.000.0000097f  0x0140010c.0186.01  CB--    0  scn 0x0000.004b9956
0x02   0x0005.001.00000a03  0x01400ba7.0156.28  ----    1  fsc 0x000f.00000000
0x03   0x0003.01b.00000944  0x01400dfd.01f1.19  --U-    1  fsc 0x000e.004b9970
Leaf block dump
===============
header address 37170812=0x2372e7c
kdxcolev 0
KDXCOLEV Flags = - - -
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y
kdxconco 2
kdxcosdc 1
kdxconro 262
kdxcofbo 560=0x230
kdxcofeo 4609=0x1201
kdxcoavs 4049
kdxlespl 0
kdxlende 2
kdxlenxt 25167117=0x180050d
kdxleprv 25167116=0x180050c
kdxledsz 0
kdxlebksz 8012
row#0[4609] flag: -------, lock: 0, len=13
col 0; len 3; (3):  c2 03 50
col 1; len 6; (6):  01 80 04 e7 01 16
row#1[4622] flag: -------, lock: 0, len=13
col 0; len 3; (3):  c2 03 51
col 1; len 6; (6):  01 80 04 e7 01 17

....

row#219[7454] flag: -------, lock: 0, len=13
col 0; len 3; (3):  c2 05 63
col 1; len 6; (6):  01 80 04 e3 00 93
row#220[7467] flag: ---DS--, lock: 2, len=13
col 0; len 3; (3):  c2 05 64
col 1; len 6; (6):  01 80 04 e3 00 94
row#221[7480] flag: ---D---, lock: 3, len=12
col 0; len 2; (2):  c2 06
col 1; len 6; (6):  01 80 04 e3 00 95

....

The first thing we notice is that this leaf block as three, not the default two ITL slots. As both concurrent delete transactions deleted entries from this particular leaf block, an additional ITL slot was allocated as we now have plenty of free space.

The kdxlende value is set to 2 as we now have the two index entries marked as deleted (these are index entries 220 and 221 within the block). Index entry 220 was deleted by the transaction logged in ITL slot 2 and index entry 221 was deleted by the transaction logged in the new ITL slot 3.

So having two concurrent transactions wanting to delete from the same “full” leaf block resulted in the leaf block performing a 50-50 block split with a new leaf block being added to the index in order to accommodate the additional required ITL slot.

I was very careful when deleting rows from the table to cause maximum “damage” to the index. Both delete transactions in my demo effectively deleted every 500th pair of index entries. As there was previously approx. 533 index entries per leaf block, this resulted in every leaf block in the index splitting in this exact manner as all the leaf blocks had two index entries deleted. This is why the deletes resulted in the index practically doubling in size.

The only index leaf block that didn’t have to split was the very last leaf block as it had plenty of free space (2019 bytes) to accommodate the additional ITL slot. This last leaf block only had 1995 bytes of free space after the deleted, as it lost the 24 bytes due to the additional ITL slot being allocated. You can see these numbers in the tree dumps (following is the tree dump after the delete operations):

—– begin tree dump
branch: 0x180050b 25167115 (0: nrow: 37, level: 1)
leaf: 0x180050c 25167116 (-1: row:278.278 avs:3931)
leaf: 0x1800524 25167140 (0: row:262.260 avs:4049)
leaf: 0x180050d 25167117 (1: row:271.271 avs:3938)
leaf: 0x1800525 25167141 (2: row:262.260 avs:4048)
leaf: 0x180050e 25167118 (3: row:271.271 avs:3938)
leaf: 0x1800526 25167142 (4: row:262.260 avs:4049)
leaf: 0x180050f 25167119 (5: row:271.271 avs:3937)
leaf: 0x1800527 25167143 (6: row:262.260 avs:4049)
leaf: 0x1800510 25167120 (7: row:271.271 avs:3938)
leaf: 0x1800520 25167136 (8: row:262.260 avs:4048)
leaf: 0x1800511 25167121 (9: row:271.271 avs:3938)
leaf: 0x1800521 25167137 (10: row:262.260 avs:4049)
leaf: 0x1800512 25167122 (11: row:271.271 avs:3937)
leaf: 0x1800522 25167138 (12: row:262.260 avs:4049)
leaf: 0x1800513 25167123 (13: row:271.269 avs:3914)
leaf: 0x1800523 25167139 (14: row:262.262 avs:4072)
leaf: 0x1800514 25167124 (15: row:271.269 avs:3914)
leaf: 0x1800529 25167145 (16: row:262.262 avs:4073)
leaf: 0x1800515 25167125 (17: row:271.269 avs:3913)
leaf: 0x180052a 25167146 (18: row:262.262 avs:4073)
leaf: 0x1800516 25167126 (19: row:271.269 avs:3914)
leaf: 0x180052e 25167150 (20: row:262.262 avs:4072)
leaf: 0x1800517 25167127 (21: row:271.269 avs:3914)
leaf: 0x180052f 25167151 (22: row:262.262 avs:4073)
leaf: 0x1800519 25167129 (23: row:271.269 avs:3913)
leaf: 0x180052b 25167147 (24: row:262.262 avs:4073)
leaf: 0x180051a 25167130 (25: row:271.269 avs:3914)
leaf: 0x180052c 25167148 (26: row:262.262 avs:4072)
leaf: 0x180051b 25167131 (27: row:271.269 avs:3914)
leaf: 0x180052d 25167149 (28: row:262.260 avs:4049)
leaf: 0x180051c 25167132 (29: row:271.271 avs:3937)
leaf: 0x1800534 25167156 (30: row:262.260 avs:4049)
leaf: 0x180051d 25167133 (31: row:264.264 avs:4042)
leaf: 0x1800535 25167157 (32: row:269.267 avs:3944)
leaf: 0x180051e 25167134 (33: row:271.271 avs:3938)
leaf: 0x1800536 25167158 (34: row:262.260 avs:4049)
leaf: 0x180051f 25167135 (35: row:399.397 avs:1995)
—– end tree dump

Be very careful allocating a pctfree of 0 to indexes as it may not ultimately help in keeping the indexes as compact as you might have hoped, even if you don’t insert new index entries into the existing full portions of the index.

Thanks to all of those that had a go at the quiz and well done to those that got it right:)

Quiz Time. Why Do Deletes Cause An Index To Grow ? (Up The Hill Backwards) June 25, 2015

Posted by Richard Foote in Oracle Indexes.
14 comments

OK, time for a little quiz.

One of the things I’ve seen at a number of sites is the almost fanatical drive to make indexes as small as possible because indexes that are larger than necessary both waste storage and hurt performance.

Or so the theory goes …   :)

In many cases, this drives DBAs to create or rebuild indexes with a PCTFREE set to 0 as this will make the index as compact and small as possible.

Of course, this is often the very worst setting for an index to remain small because the insert of a new index entry is likely to cause a 50-50 block split and result in two 1/2 empty leaf blocks (unless the index entry is the maximum current value). Before very long, the index is back to a bloated state and in some sad scenarios, the process is repeated again and again.

A point that is often missed though is that it doesn’t even take an insert to cause the index to expand out. A few delete statements is all that’s required.

To illustrate I create my favorite little table and populate it with a few rows:

SQL> create table bowie (id number, name varchar2(42));

Table created.

SQL> insert into bowie select rownum, 'DAVID BOWIE' from dual connect by level <=10000;

10000 rows created.

SQL> commit;

Commit complete.

OK, time to create an index but because I’m paranoid about having indexes larger than necessary, I’ll create it with a pctfree of 0:

SQL> create index bowie_id_i on bowie(id) pctfree 0;

Index created.

If we look at a tree dump of the index:

SQL> select object_id from dba_objects where object_name='BOWIE_ID_I';

OBJECT_ID
----------
98884

SQL> alter session set events 'immediate trace name treedump level 98884';

Session altered.

—– begin tree dump
branch: 0x180050b 25167115 (0: nrow: 19, level: 1)
leaf: 0x180050c 25167116 (-1: row:540.540 avs:4)
leaf: 0x180050d 25167117 (0: row:533.533 avs:10)
leaf: 0x180050e 25167118 (1: row:533.533 avs:11)
leaf: 0x180050f 25167119 (2: row:533.533 avs:10)
leaf: 0x1800510 25167120 (3: row:533.533 avs:10)
leaf: 0x1800511 25167121 (4: row:533.533 avs:11)
leaf: 0x1800512 25167122 (5: row:533.533 avs:10)
leaf: 0x1800513 25167123 (6: row:533.533 avs:10)
leaf: 0x1800514 25167124 (7: row:533.533 avs:11)
leaf: 0x1800515 25167125 (8: row:533.533 avs:10)
leaf: 0x1800516 25167126 (9: row:533.533 avs:10)
leaf: 0x1800517 25167127 (10: row:533.533 avs:11)
leaf: 0x1800519 25167129 (11: row:533.533 avs:10)
leaf: 0x180051a 25167130 (12: row:533.533 avs:10)
leaf: 0x180051b 25167131 (13: row:533.533 avs:11)
leaf: 0x180051c 25167132 (14: row:533.533 avs:10)
leaf: 0x180051d 25167133 (15: row:533.533 avs:10)
leaf: 0x180051e 25167134 (16: row:533.533 avs:11)
leaf: 0x180051f 25167135 (17: row:399.399 avs:2019)
—– end tree dump

We note the index only has 19 leaf blocks and that most leaf blocks have 533 index entries and only an avs (available free space) of some 10 or 11 bytes. Only the last leaf block is partly full with some 2019 free bytes.

That’s fantastic, the index really is a small as can be. Trying to use index compression will be futile as the indexed values are effectively unique.

I’m now going to delete just a few rows. Surely deleting rows from the table (and hence entries from the index) can only have a positive impact (if any) on the index structure.

In one session, I delete some 20 odd rows:

SQL> delete bowie where id in (select rownum*499 from dual connect by level<=20);

20 rows deleted.

Meanwhile, in another session, I delete another 20 different rows:

SQL> delete bowie where id in (select rownum*500 from dual connect by level<=20); 

20 rows deleted.

I then commit the delete statement in both sessions.

So in total, I’ve just deleted some 40 rows in total out of the 10000 rows in the table.

If I look at a tree dump of the index now:

—– begin tree dump
branch: 0x180050b 25167115 (0: nrow: 37, level: 1)
leaf: 0x180050c 25167116 (-1: row:278.278 avs:3931)
leaf: 0x1800524 25167140 (0: row:262.260 avs:4049)
leaf: 0x180050d 25167117 (1: row:271.271 avs:3938)
leaf: 0x1800525 25167141 (2: row:262.260 avs:4048)
leaf: 0x180050e 25167118 (3: row:271.271 avs:3938)
leaf: 0x1800526 25167142 (4: row:262.260 avs:4049)
leaf: 0x180050f 25167119 (5: row:271.271 avs:3937)
leaf: 0x1800527 25167143 (6: row:262.260 avs:4049)
leaf: 0x1800510 25167120 (7: row:271.271 avs:3938)
leaf: 0x1800520 25167136 (8: row:262.260 avs:4048)
leaf: 0x1800511 25167121 (9: row:271.271 avs:3938)
leaf: 0x1800521 25167137 (10: row:262.260 avs:4049)
leaf: 0x1800512 25167122 (11: row:271.271 avs:3937)
leaf: 0x1800522 25167138 (12: row:262.260 avs:4049)
leaf: 0x1800513 25167123 (13: row:271.269 avs:3914)
leaf: 0x1800523 25167139 (14: row:262.262 avs:4072)
leaf: 0x1800514 25167124 (15: row:271.269 avs:3914)
leaf: 0x1800529 25167145 (16: row:262.262 avs:4073)
leaf: 0x1800515 25167125 (17: row:271.269 avs:3913)
leaf: 0x180052a 25167146 (18: row:262.262 avs:4073)
leaf: 0x1800516 25167126 (19: row:271.269 avs:3914)
leaf: 0x180052e 25167150 (20: row:262.262 avs:4072)
leaf: 0x1800517 25167127 (21: row:271.269 avs:3914)
leaf: 0x180052f 25167151 (22: row:262.262 avs:4073)
leaf: 0x1800519 25167129 (23: row:271.269 avs:3913)
leaf: 0x180052b 25167147 (24: row:262.262 avs:4073)
leaf: 0x180051a 25167130 (25: row:271.269 avs:3914)
leaf: 0x180052c 25167148 (26: row:262.262 avs:4072)
leaf: 0x180051b 25167131 (27: row:271.269 avs:3914)
leaf: 0x180052d 25167149 (28: row:262.260 avs:4049)
leaf: 0x180051c 25167132 (29: row:271.271 avs:3937)
leaf: 0x1800534 25167156 (30: row:262.260 avs:4049)
leaf: 0x180051d 25167133 (31: row:264.264 avs:4042)
leaf: 0x1800535 25167157 (32: row:269.267 avs:3944)
leaf: 0x180051e 25167134 (33: row:271.271 avs:3938)
leaf: 0x1800536 25167158 (34: row:262.260 avs:4049)
leaf: 0x180051f 25167135 (35: row:399.397 avs:1995)
—– end tree dump

I notice the index has basically doubled in size (37 leaf blocks up from the original 19), with most index leaf blocks now 1/2 empty.

So the key question. How could just deleting a handful of rows result in my beautifully compact index bloating to double the size ???

I’ve given away one clue within the post. I’ll post the answer in the coming days:)

Follow

Get every new post delivered to your Inbox.

Join 2,425 other followers