Oracle 19c Automatic Indexing: Indexing Partitioned Tables Part II (Neighbourhood Threat) January 13, 2021
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Transaction Processing, CBO, Exadata, Local Indexes, Oracle, Oracle Cloud, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning.1 comment so far
In my first post on Automatic Indexing on Partitioned Tables, I discussed how Automatic Indexing (AI) can now create a Non-Partitioned index if deemed the most effective indexing structure (this wasn’t supported when AI was initially released). A Non-Partitioned index is indeed likely the most efficient indexing structure if the underlying table has many partitions and associated SQL equality predicates only reference non-partition key columns. A Non-Partitioned index ensure Oracle only needs to scan the single index structure and not all the partitions of a Local index.
But what if SQLs do reference the column by which the underlying table is partitioned?
The following SQL has an equality filtering predicate on the RELEASE_DATE column, the column by which the BIG_BOWIE1 table is partitioned:
SQL> SELECT * FROM big_bowie1 where release_date = to_date('2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss'); no rows selected
If we look at the subsequent AI report:
INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: *: invisible ------------------------------------------------------------------------------- ------------------------------------------------------------------------------------------------ | Owner | Table | Index | Key | Type | Properties | ------------------------------------------------------------------------------------------------ | BOWIE | BIG_BOWIE1 | SYS_AI_14gpurjp8m76s | RELEASE_DATE | B-TREE | LOCAL | ------------------------------------------------------------------------------------------------ -------------------------------------------------------------------------------
We notice that Automatic Indexing has in this instance created a Local Index.
If we look further down the AI report:
------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 4mm3mbkk38pa8 SQL Text : SELECT * FROM big_bowie1 where release_date = to_date('2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss') Improvement Factor : 8339x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 146957 71 CPU Time (s): 146124 71 Buffer Gets: 16678 3 Optimizer Cost: 162 4 Disk Reads: 0 0 Direct Writes: 0 0 Rows Processed: 0 0 Executions: 2 1 PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 4031749531 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 162 | | | 1 | PARTITION RANGE SINGLE | | 3602 | 93652 | 162 | 00:00:01 | | 2 | TABLE ACCESS STORAGE FULL | BIG_BOWIE1 | 3602 | 93652 | 162 | 00:00:01 | ----------------------------------------------------------------------------------- Notes ----- - dop = 1 - px_in_memory_imc = no - px_in_memory = no - With Auto Indexes ----------------------------- Plan Hash Value : 4049653350 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 3 | 78 | 4 | 00:00:01 | | 1 | PARTITION RANGE SINGLE | | 3 | 78 | 4 | 00:00:01 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE1 | 3 | 78 | 4 | 00:00:01 | | * 3 | INDEX RANGE SCAN | SYS_AI_14gpurjp8m76s | 1 | | 3 | 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 3 - access("RELEASE_DATE"=TO_DATE(' 2013-12-30 22:15:25', 'syyyy-mm-dd hh24:mi:ss')) Notes ----- - Dynamic sampling used for this statement ( level = 11 )
We can see Automatic Indexing has created the index because it provides an average Improvement Factor of 8339x. As the necessary indexed column(s) matches the table partitioning key, it makes sense for the associated index be a Local index as Oracle is certain which specific index partition to visit based on the value of the equality predicate.
If we look at the details of this new AI:
SQL> select index_name, auto, constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name='BIG_BOWIE1'; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------------ --- --- --------- ------------- -------- ---------- ----------- ----------------- SYS_AI_14gpurjp8m76s YES NO VISIBLE ADVANCED LOW N/A 20000000 30742 19941449 SYS_AI_8armv0hqq73fa YES NO VISIBLE ADVANCED LOW VALID 20000000 42697 19995451 SQL> select index_name, column_name, column_position from user_ind_columns where table_name='BIG_BOWIE1' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ --------------- --------------- SYS_AI_14gpurjp8m76s RELEASE_DATE 1 SYS_AI_8armv0hqq73fa TOTAL_SALES 1 SQL> SELECT index_name, partitioning_type, partition_count, locality FROM user_part_indexes WHERE table_name = 'BIG_BOWIE1'; INDEX_NAME PARTITION PARTITION_COUNT LOCALI ------------------------------ --------- --------------- ------ SYS_AI_14gpurjp8m76s RANGE 8 LOCAL SQL> select index_name, partition_name, status, compression from user_ind_partitions where index_name in (select index_name from user_indexes where table_name='BIG_BOWIE1') order by partition_position; INDEX_NAME PARTITION_NAME STATUS COMPRESSION -------------------- -------------------- -------- ------------- SYS_AI_14gpurjp8m76s ALBUMS_2013 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2014 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2015 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2016 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2017 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2018 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2019 USABLE ADVANCED LOW SYS_AI_14gpurjp8m76s ALBUMS_2020 USABLE ADVANCED LOW
We can see that indeed, a Visible, Usable, Local index was created by Automatic Indexing.
So depending on the column(s) within the index, Automatic Indexing can potentially create either a Local or Non-Partitioned index when indexing a partitioned table.
Oracle 19c Automatic Indexing: Indexing Partitioned Tables Part I (Conversation Piece) October 14, 2020
Posted by Richard Foote in 19c, 19c New Features, Automatic Indexing, Autonomous Data Warehouse, Autonomous Database, Autonomous Transaction Processing, CBO, Exadata, Index Access Path, Local Indexes, Oracle, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Partitioned Indexes, Partitioning, Performance Tuning.1 comment so far
In this little series, I’m going to discuss how Automatic Indexing works in relation to Partitioning.
I’ve discussed Indexing and Partitioning many times previously and how Oracle has various options when indexing a partitioned table:
- Non-Partitioned Index
- Globally Partitioned Index
- Locally Partitioned Index
So the question(s) are how does Automatic Indexing handle scenarios with partitioned objects?
A very important point to make at the start is that based on my research, the answer has already changed significantly since Automatic Indexing was first released. So it’s important to understand that Automatic Indexing is an ever evolving capability, that will advance and improve as time goes on.
I’ll focus on how the feature currently works (as of Oracle Database 19.5), but will mention previously identified behaviour as a reference on how things can easily change.
In my first simple little example, I’m just going to create a range-partitioned table, partitioned based on RELEASE_DATE, with a partition for each year’s worth of data:
SQL> CREATE TABLE big_bowie1(id number, album_id number, country_id number, release_date date, total_sales number) PARTITION BY RANGE (release_date) (PARTITION ALBUMS_2013 VALUES LESS THAN (TO_DATE('01-JAN-2014', 'DD-MON-YYYY')), PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')), PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')), PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')), PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')), PARTITION ALBUMS_2018 VALUES LESS THAN (TO_DATE('01-JAN-2019', 'DD-MON-YYYY')), PARTITION ALBUMS_2019 VALUES LESS THAN (TO_DATE('01-JAN-2020', 'DD-MON-YYYY')), PARTITION ALBUMS_2020 VALUES LESS THAN (MAXVALUE)); Table created.
I’ll now add about 8 years worth of data:
SQL> INSERT INTO big_bowie1 SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800), ceil(dbms_random.value(1,500000)) FROM dual CONNECT BY LEVEL <= 10000000; 10000000 rows created. SQL> COMMIT; Commit complete.
As discussed previously, I’ll importantly collect statistics:
SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE1'); PL/SQL procedure successfully completed.
I’ll now run the following very selective query based the TOTAL_SALES column that is NOT part of the partitioning key:
SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42; 19 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2468051548 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 520 | 643 (15)| 00:00:01 | | | | 1 | PARTITION RANGE ALL | | 20 | 520 | 643 (15)| 00:00:01 | 1 | 8 | |* 2 | TABLE ACCESS STORAGE FULL| BIG_BOWIE1 | 20 | 520 | 643 (15)| 00:00:01 | 1 | 8 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - storage("TOTAL_SALES"=42) filter("TOTAL_SALES"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 44014 consistent gets 9516 physical reads 0 redo size 1107 bytes sent via SQL*Net to client 369 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 19 rows processed
Without an index in place, the CBO has no choice but to use a FTS. But what will Automatic Indexing make of things?
If we look at the next Automatic Indexing report:
SQL> select dbms_auto_index.report_last_activity() from dual; GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 13-OCT-2020 01:47:48 Activity end : 13-OCT-2020 02:59:48 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created (visible / invisible) : 1 (1 / 0) Space used (visible / invisible) : 184.55 MB (184.55 MB / 0 B) Indexes dropped : 0 SQL statements verified : 2 SQL statements improved (improvement factor) : 1 (44119.6x) SQL plan baselines created : 0 Overall improvement factor : 25135.8x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: *: invisible ------------------------------------------------------------------------------- --------------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | --------------------------------------------------------------------------------- | BOWIE | BIG_BOWIE1 | SYS_AI_2zt7rg40mxa4n | TOTAL_SALES | B-TREE | NONE | --------------------------------------------------------------------------------- ------------------------------------------------------------------------------- VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : chwm2gubm8fx9 SQL Text : SELECT * FROM big_bowie1 WHERE total_sales = 42 Improvement Factor : 44119.6x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 4387193 1173 CPU Time (s): 2599423 1037 Buffer Gets: 749507 22 Optimizer Cost: 643 22 Disk Reads: 470976 2 Direct Writes: 0 0 Rows Processed: 323 19 Executions: 17 1 PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 2468051548 ----------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 643 | | | 1 | PARTITION RANGE ALL | | 20 | 520 | 643 | 00:00:01 | | 2 | TABLE ACCESS STORAGE FULL | BIG_BOWIE1 | 20 | 520 | 643 | 00:00:01 | ----------------------------------------------------------------------------------- Notes ----- - dop = 1 - px_in_memory_imc = no - px_in_memory = no - With Auto Indexes ----------------------------- Plan Hash Value : 937174207 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 19 | 494 | 22 | 00:00:01 | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE1 | 19 | 494 | 22 | 00:00:01 | | * 2 | INDEX RANGE SCAN | SYS_AI_2zt7rg40mxa4n | 19 | | 3 | 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access("TOTAL_SALES"=42) Notes ----- - Dynamic sampling used for this statement ( level = 11 )
We notice a couple of interesting points.
Firstly, yes Automatic Indexing has created an index based on the TOTAL_SALES column (SYS_AI_2zt7rg40mxa4n) as it improves performance by a reported 44119.6x.
Note also that the Automatic Index is a Non-Partitioned (Global) Index. From a performance perspective, this is the most efficient index to create to improve the performance of this query as the CBO only has the one index structure to navigate (vs. a LOCAL index that would require having to navigate down all 8 index structures for each table partition.
If we look at the index details:
SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes WHERE table_name = 'BIG_BOWIE1'; INDEX_NAME PAR AUT VISIBILIT STATUS ------------------------------ --- --- --------- -------- SYS_AI_2zt7rg40mxa4n NO YES VISIBLE VALID
We notice that this is indeed a Non-Partitioned Index, that is both VISIBLE and VALID and so can be potentially used by any database session.
If we now re-run the query:
SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42; 19 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 937174207 ----------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 520 | 23 (0)| 00:00:01 | | | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| BIG_BOWIE1 | 20 | 520 | 23 (0)| 00:00:01 | ROWID | ROWID | |* 2 | INDEX RANGE SCAN | SYS_AI_2zt7rg40mxa4n | 20 | | 3 (0)| 00:00:01 | | | ----------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TOTAL_SALES"=42) Note ----- - automatic DOP: Computed Degree of Parallelism is 1 Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 23 consistent gets 0 physical reads 0 redo size 1166 bytes sent via SQL*Net to client 369 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 19 rows processed
We can see the query now uses the newly created Automatic Index and is indeed more efficient, performing now just 23 consistent gets (previously 44014 consistent gets).
However, this was NOT previous behaviour.
The documentation previously mentioned that only LOCAL indexes are used when indexing partitioned tables.
If we run the same demo on Oracle Database 19.3, we get the following report:
GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 14-OCT-2020 13:12:07 Activity end : 14-OCT-2020 14:24:07 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created (visible / invisible) : 1 (1 / 0) Space used (visible / invisible) : 192.94 MB (192.94 MB / 0 B) Indexes dropped : 0 SQL statements verified : 1 SQL statements improved (improvement factor) : 1 (1950.5x) SQL plan baselines created : 0 Overall improvement factor : 1950.5x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: *: invisible ------------------------------------------------------------------------------- --------------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | --------------------------------------------------------------------------------- | BOWIE | BIG_BOWIE1 | SYS_AI_8armv0hqq73fa | TOTAL_SALES | B-TREE | LOCAL | --------------------------------------------------------------------------------- ------------------------------------------------------------------------------- VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : 2pp8ypramw30s SQL Text : SELECT * FROM big_bowie1 WHERE total_sales = 42 Improvement Factor : 1950.5x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 6996973 27327 CPU Time (s): 6704215 12819 Buffer Gets: 815306 49 Optimizer Cost: 12793 28 Disk Reads: 2 40 Direct Writes: 0 0 Rows Processed: 475 25 Executions: 19 1 PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 4294056405 ----------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | ----------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 12793 | | | 1 | PARTITION RANGE ALL | | 20 | 520 | 12793 | 00:00:01 | | 2 | TABLE ACCESS FULL | BIG_BOWIE1 | 20 | 520 | 12793 | 00:00:01 | ----------------------------------------------------------------------------- - With Auto Indexes ----------------------------- Plan Hash Value : 3781269341 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 25 | 650 | 28 | 00:00:01 | | 1 | PARTITION RANGE ALL | | 25 | 650 | 28 | 00:00:01 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE1 | 25 | 650 | 28 | 00:00:01 | | * 3 | INDEX RANGE SCAN | SYS_AI_8armv0hqq73fa | 25 | | 17 | 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 3 - access("TOTAL_SALES"=42) Notes ----- - Dynamic sampling used for this statement ( level = 11 )
As we can see, in this scenario, the newly created Automatic Index has a “Property” of LOCAL.
If we look at its index details:
SQL> SELECT index_name, partitioned, auto, visibility, status FROM user_indexes WHERE table_name = 'BIG_BOWIE1'; INDEX_NAME PAR AUT VISIBILIT STATUS ------------------------------ --- --- --------- -------- SYS_AI_8armv0hqq73fa YES YES VISIBLE N/A SQL> SELECT index_name, partitioning_type, partition_count, locality FROM user_part_indexes WHERE table_name = 'BIG_BOWIE1'; INDEX_NAME PARTITION PARTITION_COUNT LOCALI ------------------------------ --------- --------------- ------ SYS_AI_8armv0hqq73fa RANGE 8 LOCAL
We can see how a Local Index was previously created.
As such if we re-run an equivalent query:
SQL> SELECT * FROM big_bowie1 WHERE total_sales = 42; 25 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3781269341 ----------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ----------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 520 | 26 (0)| 00:00:01 | | | | 1 | PARTITION RANGE ALL | | 20 | 520 | 26 (0)| 00:00:01 | 1 | 8 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED| BIG_BOWIE1 | 20 | 520 | 26 (0)| 00:00:01 | 1 | 8 | |* 3 | INDEX RANGE SCAN | SYS_AI_8armv0hqq73fa | 20 | | 17 (0)| 00:00:01 | 1 | 8 | ----------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 50 consistent gets 0 physical reads 0 redo size 1555 bytes sent via SQL*Net to client 409 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 25 rows processed
Although the query is returning 6 more rows (as with the random number generation, has a slightly different data set), it’s more expensive proportionally now having to perform 50 consistent gets as it now has to read 8 index structures rather than just the one.
So (IMHO), Automatic Indexing has improved here, creating a more efficient index structure than previously. So always bear in mind that Automatic Indexing is an evolving beast, improving and adapting as time moves on.
However, note the compromise here is that by having an effectively Global index structure, there may be some additional issues depending on any subsequent structural changes to the table.
More on Automatic Indexing and Partitioning in my next post…
“Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part IV” (Hallo Spaceboy) October 31, 2018
Posted by Richard Foote in Global Indexes, Local Indexes, Oracle Indexes, Partitioned Indexes.3 comments
In Part I, Part II and Part III we looked at some advantages of Global Indexes that may not be obvious to some.
One of the advantages of a Local Index vs. Non-Partitioned Global Index is that a Local Index being a smaller index structures may have a reduced BLEVEL in comparison. This can save a logical read each and every time the index is accessed.
However, if this is a performance concern for usage of a corresponding Global Index, this is a key reason why Global Indexes can likewise be partitioned.
As we saw in the demo in Part III, when the Global Index is used in a query that uses a predicate with the table partitioned key:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 1081241859 -------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 13 (0) | 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 13 (0) | 00:00:01 | 7 | 7 | |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RELEASE_DATE"=TO_DATE('2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The query required 5 consistent gets.
But when the Local Index is used with a reduced BLEVEL:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 3499166408 -------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 2 (0) | 00:00:01 | | | | 1 | PARTITION RANGE SINGLE | | 1 | 25 | 2 (0) | 00:00:01 | 7 | 7 | |* 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 2 (0) | 00:00:01 | 7 | 7 | |* 3 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_LOCAL_I | 1 | | 1 (0) | 00:00:01 | 7 | 7 | -------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("RELEASE_DATE"<=TO_DATE(' 2017-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 3 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The query used just 4 consistent gets (vs. 5) as a result of the reduction of 1 for the BLEVEL.
So to have the best of both worlds, excellent performance when the query doesn’t contain the table partitioned columns in a predicate and excellent performance to match Local Indexes when the table partitioned key is specified, a Global Index can also be partitioned into many, smaller index structures.
However, unlike a Local Index, a Global Partitioned Index can be partitioned in a manner totally different to that of the table (indeed, the table doesn’t even have to be partitioned).
In this example, the Global Index on the TOTAL_SALES column is partitioned based on TOTAL_SALES (unlike the table which is partitioned based on RELEASE_DATE) and partitioned into 16 partitions (unlike the table which has 8 partitions):
SQL> CREATE INDEX big_bowie_total_sales_global_i ON big_bowie(total_sales) 2 GLOBAL PARTITION BY RANGE (total_sales) 3 (PARTITION P1 VALUES LESS THAN (12501), 4 PARTITION P2 VALUES LESS THAN (25001), 5 PARTITION P3 VALUES LESS THAN (37501), 6 PARTITION P4 VALUES LESS THAN (50001), 7 PARTITION P5 VALUES LESS THAN (62501), 8 PARTITION P6 VALUES LESS THAN (75001), 9 PARTITION P7 VALUES LESS THAN (87501), 10 PARTITION P8 VALUES LESS THAN (100001), 11 PARTITION P9 VALUES LESS THAN (112501), 12 PARTITION P10 VALUES LESS THAN (125001), 13 PARTITION P11 VALUES LESS THAN (137501), 14 PARTITION P12 VALUES LESS THAN (150001), 15 PARTITION P13 VALUES LESS THAN (162501), 16 PARTITION P14 VALUES LESS THAN (175001), 17 PARTITION P15 VALUES LESS THAN (187501), 18 PARTITION P16 VALUES LESS THAN (MAXVALUE)) invisible; Index created. SQL> select index_name, partition_name, blevel, leaf_blocks from dba_ind_partitions where index_name='BIG_BOWIE_TOTAL_SALES_GLOBAL_I'; INDEX_NAME PARTITION_NAME BLEVEL LEAF_BLOCKS ------------------------------ -------------------- ---------- ----------- BIG_BOWIE_TOTAL_SALES_GLOBAL_I P1 1 335 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P10 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P11 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P12 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P13 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P14 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P15 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P16 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P2 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P3 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P4 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P5 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P6 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P7 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P8 1 349 BIG_BOWIE_TOTAL_SALES_GLOBAL_I P9 1 349
We notice that each Global Index partition now only has a BLEVEL of 1, the same as the corresponding Local Index.
As such, the performance of the Global Index now matches that of the Local Index when the table partition key is referenced in an SQL predicate:
SQL> alter index BIG_BOWIE_TOTAL_SALES_LOCAL_I invisible; Index altered. SQL> alter index BIG_BOWIE_TOTAL_SALES_GLOBAL_I visible; Index altered. SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 2458305506 ---------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | ---------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 11 (0) | 00:00:01 | | | | 1 | PARTITION RANGE SINGLE | | 1 | 25 | 11 (0) | 00:00:01 | 1 | 1 | |* 2 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 11 (0) | 00:00:01 | 7 | 7 | |* 3 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_GLOBAL_I | 1 | | 1 (0) | 00:00:01 | 1 | 1 | ---------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("RELEASE_DATE"=TO_DATE(' 2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 3 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
So Global Indexes can perform optimally, regardless of whether the table partition key is specified in a predicate or not.
The same can’t always be said for a corresponding Local Index.
“Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part III” (Ricochet) October 25, 2018
Posted by Richard Foote in Block Dumps, Global Indexes, Local Indexes, Oracle Indexes, Partitioned Indexes, Partitioning.1 comment so far
In Part I and Part II of this series, we looked at how Global Indexes can effectively perform “Partition Pruning” when the partition keys are specified in SQL predicates, by only using those index entries that have a Data Object of interest stored within the index Rowids.
In this piece, I’ll cover the key performance advantage that Global Indexes have over Local Indexes and why I generally recommended Global Indexes from a purely performance perspective.
First, a quick recap of how the Global Index performed. Following is the performance of a query where the table partitioned key is specified in the query:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 1081241859 -------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 13 (0) | 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 13 (0) | 00:00:01 | 7 | 7 | |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RELEASE_DATE"=TO_DATE('2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
As discussed previously, at just 5 consistent gets, it’s very efficient as only the table blocks that reside in possible partitions of interest are only accessed.
The following query selects all TOTAL_SALES values of interest, with no partition key predicate:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 400041 42 42 28-JAN-12 42 1800041 42 42 28-JAN-12 42 800041 42 42 03-MAR-13 42 1200041 42 42 07-APR-14 42 1600041 42 42 12-MAY-15 42 200041 42 42 12-MAY-15 42 600041 42 42 15-JUN-16 42 1000041 42 42 20-JUL-17 42 41 42 42 24-AUG-18 42 1400041 42 42 24-AUG-18 42 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1761527485 -------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 250 | 13 (0) | 00:00:01 | | | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 10 | 250 | 13 (0) | 00:00:01 | ROWID | ROWID | |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 1184 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
So the index is scanned (4 consistent gets) and 10 consistent gets for the 10 rows accessed (as the clustering here is poor) for a total of 14 consistent gets.
Let’s now compare this to an equivalent Local Index.
SQL> create index big_bowie_total_sales_local_i on big_bowie(total_sales) local invisible; Index created. SQL> alter index big_bowie_total_sales_i invisible; Index altered. SQL> alter index big_bowie_total_sales_local_i visible; Index altered.
If we compare the size characteristics between the two indexes we notice a couple of important differences:
SQL> select index_name, blevel, leaf_blocks from dba_indexes where table_name='BIG_BOWIE'; INDEX_NAME BLEVEL LEAF_BLOCKS ------------------------------ ---------- ----------- BIG_BOWIE_TOTAL_SALES_I 2 5585 BIG_BOWIE_TOTAL_SALES_LOCAL_I 1 4444 SQL> select index_name, partition_name, blevel, leaf_blocks from dba_ind_partitions where index_name='BIG_BOWIE_TOTAL_SALES_LOCAL_I'; INDEX_NAME PARTITION_NAME BLEVEL LEAF_BLOCKS ------------------------------ -------------------- ---------- ----------- BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2011 1 525 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2012 1 581 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2013 1 579 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2014 1 579 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2015 1 579 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2016 1 581 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2017 1 580 BIG_BOWIE_TOTAL_SALES_LOCAL_I ALBUMS_2018 1 440 8 rows selected.
The first difference is that the corresponding Local index segments have a reduced BLEVEL (just 1) when compared to the Global Index (value of 2). A reduction in BLEVEL is quite possible as instead of one “big” index segment, we now have 8 “smaller” index segments.
However, if we look at the overall size of both indexes, we notice that the Local Index (at 4444 leaf blocks) is somewhat smaller than the Global Index (5585 leaf blocks). This is due to the Rowids of Local Indexes not having to be the extended Global Index 10 byte version (which contains the 4 byte Data Object Id), but the standard 6 byte version. Local Indexes can only reference the one table partition and so it’s unnecessary to store the corresponding Data Object Id within the Rowid.
A partial block dump of a Local Index leaf block:
Leaf block dump
===============
header address 924483684=0x371a8064
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 483
kdxcofbo 1002=0x3ea
kdxcofeo 1823=0x71f
kdxcoavs 821
kdxlespl 0
kdxlende 0
kdxlenxt 29412237=0x1c0cb8d
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 47
col 1; len 6; (6): 01 c0 20 7b 00 a6
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 47
col 1; len 6; (6): 01 c0 22 3a 00 00
row#2[8000] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 48
col 1; len 6; (6): 01 c0 20 7b 00 a7
Shows that the Rowids are only 6 bytes.
If we re-run the query that references the partition key in a SQL predicate:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 3499166408 -------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 2 (0) | 00:00:01 | | | | 1 | PARTITION RANGE SINGLE | | 1 | 25 | 2 (0) | 00:00:01 | 7 | 7 | |* 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 2 (0) | 00:00:01 | 7 | 7 | |* 3 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_LOCAL_I | 1 | | 1 (0) | 00:00:01 | 7 | 7 | -------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("RELEASE_DATE"<=TO_DATE(' 2017-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 3 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 4 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We notice that this is slightly more efficient with only 4 consistent gets, when previously the Global Index required 5 consistent gets. This is directly due to the reduction in the BLEVEL.
So this is a good thing, especially if this query is frequently executed.
If we now run the query without the partition key SQL predicate:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 400041 42 42 28-JAN-12 42 1800041 42 42 28-JAN-12 42 800041 42 42 03-MAR-13 42 1200041 42 42 07-APR-14 42 1600041 42 42 12-MAY-15 42 200041 42 42 12-MAY-15 42 600041 42 42 15-JUN-16 42 1000041 42 42 20-JUL-17 42 41 42 42 24-AUG-18 42 1400041 42 42 24-AUG-18 42 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3527547124 -------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 250 | 15 (0) | 00:00:01 | | | | 1 | PARTITION RANGE ALL | | 10 | 250 | 15 (0) | 00:00:01 | 1 | 8 | | 2 | TABLE ACCESS BY LOCAL INDEX ROWID BATCHED | BIG_BOWIE | 10 | 250 | 15 (0) | 00:00:01 | 1 | 8 | |* 3 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_LOCAL_I | 10 | | 9 (0) | 00:00:01 | 1 | 8 | -------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 27 consistent gets 0 physical reads 0 redo size 1088 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We notice that consistent gets have increased more significantly, up to 27 consistent gets when it was previously 14 consistent gets.
This is because instead of accessing the one Global Index structure, we are now forced to access all 8 Local index structures, as the required TOTAL_SALES value could potentially be found in any of the table partitions. So that’s a minimum of at least 2 consistent gets per Local Index (with an index of BLEVEL 1) that has to accessed even if there are actually no corresponding rows of interest in the particular table partition.
Imagine if this table had a 1000+ table partitions, you can easily see how the cost of using such Local Indexes can quickly become excessive.
So Local Indexes can be very problematic if the partition key is NOT referenced in the SQL or if the range of possible table partitions is excessive. The advantage of a Non-Partitioned index is that there is only the one index structure that need be accessed, regardless of the number of table partitions.
So what if you want to protect yourself from the possible ramifications of the table partition key not being referenced in SQL predicates, but you want to take advantage of the performance benefits of smaller index structures that might have a reduced index BLEVEL?
That’s the topic of Part IV in this series 🙂
“Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part II (Aladdin Sane) October 9, 2018
Posted by Richard Foote in Global Indexes, Index Internals, Local Indexes, Oracle Indexes, Partitioned Indexes, Partitioning, ROWID.2 comments
In Part I of this series, I highlighted how a Non-Partitioned Global Index on a Partitioned Table is able to effectively perform “Partition Pruning” by reading only the associated index entries to access just the table blocks of interest from relevant table partitions when the table partitioned keys are specified in an SQL Predicate.
Understanding how Oracle achieves this is key (pun fully intended) in understanding the associated advantages of Global Indexes.
Back in time before Oracle introduced Partitioning (pre-Oracle 8 days), the 6 byte ROWID was safely made up of the following components:
- File Number
- Block Number
- Row Number
to uniquely determine the location of any given row.
If we look at a partial block dump of a leaf block from the index based on the Non-Partitioned table:
Leaf block dump
===============
header address 1881436260=0x70247064
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 513
kdxcofbo 1062=0x426
kdxcofeo 1880=0x758
kdxcoavs 818
kdxlespl 0
kdxlende 0
kdxlenxt 29387269=0x1c06a05
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8024] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c0 1d 68 00 18
row#1[8012] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c0 24 c8 00 c1
row#2[8000] flag: ——-, lock: 0, len=12
col 0; len 2; (2): c1 02
col 1; len 6; (6): 01 c0 3a 1c 00 96
…
We notice that the ROWID for each index entry is the standard 6 bytes in size.
With the introduction of Oracle 8 and the Partitioning Option, the File Number was no longer unique, with this number of files (approx. 1K) now possible not for the database at large, but for each Tablespace (thus making Oracle able to cater for very large databases with there now being the option for so many more data files in a database).
This means for a Partitioned Table in which each table partition (or sub-partition) could potentially reside in different tablespaces, the associated file number (RELATIVE_FNO) within the ROWID is no longer unique. Therefore, for Global Indexes in which index entries span across all table partitions, the ROWID is extended to include the 4 byte Data Object Id. A specific object can only live in one tablespace and if Oracle knows the tablespace, Oracle can determine which specific file number the ROWID is referencing. So an extended ROWID is consists of:
- Data Object Id
- File Number
- Block Number
- Row Number
If we look at a partial block dump of a leaf block from the index based on the Partitioned table:
Leaf block dump
===============
header address 1881436260=0x70247064
kdxcolev 0
KDXCOLEV Flags = – – –
kdxcolok 0
kdxcoopc 0x80: opcode=0: iot flags=— is converted=Y
kdxconco 2
kdxcosdc 0
kdxconro 399
kdxcofbo 834=0x342
kdxcofeo 1652=0x674
kdxcoavs 818
kdxlespl 0
kdxlende 0
kdxlenxt 29385221=0x1c06205
kdxleprv 0=0x0
kdxledsz 0
kdxlebksz 8036
row#0[8020] flag: ——-, lock: 0, len=16
col 0; len 2; (2): c1 02
col 1; len 10; (10): 00 01 57 4a 01 c0 5e cf 00 cc
row#1[8004] flag: ——-, lock: 0, len=16
col 0; len 2; (2): c1 02
col 1; len 10; (10): 00 01 57 4a 01 c0 5f 74 00 e7
row#2[7988] flag: ——-, lock: 0, len=16
col 0; len 2; (2): c1 02
col 1; len 10; (10): 00 01 57 4b 01 c0 5c 32 00 c9
…
We notice that the ROWID for each index entry is now the extended 10 bytes in size as it includes the Data Object Id.
Storing the Data Object Id as part of the ROWID has various advantages, such as being able to asynchronously maintain index entries following table partition operations such as dropping a table partition (as discussed previously here).
However the key advantage of storing the Data Object Id as part of the ROWID is that this enables Oracle when using Global Indexes to automatically perform “Partition Pruning” (the ability to access only those partitions that can possibly contain data of interest), when the table partition key is specified in an SQL predicate.
When the table partition key is specified in an SQL predicate, Oracle can determine which table partitions can only contain such data and then only access the table blocks via the index ROWIDs that have corresponding Data Object Ids of interest. This is how in the example in Part I Oracle was able to only access just the table block that belongs in the table partition of interest, effectively performing predicate filtering at the index level, without unnecessarily having to access the table blocks at all from partitions that are not of interest.
This enables Global Indexes to have almost Local Index like performance in scenarios where the table partition key is specified in SQL predicates. Local Indexes do have the advantage of potentially having a reduced BLEVEL in that if you have say 100 table partitions, each Local Index would only have to be approx. 1/100 the size of the single, Non-Partitioned Index (although Global Indexes can in turn be partitioned if individual index size were problematic, even if the table were not partitioned). Additionally, Local Indexes don’t have to concern themselves with having to read through unnecessary index entries if index entries associated with a specific subset of table partitions were only of interest.
However, Global Indexes have a key performance advantage over Local Indexes which I’ll discussed in Part III.
“Hidden” Efficiencies of Non-Partitioned Indexes on Partitioned Tables Part I (The Jean Genie) October 4, 2018
Posted by Richard Foote in Global Indexes, Local Indexes, Non-Partitioned Indexes, Oracle Indexes, Partitioned Indexes, Partitioning.7 comments
When it comes to indexing a partitioned table, many automatically opt for Local Indexes, as it’s often assumed they’re simply easier to manage and more efficient than a corresponding Global Index.
Having smaller index structures that are aligned to how the table is partitioned certainly has various advantages. The focus in this little series is on the “more efficient” claim.
A key point that many miss is that a Non-Partitioned Index on a Non-Partitioned table is not exactly the same beast as a Non-Partitioned Index on a Partitioned Table. The purpose of this initial post is to illustrate this difference.
Let’s begin by creating a Non-Partitioned table that has a number of years worth of data:
SQL> CREATE TABLE big_ziggy (id number, album_id number, country_id number, release_date date, total_sales number); Table created. SQL> INSERT INTO big_ziggy SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800), mod(rownum,200000)+1 FROM dual CONNECT BY LEVEL 2000000; 2000000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_ZIGGY'); PL/SQL procedure successfully completed.
So we have a 2M row table with about 8 years worth of data (based on the RELEASE_DATE column) and a TOTAL_SALES column that has some 200,000 distinct columns throughout this period.
Let’s next create a standard Non-Partitioned index based on the TOTAL_SALES column:
SQL> create index big_ziggy_total_sales_i on big_ziggy(total_sales); Index created.
If we now run a query to access the 10 rows with a value equal to 42:
SQL> SELECT * FROM big_ziggy WHERE total_sales = 42; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1400041 42 42 24-AUG-18 42 400041 42 42 28-JAN-12 42 1000041 42 42 20-JUL-17 42 1800041 42 42 28-JAN-12 42 600041 42 42 15-JUN-16 42 800041 42 42 03-MAR-13 42 1200041 42 42 07-APR-14 42 1600041 42 42 12-MAY-15 42 200041 42 42 12-MAY-15 42 41 42 42 24-AUG-18 42 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1252095634 --------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | --------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 250 | 13 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BIG_ZIGGY | 10 | 250 | 13 (0) | 00:00:01 | |* 2 | INDEX RANGE SCAN | BIG_ZIGGY_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | --------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 1184 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We notice we need 14 consistent gets to access these 10 rows, 4 gets for index block accesses and 10 gets to access the relevant rows from the table blocks (as we have a terrible clustering due to the relevant data being distributed throughout the table).
If we run a query where we’re only interested in accessing data only within a specific year:
SQL> SELECT * FROM big_ziggy WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 1252095634 --------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | --------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 13 (0) | 00:00:01 | |* 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BIG_ZIGGY | 1 | 25 | 13 (0) | 00:00:01 | |* 2 | INDEX RANGE SCAN | BIG_ZIGGY_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | --------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RELEASE_DATE">=TO_DATE(' 2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss') AND "RELEASE_DATE"<=TO_DATE(' 2017-07-31 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Even though we now only return the one row, notice we still have to perform the same 14 consistent gets. That’s because the RELEASE_DATE column is NOT part of the index, so we still need to fetch all 10 matching rows with TOTAL_SALES=42 and then filter out those that don’t have a RELEASE_DATE of interest. The note above in the predicate information shows we now have this additional filtering taking place.
Let’s run the same queries on a table with identical data, but this time on a table that is partitioned based on the RELEASE_DATE column, with a partition for each years worth of data:
SQL> CREATE TABLE big_bowie(id number, album_id number, country_id number, release_date date, total_sales number) 2 PARTITION BY RANGE (release_date) 3 (PARTITION ALBUMS_2011 VALUES LESS THAN (TO_DATE('01-JAN-2012', 'DD-MON-YYYY')), 4 PARTITION ALBUMS_2012 VALUES LESS THAN (TO_DATE('01-JAN-2013', 'DD-MON-YYYY')), 5 PARTITION ALBUMS_2013 VALUES LESS THAN (TO_DATE('01-JAN-2014', 'DD-MON-YYYY')), 6 PARTITION ALBUMS_2014 VALUES LESS THAN (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')), 7 PARTITION ALBUMS_2015 VALUES LESS THAN (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')), 8 PARTITION ALBUMS_2016 VALUES LESS THAN (TO_DATE('01-JAN-2017', 'DD-MON-YYYY')), 9 PARTITION ALBUMS_2017 VALUES LESS THAN (TO_DATE('01-JAN-2018', 'DD-MON-YYYY')), 10 PARTITION ALBUMS_2018 VALUES LESS THAN (MAXVALUE)); Table created. SQL> INSERT INTO big_bowie SELECT rownum, mod(rownum,5000)+1, mod(rownum,100)+1, sysdate-mod(rownum,2800), mod(rownum,200000)+1 FROM dual CONNECT BY LEVEL 2000000; 2000000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=> null, tabname=> 'BIG_BOWIE'); PL/SQL procedure successfully completed.
Again, we create a standard, Non-Partitioned Index:
SQL> create index big_bowie_total_sales_i on big_bowie(total_sales); Index created.
If we now run the equivalent of the first query:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 400041 42 42 28-JAN-12 42 1800041 42 42 28-JAN-12 42 800041 42 42 03-MAR-13 42 1200041 42 42 07-APR-14 42 1600041 42 42 12-MAY-15 42 200041 42 42 12-MAY-15 42 600041 42 42 15-JUN-16 42 1000041 42 42 20-JUL-17 42 41 42 42 24-AUG-18 42 1400041 42 42 24-AUG-18 42 10 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1761527485 -------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10 | 250 | 13 (0) | 00:00:01 | | | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 10 | 250 | 13 (0) | 00:00:01 | ROWID | ROWID | |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 14 consistent gets 0 physical reads 0 redo size 1184 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
We get the exact same performance, with the same 14 consistent gets necessary to access the 10 rows of interest.
If we now run the equivalent of the second query:
SQL> SELECT * FROM big_bowie WHERE total_sales = 42 and release_date between '01-JAN-2017' and '31-JUL-2017'; ID ALBUM_ID COUNTRY_ID RELEASE_D TOTAL_SALES ---------- ---------- ---------- --------- ----------- 1000041 42 42 20-JUL-17 42 Execution Plan ---------------------------------------------------------- Plan hash value: 1081241859 -------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | Pstart | Pstop | -------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 25 | 13 (0) | 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED | BIG_BOWIE | 1 | 25 | 13 (0) | 00:00:01 | 7 | 7 | |* 2 | INDEX RANGE SCAN | BIG_BOWIE_TOTAL_SALES_I | 10 | | 3 (0) | 00:00:01 | | | -------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RELEASE_DATE"=TO_DATE(' 2017-01-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')) 2 - access("TOTAL_SALES"=42) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 885 bytes sent via SQL*Net to client 624 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
We notice a key difference. Even though it’s equivalent to the same index as in the previous Non-Partitioned table and even though the index only contains just the TOTAL_SALES column, the number of consistent gets has dropped from 14 to just 5 consistent gets.
In this example, Oracle has clearly not had to fetch the rows from the table that do not match the RELEASE_DATE of interest. Even though the predicate information is listing the requirement for filtering to take place, this filtering has clearly been performed within the index, without having to actually fetch any of the rows that aren’t of interest.
The index is able to only access the row(s) of interest from the Partitioned Table…
This is the little “hidden efficiency” of Global Indexes on Partitioned Tables, which is what we effectively have here.
In Part II, I’ll discuss how Oracle does this additional filtering within the index and why understanding this is important in deciding which type of index to deploy, as from a “performance” perspective, Global Indexes are often the preferred option.
12.2 Some Cool Partitioning New Features (Big Wheels) April 5, 2017
Posted by Richard Foote in 12c Rel 2, Oracle Indexes, Partitioning.2 comments
I previously discussed just how easy it is to convert online a non-partitioned table to be partitioned with Oracle Database 12.2.
Thought I might run through a number of really cool new partitioning features and capabilities that were also introduced in 12.2.
To start, I’m just going to create a basic range-partitioning table and populate it with a year’s worth of data:
SQL> create table ziggy 2 (prod_id NUMBER, 3 cust_id NUMBER, 4 time_id DATE, 5 quantity NUMBER) 6 PARTITION BY RANGE (time_id) 7 (PARTITION z_2016_q1 VALUES LESS THAN (TO_DATE('01-APR-2016','dd-MON-yyyy')), 8 PARTITION z_2016_q2 VALUES LESS THAN (TO_DATE('01-JUL-2016','dd-MON-yyyy')), 9 PARTITION z_2016_q3 VALUES LESS THAN (TO_DATE('01-OCT-2016','dd-MON-yyyy')), 10 PARTITION z_2016_q4 VALUES LESS THAN (TO_DATE('01-JAN-2017','dd-MON-yyyy'))); Table created. SQL> insert into ziggy select mod(rownum,10), mod(rownum,100), sysdate-dbms_random.value(94, 454), 100 from dual connect by level <= 100000; 100000 rows created. SQL> commit; Commit complete.
I’ll then create both a global, non-partitioned index and a locally partitioned index:
SQL> create index ziggy_prod_id_i on ziggy(prod_id); Index created. SQL> select index_name, num_rows, status from dba_indexes where index_name='ZIGGY_PROD_ID_I'; INDEX_NAME NUM_ROWS STATUS -------------------- ---------- -------- ZIGGY_PROD_ID_I 100000 VALID SQL> create index ziggy_time_id_i on ziggy(time_id) local; Index created. SQL> select index_name, partition_name, num_rows, status from dba_ind_partitions where index_name='ZIGGY_TIME_ID_I'; INDEX_NAME PARTITION_NAME NUM_ROWS STATUS -------------------- -------------------- ---------- -------- ZIGGY_TIME_ID_I Z_2016_Q1 23941 USABLE ZIGGY_TIME_ID_I Z_2016_Q2 25276 USABLE ZIGGY_TIME_ID_I Z_2016_Q3 25522 USABLE ZIGGY_TIME_ID_I Z_2016_Q4 25261 USABLE
OK, the first 12.2 new feature is the capability to now “Split” a partition online (previously this was an offline only operation that resulted in invalid global indexes and invalid corresponding local indexes):
SQL> alter table ziggy 2 split PARTITION z_2016_q4 into 3 (PARTITION z_2016_oct VALUES LESS THAN (TO_DATE('01-NOV-2016','dd-MON-yyyy')), 4 PARTITION z_2016_nov VALUES LESS THAN (TO_DATE('01-DEC-2016','dd-MON-yyyy')), 5 PARTITION z_2016_dec) online; Table altered. SQL> select index_name, num_rows, status from dba_indexes where index_name='ZIGGY_PROD_ID_I'; INDEX_NAME NUM_ROWS STATUS -------------------- ---------- -------- ZIGGY_PROD_ID_I 100000 VALID SQL> select index_name, partition_name, num_rows, status from dba_ind_partitions where index_name='ZIGGY_TIME_ID_I'; INDEX_NAME PARTITION_NAME NUM_ROWS STATUS -------------------- -------------------- ---------- -------- ZIGGY_TIME_ID_I Z_2016_DEC 8276 USABLE ZIGGY_TIME_ID_I Z_2016_NOV 8298 USABLE ZIGGY_TIME_ID_I Z_2016_OCT 8687 USABLE ZIGGY_TIME_ID_I Z_2016_Q1 23941 USABLE ZIGGY_TIME_ID_I Z_2016_Q2 25276 USABLE ZIGGY_TIME_ID_I Z_2016_Q3 25522 USABLE 6 rows selected.
Nice !!
OK, let’s quickly check how many rows I have for each PROD_ID value that belongs within the Q1 partition:
SQL> select prod_id, count(*) from ziggy where time_id between '01-JAN-2016' and '31-MAR-2016' group by prod_id order by prod_id; PROD_ID COUNT(*) ---------- ---------- 0 2391 1 2334 2 2324 3 2372 4 2284 5 2462 6 2348 7 2399 8 2380 9 2388
So we have a PROD_ID with values between 0 and 9 that have roughly the same number of rows.
Let’s now check the size of each table partition in blocks:
SQL> select partition_name, blocks from dba_tab_partitions where table_name='ZIGGY'; PARTITION_NAME BLOCKS -------------------- ---------- Z_2016_DEC 44 Z_2016_NOV 44 Z_2016_OCT 45 Z_2016_Q1 1006 Z_2016_Q2 1006 Z_2016_Q3 1006
Note that the Q1 partition current has 1006 blocks allocated.
OK, the next cool new feature is to select which rows we make want to keep during a subsequent re-org operation. In the following example, I’m going to re-org the Q1 partition and compress the data, but I’m only going to keep those rows where the PROD_ID is between 1 and 8 (hence getting rid of all rows with PROD_ID that’s 0 or 9):
SQL> ALTER TABLE ziggy 2 MOVE PARTITION z_2016_q1 TABLESPACE users COMPRESS ONLINE 3 INCLUDING ROWS WHERE prod_id between 1 and 8; Table altered.
The new INCLUDING ROWS clause explicitly states that I’m only going to include those rows where the PROD_ID is between 1 and 8.
If we now check the size of the partition and its contents:
SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY'); PL/SQL procedure successfully completed. SQL> select partition_name, blocks from dba_tab_partitions where table_name='ZIGGY'; PARTITION_NAME BLOCKS -------------------- ---------- Z_2016_DEC 44 Z_2016_NOV 44 Z_2016_OCT 45 Z_2016_Q1 57 Z_2016_Q2 1006 Z_2016_Q3 1006 SQL> select prod_id, count(*) from ziggy where time_id between '01-JAN-2016' and '31-MAR-2016' group by prod_id order by prod_id; PROD_ID COUNT(*) ---------- ---------- 1 2334 2 2324 3 2372 4 2284 5 2462 6 2348 7 2399 8 2380
We see the Q1 partition has indeed decreased in size (down to just 57 blocks from 1006 blocks) because it has now been compressed AND because it now only has rows where the PROD_ID is between 1 and 8.
Nice !!
The next cool new feature is that we now have new syntax (FOR EXCHANGE WITH TABLE) to more easily create a table by which we wish to subsequently perform a partition exchange. This ensures that the new table is entirely compatible for such an exchange, although note that associated index are NOT created as part of this process:
SQL> CREATE TABLE ziggy_exchange 2 TABLESPACE users 3 FOR EXCHANGE WITH TABLE ziggy; Table created. SQL> ALTER TABLE ziggy 2 EXCHANGE PARTITION z_2016_q1 WITH TABLE ziggy_exchange; Table altered.
If we look at the contents of each object, we can see the partition exchange has been successful:
SQL> select prod_id, count(*) from ziggy where time_id between '01-JAN-2016' and '31-MAR-2016' group by prod_id order by prod_id; no rows selected SQL> select prod_id, count(*) from ziggy_exchange where time_id between '01-JAN-2016' and '31-MAR-2016' group by prod_id order by prod_id; PROD_ID COUNT(*) ---------- ---------- 1 2334 2 2324 3 2372 4 2284 5 2462 6 2348 7 2399 8 2380
Nice !!
The final new 12.2 partitioning feature I want to introduce is the ability now to make a particular partition (or sub-partition) read only:
SQL> alter table ziggy modify partition z_2016_q1 read only; Table altered. SQL> select table_name, partition_name, read_only from dba_tab_partitions where table_name='ZIGGY'; TABLE_NAME PARTITION_NAME READ -------------------- -------------------- ---- ZIGGY Z_2016_DEC NO ZIGGY Z_2016_NOV NO ZIGGY Z_2016_OCT NO ZIGGY Z_2016_Q1 YES ZIGGY Z_2016_Q2 NO ZIGGY Z_2016_Q3 NO SQL> insert into ziggy values (1,1,'13-JAN-2016', 1); insert into ziggy values (1,1,'13-JAN-2016', 1) * ERROR at line 1: ORA-14466: Data in a read-only partition or subpartition cannot be modified.
Nice !!
There are lots of great new features introduced with Oracle Database 12c R2, but sometimes it’s these lesser know features that can so terribly useful.
12.2 Online Conversion of a Non-Partitioned Table to a Partitioned Table (A Small Plot Of Land) March 27, 2017
Posted by Richard Foote in 12c Release 2 New Features, Attribute Clustering, Clustering Factor, Online DDL, Oracle, Oracle Indexes, Partitioning.4 comments
In my previous post, I discussed how you can now move heap tables online with Oracle Database 12.2 and how this can be very beneficial in helping to address issues with the Clustering Factor of key indexes.
A problem with this technique is that is requires the entire table to be effectively reorganised when most of the data might already be well clustered. It would be much more efficient if we could somehow only move and reorganise just the portion of a table that has poorly clustered data introduced to the table since the last reorg.
Partitioning the table appropriately would help to address this disadvantage but converting a non-partitioned table to be partitioned can be a pain. To do this online with as little complication as possible one could use the dbms_redefintion package which has improved with latter releases.
However, with Oracle Database 12.2, there is now an even easier, more flexible method of performing such a conversion.
Using the same table definition and data as from my previous post, I’m going to first create a couple of additional indexes (on the ID column and on the DATE_CREATED column) :
SQL> create unique index ziggy_id_i on ziggy(id); Index created. SQL> create index ziggy_date_created_i on ziggy(date_created); Index created.
To convert a non-partitioned table to a partitioned table online, we can now use this new extension to the ALTER TABLE syntax:
SQL> alter table ziggy 2 modify partition by range (date_created) 3 (partition p1 values less than (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')), 4 partition p2 values less than (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')), 5 partition p3 values less than (maxvalue)) online; Table altered.
How simple is that !! We now have a table that is range partitioned based on the DATE_CREATED column and this conversion was performed online.
We notice not only is the table now partitioned with all the indexes remaining Valid, but the index based on the partitioning key (DATE_CREATED) has also been implicitly converted to be a Local partitioned index:
SQL> select table_name, status, partitioned from dba_tables where table_name='ZIGGY'; TABLE_NAME STATUS PAR ------------ -------- --- ZIGGY VALID YES SQL> select index_name, status, partitioned, num_rows from dba_indexes where table_name='ZIGGY'; INDEX_NAME STATUS PAR NUM_ROWS -------------------- -------- --- ---------- ZIGGY_DATE_CREATED_I N/A YES 2000000 ZIGGY_CODE_I VALID NO 2000000 ZIGGY_ID_I VALID NO 2000000 SQL> select index_name, partition_name, status, leaf_blocks from dba_ind_partitions where index_name like 'ZIGGY%'; INDEX_NAME PARTITION_NAME STATUS LEAF_BLOCKS -------------------- --------------- -------- ----------- ZIGGY_DATE_CREATED_I P1 USABLE 865 ZIGGY_DATE_CREATED_I P2 USABLE 1123 ZIGGY_DATE_CREATED_I P3 USABLE 1089 SQL> select index_name, partitioning_type, partition_count, locality from dba_part_indexes where table_name='ZIGGY'; INDEX_NAME PARTITION PARTITION_COUNT LOCALI -------------------- --------- --------------- ------ ZIGGY_DATE_CREATED_I RANGE 3 LOCAL
As part of the table conversion syntax, we have the option to also update all the associated indexes and partition them in any manner we may want. For example:
SQL> alter table ziggy 2 modify partition by range (date_created) 3 (partition p1 values less than (TO_DATE('01-JAN-2015', 'DD-MON-YYYY')), 4 partition p2 values less than (TO_DATE('01-JAN-2016', 'DD-MON-YYYY')), 5 partition p3 values less than (maxvalue)) online 6 update indexes 7 (ziggy_code_i local, 8 ziggy_id_i global partition by range (id) 9 (partition ip1 values less than (maxvalue))); Table altered.
In this example, not only are we converting the non-partitioned table to be partitioned, but we’re also explicitly converting the index on the CODE column to be a Locally partitioned index and the index on the ID column to be Globally partitioned in its own manner.
If we look at the definition of these indexes, we see that they also have all been converted to partitioned indexes online along with the table:
SQL> select table_name, status, partitioned from dba_tables where table_name='ZIGGY'; TABLE_NAME STATUS PAR ------------ -------- --- ZIGGY VALID YES SQL> select index_name, status, partitioned from dba_indexes where table_name = 'ZIGGY'; INDEX_NAME STATUS PAR -------------------- -------- --- ZIGGY_CODE_I N/A YES ZIGGY_ID_I N/A YES ZIGGY_DATE_CREATED_I N/A YES SQL> select index_name, partitioning_type, partition_count, locality from dba_part_indexes where table_name='ZIGGY'; INDEX_NAME PARTITION PARTITION_COUNT LOCALI -------------------- --------- --------------- ------ ZIGGY_CODE_I RANGE 3 LOCAL ZIGGY_ID_I RANGE 1 GLOBAL ZIGGY_DATE_CREATED_I RANGE 3 LOCAL
If we look at the Clustering Factor of the important CODE column index, we see that all partitions have an excellent Clustering Factor as all partitions have just been created.
SQL> select partition_name, num_rows, clustering_factor from dba_ind_partitions where index_name='ZIGGY_CODE_I'; PARTITION_NAME NUM_ROWS CLUSTERING_FACTOR -------------------- ---------- ----------------- P1 490000 2275 P2 730000 3388 P3 780000 3620
However, if we now add new rows to the table as would occur with a real application, the data from the “current” partition results in the Clustering Factor “eroding” over time for this partition.
SQL> insert into ziggy select 2000000+rownum, mod(rownum,100), sysdate, 'DAVID BOWIE' from dual connect by level <= 500000; 500000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_index_stats(ownname=>null,indname=>'ZIGGY_CODE_I'); PL/SQL procedure successfully completed. SQL> select partition_name, num_rows, clustering_factor from dba_ind_partitions where index_name='ZIGGY_CODE_I'; PARTITION_NAME NUM_ROWS CLUSTERING_FACTOR -------------------- ---------- ----------------- P1 490000 2275 P2 730000 3388 P3 1280000 238505
As discussed previously, the Clustering Attribute has no effect with standard DML operations. Therefore, the efficiency of the CODE index reduces over time in the partition where new data is being introduced. The Clustering Factor has now substantially increased from 3620 to 238505. Note for all the other partitions where there are no modifications to the data, the Clustering Factor remains excellent.
Having the table/index partitioned means we can therefore periodically reorg just the problematic partition:
SQL> alter table ziggy move partition p3 update indexes online; Table altered. SQL> select partition_name, num_rows, clustering_factor from dba_ind_partitions where index_name='ZIGGY_CODE_I'; PARTITION_NAME NUM_ROWS CLUSTERING_FACTOR -------------------- ---------- ----------------- P1 490000 2275 P2 730000 3388 P3 1280000 5978
The Clustering Factor for this partition has now reduced substantially from 238505 to just 5978.
For those of you with the Partitioning database option, the ability in 12.2 to now so easily convert a non-partitioned table to be partitioned, along with its associated indexes is just brilliant 🙂
12c Online Partitioned Table Reorganisation Part II (Move On) January 15, 2014
Posted by Richard Foote in 12c, Move Partitions, Oracle Indexes, Update Indexes Online.15 comments
In Part I, we looked at some of the issues associated with locking and unusable indexes when Moving both tables and table partitions.
The Oracle 12c Database has introduced a number of great new capabilities associated with online DDL activities. One of these really useful features is the capability to now move table partitions online while maintaining associated indexes.
If we look at the same demo as with the previous post:
SQL> create table muse (id number, status varchar2(6), name varchar2(30)) 2 partition by range (id) 3 (partition p1 values less than (1000001) 4 partition p2 values less than (2000001) 5 partition p3 values less than (maxvalue)); Table created. SQL> insert into muse select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level >= 3000000; 3000000 rows created. SQL> commit; Commit complete. SQL> create index muse_id_pk on muse(id); Index created. SQL> alter table muse add constraint muse_id_pk primary key(id); Table altered. SQL> create index muse_status_i on muse(status) local; Index created.
If in one session we have an active transaction (i.e. not yet committed):
SQL> insert into muse values (3000001, 'OPEN', 'ZIGGY STARDUST'); 1 row created.
while we move a table partition in another session as we did previously:
SQL> alter table muse move partition p3 tablespace users update indexes; alter table muse move partition p3 tablespace users update indexes ERROR at line 1: ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
We get the same resource busy error.
However, if we now use the new 12c UPDATE INDEXES ONLINE clause:
SQL> alter table muse move partition p3 tablespace users update indexes online;
The session now simply hangs until the transaction in session one completes, in a similar manner to that of an index rebuild online operation.
Once we commit in session one:
SQL> commit; Commit complete.
The Move Partition operation can obtain the necessary table partition lock and complete successfully:
SQL> alter table muse move partition p3 tablespace users update indexes online; Table altered.
The indexes remain in a USABLE state throughout:
SQL> select index_name, partition_name, status from dba_ind_partitions where index_name='MUSE_STATUS_I' union select index_name, null, status from dba_indexes where index_name='MUSE_ID_PK'; INDEX_NAME PARTITION_NAME STATUS --------------- -------------------- -------- MUSE_ID_PK VALID MUSE_STATUS_I P1 USABLE MUSE_STATUS_I P2 USABLE MUSE_STATUS_I P3 USABLE
If we perform these operations the other way around with the Move Partition first:
SQL> alter table muse move partition p3 tablespace users update indexes online;
While in session two we now insert a new row into the partition being moved:
SQL> insert into muse values (3000004, 'OPEN', 'ZIGGY STARDUST'); 1 row created.
The insert operation is now not locked out by the Move Partition operation and can proceed. The Move Partition operation in turn requires a brief lock at the end of the process (in much the same way as a normal online index rebuild) and so will hang if there are any outstanding transactions on the partition being moved at that time. And again like an online index rebuild, this will not in turn lock out other DML transactions.
Once the commit is performed in session two:
SQL> commit; Commit complete.
The Move Partition can complete:
SQL> alter table muse4 move partition p3 tablespace users update indexes online; Table altered.
And importantly, all the associated indexes again remain in a USABLE state:
SQL> select index_name, partition_name, status from dba_ind_partitions where index_name='MUSE_CODE_I' union select index_name, null, status from dba_indexes where index_name='MUSE_ID_PK'; INDEX_NAME PARTITION_NAME STATUS --------------- -------------------- -------- MUSE_CODE_I P1 USABLE MUSE_CODE_I P2 USABLE MUSE_CODE_I P3 USABLE MUSE_ID_PK VALID
Now this is only applicable to the online move of table partitions. Moving a non-partitioned table still has the same issues and restrictions as before (as discussed in Part I).
Therefore for those with the Partitioning option, serious consideration could be given to converting tables to partitioned tables, even if the table only has the one partition:
SQL> create table muse3 (id number, status varchar2(6), name varchar2(30)) 2 partition by range (id) 3 (partition p1 values less than (maxvalue)); Table created.
Such a table is effectively the equivalent of a non-partitioned table, but can now take advantage of the flexibility that the new online move partition capability provides 🙂
12c Online Partitioned Table Reorganisation Part I (Prelude) January 7, 2014
Posted by Richard Foote in 12c, Oracle Indexes, Partitioning, Unusable Indexes, Update Indexes, Update Indexes Online.2 comments
First post for 2014 !!
Although it’s generally not an overly common activity with Oracle databases, reorganising a table can be somewhat painful, primarily because of the associated locking implications and the impact it has on indexes.
If we look at the following example:
SQL> create table muse2 (id number, status varchar2(6), name varchar2(30)); Table created. SQL> insert into muse2 select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level <= 3000000; 3000000 rows created. SQL> commit; Commit complete. SQL> create index muse2_id_pk on muse2(id); Index created. SQL> alter table muse2 add constraint muse2_id_pk primary key(id); Table altered. SQL> create index muse2_status_i on muse2(status); Index created.
So we have a table with a couple of indexes. We can’t move the table using the ONLINE option as it’s only applicable for Index Organized Tables:
SQL> alter table muse2 move online; alter table muse2 move online * ERROR at line 1: ORA-01735: invalid ALTER TABLE option
If in one session, we have a current transaction on the table (i.e. not committed):
SQL> insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST'); 1 row created.
An attempt to MOVE the table in another session will fail with locking issues:
SQL> alter table muse2 move; alter table muse2 move * ERROR at line 1: ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
On the other hand, if the table MOVE command proceeds:
SQL> alter table muse2 move; Table altered.
It in turn locks other transactions out during the duration and leaves all indexes in an UNUSABLE state:
SQL> insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST'); insert into muse2 values (3000001, 'OPEN', 'ZIGGY STARDUST') * ERROR at line 1: ORA-01502: index 'BOWIE.MUSE2_ID_PK' or partition of such index is in unusable state SQL> select index_name, status from dba_indexes where table_name='MUSE2'; INDEX_NAME STATUS --------------- -------- MUSE2_ID_PK UNUSABLE MUSE2_STATUS_I UNUSABLE
If we now look at a similar Partitioned Table example:
SQL> create table muse (id number, status varchar2(6), name varchar2(30)) 2 partition by range (id) 3 (partition p1 values less than (1000001), 4 partition p2 values less than (2000001), 5 partition p3 values less than (maxvalue)); Table created. SQL> insert into muse select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level <= 3000000; 3000000 rows created. SQL> commit; Commit complete. SQL> create index muse_id_pk on muse(id); Index created. SQL> alter table muse add constraint muse_id_pk primary key(id); Table altered. SQL> create index muse_status_i on muse(status) local; Index created.
Similar locking and indexing issues occur if we try and reorganise a partition, even if we UPDATE INDEXES on the fly. For example, if we have an active transaction in one session:
SQL> insert into muse values (3000001, 'OPEN', 'ZIGGY STARDUST'); 1 row created.
While in another session:
SQL> alter table muse move partition p3 tablespace users update indexes; alter table muse move partition p3 tablespace users update indexes * ERROR at line 1: ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired
We get the same old resource busy error. If we tried things the other way around, so in one session we first attempt to move a table partition:
SQL> alter table muse move partition p3 tablespace users update indexes; alter table muse move partition p3 tablespace users update indexes * ERROR at line 1: ORA-14327: Some index [sub]partitions could not be rebuilt
We can eventually get the above error if in another session we then attempt to insert a new row into this partition:
SQL> insert into muse values (3000002, 'OPEN', 'ZIGGY STARDUST'); 1 row created.
which in turn hangs for the period of time until the above error is generated.
The associated local index is now not a happy chappy:
SQL> select index_name, partition_name, status from dba_ind_partitions where index_name='MUSE_STATUS_I' union select index_name, null, status from dba_indexes where index_name='MUSE_ID_PK'; INDEX_NAME PARTITION_NAME STATUS --------------- --------------- -------- MUSE_ID_PK VALID MUSE_STATUS_I P1 USABLE MUSE_STATUS_I P2 USABLE MUSE_STATUS_I P3 UNUSABLE
So Moving tables and table partitions around can all get a bit messy, especially if high availability is required.
The DBMS_REDEFINITION package is designed specifically to enable the online redefinition of objects, but it has its own issues and is not as clean and simplistic as a simple MOVE operation when we just want to perform a table re-org.
So what was has changed in Oracle Database 12c ?
We’ll see in the next post although the title here does rather give it away 🙂
12c Partial Indexes For Partitioned Tables Part II (Vanishing Act) July 12, 2013
Posted by Richard Foote in 12c, Local Indexes, Oracle Indexes, Partial Indexes, Partitioning.5 comments
In Partial Indexes Part I, we looked at how it was possible with the 12c database to create a Partial Index based on data from only selected table partitions. The resultant Partial Index can be either a Global or Local Index.
In Part I, we only really looked at Global Indexes, so let’s look at a Local Index example. Using the same Partitioned Table example as before:
SQL> create table pink_floyd (id number, status varchar2(6), name varchar2(30)) indexing off partition by range (id) (partition pf1 values less than (1000001), partition pf2 values less than (2000001) indexing off, partition pf3 values less than (maxvalue) indexing on); Table created.
This time, we’ll create a Local Partial Index:
SQL> create index pink_floyd_status_i on pink_floyd(status) local indexing partial; Index created.
If we look at the details of the resultant Local Index:
SQL> select index_name, partition_name, num_rows, status, leaf_blocks from dba_ind_partitions where index_name = 'PINK_FLOYD_STATUS_I'; INDEX_NAME PARTITION_NAME NUM_ROWS STATUS LEAF_BLOCKS -------------------- --------------- ---------- -------- ----------- PINK_FLOYD_STATUS_I PK1 0 UNUSABLE 0 PINK_FLOYD_STATUS_I PK2 0 UNUSABLE 0 PINK_FLOYD_STATUS_I PK3 1000000 USABLE 2513
We can see that for those table partitions with INDEXING OFF, the associated Local Indexes have simply been made UNUSABLE. Since Unusable Indexes consume no storage, there is effectively no corresponding index segment for these index partitions.
For the one and only PK3 table partition with INDEXING ON, its associated Local Index has been created as normal. So the end result is very similar to the previous Global Index example, only those rows from the table partitions with the INDEXING ON property are effectively being indexed.
There is one scenario in which the creation of a Partial Index is not permitted, that is in the creation of a Unique Index or a Non-Unique Index to police a Primary Key or Unique Key constraint. Some examples:
SQL> create unique index pink_floyd_id_i on pink_floyd(id) indexing partial; create unique index pink_floyd_id_i on pink_floyd(id) indexing partial * ERROR at line 1: ORA-14226: unique index may not be PARTIAL SQL> alter table pink_floyd add constraint pink_floyd_pk primary key(id) using index (create index pink_floyd_id_i on pink_floyd(id) indexing partial); alter table pink_floyd add constraint pink_floyd_pk primary key(id) using index (create index pink_floyd_id_i on pink_floyd(id) indexing partial) * ERROR at line 1: ORA-14196: Specified index cannot be used to enforce the constraint. SQL> create index pink_floyd_id_i on pink_floyd(id) indexing partial; Index created. SQL> alter table pink_floyd add primary key(id); alter table pink_floyd add primary key(id) * ERROR at line 1: ORA-01408: such column list already indexed
It clearly doesn’t make sense to create a Partial Unique Index or on a Non-Unique Index policing a PK or Unique Key constraint as it would be impossible to use such an index to guarantee the required unique property. With missing index entries associated with non-indexed partitions, how can Oracle determine whether a value from new row already exists or not ? It can’t and hence Oracle doesn’t permit the creation of such a Partial Index.
Partial Indexes can potentially be extremely useful in reducing unnecessary storage requirements, reducing index maintenance overheads and in improving performance by reducing index block accesses.
But they’re only useful (possible) with Partitioned Tables.
I’ll next look at another cool index improvement introduced with the Oracle 12c Database that’s associated with Partitioning, Asynchronous Global Index Maintenance …
12c Partial Indexes For Partitioned Tables Part I (Ignoreland) July 8, 2013
Posted by Richard Foote in 12c, Oracle Indexes, Partial Indexes, Partitioning.14 comments
In my opinion, one of the unsung “Heroes” of the new 12c Oracle database are improvements associated with Partitioning. There are lots of really nice capabilities introduced in the 12c database that significantly improves the manageability of partitioned environments.
One of those new capabilities is the ability to now create both local and (importantly) global indexes on only a subset of partitions within a partitioned table. This provides us with the flexibility to say only create partitions with data that would make sense to index, to not index current partitions where perhaps data insert performance is paramount, etc. Additionally and just as importantly, the CBO is aware of the indexing characteristics of individual partitions and can access partitions in differing manners accordingly.
To illustrate, a simple little demo as usual 🙂 Firstly, I’ll create a partitioned table with the new INDEXING clause:
SQL> create table pink_floyd (id number, status varchar2(6), name varchar2(30)) INDEXING OFF partition by range (id) (partition pf1 values less than (1000001), partition pf2 values less than (2000001) INDEXING OFF, partition pf3 values less than (maxvalue) INDEXING ON); Table created.
The INDEXING clause determines whether or not the partition is to be indexed. It can be set at the table level and so set the default behaviour for the table or at the individual partition/subpartition level.
In the above example, I’ve set INDEXING OFF at the table level and so indexing by default is not be enabled for the table partitions. Therefore the PF1 partition is not indexed by default. The PF2 partition is explicitly set to also not be indexed but the PF3 index is explicitly set (INDEXING ON) to enable indexing and so override the table level default.
Let’s now populate the table with some basic data:
SQL> insert into pink_floyd select rownum, 'CLOSED', 'DAVID BOWIE' from dual connect by level <= 3000000; 3000000 rows created. SQL> commit; Commit complete. SQL> update pink_floyd set status = 'OPEN' where id > 2000000 and mod(id,10000)=0; 100 rows updated. SQL> commit; Commit complete.
Most of the data has a STATUS column value of ‘CLOSED’ but I’ve updated a few rows within just the last partition with a STATUS set to ‘OPEN’.
Let’s now create an index on this STATUS column and collect table statistics:
SQL> create index pink_floyd_status_i on pink_floyd(status); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>user, tabname=>'PINK_FLOYD', estimate_percent=>null, cascade=> true, method_opt=>'FOR ALL COLUMNS SIZE 1 FOR COLUMNS STATUS SIZE 5'); PL/SQL procedure successfully completed. SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I'; INDEX_NAME NUM_ROWS LEAF_BLOCKS INDEXIN -------------------- ---------- ----------- ------- PINK_FLOYD_STATUS_I 3000000 9203 FULL
By default, an index will include all partitions in a table, regardless of the INDEXING table clause setting. So this index covers all 3M rows in the table and currently has 9203 leaf blocks. The new INDEXING column in DBA_INDEXES shows us that this index is a FULL (non-Partial) index.
We can of course get the data of interest (STATUS = ‘OPEN’) via this index now:
SQL> select * from pink_floyd where status = 'OPEN'; 100 rows selected. Execution Plan ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 964 | 24100 | 4 (0)| 00:00:01 | | | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD | 964 | 24100 | 4 (0)| 00:00:01 | ROWID | ROWID | |* 2 | INDEX RANGE SCAN | PINK_FLOYD_STATUS_I | 100 | | 3 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("STATUS"='OPEN') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 110 consistent gets 0 physical reads 0 redo size 4479 bytes sent via SQL*Net to client 609 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
However, we can potentially also run a query based on just the last partition as all the ‘OPEN’ statuses of interest only reside in this last partition:
SQL> select * from pink_floyd where status = 'OPEN' and id > 2000001; 100 rows selected. Execution Plan ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 99 | 2475 | 4 (0)| 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD | 99 | 2475 | 4 (0)| 00:00:01 | 3 | 3 | |* 2 | INDEX RANGE SCAN | PINK_FLOYD_STATUS_I | 100 | | 3 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID">2000001) 2 - access("STATUS"='OPEN') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 110 consistent gets 0 physical reads 0 redo size 2787 bytes sent via SQL*Net to client 609 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
Currently however, the index includes data from all table partitions, even though we’re only really interested in using the index to retrieve the less common ‘OPEN’ status that resides in only the last table partition. With 12c, there is now the capability to only index those partitions that are of interest to us, which with proper design can also be implemented such that only those column values of interest are included within an index.
I’m going to drop and create the index as a “Partial” Index:
SQL> drop index pink_floyd_status_i; Index dropped. SQL> create index pink_floyd_status_i on pink_floyd(status) indexing partial; Index created. SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I'; INDEX_NAME NUM_ROWS LEAF_BLOCKS INDEXIN -------------------- ---------- ----------- ------- PINK_FLOYD_STATUS_I 1000000 3068 PARTIAL
The new INDEXING PARTIAL clause means only those table partitions with INDEXING ON are to be included within the index.
Notice how the index, which is a Global, Non-Partitioned Index, now only has 1M entries (not all 3M as previously) and with 3068 leaf blocks is only 1/3 of what it was previously. The INDEXING column now denotes this as a “Partial” index.
If we run the query again that only explicitly references the last “active” table partition:
SQL> select * from pink_floyd where status = 'OPEN' and id >= 2000001; 100 rows selected. Execution Plan ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 2500 | 4 (0)| 00:00:01 | | | |* 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD | 100 | 2500 | 4 (0)| 00:00:01 | 3 | 3 | |* 2 | INDEX RANGE SCAN | PINK_FLOYD_STATUS_I | 33 | | 3 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("ID">=2000001) 2 - access("STATUS"='OPEN') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 110 consistent gets 0 physical reads 0 redo size 2787 bytes sent via SQL*Net to client 609 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
We see that the index is used as it was previously. By stating with the ID > 2000001 predicate we’re only interested in data that can only reside in the last table partition, the partition with INDEXING ON, the CBO knows the index can be used to retrieve all the rows of interest. If we know the application will only extract data in this manner, all is well with our smaller, Partial index.
However, if it’s possible within the application to perhaps search for STATUS values from other partitions, that have INDEXING OFF:
SQL> select * from pink_floyd where status = 'OPEN'; 100 rows selected. Execution Plan ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 2500 | 2474 (1)| 00:00:01 | | | 1 | VIEW | VW_TE_2 | 99 | 3465 | 2474 (1)| 00:00:01 | | | 2 | UNION-ALL | | | | | | | |* 3 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD | 33 | 825 | 4 (0)| 00:00:01 | ROWID | ROWID |* 4 | INDEX RANGE SCAN | PINK_FLOYD_STATUS_I | 100 | | 3 (0)| 00:00:01 | | | 5 | PARTITION RANGE ITERATOR | | 66 | 1650 | 2470 (1)| 00:00:01 | 1 | 2 |* 6 | TABLE ACCESS FULL | PINK_FLOYD | 66 | 1650 | 2470 (1)| 00:00:01 | 1 | 2 ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 3 - filter("PINK_FLOYD"."ID">=2000001 OR "PINK_FLOYD"."ID" IS NULL) 4 - access("STATUS"='OPEN') 6 - filter("STATUS"='OPEN') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 16341 consistent gets 8204 physical reads 0 redo size 2787 bytes sent via SQL*Net to client 609 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
We notice the index is still used to efficiently find those rows of interest from the last partition, but a Full Table (Partition) Scan is performed to search for data from the other two partitions, for which with INDEXING OFF means the index does not contain entries that reference these partitions. As a result, this query is now much more expensive than it was previously as the index can not be used to exclusively find the rows of interest. The CBO within the one execution plan uses the index where it can and a full scan of the other partitions where it can’t use the index.
If however we were a little cleverer in how we designed our table and also created table subpartitions based on the STATUS column so that rows of interest resided in their own subpartitions and only set INDEXING ON for these subpartitions:
SQL> create table pink_floyd (id number, status varchar2(6), name varchar2(30)) indexing off partition by range (id) subpartition by list(status) subpartition template (subpartition closed values ('CLOSED') indexing off, subpartition open values ('OPEN') indexing on) (partition pf1 values less than (1000001), partition pf2 values less than (2000001), partition pf3 values less than (maxvalue)) enable row movement; Table created.
Notice how only the subpartitions with a STATUS of ‘OPEN’ are now to be indexed. If we populate the table with the exact same data as before, we find the table and partial index have the follow characteristics:
SQL> select subpartition_position, subpartition_name, num_rows, indexing from dba_tab_subpartitions where table_name = 'PINK_FLOYD'; SUBPARTITION_POSITION SUBPARTITION_NAME NUM_ROWS IND --------------------- -------------------- ---------- --- 1 PF1_CLOSED 1000000 OFF 2 PF1_OPEN 0 ON 1 PF2_CLOSED 1000000 OFF 2 PF2_OPEN 0 ON 1 PF3_CLOSED 999900 OFF 2 PF3_OPEN 100 ON SQL> select index_name, num_rows, leaf_blocks, indexing from dba_indexes where index_name = 'PINK_FLOYD_STATUS_I'; INDEX_NAME NUM_ROWS LEAF_BLOCKS INDEXIN -------------------- ---------- ----------- ------- PINK_FLOYD_STATUS_I 100 1 PARTIAL
We can see that only the subpartitions with data of interest are now indexed. The resultant Partial global non-partitioned index is now tiny, with just the 100 index entries of interest residing in a single leaf block.
Just as importantly, a query searching for this data across the whole table is now extremely efficient and can be fully serviced by this tiny Partial index:
SQL> select * from pink_floyd where status = 'OPEN'; 100 rows selected. Execution Plan ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 100 | 2500 | 2 (0)| 00:00:01 | | | | 1 | TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| PINK_FLOYD | 100 | 2500 | 2 (0)| 00:00:01 | ROWID | ROWID | |* 2 | INDEX RANGE SCAN | PINK_FLOYD_STATUS_I | 100 | | 1 (0)| 00:00:01 | | | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("STATUS"='OPEN') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 16 consistent gets 0 physical reads 0 redo size 4479 bytes sent via SQL*Net to client 609 bytes received via SQL*Net from client 8 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 100 rows processed
The query can now be fully serviced by the Partial index as all subpartitions that could contain data of interest are indexed and because the data of interest is all neatly clustered within the table subpartitions, can be retrieved with far fewer consistent gets than previously.
If we ever wanted to access those STATUS values of ‘CLOSED’, the CBO can only do so via a Full Table Scan as such values are not indexed. However, as these values represent the vast majority of rows in the table, the Full Table Scan would be the most appropriate and efficient manner to access these rows any-ways.
The new Partial Index capabilities introduced in Oracle 12c enables us to easily use global (and local) indexes to just index data of interest without having to change the application. Such a capability has many potential uses.
More on Partial Indexes to come soon in Part II.
Partition Pruning – Some Examples (Knives Out) September 3, 2008
Posted by Richard Foote in Oracle Cost Based Optimizer, Partitioning.4 comments
Following on from my last post where I looked at the use of partition pruning vs. the use of an index, I thought it might be worthwhile to just go through a few additional basic examples of where partition pruning can come into play.
I’ll also include a little example of how execution plans may not favour indexes as much with partitioned tables as the resultant cost of a Full Table Scan can be reduced significantly due to partition pruning.
All the examples use exactly the same tables as created in the previous post. For a full listing of the examples and resultant executions plans, look at this Partition Pruning Demo. You may want to open up the demo in a different browser window to make it easier to follow.
In Example 1:
SQL> select * from big_album_sales where release_date between ’01-JUN-2003′ and ’13-APR-2004′;
the CBO can determine that all the data can only belong in Partitions 3 and 4 (as only only years that need to be referenced are 2003 and 2004) and so not have to visit any of the other partitions. The PARTITION RANGE ITERATOR step clearly shows that it only has to visit the partitions in the range 3 – 4.
In Example 2:
SQL> select * from big_album_sales where release_date > ’01-JUN-2006′;
All the possible values must exist from partition 6 which stores all the 2006 data and onwards through to the last partition in the table which is partition 8. The PARTITION RANGE ITERATOR step clearly shows that it only has to visit the partitions in the range 6 – 8.
In Example 3:
SQL> select * from big_album_sales where release_date between ’01-JUN-2006′ and ’30-NOV-2006′ or release_date > ’14-JUN-2008′;
The only possible partitions the data can reside in are partition 6 or partition 8 (for the years 2006 or 2008). In this case, the CBO performs the PARTITION RANGE OR step which visits only these specific partitions.
In Example 4:
SQL> select * from big_album_sales where release_date is null;
All NULL values must reside in the last partition, as the table was created with the last partition having a MAXVALUE boundary. That being the case, the CBO need only visit the last partition (8) within the table to find all possible NULL values of RELEASE_DATE.
Because partition pruning can make performing a so-called Full Table Scan much more efficient as only those partitions with possible data need only be accessed, in some rare cases this can make the CBO appear to ignore reasonable indexes that may be used and accessed in an equivalent non-partitioned table.
Note that the data includes a series of RELEASE_DATE values but many of the dates actually have the same timestamp component as all the row values were determined and loaded in a very quick manner. As a result, the CBO considers there are actually 718 rows per distinct occurrence of a RELEASE_DATE value. This fact is significant when costing an equality predicate.
In Example 5:
SQL> select * from big_album_sales where release_date = ’01-JUN-2006′;
we’re only interested in the one specific date value but Oracle still favours the Full Table Scan (FTS). Note this value can only possibly exist in partition 6 associated with data from 2006 and so the FTS need only actually access partition 6. Oracle assumes it will retrieve approximately 718 rows which is too costly via the index as partition pruning can make the FTS relatively efficient. Note that the actual cost of the FTS is 303.
In Example 6:
SQL> select /*+ index(b) */ * from big_album_sales b where release_date = ’01-JUN-2006′;
a hint is used to force the use of the associated index on the RELEASE_DATE column. Note in this example, it’s actually a much cheaper option (there are far fewer consistent gets) as no rows are actually being retrieved but the cost of 722 is significantly greater than that of the FTS (303) as Oracle is expecting to retrieve and access 718 rows.
In Example 7:
SQL> select * from big_album_sales2 where release_date = ’01-JUN-2006′;
we run exactly the same statement but this time on the non-partitioned table and notice that the execution plan is totally different. Oracle is indeed using the index because in this example the FTS is so much more expensive as it needs to read the entire table and not just the one partition (or approximately 1/8 of the table) as it does with the partitioned table example.
Lots more on partitioning to come in the future …
Regular Index vs. Partitions (The Best Of Both Worlds) August 28, 2008
Posted by Richard Foote in Oracle Cost Based Optimizer, Oracle Indexes, Partitioning.20 comments
When asking for requests and suggestions on what I could cover in the blog, Brian Tkatch asked “When is a partition better than just a regular index ?”
As it’s a very good question, I thought I might spend a few moments attempting to provide some kind of meaningful answer.
The first point I would make however, is that indexes and partitions are not at all mutually exclusive. A table of course could very well benefit from being both partitioned and indexed. Most partitioned tables have indexes and indeed indexes can in turn be partitioned.
The second point I would make is that there are two key advantages or reasons why it might be beneficial to partition a segment. They are:
- Better Manageability
- Better Performance
This post can be viewed as being but an introduction to the potential performance benefits of partitioned segments. I will leave the topic of better manageability for another time.
The point of Brian’s question is when can the performance benefits available to partitioned segments out perform and be a better option than that of a possible associated index.
As I’ve discussed a number of times, there are no magic numbers or ratios as to when an index is the most appropriate and cost effective access path. If fact, the link shows a theoretical example of when a Full Table Scan (FTS) is by far the cheaper option when selecting just 1% of data. Generally speaking though, the higher the percentage of returned rows, the more likely a FTS is going to be the more efficient option.
However, a Full Table Scan is just that, a FULL Table Scan. Oracle needs to read the entire table, each and every block below the segment High Water Mark. So if a query needed to retrieve just 5% of the data, it’s quite likely it would perform a FTS because it’s possibly a cheaper and less costly alternative than using an associated index (the above link explains why) but it would need to access 100% of the table to do so. That means it would need to effectively access 95% of the data and simply just throw away the rows so it can access the 5% of data it actually requires in a more efficient manner than possible using an index access path.
That’s potentially (say) 95% of the work and resources for no real reason other than Oracle has no choice but to access all the data when performing a FTS. That’s potentially 95% of wasted resources, 95% of wasted effort. Yes, a TFS might be more efficient than using an index but accessing unnecessarily say 95% of data sounds rather costly nonetheless.
Wouldn’t it be nice if somehow we could get the benefits of performing a FTS (multiblock reads, reading a specific block just the once, etc.) but only read the say 5% of the data that we were actually interested in (or something significantly less than the 100%)? Somehow combine the benefits of a FTS with the benefits provided by an index, that being only needing to access data that is of direct interest.
Enter Partitioning.
Partitioning provides Oracle with another level of granularity (or levels of granularity with composite partitions) when accessing a table. It’s no longer an question of using an index or reading the entire table. Oracle can now either use an index, access the entire table or with partitioning access just those partitions (or bits) of a table that can only possibly contain data of interest.
By partitioning a table, Oracle knows that data belonging to the partitioning keys must belong in a specific partition. The logical table is effectively broken down into multiple physical segments with each partition segment containing only data associated with the partitioning keys or columns.
If the say 5% of data that’s of interest only logically resides in the one or whatever number of partitions, then there’s no need for Oracle to access the data in all the other partitions, containing the other say 95% of unwanted data. The Cost based Optimizer (CBO) can automatically take advantage of “partition pruning” and only needs to access the partition or those partitions that only can contain the data of interest.
Oracle can now potentially perform a so-called “FTS” of the table, but only actually accesses and reads the 5% (or significantly less than the 100%) of data that’s potentially of direct interest by accessing just those partitions it needs to.
At a simplistic level, indexes are best when accessing a relatively “small” amount of data. A FTS is best when accessing a relatively “large” amount of the data and partitions can be extremely useful and beneficial when accessing an amount of data somewhere between the two. As such, partitioning is more likely to be beneficial in a Data Warehouse or Reporting environment where accessing a relatively high percentage of data is common.
This demo of indexes vs. a FTS vs. partition pruning shows how a FTS outperforms an index when accessing approximately 1/8 of data in a table but an appropriately partitioned table out performs both options.
More on partitioning in the future.
Local Index Issue With Partitioned PK and Unique Key Constraints December 20, 2007
Posted by Richard Foote in Constraints, Index Access Path, Local Indexes, Oracle Indexes, Partitioning, Performance Tuning, Unique Indexes.12 comments
Nuno Souto (Noons) also asked a really interesting question on my Differences between Unique and Non-Unique Indexes blog entry (comment 4) that I thought it worthy of a separate blog entry to do the answer justice. The question was:
“Isn’t it still the case that unique indexes cannot be locally partitioned unless the partition key is part of the index key? Not sure if 11g removes this. If still so, that would weigh heavily in favour of non-unique indexing for PK on a table potentially requiring local index partitions.”
Simplistically, the answer to the first part is Yes it is still the case, even in 11g and the answer to the second part is No, it wouldn’t weigh heavily in favour of non-unique indexing for PK on a table requiring local index partitions. It wouldn’t actually be a consideration at all.
Let me explain why.
Firstly, there is a really really good reason why Oracle doesn’t allow us to create a Unique Index in which the Partition key is not part of a Local Index. It’s called protecting us from ourselves !!
Let’s start by mentioning constraints again.
Remember, the main reason we have indexes policing PK and Unique constraints is so that Oracle can very quickly and efficiently determine whether or not a new value already exists. Do a quick index look-up, is the value there, yes or no, allow the insert (or update), yes or no.
Just imagine for one moment what would happen if Oracle actually allowed us to create a Unique Local index in which the index didn’t include the partitioned column(s).
Lets say a table is Range Partitioned on column ‘A’ and we try and create a Unique Local index on just column ‘B’. Let’s assume we have (say) 500 table partitions meaning we must therefore have 500 local index partitions as well. When we insert a new value for our unique index for value B, it will attempt to do so in the corresponding local index partition as governed by the value A for the new row. However Oracle can’t just check this one index partition for uniqueness to ensure value of column B doesn’t already exist, Oracle would need to check all 500 index partitions because it would be possible for our new value of column B to potentially have previously been inserted into any of the other 499 partitions !!
Each and every insert into our partitioned table (partitioned by column A) therefore would require Oracle to check all (say)500 index partitions each and every time to check for duplicates of column B. Again, it’s important to understand that any given value of column B could potentially be in any of the 500 partitions, IF Oracle allowed us to create a Local Partitioned Index just on column B.
Checking all 500 index partitions looking for a specific value of column B would obviously be impractical, inefficient and totally un-scalable. Therefore Oracle doesn’t allow us to do this. It doesn’t allow us to create a Local index in which the indexed columns does’t include the partitioning columns as well.
This is actually a good thing.
If you want to create a Unique index in a partitioned table, you MUST either add all the partitioned columns and make it part of the LOCAL unique index (so that way each and every insert would only have to check the one local partition as this value is known now it’s part of the index) or you must create it as a GLOBAL index (in which again, Oracle only has to check the one index structure).
It actually makes a lot of sense to do this.
Moving onto the second part of the question. Let’s just use a Local Non-Unique index to police our PK constraints then.
Fortunately this isn’t allowed either for exactly the same reasons. You can’t create a Local Non-unique index to police a PK (or Unique) constraint if the Constraint does not also include the partitioned columns. Otherwise again, Oracle would need to check each and every index partition to determine whether the constraint has been violated or not.
If you attempt to use an existing Local Non-Unique index to police a PK or Unique constraint that does not contain the partitioned columns, you will get an error saying it can’t create the (by default Global index) because the useless Local Non-Unique index (from a policing the constraint point of view) already exists.
Again if you want to create a Non-Unique index to police a PK or Unique constraint you must either ensure the constraint includes all the partitioned columns in which case it can be Local or you must use a Global Non-Unique index.
In other words, the rules apply equally to both Unique and Non-Unique indexes.
So it’s not really a case of Oracle not allowing one to create a Local Unique index without including the partitioned columns (although that’s of course true) but really a case of Oracle not allowing a PK or Unique *constraint* to be policed via *any* Local index (whether Unique or Non-Unique), unless the partitioned columns are also included.
Little demo to illustrate: Local Index Issue With Partitioned PK and Unique Key Constraints