Oracle 12c: Indexing JSON in the Database Part II (Find A Little Wood) August 5, 2016
Posted by Richard Foote in 12c, JSON, JSON Text Index, Oracle Indexes.trackback
In Part I, we looked at how you can now store JSON documents within the Oracle 12c Database. For efficient accesses to JSON documents stored in the Oracle database, we can either create a function-based index based on the JSON_VALUE function or on JSON .dot notation.
These indexes are useful for indexing specific JSON attributes, but what if we want to index multiple JSON attributes within a single index structure.
To start, I’m just going to add an extra row to increase the selectivity of other columns.
SQL> insert into ziggy_json 2 select 3 100001, 4 SYSdate, 5 '{"PONumber" : 1000001, 6 "Reference" : "MTOM-20161", 7 "Requestor" : "Major Tom", 8 "User" : "MTOM", 9 "CostCenter" : "B42", 10 "ShippingInstructions" : {"name" : "Major Tom", 11 "Address": {"street" : "42 Ziggy Street", 12 "city" : "Canberra", 13 "state" : "ACT", 14 "zipCode" : 2601, 15 "country" : "Australia"}, 16 "Phone" : [{"type" : "Office", "number" : "417-555-7777"}, 17 {"type" : "Mobile", "number" : "417-555-1234"}]}, 18 "Special Instructions" : null, 19 "AllowPartialShipment" : true, 20 "LineItems" : [{"ItemNumber" : 1, 21 "Part" : {"Description" : "Hunky Dory", 22 "UnitPrice" : 10.95}, 23 "Quantity" : 5.0}, 24 {"ItemNumber" : 2, 25 "Part" : {"Description" : "Pin-Ups", 26 "UnitPrice" : 10.95}, 27 "Quantity" : 3.0}]}' 28 from dual; 1 row created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON'); PL/SQL procedure successfully completed.
We can still create composite indexes based on the JSON_VALUE function as we can with conventional columns:
SQL> create index ziggy_json_idx3 on ziggy_json(json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20)), json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6))); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON'); PL/SQL procedure successfully completed.
If we now run a query with a couple of JSON_VALUE based predicates:
SQL> select * from ziggy_json where json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20))='MTOM' and json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6)) = 'B42'; Execution Plan ---------------------------------------------------------- Plan hash value: 3402615542 ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1533 | 4 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON | 1 | 1533 | 4 (0) | 00:00:01 | |*2 | INDEX RANGE SCAN | ZIGGY_JSON_IDX3 | 1 | | 3 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access(JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.User' RETURNING VARCHAR2(20) NULL ON ERROR)='MTOM' AND JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.CostCenter' RETURNING VARCHAR2(6) NULL ON ERROR)='B42') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 1248 bytes sent via SQL*Net to client 820 bytes received via SQL*Net from client 4 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The composite index is effectively used by the CBO as expected.
It does though make our SQL a little cumbersome to write. To simplify things a tad, we could create a couple of virtual columns based on these functions, create the JSON function-based indexes on these virtual columns and simplify the SQL accordingly.
First, we create the virtual columns (note they’re virtual columns and so consume no storage):
SQL> ALTER TABLE ziggy_json ADD (userid VARCHAR2(20) 2 GENERATED ALWAYS AS (json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20)))); Table altered. SQL> ALTER TABLE ziggy_json ADD (costcenter VARCHAR2(6) 2 GENERATED ALWAYS AS (json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6)))); Table altered.
Next, create the index based on these newly created virtual columns:
SQL> CREATE INDEX ziggy_user_costctr_idx on ziggy_json(userid, costcenter); Index created. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON'); PL/SQL procedure successfully completed.
And then write a simplified version of the SQL to reference the virtual columns:
SQL> select * from ziggy_json where userid='MTOM' and costcenter='B42'; ID ZIGGY_DAT ---------- --------- ZIGGY_ORDER -------------------------------------------------------------------------------- USERID COSTCE -------------------- ------ 100001 24-JUN-16 {"PONumber" : 1000001, "Reference" : "MTOM-20161", MTOM B42 Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 5717455 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1535 | 4 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON | 1 | 1535 | 4 (0) | 00:00:01 | |* 2 | INDEX RANGE SCAN | ZIGGY_USER_COSTCTR_IDX | 1 | | 3 (0) | 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("USERID"='MTOM' AND "COSTCENTER"='B42') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 1396 bytes sent via SQL*Net to client 820 bytes received via SQL*Net from client 4 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The index is again used as expected.
Of course, if we still want to reference the JSON functions directly within the SQL, the query can still be written as previously:
SQL> select * from ziggy_json where json_value(ziggy_order, '$.User' returning varchar2(20))='MTOM' and json_value(ziggy_order, '$.CostCenter' returning varchar2(6))='B42'; ID ZIGGY_DAT ---------- --------- ZIGGY_ORDER -------------------------------------------------------------------------------- USERID COSTCE -------------------- ------ 100001 24-JUN-16 {"PONumber" : 1000001, "Reference" : "MTOM-20161", MTOM B42 Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 5717455 -------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | -------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 1535 | 4 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON | 1 | 1535 | 4 (0) | 00:00:01 | |*2 | INDEX RANGE SCAN | ZIGGY_USER_COSTCTR_IDX | 1 | | 3 (0) | 00:00:01 | -------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("ZIGGY_JSON"."USERID"='MTOM' AND "ZIGGY_JSON"."COSTCENTER"='B42') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 5 consistent gets 0 physical reads 0 redo size 1396 bytes sent via SQL*Net to client 820 bytes received via SQL*Net from client 4 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
The index is again used as expected.
I’ll next look at using a JSON Text based index to effectively index the entire JSON document.
I never regret the day I decided to follow you… your blog is amazing
LikeLike
Thanks florin 🙂
I just wish I had more free time to write more content more often.
LikeLike
The use the of virtual columns when working with JSON is a great tip.
Thank you for the series, great stuff. JSON is everywhere, and growing, we better get comfortable.
LikeLike
Thanks rimblas.
Yes, JSON is indeed everywhere, so it might as well be stored in an Oracle database so you can also make use of all the other enterprise capabilities of the database such as DR, HA, indexing, etc.
LikeLiked by 1 person