jump to navigation

Oracle 12c: Indexing JSON in the Database Part III (Paperback Writer) September 2, 2016

Posted by Richard Foote in 12c, JSON, JSON Text Index, Oracle Indexes.
2 comments

In Part I and Part II, we looked at how to index specific attributes within a JSON document store within an Oracle 12c database.

But what if we’re not sure which specific attributes might benefit from an index or indeed, as JSON is by it’s nature a schema-less way to store data, what if we’re not entirely sure what attributes might be present currently or in the future.

On a JSON document store within the Oracle Database, you can create a special JSON aware Text Index that can automatically index any field/attribute within a JSON document and use a Text based function to then search efficiently for data from any attribute.

Using the same table created in Part I, you can create a JSON Text index as follows:

SQL> CREATE INDEX ziggy_search_idx ON ziggy_json (ziggy_order)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)');
Index created.

Note this Text index is (optionally) defined to be automatically synchronised when data in the ZIGGY_JSON table is committed.

We can use the JSON_TEXTCONTAINS Oracle Text function to efficiently access data for any data within the JSON defined column. For example:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.Reference', 'DBOWIE-201642');

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |    2550 (0) |  00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |    2550 (0) |  00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    2549 (0) |  00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{DBOWIE-201642}
 INPATH(/Reference)')>0)

Statistics
----------------------------------------------------------
  65 recursive calls
   0 db block gets
 118 consistent gets
   0 physical reads
   0 redo size
1863 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

Note the JSON-based Text index is used to retrieve data efficiently.

The Text index can also be used to search data efficiently from within an array set:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.LineItems.Part.Description', 'Low');

no rows selected

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |    5927 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |    5927 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    5927 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{Low}
 INPATH(/LineItems/Part/Description)')>0)

Statistics
----------------------------------------------------------
132 recursive calls
  0 db block gets
182 consistent gets
  0 physical reads
  0 redo size
627 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  0 rows processed

We can also search for a specific data value across any attribute within the JSON document store:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$', '4242');

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |       2 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |       2 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |       1 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{4242}')>0)
     
Statistics
----------------------------------------------------------
  10 recursive calls
   0 db block gets
  32 consistent gets
   0 physical reads
   0 redo size
1865 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

Let’s now add more data to the JSON column, but this time introducing a few new attributes (such as AlbumName):

SQL> insert into ziggy_json
 2 select
 3 rownum,
 4 SYSdate,
 5 '{"AlbumId" : ' || rownum || ',
 6 "AlbumName" : "HUNKY DORY",
 7 "ArtistName" : "David Bowie"}'
 8 from dual connect by level <= 10;
10 rows created.

SQL> commit;

Commit complete.

As the JSON-based Text index was defined to be automatically synchronised when we commit data in the table, these new attributes can be immediately searched and accessed via the index:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.AlbumName', 'HUNKY DORY');

10 rows selected.

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |  198 |  296K |    1948 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |  198 |  296K |    1948 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    1780 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{HUNKY DORY}
 INPATH(/AlbumName)')>0)
     
Statistics
----------------------------------------------------------
  48 recursive calls
   0 db block gets
 103 consistent gets
   0 physical reads
   0 redo size
6751 bytes sent via SQL*Net to client
3232 bytes received via SQL*Net from client
  22 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  10 rows processed

Not only can JSON data be stored within the Oracle 12c database, but we have a number of index strategies available to search such data efficiently.

Advertisements

Oracle 12c: Indexing JSON in the Database Part II (Find A Little Wood) August 5, 2016

Posted by Richard Foote in 12c, JSON, JSON Text Index, Oracle Indexes.
4 comments

In Part I, we looked at how you can now store JSON documents within the Oracle 12c Database. For efficient accesses to JSON documents stored in the Oracle database, we can either create a function-based index based on the JSON_VALUE function or on JSON .dot notation.

These indexes are useful for indexing specific JSON attributes, but what if we want to index multiple JSON attributes within a single index structure.

To start, I’m just going to add an extra row to increase the selectivity of other columns.

SQL> insert into ziggy_json
2 select
3 100001,
4 SYSdate,
5 '{"PONumber" : 1000001,
6 "Reference" : "MTOM-20161",
7 "Requestor" : "Major Tom",
8 "User" : "MTOM",
9 "CostCenter" : "B42",
10 "ShippingInstructions" : {"name" : "Major Tom",
11 "Address": {"street" : "42 Ziggy Street",
12 "city" : "Canberra",
13 "state" : "ACT",
14 "zipCode" : 2601,
15 "country" : "Australia"},
16 "Phone" : [{"type" : "Office", "number" : "417-555-7777"},
17 {"type" : "Mobile", "number" : "417-555-1234"}]},
18 "Special Instructions" : null,
19 "AllowPartialShipment" : true,
20 "LineItems" : [{"ItemNumber" : 1,
21 "Part" : {"Description" : "Hunky Dory",
22 "UnitPrice" : 10.95},
23 "Quantity" : 5.0},
24 {"ItemNumber" : 2,
25 "Part" : {"Description" : "Pin-Ups",
26 "UnitPrice" : 10.95},
27 "Quantity" : 3.0}]}'
28 from dual;

1 row created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

We can still create composite indexes based on the JSON_VALUE function as we can with conventional columns:

SQL> create index ziggy_json_idx3 on 
     ziggy_json(json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20)), 
                json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6)));

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

If we now run a query with a couple of JSON_VALUE based predicates:

SQL> select * from ziggy_json
     where json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20))='MTOM' and 
           json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6)) = 'B42';

Execution Plan
----------------------------------------------------------
Plan hash value: 3402615542

-------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name            | Rows | Bytes | Cost (%CPU) | Time     |
-------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                 |    1 |  1533 |       4 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON      |    1 |  1533 |       4 (0) | 00:00:01 |
|*2  | INDEX RANGE SCAN                    | ZIGGY_JSON_IDX3 |    1 |       |       3 (0) | 00:00:01 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.User' RETURNING VARCHAR2(20) NULL ON
ERROR)='MTOM' AND JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.CostCenter' RETURNING VARCHAR2(6) NULL ON ERROR)='B42')
 
Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   5 consistent gets
   0 physical reads
   0 redo size
1248 bytes sent via SQL*Net to client
 820 bytes received via SQL*Net from client
   4 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   1 rows processed

The composite index is effectively used by the CBO as expected.

It does though make our SQL a little cumbersome to write. To simplify things a tad, we could create a couple of virtual columns based on these functions, create the JSON function-based indexes on these virtual columns and simplify the SQL accordingly.

First, we create the virtual columns (note they’re virtual columns and so consume no storage):

 

SQL> ALTER TABLE ziggy_json ADD (userid VARCHAR2(20)
2 GENERATED ALWAYS AS (json_value(ziggy_order, '$.User' RETURNING VARCHAR2(20))));

Table altered.

SQL> ALTER TABLE ziggy_json ADD (costcenter VARCHAR2(6)
2 GENERATED ALWAYS AS (json_value(ziggy_order, '$.CostCenter' RETURNING VARCHAR2(6))));

Table altered.

 

Next, create the index based on these newly created virtual columns:

SQL> CREATE INDEX ziggy_user_costctr_idx on ziggy_json(userid, costcenter);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

And then write a simplified version of the SQL to reference the virtual columns:

SQL> select * from ziggy_json where userid='MTOM' and costcenter='B42';

ID ZIGGY_DAT
---------- ---------
ZIGGY_ORDER
--------------------------------------------------------------------------------

USERID COSTCE
-------------------- ------
100001 24-JUN-16
{"PONumber" : 1000001,
"Reference" : "MTOM-20161",
MTOM B42
  
Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 5717455

--------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                   | Rows | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                        |    1 |  1535 |       4 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON             |    1 |  1535 |       4 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | ZIGGY_USER_COSTCTR_IDX |    1 |       |       3 (0) | 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("USERID"='MTOM' AND "COSTCENTER"='B42')
   
Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   5 consistent gets
   0 physical reads
   0 redo size
1396 bytes sent via SQL*Net to client
 820 bytes received via SQL*Net from client
   4 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   1 rows processed

The index is again used as expected.

Of course, if we still want to reference the JSON functions directly within the SQL, the query can still be written as previously:

SQL> select * from ziggy_json 
     where json_value(ziggy_order, '$.User' returning varchar2(20))='MTOM' and
           json_value(ziggy_order, '$.CostCenter' returning varchar2(6))='B42';

ID ZIGGY_DAT
---------- ---------
ZIGGY_ORDER
--------------------------------------------------------------------------------

USERID COSTCE
-------------------- ------
100001 24-JUN-16
{"PONumber" : 1000001,
"Reference" : "MTOM-20161",
MTOM B42
   
Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 5717455

--------------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name                   | Rows | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                        |    1 |  1535 |       4 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON             |    1 |  1535 |       4 (0) | 00:00:01 |
|*2  | INDEX RANGE SCAN                    | ZIGGY_USER_COSTCTR_IDX |    1 |       |       3 (0) | 00:00:01 |
--------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("ZIGGY_JSON"."USERID"='MTOM' AND "ZIGGY_JSON"."COSTCENTER"='B42')

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   5 consistent gets
   0 physical reads
   0 redo size
1396 bytes sent via SQL*Net to client
 820 bytes received via SQL*Net from client
   4 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   1 rows processed

The index is again used as expected.

I’ll next look at using a JSON Text based index to effectively index the entire JSON document.

Oracle 12c: Indexing JSON In The Database Part I (Lazarus) August 2, 2016

Posted by Richard Foote in 12c, JSON, Oracle Indexes.
add a comment

One of the very cool new features introduced in Oracle Database 12c Rel 1 is the ability to store JavaScript Object Notation (JSON) documents within the database. Unlike XML which has its own data type, JSON data can be stored as VARCHAR2, CLOB or BLOB data types, but with a JSON check constraint to ensure the stored parsed document meets JSON document standards.

This enables ‘No-SQL’ schema-less type development within the Oracle Database  for “next generation” applications. Although developers can work with the JSON store without using SQL (via say RESTful APIs directly within the database), you can still leverage the power of SQL for reporting and analytics type purposes. Of course, reading JSON data efficiently then becomes important, and that’s where indexing the JSON document store kicks in.

Let’s look at a simple example.

Firstly, let’s create a table with a column called ZIGGY_ORDER which stores JSON documents.

SQL> CREATE TABLE ziggy_json  (id number,  ziggy_date date,  ziggy_order CLOB  CONSTRAINT ensure_ziggy_json CHECK (ziggy_order IS JSON));</pre>
Table created.

The ZIGGY_ORDER column has a JSON check constraint which ensures only valid JSON documents can be stored. An attempt to insert a row with invalid JSON data will fail:

SQL> insert into ziggy_json values (1, sysdate, '{"This is not legal JSON"}');
insert into ziggy_json values (1, sysdate, '{"This is not legal JSON"}')
*
ERROR at line 1:
ORA-02290: check constraint (BOWIE.ENSURE_ZIGGY_JSON) violated

Let’s insert some JSON data into the table. Note I’m using ROWNUM to insert some almost unique JSON data and then re-inserting the data again to get data worth accessing via an index:

SQL> insert into ziggy_json
2 select
3 rownum,
4 SYSdate,
5 '{"PONumber" : ' || rownum || ',
6 "Reference" : "DBOWIE-2016' || rownum || '",
7 "Requestor" : "David Bowie",
8 "User" : "DBOWIE",
9 "CostCenter" : "A42",
10 "ShippingInstructions" : {"name" : "David Bowie",
11 "Address": {"street" : "42 Ziggy Street",
12 "city" : "Canberra",
13 "state" : "ACT",
14 "zipCode" : 2601,
15 "country" : "Australia"},
16 "Phone" : [{"type" : "Office", "number" : "417-555-7777"},
17 {"type" : "Mobile", "number" : "417-555-1234"}]},
18 "Special Instructions" : null,
19 "AllowPartialShipment" : true,
20 "LineItems" : [{"ItemNumber" : 1,
21 "Part" : {"Description" : "Hunky Dory",
22 "UnitPrice" : 10.95},
23 "Quantity" : 5.0},
24 {"ItemNumber" : 2,
25 "Part" : {"Description" : "Pin-Ups",
26 "UnitPrice" : 10.95},
27 "Quantity" : 3.0}]}'
28 from dual connect by level <= 1000000;

1000000 rows created.

SQL> insert into ziggy_json select * from ziggy_json;

1000000 rows created.

SQL> commit;

Commit complete.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

There are a number of ways we can reference and access data from within JSON. One method is use .dot notation to access specific JSON data elements of interest:

SQL> SELECT z.ziggy_order.PONumber FROM ziggy_json z where z.ziggy_order.PONumber=42;

PONUMBER
--------------------------------------------------------------------------------

42
42

We can also use the JSON_VALUE function to access specific JSON data of interest:

SQL> select json_value(ziggy_order, '$.Reference') from ziggy_json  
     where json_value(ziggy_order, '$.PONumber' returning number)=42;

JSON_VALUE(ZIGGY_ORDER,'$.REFERENCE')
--------------------------------------------------------------------------------

DBOWIE-201642
DBOWIE-201642

Without an index, the CBO has no choice but to use an expensive Full Table Scan:

SQL> SELECT z.ziggy_order.PONumber FROM ziggy_json z where z.ziggy_order.PONumber=42;

Elapsed: 00:00:34.04

Execution Plan
----------------------------------------------------------
Plan hash value: 1413303849

--------------------------------------------------------------------------------
| Id | Operation         | Name       | Rows  | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------
| 0  | SELECT STATEMENT  |            | 20000 |   28M |    129K (3) | 00:00:06 |
|* 1 | TABLE ACCESS FULL | ZIGGY_JSON | 20000 |   28M |    129K (3) | 00:00:06 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter(TO_NUMBER(JSON_QUERY("Z"."ZIGGY_ORDER" FORMAT JSON ,
'$.PONumber' RETURNING VARCHAR2(4000) ASIS WITHOUT ARRAY WRAPPER NULL
ON ERROR))=42)

Statistics
----------------------------------------------------------
     0 recursive calls
     0 db block gets
500057 consistent gets
299745 physical reads
     0 redo size
   596 bytes sent via SQL*Net to client
   552 bytes received via SQL*Net from client
     2 SQL*Net roundtrips to/from client
     0 sorts (memory)
     0 sorts (disk)
     2 rows processed

 

At some 500,057 consistent gets and an elapsed time of 34.04 seconds, the above query is “slow” and expensive.

So one option to speed things up is to create a function-based index using the JSON_VALUE function. This can then be used to quickly access data that matches specific JSON name values of interest:

SQL> CREATE INDEX ziggy_po_num_idx ON 
     ziggy_json (json_value(ziggy_order, '$.PONumber' RETURNING NUMBER ERROR ON ERROR));

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

As it’s a function-based index, collecting statistics on the implicitly created virtual column is advisable. If we now re-run the query:

SQL> select * from ziggy_json 
     where json_value(ziggy_order, '$.PONumber' returning number)=42;

ID ZIGGY_DAT
---------- ---------
ZIGGY_ORDER
------------------------------------------------------------
42 24-JUN-16
{"PONumber" : 42,
"Reference" : "DBOWIE-201642",
42 24-JUN-16
{"PONumber" : 42,
"Reference" : "DBOWIE-201642",

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1939019025

--------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name             | Rows | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                  |    2 |  3058 |       5 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON       |    2 |  3058 |       5 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | ZIGGY_PO_NUM_IDX |    2 |       |       3 (0) | 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.PONumber' RETURNING NUMBER ERROR ON
ERROR)=42)
Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   9 consistent gets
   0 physical reads
   0 redo size
1863 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

The query now uses the JSON function-based index, performs just 9 consistent gets and completes in 0.01 second.

The index can be used for any data accesses in which the CBO considers the index the cheaper alternative:

SQL> select * from ziggy_json where json_value(ziggy_order, '$.PONumber' returning number)<42;

82 rows selected.

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 1939019025

--------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name             | Rows | Bytes | Cost (%CPU) | Time     |
--------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                  |   82 |  122K |      86 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON       |   82 |  122K |      86 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | ZIGGY_PO_NUM_IDX |   82 |       |       3 (0) | 00:00:01 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_VALUE("ZIGGY_ORDER" FORMAT JSON , '$.PONumber' RETURNING NUMBER ERROR ON
ERROR)<42)

Statistics
----------------------------------------------------------
    0 recursive calls
    0 db block gets
  249 consistent gets
    0 physical reads
    0 redo size
50623 bytes sent via SQL*Net to client
22528 bytes received via SQL*Net from client
  166 SQL*Net roundtrips to/from client
    0 sorts (memory)
    0 sorts (disk)
   82 rows processed

Indexes can also be created based on the .dot JSON notation:

SQL> CREATE INDEX ziggy_po_num_idx2 ON ziggy_json z (z.ziggy_order.PONumber);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

A query based on the associated JSON .dot notation can now run efficiently via the index:

SQL> SELECT * FROM ziggy_json z where z.ziggy_order.PONumber='42';

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 4224387816

---------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name              | Rows | Bytes| Cost (%CPU) | Time     |
---------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                   |    2 | 3062 |       6 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON        |    2 | 3062 |       6 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | ZIGGY_PO_NUM_IDX2 |    2 |      |       3 (0) | 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(JSON_QUERY("ZIGGY_ORDER" FORMAT JSON , '$.PONumber' RETURNING VARCHAR2(4000) ASIS
WITHOUT ARRAY WRAPPER NULL ON ERROR)='42')

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   9 consistent gets
   0 physical reads
   0 redo size
1863 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

Query again uses the index and is just as efficient with almost immediate response times at just 9 consistent gets.

JSON indexes can also be used to police and ensure data constraints and integrity (which can be problematic with JSON documents). The following numeric index example also implicitly adds data constraint capabilities:

SQL> CREATE INDEX ziggy_po_num_idx3 ON ziggy_json z (to_number(z.ziggy_order.PONumber));

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=>'ZIGGY_JSON');

PL/SQL procedure successfully completed.

SQL> insert into ziggy_json
2 select
3 rownum,
4 SYSdate,
5 '{"PONumber" : "200000A",
6 "Reference" : "DBOWIE-2016' || rownum || '",
7 "Requestor" : "David Bowie",
8 "User" : "DBOWIE",
9 "CostCenter" : "A42",
10 "ShippingInstructions" : {"name" : "David Bowie",
11 "Address": {"street" : "42 Ziggy Street",
12 "city" : "Canberra",
13 "state" : "ACT",
14 "zipCode" : 2601,
15 "country" : "Australia"},
16 "Phone" : [{"type" : "Office", "number" :"417-555-7777"},
17 {"type" : "Mobile", "number" :"417-555-1234"}]},
18 "Special Instructions" : null,
19 "AllowPartialShipment" : true,
20 "LineItems" : [{"ItemNumber" : 1,
21 "Part" : {"Description" : "Hunky Dory",
22 "UnitPrice" : 10.95},
23 "Quantity" : 5.0},
24 {"ItemNumber" : 2,
25 "Part" : {"Description" : "Pin-Ups",
26 "UnitPrice" : 10.95},
27 "Quantity" : 3.0}]}'
28 from dual;
insert into ziggy_json
*
ERROR at line 1:
ORA-01722: invalid number

The PONumber value has to now be numeric for it to be successfully added to the JSON document store. The index of course can also be used for efficient data access:

SQL> SELECT * FROM ziggy_json z where to_number(z.ziggy_order.PONumber)=42;

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 692052820

---------------------------------------------------------------------------------------------------------
| Id | Operation                           | Name              | Rows | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
| 0  | SELECT STATEMENT                    |                   |    2 |  3048 |      5 (0) | 00:00:01 |
| 1  | TABLE ACCESS BY INDEX ROWID BATCHED | ZIGGY_JSON        |    2 |  3048 |      5 (0) | 00:00:01 |
|* 2 | INDEX RANGE SCAN                    | ZIGGY_PO_NUM_IDX3 |    2 |       |      3 (0) | 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------

2 - access(TO_NUMBER(JSON_QUERY("ZIGGY_ORDER" FORMAT JSON , '$.PONumber' RETURNING
VARCHAR2(4000) ASIS WITHOUT ARRAY WRAPPER NULL ON ERROR))=42)

Statistics
----------------------------------------------------------
   0 recursive calls
   0 db block gets
   9 consistent gets
   0 physical reads
   0 redo size
1863 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

In Part II, I’ll look at how to create composite JSON indexes and how to use a text index to automatically index all name fields within a JSON document.