jump to navigation

Oracle 12c: Indexing JSON in the Database Part III (Paperback Writer) September 2, 2016

Posted by Richard Foote in 12c, JSON, JSON Text Index, Oracle Indexes.
3 comments

In Part I and Part II, we looked at how to index specific attributes within a JSON document store within an Oracle 12c database.

But what if we’re not sure which specific attributes might benefit from an index or indeed, as JSON is by it’s nature a schema-less way to store data, what if we’re not entirely sure what attributes might be present currently or in the future.

On a JSON document store within the Oracle Database, you can create a special JSON aware Text Index that can automatically index any field/attribute within a JSON document and use a Text based function to then search efficiently for data from any attribute.

Using the same table created in Part I, you can create a JSON Text index as follows:

SQL> CREATE INDEX ziggy_search_idx ON ziggy_json (ziggy_order)
2 INDEXTYPE IS CTXSYS.CONTEXT
3 PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)');
Index created.

Note this Text index is (optionally) defined to be automatically synchronised when data in the ZIGGY_JSON table is committed.

We can use the JSON_TEXTCONTAINS Oracle Text function to efficiently access data for any data within the JSON defined column. For example:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.Reference', 'DBOWIE-201642');

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time      |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |    2550 (0) |  00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |    2550 (0) |  00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    2549 (0) |  00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{DBOWIE-201642}
 INPATH(/Reference)')>0)

Statistics
----------------------------------------------------------
  65 recursive calls
   0 db block gets
 118 consistent gets
   0 physical reads
   0 redo size
1863 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

Note the JSON-based Text index is used to retrieve data efficiently.

The Text index can also be used to search data efficiently from within an array set:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.LineItems.Part.Description', 'Low');

no rows selected

Elapsed: 00:00:00.01

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |    5927 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |    5927 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    5927 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{Low}
 INPATH(/LineItems/Part/Description)')>0)

Statistics
----------------------------------------------------------
132 recursive calls
  0 db block gets
182 consistent gets
  0 physical reads
  0 redo size
627 bytes sent via SQL*Net to client
552 bytes received via SQL*Net from client
  2 SQL*Net roundtrips to/from client
  0 sorts (memory)
  0 sorts (disk)
  0 rows processed

We can also search for a specific data value across any attribute within the JSON document store:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$', '4242');

Elapsed: 00:00:00.00

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |    1 |  1534 |       2 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |    1 |  1534 |       2 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |       1 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{4242}')>0)
     
Statistics
----------------------------------------------------------
  10 recursive calls
   0 db block gets
  32 consistent gets
   0 physical reads
   0 redo size
1865 bytes sent via SQL*Net to client
1088 bytes received via SQL*Net from client
   6 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
   2 rows processed

Let’s now add more data to the JSON column, but this time introducing a few new attributes (such as AlbumName):

SQL> insert into ziggy_json
 2 select
 3 rownum,
 4 SYSdate,
 5 '{"AlbumId" : ' || rownum || ',
 6 "AlbumName" : "HUNKY DORY",
 7 "ArtistName" : "David Bowie"}'
 8 from dual connect by level <= 10;
10 rows created.

SQL> commit;

Commit complete.

As the JSON-based Text index was defined to be automatically synchronised when we commit data in the table, these new attributes can be immediately searched and accessed via the index:

SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.AlbumName', 'HUNKY DORY');

10 rows selected.

Elapsed: 00:00:00.02

Execution Plan
----------------------------------------------------------
Plan hash value: 3069169778

------------------------------------------------------------------------------------------------
| Id | Operation                   | Name             | Rows | Bytes | Cost (%CPU) | Time     |
------------------------------------------------------------------------------------------------
|  0 | SELECT STATEMENT            |                  |  198 |  296K |    1948 (0) | 00:00:01 |
|  1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON       |  198 |  296K |    1948 (0) | 00:00:01 |
|* 2 | DOMAIN INDEX                | ZIGGY_SEARCH_IDX |      |       |    1780 (0) | 00:00:01 |
------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{HUNKY DORY}
 INPATH(/AlbumName)')>0)
     
Statistics
----------------------------------------------------------
  48 recursive calls
   0 db block gets
 103 consistent gets
   0 physical reads
   0 redo size
6751 bytes sent via SQL*Net to client
3232 bytes received via SQL*Net from client
  22 SQL*Net roundtrips to/from client
   0 sorts (memory)
   0 sorts (disk)
  10 rows processed

Not only can JSON data be stored within the Oracle 12c database, but we have a number of index strategies available to search such data efficiently.