Oracle 12c: Indexing JSON in the Database Part III (Paperback Writer) September 2, 2016
Posted by Richard Foote in 12c, JSON, JSON Text Index, Oracle Indexes.3 comments
In Part I and Part II, we looked at how to index specific attributes within a JSON document store within an Oracle 12c database.
But what if we’re not sure which specific attributes might benefit from an index or indeed, as JSON is by it’s nature a schema-less way to store data, what if we’re not entirely sure what attributes might be present currently or in the future.
On a JSON document store within the Oracle Database, you can create a special JSON aware Text Index that can automatically index any field/attribute within a JSON document and use a Text based function to then search efficiently for data from any attribute.
Using the same table created in Part I, you can create a JSON Text index as follows:
SQL> CREATE INDEX ziggy_search_idx ON ziggy_json (ziggy_order) 2 INDEXTYPE IS CTXSYS.CONTEXT 3 PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)'); Index created.
Note this Text index is (optionally) defined to be automatically synchronised when data in the ZIGGY_JSON table is committed.
We can use the JSON_TEXTCONTAINS Oracle Text function to efficiently access data for any data within the JSON defined column. For example:
SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.Reference', 'DBOWIE-201642'); Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 3069169778 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 1534 | 2550 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON | 1 | 1534 | 2550 (0) | 00:00:01 | |* 2 | DOMAIN INDEX | ZIGGY_SEARCH_IDX | | | 2549 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{DBOWIE-201642} INPATH(/Reference)')>0) Statistics ---------------------------------------------------------- 65 recursive calls 0 db block gets 118 consistent gets 0 physical reads 0 redo size 1863 bytes sent via SQL*Net to client 1088 bytes received via SQL*Net from client 6 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2 rows processed
Note the JSON-based Text index is used to retrieve data efficiently.
The Text index can also be used to search data efficiently from within an array set:
SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.LineItems.Part.Description', 'Low'); no rows selected Elapsed: 00:00:00.01 Execution Plan ---------------------------------------------------------- Plan hash value: 3069169778 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 1534 | 5927 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON | 1 | 1534 | 5927 (0) | 00:00:01 | |* 2 | DOMAIN INDEX | ZIGGY_SEARCH_IDX | | | 5927 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{Low} INPATH(/LineItems/Part/Description)')>0) Statistics ---------------------------------------------------------- 132 recursive calls 0 db block gets 182 consistent gets 0 physical reads 0 redo size 627 bytes sent via SQL*Net to client 552 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed
We can also search for a specific data value across any attribute within the JSON document store:
SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$', '4242'); Elapsed: 00:00:00.00 Execution Plan ---------------------------------------------------------- Plan hash value: 3069169778 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 1534 | 2 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON | 1 | 1534 | 2 (0) | 00:00:01 | |* 2 | DOMAIN INDEX | ZIGGY_SEARCH_IDX | | | 1 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{4242}')>0) Statistics ---------------------------------------------------------- 10 recursive calls 0 db block gets 32 consistent gets 0 physical reads 0 redo size 1865 bytes sent via SQL*Net to client 1088 bytes received via SQL*Net from client 6 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 2 rows processed
Let’s now add more data to the JSON column, but this time introducing a few new attributes (such as AlbumName):
SQL> insert into ziggy_json 2 select 3 rownum, 4 SYSdate, 5 '{"AlbumId" : ' || rownum || ', 6 "AlbumName" : "HUNKY DORY", 7 "ArtistName" : "David Bowie"}' 8 from dual connect by level <= 10; 10 rows created. SQL> commit; Commit complete.
As the JSON-based Text index was defined to be automatically synchronised when we commit data in the table, these new attributes can be immediately searched and accessed via the index:
SQL> SELECT * FROM ziggy_json WHERE json_textcontains(ziggy_order, '$.AlbumName', 'HUNKY DORY'); 10 rows selected. Elapsed: 00:00:00.02 Execution Plan ---------------------------------------------------------- Plan hash value: 3069169778 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU) | Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 198 | 296K | 1948 (0) | 00:00:01 | | 1 | TABLE ACCESS BY INDEX ROWID | ZIGGY_JSON | 198 | 296K | 1948 (0) | 00:00:01 | |* 2 | DOMAIN INDEX | ZIGGY_SEARCH_IDX | | | 1780 (0) | 00:00:01 | ------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("CTXSYS"."CONTAINS"("ZIGGY_JSON"."ZIGGY_ORDER",'{HUNKY DORY} INPATH(/AlbumName)')>0) Statistics ---------------------------------------------------------- 48 recursive calls 0 db block gets 103 consistent gets 0 physical reads 0 redo size 6751 bytes sent via SQL*Net to client 3232 bytes received via SQL*Net from client 22 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10 rows processed
Not only can JSON data be stored within the Oracle 12c database, but we have a number of index strategies available to search such data efficiently.