jump to navigation

Indexing NULLs: (Empty Spaces) January 23, 2008

Posted by Richard Foote in Indexing NULLs, Indexing Tricks, Oracle General, Oracle Indexes, Performance Tuning.

There have always been issues with NULLs and indexes. The main issue being of course if the indexed columns are all null then the associated row is not indexed.

Generally, this is a good thing. If we have a table with lots of null values for indexed columns, then the associated rows are not indexed resulting in a smaller index structure. Also, very often we’re simply not interested in result sets where the indexed values are null so it’s generally not an issue.

However, what if the number of rows where the values are null are relatively small and what if we want to find all rows where the index column or columns are indeed null. If the column or columns don’t have nulls indexed then a potentially expensive Full Table Scan (FTS) is the CBO’s only option.

The first thing to point out is that nulls are actually indexed, if other columns in the index have a not null value. For example, if we have a concatenated index on columns (A,B), so long as A has a not null value then column B can have an indexed null value and if column B has a not null value then column A can have an indexed null value. Only if both columns A and B contain nulls, will the associated row not be indexed.

If column B has a NOT NULL constraint, then Oracle knows that B can not contain any null values. Therefore, if column A can contain null values, Oracle also knows that each and every null value of A must also be indexed as it’s not possible to have an entirely null indexed entry. Therefore, with an index on (A,B), we can use the index to return every null value for A, providing of course the CBO considers the costs of doing so to be cheaper than a FTS. We can also always of course use the index to return all null values of A for any corresponding not null value of B.

So with concatenated indexes and with at least one not null column, Oracle can guarantee that every null for all the other columns are contained within the index and so could potentially use the index for corresponding IS NULL predicates.

But what if the index has a single column or what if none of the indexes have a NOT NULL constraint, we’re done for, the CBO won’t be able to use the associated index to just retrieve nulls, right ?

Well not quite.

Let’s assume we have an index that consists just of column A and it’s a null column. Let’s also assume there are not too many rows that have a null for A and we have an important query that would dearly love to use an index to retrieve rows based on these null values for column A.

Well one alternative of course as I’ve seen a number of times is to just include a NOT NULL column in the  index as well, say (A,B). Yes, we don’t particularly want to include column B in the index but at least by doing so, we ensure all null values for column A are indexed, making A IS NULL predicates viable through an index.

However a somewhat cheaper and less expensive alternative is to just simply append a single character to the index, for example a space (A, ‘ ‘). The space character takes up one byte, the column length in the index takes up an additional byte for a total of 2 bytes overhead per index entry. Yes this will reduce the capacity of a leaf block to contain as many index entries and so potentially increase somewhat the overall size of the index. However, this will also guarantee that the index can not contain all null entries thereby ensuring all other columns have all their null values indexed.

 See this demo on Indexing Null Values for examples on how this all works.

About these ads


1. Christian Antognini - January 23, 2008

Hi Richard

For numeric values another possibility is to use BINARY_FLOAT/DOUBLE datatype and insert NAN instead of NULL. Since NAN has a specific binary value, it is indexed as any other value.


2. Richard Foote - January 23, 2008

Hi Christian

Good thinking, nice idea.

In case anyone is unsure what Christian is suggesting, don’t use null values for (say) binary_float numerics but use a Not A Number (NAN) instead. It’s not a null, is hence indexable and can be used to retrieve a number when it’s not really a number.

SQL> create table nan (id binary_float);

Table created.

SQL> insert into nan values (binary_float_nan);

1 row created.

SQL> commit;

Commit complete.

SQL> select * from nan where id is nan;


Create an index and all NaNs can be potenitally retrieved via the index.

Thanks for the suggestion Christian :)

Dion Cho - September 1, 2010

Hi, Richard. For NAN indexing, following query does not use index.

SQL> select * from nan where id is nan;

But this query does use index.

SQL> select * from nan where id = binary_float_nan;

Hmmm, very funny. :)

3. Richard Foote - January 23, 2008

Just a general comment on comments.

I’ve noticed that my spam filter has been picking up some comments and automatically removing them without having a chance to moderate them first (the spam count has suddenly gone up) :(

If you submit a comment and you never see it appear here, please just drop me a line.

4. Neil Johnson - January 23, 2008

So a function based index is no longer my only option. The beauty of your solution being that user queries don’t need amending. The flip side in favour of a function based index (on say DECODE(processed,null,’N’,null)) could, potentially, be the reduced number of rows in the index.

Horses for courses I guess.

Excellent post, thanks.

5. Robert - January 23, 2008

Richard, I’m amazed that this actually works: you would expect Oracle to be smart enough to know that a constant value does not make sense in an index and silently remove it. :-)

If you have a large table with few NULL entries and queries particularly need NULL values then I’d rather use a FBI because that will dramatically save space.

If the client system was built in a way to query only views then you might get away with unchanged code, if this would work:

create index col_a_nulls on T1 (case when A is null ‘x’ else null end);
create view V1 as
select …, case(when (case when A is null ‘x’ else null end) is null then ‘x’ else null end) col_a
from T1…

Once I think about it it seems pretty unrealistic that Oracle actually uses the FBI for a query with “where col_a is null”. Hm…

6. Brian Tkatch - January 23, 2008

Neil and Robert, why would you use DECODE or CASE instead of NVL, which works in exactly this situation?

Also, Robert “you would expect Oracle to be smart enough to know that a constant value does not make sense in an index and silently remove it. :-)” perhaps that is because it treats it as a COLUMN and not just as a character. Therefore, to ignore the value is to ignore the pseudo-COLUMN, and that cannot be done.

The beauty of Richard’s solution, is that the queries do not need to have any knowledge of the INDEX (such as is the case with an FBI). Currently, where i am employed, we use NVL([column], ‘~’) which look particularly ugly and is not intuitive. I am planning to suggest this change in a few minutes. :)

But how much space is wasted anyway? It’s two bytes per record. It doesn’t seem like much space, unless every single INDEX will start using this trick. Is that really an issue when there is such a benefit?

7. Robert - January 23, 2008

Brian, the disadvantage of NVL is that you will get a larger index because it will contain entries for every row. If you are really only interested in rows where col_X is NULL, your table is large and the portion of rows with NULL value is small then you want to create an FBI based on a function that will return a non NULL value for rows with NULL column value and NULL for all others. That can not be done with NVL as far as I can see. (Personally I prefer DECODE because IIRC that is standard SQL.)

8. Neil - January 23, 2008

Quote – “The beauty of Richard’s solution, is that the queries do not need to have any knowledge of the INDEX”
I agree completely, but also agree with Robert that the FBI solution has it’s place. If you have no control over the code then the subject of this blog entry is a godsend. Anyway, Robert has done a good job above explaining the FBI reasoning. Two good solutions to keep in the locker for those rare yet troublesome queries.

9. pekka - January 24, 2008

you can use bitmap indexes to speed up queries having null-not null as filter. They can index null values, and have advantage of that.

10. Martin W - January 24, 2008

This is an excellent suggestion Richard and some good stuff in the comments too.
Bitmap indexes index null of course, though bitmaps should only be used where suitable.

For many years I’ve used two techniques.

If I want to index the normal contents as well as nulls, I use a false value to replace null (-1 in numbers, ‘Z’ or ‘#’ for varchar2′s), which always makes me cringe internally as it can lead to confusion, go wrong if you pick a value that later becomes valid, and needs documenting so that people realise -1 means null.
If I want to identify only the nulls then I added a new column as a null indicator and set it only for nulls. It could be argued that the indicator column is the “correct” thing to do from a logical modelling perspective.
Both the above suffer terribly from being usable only if you have access to the code base so you can change that, or even that you plan them in from the start (especially the null indicator column).

I think I will use both the above two techniques a lot less now and replace with the suggestions in this thread

11. Richard Foote - January 24, 2008

Hi all, thanks for the comments, they’re great and really adds to the whole piece.

Regarding the use of a FBI, I totally agree that it can be very useful and was in fact the “solution” to #6 in my things you may not know about indexes “It’s possible and potentially very useful to just index some column values and not all column values within a table”.

It’s not only useful in this scenario but in any similar example where you may have a whole bunch of common values that you would not ordinarily search on and if you did would ignore the index anyways but have a few less common values, which through the use of say a decode function, would only return a value and index it if it’s of interest via an index access.

But but but, if also requires the application to have knowledge of the function-based index (virtual column) and be coded accordingly. The approach in the blog requires no rewrite of the application and has a relatively small overhead, assuming you may also want to query some of the not null values as well.

Yes bitmap indexes index null values but but but, they’re just not suitable for OLTP environments with any real concurrency requirements, even after all the improvements in later releases. I’ll cover bitmap indexes for sure in the future.

So definitely horses for courses ..

12. Brian Tkatch - January 24, 2008

Robert and Neil, thanx for the explanations. I’m learning quite a bit from this blog and comments such as yours.

13. Mathew Butler - January 24, 2008

I’ve been aware of this behaviour for some time. I had a discussion with a colleague where I pointed out that NULLs can indeed be indexed. On production of a test case, his comment was that the behaviour must have changed, as he remembers that NULLs could not be indexed in v6.

I’m curious, has this behaviour changed, or was my colleague being a little disingenuous?


14. Richard Foote - January 25, 2008

Hi Mathew

I’m not sure which but I would say your colleague has a good memory if correct ;)

A lot lot lot has changed since Oracle 6 days, I hope he doesn’t do too many things based on how Oracle 6 may have worked !!

Oracle 6 was before my time with Oracle so I can’t say with certainty however I would be surprised if it has changed because it would mean in Oracle 6 you could not index columns in a concatenated index if it contained a nullable column.

Because let’s pretend for a minute you couldn’t index null values. Let’s say you have an index on (A,B) and say A was not null but B could have nulls.

If you had value of A = ‘Bowie’ and B=null, then this row wouldn’t be indexed in Oracle 6. Either it would give you an error (in which case B would in effect be a not null column as well if indexed) or it would simply not index the row as B has a null.

But if you don’t index the row, then the index wouldn’t contain all the possible values for A. This “Bowie” row for example wouldn’t be in the index.

That being the case, how could we use the index to retrieve all A=”Bowie”, we couldn’t as Oracle can’t guarantee all the associated values are in the index.

So the index would be useless.

So unless Oracle 6 allowed you to create useless indexes or it didn’t permit you to create a concatenated index on nullable columns, I don’t see how it would have worked ?

You get my point ;)

15. Mathew Butler - January 25, 2008

I get the point and agree with your reasoning. My view is that nothing “should” have changed in the way an index works when processing nulls ( kind of seems inherent to the way that indexes work that they need to work this way when processing nulls ).

I always argue that things even change significantly between dot releases, so some behaviours cannot be relied upon even with releases of the same Oracle base version. ( a good example being the behaviour of STATISTICS_LEVEL for the various settings through 9i and 10G releases ).

Assuming behaviour is how myths continue to propogate.


16. Richard Foote - January 26, 2008

Hi Mathew

Agree 100%

17. Jonathan Lewis - January 26, 2008

Regarding v6 – the problem isn’t one of indexing, it’s one of optimizer access paths. (The following may be unreadable, if so, I’ll email the example to RIchard and ask him to edit it into place).

drop table t1;
create table t1(n1 number,n2 number, v1 varchar2(10));
create index t1_i1 on t1(n1,n2);
insert into t1 values(null,1,’x’);

alter session set optimizer_mode = rule;
set autotrace traceonly explain

select * from t1 where n1 is null;


Execution Plan
Plan hash value: 3617692013

|Id | Operation                 | Name |
|0 |SELECT STATEMENT   |           |
|*1|  TABLE ACCESS FULL |      T1 |
Predicate Information (identified by operation id):

   1 - filter(“N1″ IS NULL)

   - rule based optimizer used (consider using cbo)


18. Richard Foote - January 26, 2008

Hi Jonathan

I’ve fixed it up a tad.

The fact the RBO can’t deal with nulls makes sense. It didn’t sound right that the problem was with Oracle6 not being able to index nulls.

Thanks for clearing it up :)

19. Mathew Butler - January 26, 2008

Ah! So the RBO couldn’t understand when “is null” can be used as an index access.

@Jonathan – was this test on 10G?


20. Mathew Butler - January 26, 2008

@Jonathan – posted the provious quetsion without thinking. Your output if dbms_xplan, and the note at the end of the output implies 10G.

I was answering some questions about a problematic RBO -> CBO migration recently ( RBO queries running faster than CBO ) , and hadn’t considered jumping between optimizers on the10GR2 platform.

Incidentally, the solution that the team came up with involved, system stats and setting *both* optimizer_index_caching and cost_adj parameters. Needing to set both of these was a surprise to me. I haven’t managed to get my hands on the system to form a view as to whether there is another approach that would provide the same result. Raised some interesting questions for me though.


21. Jonathan Lewis - January 27, 2008

I wouldn’t normally consider setting 10g to RBO – but in this case I didn’t have a 6.0 around, so I just did it to demonstrate the point.

I’d look closely at what’s going on with system statistics (CPU costing) and the value of db_file_multiblock_read_count before setting the optimizer_index_xxx parameters. (My default position is to eliminate all three parameters on upgrade).

Jonathan Lewis

22. Mathew Butler - January 27, 2008


I understood the reasoning for running 10G in RBO. I just haven’t seen this done before, though understood the RBO code still executes despite being unsupported.

My expectation when moving RB0(9i)->CBO(10GR2) was to be able to use systems statistics, the default 10G DBFMBRC ( ie: unset ) and the default optimizer_index_* parameters ( ie: unset). I think there is something screwy going on with the stats generation in the area of histograms ( they use SKEWONLY ), and possibly with their workload stats. I haven’t been able to get hands-on to get to the bottom of things, and as it stands I’m told that subject to some final testing they now have satisfactory performance. So it seems unlikely that I’ll be able to take this further.

Anyway – apologies to Richard for temporarily hijacking his blog.


23. Jeff C - January 29, 2008

It amazes me sometimes how other people are working on the exact same set of issues as I am.

We have been suffering under the misunderstanding that nulls “don’t work” in indexes, until this weekend. We had an index that was being used to resolve a query that had a nullable column way down on the end somewhere that was in the criteria (as IS NULL) . We created a new index from several columns being used to select data, but not the nullable column and forced Oracle (9.2) to use the new index.

The query slowed down badly, though the non-nullable columns in the new index should have allowed many fewer index records to be scanned. It was then that my brain turned on and I realized that the nullable column was only null for about 0.2% of the population. So, though we were reading more index rows, we were reading fewer data rows per index row. And that made a huge difference.

So, when scanning indexes, Oracle obviously uses null values. The question we’re now trying to test to gain the answer for is: can Oracle use NULL to locate a position in the index like “any other value” or does it have to read all the leaf nodes and pick out the nulls?

24. Richard Foote - January 30, 2008

Hi Jeff

The answer is that NULLs are treated as being the largest possible value and so if found in the leading column of an index, will all be sorted and grouped at the end.

This question has me thinking that it might be worth writing a short piece on how to actually prove this is the case …

25. Jeff C - January 31, 2008

>This question has me thinking that it might be worth writing
>a short piece on how to actually prove this is the case …

I’d like to see that, because my real-world data attempts to prove this have not convinced me.

26. Richard Foote - January 31, 2008

Hi Jeff

Have you checked out the lastest post ?

27. Hemant K Chitale - February 15, 2008

See http://hemantoracledba.blogspot.com/2007/08/nulls-are-not-indexed-right-not.html for my example.

However, see Note#5005939.8 {ORA-600 [qkssao2s1] }
that I referenced in the example when I created the exampleAbout #5005939.8, the error I encountered was not with SQLTUNE but with dbms_stats itself.

Recently, , I have also seen Note#551754.1 (Bug#6737251) where a query using ROWNUM and such an Index with a NULL+constant would return incorrect results.

28. Richard Foote - February 17, 2008

Hi Hemant

Thank you very much for the references.

As a general point, whenever one does something a little “differently” or unusal or uses a feature in an unusual manner, the risk is that you force Oracle down an execution path less trodden, less tested and hence more likely to find these nasty bugs or issues.

Thanks again.

29. Thomas - February 19, 2008

Hi Richard

when playing with the constant-in-index option i recognized that under the cover it obviously (i.e. according to the *_index-view) creates an FBI as well . That’s probably the reason, why RBO, though index-crazy, refuses to use that index.

30. Richard Foote - February 20, 2008

Hi Thomas

You might just be onto something …

31. cristiancudizio - March 20, 2008

Hi Richard,
there is a thing that i can’t understand. i’ve made a test of creation of a index on (columnx,’a’). I see in my db in USER_INDEXES that this index is Function-based normal. How does Oracle knows that there’s an entry in the index for every record in the table? With FBI has every record in the table an entry in the index?


32. Richard Foote - March 20, 2008

Hi Cristian

An index entry is guaranteed to exist if any of the indexed columns are non null. Only totally null column entries are not indexed.

By having a constant value as part of the index, the constant can never be null. Therefore all other indexed column values, including any nulls, must and are guaranteed to be in the index.

If Oracle knows all null values for a column are indexed, it can potentially use the index to retrieve all the null column entries.

So it’s not the FBI that’s important here (unless you’re on the RBO) as yes a FBI can potentially have “missing” null values, it’s the simple fact the constant column can never be null.

33. Oracle 11g, Istogrammi e DBMS_STATS.AUTO_SAMPLE_SIZE « Oracle and other - July 18, 2008

[...] mi spiego, se lascio il valore null e poi devo cercare i record con quel campo a null mi precludo (non sempre come ha ben spiegato Richard Foote) l’utilizzo di un indice. Utilizzando il -1 il valore viene sempre indicizzato, solo che se [...]

34. Vladimir - August 17, 2008

Hi Richard,

In the fourth paragraph of your article, you assert that if index is based on columns A and B, then only one of these columns may have null values (let’s calling it as null-column). But at the end of your article only examples with the leading null-column are represented.

Are these two cases not equivalent? What is the reason of lack of consideration of the case when null-column is not leading?


35. Richard Foote - August 25, 2008

Hi Vladmir

Because they’re indeed equivalent, I haven’t bothered to give an example the other way around.

I try to make the examples as simplistic as possible to get the “message” across, leaving it to the humble reader to experiment further for themselves.

36. Index Access best approach?? Not Always… « AskDba.org Weblog - December 18, 2008

[...] null columns too if index is a concatenated index and the other column has non-null value. Refer to this article from Richard Foote). CBO assumed that only 1 row will be returned and it choose index [...]

37. VKoul - December 7, 2009

Hi Richard,

Great Great post !!!

Oracle 10g Express Edition For UBUNTU Linux 9.10.

While experimenting with the example, I created another index as:

SQL> CREATE INDEX test_nulls_i4 ON test_nulls(pct_free, ”) COMPUTE STATISTICS;

Index created.

SQL> CREATE INDEX test_nulls_i5 ON test_nulls(pct_free, NULL) COMPUTE STATISTICS;

Index created.


Q1 : I was expecting it would not let me create TEST_NULLS_I5 and would say that such set of columns already indexed, as I was expecting it would treat a null string same as NULL. What could be the reason behind it ?

Q2 : After running the query it uses the INDEX TEST_NULLS_I4 (for what so ever reason) and returns ZERO rows. If I force it to use FTS, it returns me the correct number of rows.


SQL> @auto_on
Setting On “autotrace trace explain statistics”
Verifying …
Setting On “Timing”
Verifying …
timing ON
SQL> SELECT * FROM test_nulls WHERE pct_free IS NULL;

no rows selected

Elapsed: 00:00:00.03

Execution Plan
Plan hash value: 1977379870

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 50 | 10300 | 5 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST_NULLS | 50 | 10300 | 5 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | TEST_NULLS_I4 | 50 | | 2 (0)| 00:00:01 |

Predicate Information (identified by operation id):

2 – access(“PCT_FREE” IS NULL)

318 recursive calls
0 db block gets
77 consistent gets
4 physical reads
0 redo size
3172 bytes sent via SQL*Net to client
373 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
6 sorts (memory)
0 sorts (disk)
0 rows processed


Richard Foote - December 8, 2009

Hi VKoul

You have stumbled across a bug here.

Note that it makes no real sense to concatenate a null to an index as an index entry that contains nothing but null values is not actually indexed. The whole point of this post was to add a not null constant to the index thereby ensuring all nulls are indeed indexed.

However, although the index is indeed not indexing the empty string as expected, meaning the null index entries are not actually stored in the index, the CBO is incorrectly thinking such values are in the index and is hence using the index in an invalid manner.

Bug Bug Bug !!

I get the same results in as well.

Advice. Don’t append an empty string to an index ;)

38. Gary - December 8, 2009

For the first part of VKoul’s issue, if you do a
select table_name, column_name, data_type from user_tab_cols where table_name = ‘TEST_NULLS’;
You’ll see the ” column was defined as a CHAR, but the NULL was defined as a VARCHAR (both zero length).

The incorrect result, I guess, Oracle getting confused whether a zero length fixed length string is null or not. I can also reproduce the bug in 11gR2

create table test_nulls (pct_free number);
insert into test_nulls values (1);
insert into test_nulls values (null);
CREATE INDEX test_nulls_i4 ON test_nulls(pct_free,”) COMPUTE STATISTICS;
CREATE INDEX test_nulls_i5 ON test_nulls(pct_free, NULL) COMPUTE STATISTICS;
select table_name, column_name, data_type, data_length, nullable from user_tab_cols where table_name = ‘TEST_NULLS’;
————- ————– ———- ———– -
select dump(SYS_NC00002$) from test_nulls;

select /*+NO_INDEX (T TEST_NULLS_I4)*/ count(*) from test_nulls t where pct_free is null;

select count(*) from test_nulls t where pct_free is null;

Richard Foote - December 8, 2009

Hi Gary

Indeed. Thanks for the demo.

The other point I would make is that if you tried now to collect stats on the virtual column, it will fail with a divising by zero error.

I guess not many folk try to append empty strings to their indexes ;)

39. goiyala3 - December 29, 2010


I tried that. but it did not take index .

SQL> create table yy4 as select * from dba_objects;
Table created.

SQL> ect object_id,object_name from yy4 where rownum update yy4 set object_name=null where object_id=46;

1 row updated.

SQL> update yy4 set object_id=null where object_name=’IND$’;

1 row updated.

SQL> select object_id,object_name from yy4 where rownum create index idx_yy4 on yy4 (object_id,object_name);

Index created.

SQL> exec dbms_stats.gather_table_stats(ownname=>’MAS’,TABNAME=>’YY4′,CASCADE=>TRUE);

PL/SQL procedure successfully completed.



Execution Plan
Plan hash value: 3318007362

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 5 | 51 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 5 | | |
|* 2 | TABLE ACCESS FULL| YY4 | 1 | 5 | 51 (0)| 00:00:01 |



Execution Plan
Plan hash value: 3318007362

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
| 0 | SELECT STATEMENT | | 1 | 19 | 51 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 19 | | |
|* 2 | TABLE ACCESS FULL| YY4 | 1 | 19 | 51 (0)| 00:00:01 |

40. Repeat After Me: NULL Values are Not Stored in Indexes? « Charles Hooper's Oracle Notes - February 28, 2012

[...] several techniques for utilizing b*tree indexes to locate rows with NULL values in a column.  Richard Foote’s blog also has at least one article that describes how to use b*tree indexes to locate NULL values in a [...]

41. Yes, yes you can index nulls » SQLfail - October 3, 2012

[...] by posts from Charles Hooper and Richard Foote. Share this:FacebookTwitter  Posted by Chris Saxon at 08:00  Tagged with: index, [...]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

Join 1,713 other followers

%d bloggers like this: