jump to navigation

How Are NULLs Actually Indexed ? (Fascination) January 30, 2008

Posted by Richard Foote in Index Internals, Indexing NULLs, Oracle General, Oracle Indexes.
trackback

A nice question by Jeff regarding how Oracle goes about indexing NULLs has prompted me to show how one could go about actually determining the answer. The basic question is are NULLs treated as just another column value and grouped accordingly or does Oracle have to somehow search through all the leaf blocks looking for all occurrences of these mysterious NULLs.

The answer is that NULLs are basically considered to be potentially the largest value possible by Oracle and so are all grouped and sorted together at the “end” of the index structure (assuming the column is the leading column in the concatenated index, else they’ll be listed last for each distinct column that precedes it in the index).

The fact that index range scans are just as efficient when searching for NULLs values as for any other value strongly supports this assumption, but how does one actually prove it ?

The first obvious thing to check would be to create a little table and associated index with a few rows and a few NULL column values thrown in and see the results of a SELECT … ORDER BY. One would expect the order of an ascending index to match the order of the resulting output. Indeed, NULL values are by default listed last in ORDER BY ascending listings suggesting they would likewise be grouped and sorted last within an index.

The next thing to check would possibly be to use the DUMP function to again see what Oracle is likely to do with NULL values. The DUMP function displays the raw decimal representation of the specific character (depending of course on the character-set) . For NULL values however, there’s actually nothing to display other than a NULL text to represent there’s nothing actually there.

The best place to check of course is within the actual index itself. By determining the actual block that stores our example index, we can perform an index block dump and look at the resultant trace file that describes a representation of the index block to see precisely how Oracle deals with NULLs within indexes.

A quick check of the HEADER_FILE and HEADER_BLOCK in DBA_SEGMENTS will give us the index segment header location.To find the associated index root/leaf block simply add 1 to the HEADER_BLOCK.

Dump the block via the

ALTER SYSTEM DUMP DATAFILE a BLOCK b

command and look at the trace file in USER_DUMP_DEST (where ‘a‘ represents the datafile id and ‘b represents the block id determined from dba_segments).

The resultant output clearly shows that yes:

  1. Leading column NULLs values are all grouped together
  2. They are all listed at the “end” of the index structure
  3. Any NULLs in the non-leading indexed columns are listed “last” for each distinct value in the leading columns in which they appear
  4. Any index entry consisting of nothing but NULLs are not actually stored within the index

This NULLs Index Dump demo goes through this entire process with a little working example and describes the relevant section of the index block dump.

I spend some time discussing block dumps in my seminar as it’s an extremely useful tool when determining and learning how Oracle actually works.

About these ads

Comments»

1. Brian Tkatch - January 30, 2008

“*** Note how the NULL values are logically ordered last in the output”

That’s the default. The query uses ORDER BY. ORDER BY defaults to ASC. ASC defaults to NULLS LAST. Docs

Is there some way to show a better test by how many block reads it does? Such as grabbing the first and last blocks to get early and later data?

2. Richard Foote - January 30, 2008

Hi Brian

The SELECT … ORDER BY was designed simply to show how NULLs are likely to appear in the index.

The “better test” was to actually see how NULLs are stored in the index via the block dump. You need to read the next bit in the demo ;)

3. Brian Tkatch - January 31, 2008

Hey Richard,

I don’t understand why that is “likely”. Do you mean to say, that since when an ORDER BY is used, there is a default, the INDEX is likely to store tham as such?

I *did* read the entire demo. Sheesh. :P I meant to offer a better test in place of the first test. Obviously the block dump was better. I was wondering why that wasn;t the *only* test.

4. Richard Foote - January 31, 2008

Hi Brian

The question was how are NULL values stored in the index.

“Do you mean to say, that since when an ORDER BY is used, there is a default, the INDEX is likely to store tham as such”.

Precisely. Index entries need to be sorted and stored as such. In an ASC index, it just seemed “likely” that the index entries would be sorted in the same manner in which an ORDER BY ASC would sort data by default.

However, it was the block dump that actually proved this was the case. And perhaps more importantly, hopefully illustrated a way in which everyone can check these things for themselves.

5. Brian Tkatch - January 31, 2008

OK, it was the line “Indeed, NULL values are always shown to be listed last in any ORDER BY listings” that threw me off. Particularly the word “always”.

6. Richard Foote - February 1, 2008

Hi Brian

No worries, if that throws you it will likely throw others as well so I’ve changed the wording a little, removing “always” and chucking in the odd “default” instead.

Thanks for your feedback !!

7. tanelp - February 3, 2008

Hi Richard, one comment regarding following:

“If the index is in a NON-ASSM tablespace, simply add 1 to the HEADER_BLOCK to find the associated index root/leaf block, else you’ll need to skip a few more additional blocks due to the ASSM bitmap blocks.”

Actually the index root block is always immediately next from segment header. That helps the index traversing functions to find the index root block without having to read the segment header. The first ASSM bitmap blocks will be written to the beginning of segment, the segment header comes after first ASSM blocks and then the index root block immediately follows…

8. Richard Foote - February 3, 2008

Hi Tanel

Thanks, you’re of course absolutely correct.

This is a “mistake” I’ve made a number of times as I previously always used the file_id and block_id from the first extent in dba_extents to do this sort of thing in the days before ASSM (out of pure habit). I now generally use dba_segments to simplify the arithmetic but still manage to get it wrong :(

Thanks again, I’ve now corrected the post.

9. Mithun - February 11, 2008

Hi,
Dunno if this is the correct place to put this question, but since lot of discussion going around NULL, firing it here.

What is the order in which NULL values get orderd..?

for eg:- take this query

select emp_name, salary from emp order by salary;

If there are a few employees, say 5 employees for whom salary is NULL, then what would be the order in which those 5 employees would be ordered?

I was expecting for it to be ordered based on rowid, but then that is not the way its happening.

10. Richard Foote - February 11, 2008

Hi Mithun

As with any ORDER BY clause, Oracle will only guarantee the order of the specific columns as specified by the ORDER BY.

In your example, Oracle will only gurantee that the returned data is in SALARY order (with all the NULLS listed at the end by default). However, the order of those salaries with the same value is undetermined.

Therefore, the order within the 5 NULLs is undetermined and depends on how Oracle has gone about processing and returning the sorted salary set. This will vary between Oracle releases and on factors such as whether there was parallel processing etc.

11. Vladimir - August 21, 2008

Hi Richard,

In demo “NULLs Index Dump demo” you insert rows in successive order so, that NULLs are follow after not-NULLs. Say more – inserting columns values are in the order of every branch of index if we would to build it.

Would it be more correctly (for the “purity of experiment”) to reverse the order of inserting rows?.. :)

Thanks.

12. Richard Foote - August 25, 2008

Hi Vladimir

Wow, while I’ve been setting in front of the TV watching the Olympics, you’ve really been reading through these posts !!

Possibly for the “purity of the experiment” I could have ordered things differently, but again I make the point that a main reason why I show how I explain things with demos and the such is so that you the humble reader can experiment with the demo yourself.

If you insert the rows in a different order, what difference does it make to the order of the index entries ? Give it a go and see … :)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 1,861 other followers

%d bloggers like this: