Demonizing Foreign Keys (“Helden”) August 25, 2009Posted by Richard Foote in Foreign Keys, Richard's Musings.
My mate Marcel Kratochvil, a fellow Canberran Oracle ACE Director and David Bowie fan, has recently published a paper finally called “Discussing Foreign Keys and their usage in the Database” in which (in his own words) he is “demonizing foreign keys”.
It’s a really good read, well worth the effort, in which covers various interesting topics but primarily questions the general usage of Foreign Keys in the relational (or object-relational) database, especially in the world of multimedia. He’s been arguing against the general use of Foreign Keys in the database for a long time and has finally put down his thoughts on the subject. I would highly recommend giving the paper a read and see if it might alter (or confirm) your own views on the subject.
Although I think Marcel makes many a good point and I agree with much of what he says, I however basically disagree with his general central notion and arguments against Foreign Keys. I guess I still consider myself a “Foreign Key Dinosaur”
These are just some of the comments and feedback I gave Marcel when I was given an early look at the paper. I suggest you read Marcel’s paper before my comments so they make more sense and can be put into context and perspective.
“Let me say that I agree with much of what you say and you make many valid points. Certainly the world of multimedia is one which we at work are slowing moving into and it certainly has many challenges. Now, with respect to the specific issue of FKs in the database, I suspect we’re actually in some ways in agreement. I agree with you that there isn’t necessarily one correct answer for all situations. One of your closing statements:
“So by all means use Foreign Keys, but don’t blindly use them. Use them when it makes sense, when it solves a tuning issue or when maintaining a relationship between tables is crucial to the application running”
is actually pretty well what I think as well. However, I suspect our interpretation of what this actually means differs somewhat with my opinion being that it makes sense to use FKs far more widely and comprehensively than perhaps you suggest in your paper. So our likely differences of opinion lies mainly with the occurrences of when it makes sense to use FKs. However, your paper doesn’t alter my fundamental view that it almost always makes sense to use FKs in the database where possible.
Firstly, many of your comments suggest DBAs generally find FKs problematic, developers like to push them. As a DBA (and ex developer of nearly 10 years), I find FKs to be essential in many of my activities and find they generally assist, rather than impede my tasks. More on this later but the importance of FKs on DBAs activities and responsibilities is paramount.
I don’t think anyone seriously suggests that FKs (or other constraints) ensure data integrity in the general sense. Yes, someone can enter incorrect details or pick the incorrect menu option. However, constraints and FKs specifically are an important method of reducing the occurrence of data integrity related issues that if not implemented in the database, needs likely to be implemented less effectively elsewhere. More on this to come but whenever I mention data integrity later, I mean it only in the sense of having consistent data relationships that follow key business rules.
Examples are good because the strength of the example can often be telling. Take your deleting an account but keep the purchase order example. Tricky scenario but one which simply requires a correct physical model with FKs to support it to start with. If person accounts must be deleted, then either it doesn’t indeed make sense to use a FK in the purchase order table to point to the person account or the FK needs an “on delete set null” type logic or the FK needs to point to a entity details table that contains data that doesn’t have to be deleted, etc.. Will the tax audit really be happy to have purchase orders associated with unknown entities, how can you prove the purchase was legitimate and not a fraudulent tax reduction if the purchase order can reference entities that don’t exist. A entity number that references nothing is meaningless therefore it either needs to reference something else, perhaps sufficient person details to keep both privacy and auditing requirements satisfied, or it should be removed. A FK would be
critical in such an application but it needs to point to the correct entity and/or be maintained appropriately to ensure the business rules are followed.
The issue of “not trusting” developers is an interesting one. I fundamentally don’t trust developers to perform and code FK functionality outside the database for 4 key reasons:
1) Developers come in sorts of shapes and sizes, have all sorts of experiences and come in varying degrees of quality. There are good developers and not so good developers and as I have no control on their hiring and firing, can’t assume they’re necessarily “trustworthy” when it comes to ensuring their code will maintain necessary data relationship integrity. I’ll trust FKs in the database over developers in this respect any day as FKs in the database are far simpler to implement and control.
2) Even the best of the best, the most brilliant of developers is limited by being human. And no matter how good, humans make mistakes. They occasionally write code that might leave orphaned data, they write code that might not take into consideration a specific business rule, they change the business rule in the application but forget to change it correctly in a specific module. A FK in a database doesn’t make a mistake. If it’s there, it’s centralised and it’s enforced. Period.
3) Note the database itself, with all its clever internal code, must still lock tables when a parent record is deleted/updated/merged. Now Oracle internally can get around this if an index exists on the FK because it can in effect check the index structure to ensure no new associated FKs are inserted. This “trick” is only available to the Oracle code, a developer can’t effectively lock a row in an index that doesn’t exist. Therefore, if Oracle itself is forced to lock a table to ensure the relationship between tables remains consistent, a developer must likewise lock a table during any similar operation. The very best of trustworthy developers only really has the choice between locking tables or potentially allowing “dodgy” data in the system. Give me an indexed FK constraint any day as it will fundamentally be more efficient to implement.
4) Even the best developer, writing the most perfect of error proof code, can’t necessarily control how data is accessed and changed in the database. In highly complex environments (note some of our applications have 2,500+ tables), with these users that make mistakes, it’s virtually impossible for the application to cater for every possible data “cleanup” scenario. Therefore, DBAs (in particular) and other power users are frequently required to go in the database via the “backend” and clean things up. I
don’t know of any of our more complex applications, be they in-built or 3rd part apps, where this hasn’t been necessary. Business rules in the application, with FKs effectively managed in the application are of no use when data is not access via the application. And we can’t wait the 6 months it might take to change the application to prevent these scenarios from occurring or to make the application able to deal with the data spill or data “integrity” issue.
Expanding on this point a little, we have many many Oracle applications. These are “real-world”, modern applications. I can tell you without exception, that the most problematic, worst performing, data integrity error filled atrocities of applications we administer have a common characteristic. They don’t have FKs. I don’t have to imagine an application that has no FKs, we have a number of 3rd party apps, running in production right now that have no FKs, ranging from case management systems to CAD systems that I need to deal with every day. Guess what. They have various performance issues AND all manner of data integrity issues, directly as a result of not implementing FKs in the database. I’m sure these software companies trusted their developers … As a humble DBA, we have little control over the adoption of these various apps, although lessons are being learnt and things are slowly changing.
Not only are data integrity issues much more prevalent in these apps, I also don’t have to imagine the nightmare that is trying to resolve data related issues without FKs in the database. If I have to remove (say) an over classified document after its been incorrectly inserted a few weeks ago into the system and subsequently linked and referenced all over the place, it’s extremely difficult to safely remove it. We have no way of knowing the necessary dependencies, if you delete a row here or there, how to ensure there’s no orphaned data left behind as a result and measure the impact in other parts of the apps. Interestingly, even the software vendors themselves struggle and forever provide scripts that don’t clean data out correctly. Application upgrades invariably result in corrupted data relationships.
Our applications with FKs have far less issues and when we do need to clean things up, the FKs both protect us and clearly document what needs to be done. One can’t “accidentally” delete a parent row, leaving orphaned data behind, the enabled FKs won’t let us. We know the impact of changing this column definition or dropping this column, the FKs tell us all the related dependencies. We don’t go in blind, we don’t have to guess, we know. We know that by removing (say) this document from the system, it will impact all these related cases, that this person will no longer have these details recorded against him. And no, having some data integrity issues due to FKs being left out is not acceptable. The implications of having say orphaned data might mean a vital document containing critical intelligence information is inaccessible by the application, resulting in perhaps a fugitive being able to leave the country or conversely, sensitive data still being accessible. The implications here can potentially be life threatening. It doesn’t get any more real world than that and FKs are a vital component in ensuring the correct data is visible when it should be.
When the help desk forwards an ORA-02292 – child record found error or if we get one during a data spill cleanup, unlike some who grumble and complain, I just thank goodness for the FK. Because someone is attempting something that a business rule doesn’t allow and hasn’t thought through the implications. Without the FK in the database, the operation would have succeeded and caused a data integrity issue. No, the child records need to be removed as well or no, the parent record needs to be logically deleted instead or no, you simply can’t now perform that operation due to the live data in the system, etc. The FKs help to protect us from ourselves, be it at the
application level or indeed via any accesses to the database.
Without FKs, one is blind to the relationships and dependencies between objects within the database. The application is not a pretty place to search and neither is the documentation, assuming it exists and is up to date.
Not only is the DBA and others “blind” to these relationships if the FKs are not enabled in the database, but so is the CBO. A point you’ve missed a little in the paper is the performance implications of the CBO not being able to make important assumptions on the data relationships if FKs are missing. It’s not a good idea to hide information from the CBO, it reduces its options and limits its decision making processes. It can’t for example determine and ignore redundant join conditions, it can’t make use of query rewrite opportunities, it can’t assume it can use an existing index or use an efficient join method, etc. And the CBO gets cleverer and can make use of database constraint information more often with each new release. Without enabled FKs, the CBO can’t use this information and might make sub-optimal decisions as a result.
At the end of the day, I agree with your statement that we should use FKs when they “make sense”. It’s just IMHO, they make sense far far more often than you appear to suggest. I work with applications without FKs in the database, it isn’t pretty !!”
Marcel has started a Foreign Key Discussion website where you can find his paper and make comments on the various subject matters discussed. I’m sure Marcel would welcome any such comments and feedback.
The CBO and Indexes: OPTIMIZER_INDEX_COST_ADJ Part III August 20, 2009Posted by Richard Foote in Index Access Path, OPTIMIZER_INDEX_COST_ADJ, Oracle Indexes.
After a bit of a layoff to organise a few upcoming overseas trips, while watching plenty of Ashes Cricket and the brilliantly funny “Flight Of The Conchords” DVDs, it’s about time I got back to my humble little blog.
In Part II, we looked at a really bad way to set the optimizer_index_cost_adj parameter, by just setting it a really low value and allow indexes to blindly reign supreme in the database.
Remember, the purpose of the optimizer_index_cost_adj parameter is to accurately reflect differences and discrepancies in costs associated with single block I/Os when compared with corresponding multi-block I/Os so that the CBO considers and incorporates these discrepancies in its costings.
A second method of setting the optimizer_index_cost_adj parameter is to set it to a value that attempts to accurately reflect these comparative costs. So if a single block I/O is typically only half as expensive and/or only takes half the time to complete when compared to a multi-block I/O, then a reasonable setting for the optimizer_index_cost_adj parameter would be 50.
So how to set the optimizer_index_cost_adj parameter “intelligently” ?
Well, Oracle has excellent instrumentation and the comparative wait times for each of these types of I/Os are automatically measured and captured by Oracle. A single block I/O as performed typically by an index range scan is measured via the “db file sequential read” wait event while the multi-block I/O as typically performed during a FTS is measured via the “db file scattered read” wait event.
By determining the average wait times for each of these events and comparing the differences, one can determine how much longer it takes on average for one type of I/O to complete versus the other. This will then provide us with a reasonable starting point with which to set the optimizer_index_cost_adj parameter.
One can simply look at these average wait events for the database since startup by querying v$system_event:
SQL> select event, average_wait from v$system_event where event like ‘db file s%read’;
EVENT AVERAGE_WAIT ----------------------- ------------ db file sequential read .59 db file scattered read .78
In order to determine these wait events during a specific time period to perhaps better reflect typical loads during these times, one could also simply run a statspack or an AWR report and look at the wait event section of the report.
So in the above example, a “sequential” read only takes approximately 75% of the time when compared to a “scattered” read. As such, a value of 75 would be an appropriate starting value with which to set the optimizer_index_cost_adj parameter.
With the I/O costing model, the CBO is basing it’s costs on the number of I/Os performed by each possible access path. If an index is only going to take 75% of the time to perform it’s associated I/Os when compared to the time it takes to typical perform I/Os during a FTS, it’s reasonable to adjust the associated costs of an index access down to 75% of its overall costs.
This will hopefully have the desired effect of making it a “level playing field” between an index based access path and a FTS when determining how long all the I/Os associated with each possible execution path might take.
If we plug a value of 75 into the optimizer_index_cost_adj parameter and re-run the demo in Part Iwhere the CBO initially choose the more expensive FTS which had a cost of 65:
SQL> alter session set optimizer_index_cost_adj=75;
SQL> select * from bowie_stuff2 where id in (20, 30, 40, 50, 60);
10000 rows selected.
Plan hash value: 2964430066
| Id | Operation | Name | Rows | Bytes | Cost |
| 0 | SELECT STATEMENT | | 10000 | 175K| 60 |
| 1 | INLIST ITERATOR | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID| BOWIE_STUFF2 | 10000 | 175K| 60 |
|* 3 | INDEX RANGE SCAN | BOWIE_STUFF2_I | 10000 | | 27 |
We note the CBO is now choosing to use the index, which is the more appropriate plan as it provides a somewhat faster response that the previous FTS.
However, if we also re-run the demo from Part IIwith the optimizer_index_cost_adj also set to 75, where previously Oracle initially choose to use a FTS quite correctly:
SQL> alter session set optimizer_index_cost_adj=75;
SQL> select * from bowie where id between 1 and 1000;
1000873 rows selected.
Plan hash value: 1845943507
| Id | Operation | Name | Rows | Bytes | Cost |
| 0 | SELECT STATEMENT | | 1000K| 69M| 16499 |
|* 1 | TABLE ACCESS FULL| BOWIE | 1000K| 69M| 16499 |
We note that the FTS is still selected as the change in the CBO index related costs were not significant enough to change the execution plan. A really low value of 2 for the optimizer_index_cost_adj parameter really stuffed things up previously, but a more appropriate value of 75 in this database has ensured that the FTS is still chosen when appropriate.
So in both scenarios, the CBO is now choosing an appropriate execution plan. By setting the optimizer_index_cost_adj parameter in a logical manner, consistent with the relative wait time differences between single and mutli-block I/Os, the CBO is more likely to choose appropriate execution plans.
Of course, there are always likely to be some discrepancies when dealing with such “averages”. We only have the one parameter after all which impacts the costs of all index range scan access paths, so we can only deal with averages. Perhaps there are some specific indexes which take significantly more (or less) time to complete than the average, as their associated I/Os are impacted by where the blocks might physically sit on the disk arrays, or on contention issues due to other concurrent activity, or on index caching characteristics (Note: I’ll discuss the optimizing_index_caching parameter at another time), etc. etc.
Same for some specific FTS which have multi-block I/Os that take significantly less (or more) time to complete than the average, as it’s associated I/Os might be also be impacted by similar factors. Perhaps some of these I/O characteristics and timings might change depending on the load on the system at different times of the day or week or month.
But that’s what an “average” value means right, some objects will have a higher (or slower) value while some have a lower (or faster) value.
So setting the optimizer_index_cost_adj parameter is not a precise science although of course the CBO in general is not a precise science either and close enough is usually good enough for the vast majority of cases. The name of the game is ensuring that the parameter is set to a value that’s in the “ballpark” and using the associated wait events to determine comparative wait times for single and multi-block I/Os is a reasonable way to do this.
However, despite being able to set the optimizer_index_cost_adj parameter in a reasonably “intelligent” manner, my preferred method of setting this parameter is still method number 3. That is to simply not set the optimizer_index_cost_adj parameter at all and leave it at the default value of 100 and use system statistics and the CBO CPU costing model instead.
By generating and maintaining accurate system statistics, you can effectively get the desired “level playing field” benefits of a well tuned optimizer_index_cost_adj parameter in a somewhat easier manner but with a few other added benefits as well. I would therefore strongly recommend the use and implementation of system statistics and leave the optimizer_index_cost_adj parameter well alone. IMHO, the optimizer_index_cost_adj parameter is there now only for backward compatibility reasons since the introduction of the CBO CPU costing model.
However, these discussions have not all been in vain because the optimizer_index_cost_adj parameter still has an impact even with system statistics in place. It’s just that the use of the optimizer_index_cost_adj parameter in conjunction with system statistics typically has the effect of screwing up the “level playing field” environment system statistics is meant to create.
Also, the costing formulas for indexes as previously discussed are still very much relevant as the CPU costing model often has little impact on the actual costs associated with using indexes. As I’ll discuss later, system statistics actually achieves a very similar outcome to the optimizer_index_cost_adj parameter. It’s just that it does so in a somewhat different manner by generally increasing the associated FTS costings to a more appropriate comparative value, rather than simply decreasing the index related costs, while taking both I/O and CPU overheads into consideration.