Managing relational tables in a database is a topic that all developer will eventually bump into. There are quite a few ways of managing relational data, and the best solution sometimes can be hard to find. As a developer you have to look at the scope of the application, the types of tools you have on hand, and where you want the responsibility for the referential integrity to reside. The most often used ways of maintaining referential integrity are as follows:
- foreign key constraints – simply a key in a table that refers to the index of another database table. MySql Server as of 3.23.44 supports this feature in InnoDB based tables. You can use ON DELETE and CASCADE functions to lock in update and delete logic. There are several benefits to pushing the ref. responsibility to the database. Introducing constraints makes it much more difficult for a developer to frell with the database. Also, the processing overhead for checking the tables is moved from the application server to the database layer. This can be good for low horsepower or high demand systems. This also frees the developer from writing code that maintains integrity, such as searching for orphans, etc.
- application logic – oh, the joys of writing application code to maintain your database tables. I would have to say that one of the best advantages of maintaining the database from the application layer is the amount of control you have. Also there is the issue of portability. It’s a lot easier to maintain code that doesn’t rely on a specific database set up to work properly. You will have to write more code for sure, but if you know your SQL, the additional lines of code can be fairly minor.
One pitfall I want to mention here is one that some inexperienced coder might fall into, and thats the concept of looping over what should be a JOIN query to produce the same results. Take the example of a table of customers, with a one-to-many relationship to the orders via a relational table to info stored in an orders table. (Each customer has multiple orders.) If the deletion of a customer record mandated the removal of all of their orders from the system, a coder may be temped to retrieve the customer’s id, then query the relational table for all orders that belong to the customer. Once that query is complete the developer will use a loop and a delete query in the application logic to perform the delete. It sounds crazy, but I’ve fixed a ton of applications that suffer from this type of coding.
There are several problems with the above example. The most important issue deals with integrity. If any part of the above fails or is interrupted you will end up with orphaned records in the database. Since the type of coding that produces orphans, rarely provides for their cleanup, you will tend to build a backlog of dead records in your table. nk no good. To a lesser extent, there are serious performance issues to deal with if you are running a loop of deletes against a table.
An overlooked solution to the above issue is that you can use a JOIN in a DELETE just as you can a SELECT. Using the customer / order relation described above, one could do the following:
DELETE orders.*, ordersRelativetoCustomers.*
LEFT JOIN orders
ON ordersRelativetoCustomers.orderId = orders.orderId
WHERE ordersRelativetoCustomers.customerId= ’12’
Using the code above you can delete multiple orders from a single customer without relying on a application logic loop.