As a developer, I often get across some database designs, that are quite complex caused by a developer not quite understanding the problem (we’ve all been there!), and therefore cannot solve it correctly.
The problem is as follows (or similar):
You have 2+ tables:
It’s quite simple, you have two types of data (media and documents), that you want to store in your database. I get that, I would too.
Now the requirements are as follows:
Both media and document has a set of user defined properties.
Said properties must store the following values: Type and Value, and a reference to both Media and Document.
There are a couple of ways to solve this lil’ problem, one (the one I encounter the most):
In this setup the Property-table knows about both media and document. We could make the two foreign keys nullable, either way we depend heavily on our code to keep media and document properties separated. And what happens if we add an other type (say Users), then we have to add a new foreign key to the property-table, and expand our code even more.
An other approach is this:
I must admit, I have done this one as well as the other one, and just admit it, so have you at some point!
So what are the pro’s and con’s of this setup: Well the pros are simple, neither media or document are referenced in the property table, we can have as many properties as we want per media and document, and we can quite simple add other types, such as Users. BUT:
When we have this setup, we must rely heavily on our code to help us not to have the same property on more than one media, and to ensure we don’t mix media properties with documents and users. And if we add an other type (Users) we must create, not only one, but two new tables, and still expand a complex code to handle that new type as well as the other types.
So how can we solve this problem?
We have Media, Documents and more types, that has dynamic properties without the other types must know about it, we could do this:
Yeah, I’ve also done this one. And this is almost, (I wrote, almost), as bad as the other ones. Well no property can be on more than one media (or document, or whatever), and no property can be on both media and document, so whats the problem?!
Well, for starters, we have to tables instead of one, per type. If we add an other field to our properties, we must add them to all of our *Property-tables. And if we want to list all properties, including the media/document/user/whatever it is attached to, it’s nearly impossible.
So here’s the solution, I find most fitting for the problem:
So, this is my solution. I have added a Node-table, with the shared fields from Media and Document (ID and Name). Removed ID- and Name-fields from Media and Document, added a NodeID field, as both primary key and foreign key, this field must NOT be autoincremented! It will not work, then I added a Property-table, that references the Node-table.
The pros and cons: The pros are easy, One table per type, each type gets its ID from the Node-table, all properties are stored in one table, referencing the Node-table, so a Document can get its properties, using only its primary key. No property can ever be on two entities at once, and no entity knows about other entities or properties, except its own.
The cons are, that we must have some code that handles the inheritance. When I make a SELECT * FROM Media, I must make a JOIN on the Node-table as well. If you’re a .NET developer, like I, then you should take a look at the Entity Framework, as it handles this smoothly. I will write a post on that later on.