(I’m speaking at an undergraduate database class today at UNCW and needed a place to stick some supplemental notes, and thought I’d put them here.)
Some best practices in data modeling for web application development, particularly for “backend” business systems:
1. Initially focus on database tables, fields and relationships and not what the web pages are going to look like. If the data model is right, the user interface will (pretty much) take care of itself.
2. For all major “things” in your system, have a table for notes: product_note, client_note, quote_note, task_note, etc. (And encourage users to add notes.)
3. For each database table, have the following fields:
- date added
- date last modified
- last modified by
4. At the application level, have edit locks to prevent two users from trying to edit the same record at the same time, with the one submitting the form last overwriting the one who submitted it first. (At the data model level, in addition to the three recommend fields above, also have “date last locked” and “last locked by”.)
5. Think through (and discuss with your client) one-to-many vs. many-to-many relationships. For example, when setting up a database for a school, does each class have one and only one teacher or can a class have multiple teachers? When discussing this with a client, ask them if the relationship in question needs a select list (one-to-many) or checkboxes (many-to-many) in the web form. In our example, some schools will be one-to-many but others will be many-to-many. Getting this right from the start will save headaches down the road.
6. Log user access — both system access (logins) and page access (page views and actions). You probably won’t often need to know who accessed what page when and from where, but when you do, having that info in the database is golden.
7. Be careful about losing historical information. For example, if you have a product table that contains “unit_cost”, you might be tempted to not include “unit_cost” in your order_line database table, since that info is already available in the product table and “don’t duplicate info” is a value. However, over time the products’ costs are likely to change, and at that point you won’t know what the cost was for a product ordered last year unless you captured that info in the order_line database table. The solution is to have the “unit_cost” in the order_line table.
8. When setting up the data model for user access, seriously consider having users and usergroups be “many to many”, so that user access to particular parts of the system is “nonlinear”. I’ve seen others’ systems where each user could be given basic access, or intermediate access, or admin access, and I’ve heard about a system where one’s access rights were a scale from 1 to 100, with someone at a 70 getting everything below that, and someone at an 80 getting everything below that. It’s much better to have multiple usergroups (e.g. customer service, accounting, marketing, etc.) and then allow users to be in multiple usergroups based on their roles within the organization, which each usergroup having its own set of access permissions.
9. I name all of my “join table” tables with a “j” at the front, e.g. jstudent_class and jemployee_usergroup. This is really helpful for quickly knowing what relationship type the table encompasses.
10. Sometimes calculated fields are OK. For some reason at one point in my life I was under the impression that calculated fields in database tables were evil, but there are times when they are very helpful, particularly depending upon the framework you are using. With wm, we can set a flag for a field being a calculation, so that it doesn’t show up in the add/edit form, but it does show up in the display and the search filters. The only catch is that we need to add the code for the calculation, but we have things in place to handle that.
———–
Any comments, or any more you’d like to add?