Your data schema will change. This is as certain as death and taxes (although the income tax is really only about 100 years old!) You design your project and your data schema, you implement and deploy your solution and... boom!, you need to change something. When your database design needs to change and you have code in production, or your tables are being accessed by many other applications, you can't just rename columns or delete tables. Scott, in his book Refactoring Databases: Evolutionary Database Design, talks about how to implement changes to your schemas without affecting current production code.
During his talk at SD West, Scott mentioned that data problems in the United States is costing us about 600 billion dollars! I truly believe that with proper evolutionary database design practices in place, we can bring that value down quite a bit, perhaps making some money in the process. After all, $600 billion seems like quite a reasonable market to me!
Unfortunately, we currently lack tools to version our database schemas. The only one I've come across so far is from Ruby on Rails and is called Migrations, which is really useful. You basically edit a file for each new revision of your schema that tells Migrations what to do to get your schema to the next version number. Then, should you want to go from version 2 to 8, Migrations will do that for you automatically.
But I need something more powerful. I work with two other developers and we might each have a different version of a database design on our desktops in order to isolate, or sandbox, our development. Once we are ready, we apply our changes to the integration server. This means that four different computers will have different schema versions. If I integrate first, and then Joe integrates, what happens? I added a column A to a table and he removed the table! We have a conflict. What if he added a column B and I added a column C to the same table? We need to merge.
Wait, that sounds a lot like source control! Yep, and Migrations doesn't do that. I want to create a tool that handles database schemas the way source control handles source code, with commits, branches and merges in a way that allows programmers to experiment with their designs. Exploratory and evolutionary data schema design; we'll see where this takes me!

0 comments:
Post a Comment