Rethinking Agility in Databases - Part III: Testing

Skip Navigation Links

Rethinking Agility in Databases 2/29/2008 2:03 PM

view as multiple pages

by Max Guernsey, III - Managing Member, Hexagon Software LLC

Introduction

This article builds upon the concepts laid out in its predecessors. If you have not read the introduction, Part I, and Part II, it is recommended that you do so now.

For those you who have, here is a quick refresher on Part I and Part II of this series:

In Part I, we discussed evolutionary thinking in software development – a major player in emergent design – and found that it is a great fit for software but doesn’t really work for databases. A better analogy is metamorphosis, which also permits emergent design.

In Part II, we discussed deployment. We saw that most of the mechanisms used were fundamentally unreliable or manual. We hypothesized that this might be because we are trying to apply deployment models developed for software – where builds are reliable and generally happen as a separate step from deployment. Finally, we came to the conclusion that database deploys are database builds. Because of this fact, a “build in place” mentality might be more useful than a “deploy to a place” mindset.

In many respects, this installment is a continuation of Part II that “closes the loop” with Part I. We will cover how testing works in software and how it works in databases. Of course, we will find the differences, and see if there is a way we can reinvent the practice of testing software in the database domain.

Testing in Software

While there are many kinds of tests that we write in the world of software, there are generally two with which developers spend most of their time: unit and acceptance tests.

Acceptance tests tell you whether the thing(s) they validate are useful in some way. They tell you whether or not your application does the job it is required to do. They tell you if your components are fast enough. They are all about validating that the software fulfills business intent.

Unit tests, on the other hand, prove that your system was built correctly. They do not directly test design but they are tied to public interfaces closely enough to test behavior at a relatively fine grain. Unit tests are about proving that the developers’ intent is expressed by the thing(s) they test.

It is generally accepted that unit and acceptance tests which validate programs and components confer confidence upon an entire species – all of the binaries – rather than just the instance they are testing. Yet, these tests do not directly validate your code – which is the definition of the species. Instead, they test the output of your code, the definition of an individual. So how is it possible for a unit test or an acceptance test to give us confidence in the species as a whole?

The answer is reliable builds. Because there is so much fidelity in the build and deployment paths defined for programs, the object under test ends up being a good representative for other objects of the same kind. When you test something that is prototypical, you gain an understanding of its archetype.

Creating a Species of Databases

If you’re writing acceptance tests and unit tests for your database you’re already ahead of the game, so to speak. However, there’s a lot more you can get out of your testing.

In the previous section we saw that acceptance and unit tests are a great way to gain confidence in how a program or object will behave because they act on members of a species (or prototypes of an archetype). This is often not the case in the database world: databases are usually not part of a species.

In Part II of this series, we saw that we could gain some confidence in how databases are constructed by focusing on a repeatable, reliable build process. That’s easy to say… but how do we get a repeatable and reliable build system? The answer to that question is the same as the answer to “how do we build a species of databases?”

Reliability comes from tests. Compilers have been tested for a long time by way of use and many compilers are probably covered in automated tests at this point. Why, then, should we not test our database build mechanisms fastidiously?

The Metamorphosis Force

The key to reliable builds is recognizing, and testing for, the metamorphic nature of data storage. Unlike a program, which can be rebuilt from the ground up using a definition of its species in seconds, databases transform over time. At first, they are created; built up to whatever state is required of them. Then they are modified using delta scripts from that point forward.

So, if every step except for the first one is defined as a delta, it seems like maybe we should find a way to treat all of the steps the same. In fact, we can do this: all we have to do is treat the first step as a transformation from nothing to something. Once we make that shift in our minds, then our database is just a series of deltas applied in the correct order exactly once starting from nothing – a simple thing to understand.

Now all we have to do is gain confidence in those change scripts…

Transition Tests

We’ve recognized the influence of metamorphosis and discovered that we can embrace that force. By treating our database as something that regularly changes shape and is defined in terms of those alterations rather than in terms of what it should be at the moment, we give ourselves a powerful new descriptive tool. Now we have to figure out how to harness those transformations and bring them under control.

I’m calling this “transition testing.” It’s actually a very simple concept, once you’ve gotten to the point where you see the need. In fact, many of you could probably stop reading here and guess how the rest of this article goes. To transition test, you bring a data structure up to one version (one series of transformations), fill it with some data, and then advance the database exactly one step. After you’ve done that, you validate that all of the transformations were applied correctly. I even test that the tables which should not have changed are still the same.

Unlike unit tests and acceptance tests, which need to run in a vacuum to be repeatable, it is actually desirable for transition tests to depend upon one another. If the test to get from version one to version two added some data, carrying that data forward throughout the rest of the database definition’s could only increase the realism of your transition tests.

Transition testing gives you confidence that all databases created using the same script are structurally similar. What does that sound like? That sounds like a species to me. With transition tests you know that all databases are individuals of a species – like binaries – rather than each one being “unique and beautiful.”

"...but Wait!"

Some of you may be wondering “If I define my database in terms of deltas won’t it eventually get confusing?” If you were only doing transition testing and defining your database in terms of changes, then yes it would eventually be hard to derive context when examining the database script.

However, you should not just be doing transition testing. You will still need unit and acceptance tests to validate customer and developer intent. You could think of the Acceptance and Unit tests as the definition of what the database is or should be now, instead of a mammoth SQL script or a “source” database. This would actually be a similar shift in valuation to the one many of us have made from source code to tests.

In addition, with a reliable database build model, you are constantly just seconds away from a working database which is prototypical of others in its species. This means that you can easily create a database just for exploration and analysis. The new database can be poked, prodded, and otherwise molested by whatever tools you desire exactly the same as with a gigantic database script or a “source” database with one important difference: you have some certainty, rather than hope, that it will be like other databases created using the script.

Conclusion

For unit and acceptance tests to be really useful, the confidence they create needs to propagate from the object they are testing to the species of which it is a member. This is already the case with programs and components but not with data structures.

Most database deployment techniques do not create a species. In order to get confidence in an archetype, you must first create one. To create a species of databases, we need to take their metamorphic nature to heart and create a build-in-place tool that applies transformations in the correct order and with the correct cardinality.

To achieve all of this, we must regiment and marshal our databases’ transformation using transition tests to ensure it grows smoothly. This involves all of the traditional testing and transition testing, which validates the metamorphic path along which your data structures grow.