Rethinking Agility in Databases - Part I: Evolution

Skip Navigation Links

Rethinking Agility in Databases 2/23/2008 6:57 AM

view as multiple pages

by Max Guernsey, III - Managing Member, Hexagon Software LLC

Introduction

If you haven’t read the introduction for this channel of articles, I highly recommend you do so now. For those who just need a brief reminder, the point is this:

In each article, we pick a practice used in traditional, program-centric Agility. We deconstruct that practice until we have reduced it to its essence: principles and values. Then, we put Humpty-Dumpty together again, with data in mind. The result should be a new practice which fills the same need as the one with which we started but is applicable to persistent data stores.

In this installment we discuss what I think is one of the most central but least portable practices: evolutionary software development. Why is this so important? It's the practice that let's designs emerge. Without the ability to change our design for the better, we have to plan for every kind of variation; if not every variant. Why does it not work with data? Simple: databases can't evolve.

Why Evolution Works in Software

It's more than just a buzzword we use or some loosely-fitted analogy. The word "evolution" actually very keenly describes how code changes in the software world. This section covers why. We cover several traits of evolution, as we understand it in biology, and how those relate to an Agile software product.

Evolution acts on species, not individuals. While we often like to think of our code as "the program," it is not. Source code is the definition of a species of programs or components. They are like DNA: a plan for how to build an individual program.

Evolution also functions by way of new generations replacing older ones. When we change the source code, we evolve the species. Over time the new generation of that species – the new version of our software – supplants its predecessors. The analogy is very strong so far.

Let's continue.

Evolution in biological species is, primarily, driven by environmental factors: new meteorological conditions, new competitors, changes in available resources... these are all the kinds of things that can accelerate the evolutionary process. All of these things are true of source code, almost verbatim. Changing environmental factors lead to new tests which, in turn, drive changes in source code.

Now that we've established that the term "evolution" does apply to software in an Agile environment, let's consider why it is beneficial to think of changing code as evolving. If code is evolving, it is always getting better. If code is always getting better, then we don’t have to do a big design up front; instead, we can design just to the needs of today in a way that allows for change tomorrow. This is a key element of Lean and Agile software development: emergence.

Why Evolution does not Work with Data Structures

Thinking in terms of evolution is appropriate in an iterative software development environment and adds value. However, it does not apply to data storage systems and, as a result, such a mindset can actually cost us something.

Let’s go through why:

Whereas evolution drives change within a species by adding new attributes and subtracting useless ones, an individual database is likely to change the information it contains more in a single week, than all of the feature-related changes throughout its entire life. In other words, there's something else that drives database change.

We've already seen that the vehicle of evolution is the death of one generation and the birth of another. That applies to code but not to databases. Individual databases in an agile environment need to be able to survive multiple iterations. This is especially true of the information they contain.

While evolution’s focus is on the species, we find that individual databases often take center stage. Rarely do we care if we lose a binary - we can always just rebuild it - but I'm sure that, at some point, you've heard someone say (or been the one who said) "the database is down!"

While there are lots of reasons why databases and evolution do not match, the real deal-breaker is the fact that databases are persistent things that need to live for a long time but evolution is a process of constant destruction and renewal.

Databases still have to Change

We started applying the evolutionary model to code as a way of responding to change. Up until that point, change was considered bad. Now we realize that change is not bad, the inability to respond to change is the problem. Change is inevitable.

This is the principle of emergence. For a detailed exploration of this principle, I recommend Scott Bain’s upcoming book:

Emergent Design: The Evolutionary Nature of Professional Software Development

Just because we cannot apply the evolutionary analogy to our database development doesn’t mean we can go back to thinking "change is bad." What we must do, instead, is find another model we can use to react to change.

Data Stores and Metamorphosis

In order to attain any degree of Agility, we need to shift how we think about databases. We can no longer afford to think of them as things we build and then leave standing like some Roman bridge. Nor can they evolve like our code. I propose that even though databases do not evolve, they do metamorphose.

Let’s look at how metamorphosis matches up with database development...

Metamorphosis allows individuals to change form and databases are individual objects which need to be able to change forms. Metamorphosis allows a single entity to gain new attributes and shed old ones and database instances need the ability to change their design while retaining their identity. Metamorphosis uses the individual as a focal point and database instances (which are individuals) tend to play a large role in development efforts.

Hopefully, you agree that the word "metamorphosis" is very apt as a description for how an Agile database would need to change.

Conclusion

The value of evolutionary thinking in Lean/Agile software development is the facilitation of emergence. However, evolution does not allow for database designs to emerge because data cannot evolve in the way we want them to. To permit emergent design in the database world, we need to adopt metamorphism as the model for change in persistent data structures. Once we shift our paradigm in that way, we may then be able to produce tools, practices, and processes to support truly agile database development.