The Problem

In this article I’m going to discuss a problem which most people wouldn’t even consider a problem: managing project configurations when automating deployment workflows.

Now this first sentence probably raises more questions than it answers:

  • What do I mean with project configurations, and more precisely with managing project configurations?
  • Why do I consider this a problem?
  • And why do I think most people don’t?

Let’s address these questions one at a time.

An application is usually deployed to different environments. The first one is mostly a test environment where the application is tested and shown to customers or product owners. Then – according to the company’s deployment workflow – a number of other environments might follow, until the application is deployed to a production server.

These environments usually differ from one another in a number of parameters. On a test server, for instance, one would probably not use the same database as on a production server. Some feature might be turned on or off. The credentials for some API the application uses would certainly be different. And so forth.

Parameters are usually stored in configuration files which also differ from one environment to another. So if there was, say, a test, acceptance and production server involved in the deployment cycle, one might have the configuration files application-test.properties, application-acceptance.properties and application-prod.properties (this example is simplified, usually there are more configuration files like a server.xml or logback.xml). Before an application is deployed on a server, the respective properties files have to be put into place.

The typical approaches to this are as follows:

Either the files are stored in the project itself, and on build included in the artifact (JAR, WAR or similar deployable file) that is to be deployed. This is of course hard to maintain, as properties cannot be changed without triggering a new build. Or the files are stored externally on the web container or application server, i.e. they are put in a designated directory. When a property changes, someone can simply change the respective configuration and then reload the web container or application server.

Most people manage their project configurations that way. As both solutions work for better or worse, managing project configurations is usually not considered a problem.

Still, there are – at least – three downsides:

  • Both solutions are error-prone. Someone might forget to include a property, there might be typos, and so on.
  • If one works with external configuration files, it is hard to keep track of changes, because they are not part of version control. Imagine there is an error because of some configuration that has changed, and the change has not been documented. Then some poor guy might spend hours to find out what the problem is.
  • Because the right configuration needs to be picked, completely automating deployment is a problem. This one step – including the configuration files – has to be done manually. One could surely automate copying a file to the correct location, or setting markers in a file, iterating over them with a script and replacing them with values from some storage. Still, the problem remains, as the correct file or the correct properties (i.e. the one(s) matching the desired environment) must be chosen.

The Solution

At techdev we decided to tackle this problem and develop a tool that manages our configurations, in a way that allows fully automated deployment. Thus an application called Shelf was born, and a prototype implemented by my colleague Markus.

Shelf – in a nutshell – is a webservice endpoint where one can retrieve configuration files for a specific project with a specific version and a specific environment. The HTTP API includes user administration, but the main part is the configuration retrieval. The API call looks like this: GET https://shelf.io/your-project/environment/version, for example GET https://shelf.io/shypp-it/test/2.1.54.

The call returns a zip folder containing all relevant configuration files matching the given parameters (in the above case, all configuration files for project shypp.it in version 2.1.54 on test environment).

Shelf takes files from a VCS repository. This repository is not run or maintained by Shelf. Its credentials and URL are provided within a project configuration in Shelf. Currently, we support Git but we envision supporting other mainstream VCS like SVN and Mercurial. Note that the version control storage needs to be run and maintained by the user.

Shelf works on top of a directory structure. The files need to be placed in this structure so that Shelf can retrieve them based on “project”, “environment” and “version”. Since the directory structure is a tree we encode this information in it in the following way: The leaf nodes are actual files with file names defining their version. The nodes on the path to them denote project, environment, and file name.

Let me provide an – in fact not very far-fetched – example to clarify matters. Say in an application called shypp.it there were two types of properties files, application.properties and logback.xml. There were three servers shypp.it would be deployed to: test, acceptance and production. The folder structure in the storage would then look like this:

/shypp.it
----- /test
---------- /application.properties
--------------- 1.0 (file)
--------------- 2.0.3
--------------- 2.0.10
--------------- ...
---------- /logback.xml
--------------- 1.0
--------------- 2.0
--------------- 4.0
--------------- 4.0.1
--------------- /...
----- /acceptance
---------- /...
----- /production
---------- /...
----- /...

In this example version the original application.properties file for the test server originates from shypp.it version 1.0. In version 2.0.3. there was a change in application.properties – say a port number changed. The next change was made in version 2.0.10, and so on. The same principle applies to logback.xml.

Let’s assume one called GET https://shelf.io/shypp.it/test/2.0.4.

The zip folder they retrieved would then contain application.properties 2.0.3, because that is the highest version equal to or smaller than the desired version 2.0.4. All changes that were made after this point are not relevant for version 2.0.4. Similarly, logback.xml would be delivered in version 2.0.

Now how would this set-up help automating the deployment of – sticking to our example – shypp.it version 2.0.4 on our test server?

We imagine it like this: the HTTP call would be made automatically by the deployment process, to which the neccessary parameters would passed through a prompt (i.e. “Which project do you want to deploy?”, “Which version do you want to deploy?”, “Which environment do you want to deploy to?”). The same process – be it a shell script, an Ansible playbook or any other – would unpack the zip folder and move the containing configuration files to the correct place. Then the artifact would be deployed as usual.

In the next section the implementation of Shelf will be discussed in more detail.

Implementation

I will elaborate on three aspects of implementation detail and explain why we made the choices we made:

  • Properties update
  • Storage
  • File access

Ad 1.) Technically, changing properties can be done in two ways: one could either exchange some property values in the files, or replace the whole files themselves. Both approaches can be automated. Because replacing whole files is much easier than parsing some properties in a JSON or XML structure, iterating over a file with markers and replacing individual values, Shelf takes the second approach and retrieves whole properties files.

Ad 2.) Actually, any means of storage would do the job, and there’s a huge selection. Most often I’ve seen configuration properties „stored“, i.e. documented in the project wiki or in a README file in the project itself. If data protection is considered very important, people often use password administration tool like keepass (this only works though if property values are interchanged instead of whole configuration files, and as described in the last section we decided against this approach).

To keep it short, we decided to use a version control system for the following reasons:

  • Auditing of changes: With version control every change is tracked: when it was made, by whom, and what its predecessor looked like. No more wondering who made the last change, and what exactly was changed if something breaks. Plus in that case it’s easy to go back to the previous configuration.
  • Basically any editor can be used for editing and diffing before changes are committed to the VCS. This would not be the case if the configuration files were for instance stored in a database. Again, there’s a huge selection of editors out there and most of them do their job really well, Shelf does not need to manage editing and diffing and can thus be kept more lean.

Ad 3.) There are two points that can manage the access rights to the properties: the storage and the prospective configuration management application itself.

Read and write access can be controlled to some extent by the VCS anyway, so we decided to take advantage of this fact. Shelf itself offers a access control on project level on top, in order to provide a more fine-grained means of access control.

Having decided to use version control, this leads directly to the questions if one stores the configurations of all projects – assuming there are multiple projects one wants to deploy – in a single git repository, or if every project has its own configuration repository. In the current implementation we took the latter approach, but we plan to change this in the future. We might even support both ways.

All these considerations reveal why we chose to build a web service in the first place. The reason is (apart from that we have lots of experience in building web applications) that we did not want to spend any time building some fancy GUI when there are plenty of tools for jobs like storage, editing, version history and access control. Why not use these? What’s left is the actual configuration retrieval, and a web service provides an easy way to do so.

Discussion

This article is to serve as a base for discussion. We would like to know how others manage their configurations in the course of deployment, as well as their experiences with their workflow. We are curious to hear what people think of our approach (i.e. Shelf, the way it was described here). Finally, we would love to have people’s opinion on which requirements an application would have to meet so that it would really help them automating their deployment processes. Thank you for your feedback!