The hidden truth about CMS upgrades

This post was originally published on AIIM's Expert Blogs by Serge Huber, CTO at Jahia Solutions

********
Among the criterias that are most often ignored in request for proposal (RFP) offers are content management system (CMS) major version upgrades. Most clients assume that they will be transparent, and that they will simply need to follow a simple procedure to move from one version to the next. Especially nowadays when the same person might have a smartphone and is used to clicking the “Update all” button to have all the native applications upgraded automatically.

Unfortunately, and where the main difficulty lies, is the proper understanding that a server-side solution doesn’t evolve the same way as a desktop or mobile application. A mobile application, for example, will often be composed of less features, and will not have strong requirements of availability or scalability, as it runs directly on the portable device and only needs to service one user. So despite the fact that many people compare mobile/desktop applications to server software, the comparison is, as of today at least, unrealistic. A better comparison would be to treat server software like an operating system. And migrating between versions of the same server software can be compared, in some basic regards, to upgrading an open source operating system (Some of the problems described below also apply to closed source operating systems but the analogy is stronger when using open source operating systems).

Operating system upgrades may range from minor updates that are trivial to install, to very complex procedures that may take a long time and involve significant manual work. Especially in the case of open source software, where it is possible to actually modify the source code and patch or fork a piece of the system to fit a specific need. At a basic level, if we look at software configuration, it is possible that from one version to another the format of the configuration storage has changed, new parameters have been added or old ones removed, and either the changes may be merged automatically or they must be manually redone to upgrade the operating system component.

In the case of a patched operating system the upgrade will become a lot more complex. This will probably require recompiling the new version by up-porting the modifications, if they are still compatible with the new code. If they are not the patch will have to be re-engineered to work properly.

You might be wondering why someone would want to patch an open source operating system. Well, maybe they didn’t ask for the patch, they simply had a requirement that they gave to an integrator and that could apparently only be fulfilled by doing this modification. Or maybe a requirement came very late in the development process (or sometimes even after the process has finished) and a “quick patch” was done to reduce development costs. This latter case might not occur all that much in operating systems (except maybe for security patches), but is much more frequent in web server software, and is the cause of a lot of suffering when upgrading time comes around.

Now that we’ve explored the analogy, let’s get back to CMS software upgrades. Usually users of a CMS will want to choose a solution that fits their needs at a specific time, and will be willing to spend some effort and money to make sure that they get exactly what they need at that point. They will usually not really be very concerned about upgrading, except if they have had previous experiences that have taught them a good lesson, since it is hard to anticipate future needs, as the technology is itself still evolving at a very rapid pace. But they might pay for this oversight dearly, and they usually won’t get much help avoiding this problem from a third-party they might be sub-contracting to.

CMS integrators are in the business of quickly delivering a solution that fits the RFP demands, making sure that the project stays on track in order to make a profit. So they have a big pressure to stay on schedule, and might sometimes cut a few corners that they really should not be cutting. Among those cuts are proper analysis and communication of long term maintenance issues. Maybe some third parties are not sufficiently experienced, or more realistically may not have the proper resources available to counsel a client at that point in time, to properly recommend a course of action and they may be as hopeful as the end client that the software provider will be able to solve any issue they may have.

It is of course also a responsibility on the CMS software editor side to make sure that upgrades are as seamless as possible. However there are a few technological barriers that all involved in  developing and deploying such a solution must be aware of. Some upgrade problems are very difficult to automate - much in the same way as we have illustrated for open source operating system upgrades. Of course when talking about CMS upgrades in this article we do not consider minor upgrades, but rather major releases, that usually introduce lots of new features and possibly even major architectural changes.

Most CMS use templates to build the HTML that will be sent to the browser. Templates are usually a mixture of logic and content, even if it is kept very basic in order to avoid tying the two together and creating separation-of-concern issues. Most templates are written using a scripting language (such as PHP, JSP, Velocity, Perl, etc..) or using tags to perform tasks such as inserting content into HTML markup, looping through lists of content objects, displaying content based on some type of condition, or even performing subsequent calls to business logic or remote systems.

Templates are therefore usually customized for a client installation, and are rarely used as-is out of the box. The CMS software usually provides samples templates, or sometimes even recommended ones, that have been tested for scalability, security and performance, and that are maintained over the long term. But as these templates are often written using powerful scripting languages, it is very difficult to “upgrade” their content. Even templates using tag-based systems might need to be upgraded, and this might prove harder than expected if their syntax and interactions are complex. In the best case an XML parser could do the job, but this is rarely realistic as the tags are often mixed with hand-written HTML markup that might not even be properly processed by an XML parser.

Automating the upgrading of a script-based or tag-based templates is therefore a really complex task, and truth be told rarely the practice in the CMS world. More often CMS vendors will work hard to make sure that templates must be upgraded as little as possible, for example cutting them up into smaller pieces to make sure that the smaller parts may be simply replaced instead of needing to be transformed. Also, as templates are often tied to the design of the client’s site, it may be a good opportunity to redesign the site at the same time, therefore avoiding to have to up-port the old templates and simply starting fresh with new templates.

If the old templates are still compatible with the new CMS version, the temptation will be great not to “upgrade them” and leave them as is, but this is a mistake since they will not benefit from new features, best practices, security fixes or performance updates. So in effect even compatible templates may need to be reviewed and possibly modified when upgrading the CMS version.

As we have explained, templates are therefore a major barrier to seamless upgrading, and their design, customization and implementation must be carefully analyzed in order to make the maintenance process is as cost efficient as possible. Unfortunately it is not the only pitfall that may be encountered when upgrading.

A lot of integration work may include specific extensions to the CMS’ out-of-the-box functionality. Some of these may be done using a modular system if the platform supports it, or may require actual forking of the main code, which in this latter case will definitely involve some merging work when upgrading to a new version.

In the case of modular pieces of codes, different technologies may go a long way to making the migration process smoother. Usage of standards such as Java Content Repository (JCR) API or the JEE framework are usually a guarantee that the code will still be compatible with the more recent version of the platform. Various wiring and hot deployment technologies such as OSGi, the Spring Framework or portlets may ensure that the various pieces of code are less inter-dependant and may be upgraded incrementally.

Despite this, if a major architectural change is introduced at some point in the CMS’ version history, it will probably require a lot of changes. And with CMS technology evolving to cover completely new deployment topologies such as cloud scalability, it is quite possible that the software editor will need to modify the architecture to introduce such capabilities. So a migration path will be needed, and it will, in a lot of cases, include some manual work (yes, despite the marketing that says that it is 100% seamless and effortless :)).

Actually I can provide you with a very simple rule of thumb: the more the solution was customized and extended, the more work will be required when upgrading to a new version.

With all the possible barriers to upgrading that we have described, it is no wonder that CMS end users, when faced with the realities of the process, might be reluctant to upgrade to new versions, and sometimes even consider changing vendors, because after all their marketing makes it look so much better than the solution they have now :). But it is very important that they upgrade because staying the course with the old version will only delay the cost and increase it over time, and with the increased legacy they run the risk of reaching end of life delays and might be forced to upgrade at a time that is not convenient nor even possible for them. At the worst case they might be stuck for a long time with some piece of software that was written by a company no longer in existence, and that no one has the expertise to maintain properly. Even open source software doesn’t fully protect from that scenario, it only guarantees that possibly someone with the proper skills may still be found, or may develop them to maintain the code. In the history of software, this has occurred in the past with examples such as finance applications being developed in COBOL, that generated consistent need for engineers to stay or even re-become familiar with the language that had been almost forgotten when the previous generation of engineers that initially wrote the code left the workforce.

Another problem for the end customer is to properly estimate the cost of upgrading, especially if there is a need to do major modifications. In a completely closed system, that doesn’t allow any customizations, the migration path can be entirely handled by the software editor, but the more open it is and the more difficulties may arise if the modifications render automated migrations impossible. Content migration is usually possible either through an automated procedure in the case of compatible data structures, or through manual input if it is decided to re-organize it or replace it completely. The cost may range from very small to a major template redesign, so it is important to properly estimate it and understand the various factors involved to ensure a smooth upgrade path.

So it is very important that end customers of a CMS be made aware of all the above-described problems as early as possible in the process of selecting and generating their requirements. As they will usually keep their solution for a few years, they need to understand that there is a reality that is quite different from the marketing done by CMS vendors concerning the upgrade process, mostly because it uses a different point of view. The CMS vendors usually talk about the upgrading process of the out-of-the-box software, while the client looks at it from his fully-customized and deployed solution.

Serge Huber

Serge Huber
Serge Huber

Serge Huber is co-founder and CTO of Jahia. He is a member of the Apache Software Foundation and Chair of the Apache Unomi project management committee as well as a co-chair of the OASIS Customer Data Platform Specification.

Back