Reinventing standards

This post was originally published on AIIM's Expert Blogs by Serge Huber, CTO at Jahia Solutions


A little while ago, I wrote a blog called "JCR is not dead, neither is CMIS", in reaction to a "JCR is dead" article that was making some noise at the time. Recently my article received some interesting feedback from a user that calls himself "Roy", and given the clear expertise shown in the reply I assumed it was given by "Roy Fielding" (Father of the REST API), or at least that's what I'm assuming :)

Anyway, what's interesting in this feedback is  that the CMIS standard has added a REST-like API but that the commenter was saying that it wasn't really a real "REST"-like standard since it didn't comply with some of the basic requirements of a REST API, such as the notion that URLs must always address content objects and that the operations on these URLs are either HTTP methods or URL parameters. I won't got into many more details here, but if you're interested in the details I strongly suggest you read the comment. One important point was the fact that some of the existing standards such as REST or HTTP were not being properly used in the new specification, and this can be problematic because it might not interoperate well with system such as web proxies or content delivery networks (CDNs). It is becoming difficult to fully understand the implications of the various existing and new standards, and so it is usually expected that some breakage or some reinventing might happen. This might be the result of miscommunication or lack of knowledge of prior work, but whatever the reason some reinventing might be unavoidable.

I am also currently part of a working group working on the Web Experience Management Interoperability (aka WEMI) standard at OASIS. The aim of this standard is to define a way for content systems to interoperate and exchange content and possibly even context. I'll probably write a future blog post on this standard so please stay tuned if you're interested in learning more about this upcoming specification. Even while working on this new standard, it has quickly become clear to me that it is almost impossible for one person to be aware and integrate all the past work and that a strong team and especially lots of public reviews are necessary to avoid reinvention or worse, duplication. Fortunately we seem to have both in this working group so this should minimize the problems, and I’m hopeful that we will come up with something interesting and useful.

These days, the information technology field is so wide and so complex, it is quite difficult to invent new standards without overlapping at least a little on previous work. So it is always important to make sure that new standards be reviewed as widely as possible in the hope that with many eyeballs will come input that will make sure that new standards avoid reinventing the wheel. Although this seems quite logical, it is quite often difficult to avoid, since computing standards are now getting older and finding people that master the whole history of technology standards are becoming more and more difficult to find.

So can we avoid reinventing existing standards ? Probably not, but we might be able to re-invent them a little better, and at the same time improve the adoption of some technologies. This is one of my main interest in developing new standards, to make the standard as easy as possible to understand and implement, to make sure that it will be easy to integrate and use in as many products as possible. This might not always be possible to achieve, but it is an important goal to keep in mind while working on any project, since it will always benefit the resulting specification.

If we look at global file format trends for example, we can see that we mostly went from rather complex and obscure to mostly simple and straightforward. In the 80s (and even before), because of size constraints, formats were mostly binary, compact and undocumented and could quickly become quite complex. Some of these are (unfortunately) still in use and are quite complex to process correctly, as their complexity has grown over the years: yes I'd talking about the Microsoft Office legacy formats :) I worked for some time with the Apache POI project that works at implementing parsers for theses old formats and I couldn't believe how complex and undocumented these were. I think it’s possible that even Microsoft is now probably relying on open source projects to parse the older formats, as it is quite probable that they've lost or didn't port the old code. Anyway, since then, new specifications have brought us formats such as XML and more recently JSON, that are much simpler to process and make data exchange much easier. JSON is a particularly interesting format since there is no real standard (aka a de-facto standard), and basically specifies what a lot of people had been doing for a long time, re-inventing the same simple idea over and over.

So maybe this is it, we will indeed keep re-inventing standards, but as long as they do find their user base this could turn out to be a good thing. After all standards are also often revised, and in the process they might be simplified (although the opposite is also true, I'm looking at you CMIS :)), giving them a new youth they might not have had otherwise. Some of the most successful standards (I'm including de-facto standards here) are usually quite basic in their basic form : HTTP, AJAX, REST, JSON or URL/URI. This doesn't prevent them from being used in complex applications but as they are easily understood and implemented, it is easier to build in complexity and make sure that they will not be dropped and re-invented too often.

Serge Huber
Serge Huber

Serge Huber est co-fondateur et directeur technique de Jahia. Il est membre de l'Apache Software Foundation et président du comité de gestion du projet Apache Unomi, ainsi que co-président de l'OASIS Customer Data Platform Specification.