The cloud sync tower of Babel

This post was originally published on AIIM's Expert Blogs by Serge Huber, CTO at Jahia Solutions

********

Cloud document syncing services are all the rage nowadays, but for a regular user, their usage might quickly become the equivalent of a document syncing tower of Babel. Like in the old mythology’s story, where humans that used to speak only one language were divided by the introduction of multiple languages, document uploading on the cloud became an incompatible mess when synchronisation clients were introduced. Wouldn’t it be great if you only had to install one client on your machine that would take care of all document synchronisation to cloud hosting services ? Is this even a reachable objective ?

Like many others, I often need to share documents. Be it word documents, presentations, photos or even videos, I want to publish them somewhere and make it easy for others to access them whatever their size. Also, I want to make sure that I can synchronize these documents between my work and home office computers, without the hassle of copying them back and forth using USB drives. So I tried out Dropbox, and rather liked it, to the point that I actually started becoming almost dependant on it. There are of course also quite a lot of interesting alternatives to Dropbox: Microsoft SkyDrive, Google Drive, Apple iCloud, Box.net and many others.

Most of these cloud hosting services offer the possibility to install a desktop client that will synchronize the contents of a local folder with the content hosted online. By installing this client software on multiple computers, it becomes incredibly easy to keep everything in sync and never have to worry about where you are accessing the documents from, or where you made the last modification.

When Google announced Google Drive, their new alternative to Dropbox, I became curious and also installed their desktop client syncing software. And then it hit me: does this mean that for every service I will install I will need to install a different desktop client ? Also, as these clients are regularly checking the file system for modifications, they all need to be active all the time, eating up memory and CPU time to perform exactly the same tasks. So if I became a power user I would need a Google Drive client, a Microsoft Skydrive client, a Box.net client, a Dropbox client, an Apple iCloud client, and so on... This is crazy and something needs to be done about it !

But what exactly are the solutions to such a problem, especially since every cloud service provider will want to differentiate on features, and therefore will probably require that the client implementations be at least slightly different from the competitions’ ?

In order to answer that question, I believe we need to look in the direction of open source software and standard definitions. Wouldn’t it be great if there was a standard that made it easy to write document syncing clients ? Well actually there is more than one, but I believe that WebDAV or CMIS can fit the bill quite well. CMIS is still quite young in client implementations, especially stable and reliable ones, but in theory it should not be impossible to write a sync client using such a technology. WebDAV on the other hand is well established, has numerous client implementations, but suffers from fragmentation of such products, and also has quality issues with some of the most distributed implementations. Microsoft Windows has had a “shared folder” feature for a long time that is actually a WebDAV implementation, but it has suffered from so many bugs and incompatibilities that nobody recommends using it anymore. On the other hand there are alternative Windows WebDAV clients that are quite powerful and some of them even mount network drives, which in effect acts similar to what a synchronization client does.

One of the most interesting projects, although still quite young, is Syncany. Originally started by a student, this open source project is actually a portable desktop client that is capable of using different cloud hosting providers as storage destinations. It works with cloud services such as Amazon S3, Box.net or standards such as FTP, IMAP (!), WebDAV, Windows Sharing and uses encryption to make sure that files sent remotely are not easily accessible by others. As it is open source it would of course be possible to add features or additional remote providers and I believe this is what makes this project really interesting. So the pipe dream of using only one desktop client for synchronization to multiple providers is not so unrealistic, it just needs help getting started.

You might be wondering what would be the incentive for companies such as Dropbox, Google or even Microsoft to integrate, or even help such projects ? Well pretty much for the same reasons that competing companies use the same open source frameworks: because you can cut development costs and benefit from a widely used code base. One of the best examples of commercial companies competing on products but using the same code base is the WebKit project. This open source web rendering framework is used by Apple’s Safari browser (on Mac OS X, iOS and even Windows), Google’s Chrome browser, Android’s native browser and many other browser implementations. This project was originally born out of another open source project: Konqueror, the Linux KDE project’s browser. Apple picked it up and decided to leverage it to build a powerful and standards compliant browser, which they did very well, while keeping it open source and contributing back their improvements. What resulted is a highly portable, high quality and standards compliant code base that many can re-use and embed.

It is my hope that maybe this could happen in the cloud hosting world. For example maybe Google could help sponsor or contribute to the Syncany project, and then use it as a base for their software or start another open source project. The only drawback I see with using the Syncany code base  as a starting point is that it is written in Java, and this might not fit well for building a lightweight desktop client.

Apart from the large hosting providers, other actors of the document management world might be interested in such a project, such as the content management or document management product vendors. These products are usually server-based and use the browser as a client, so they are not used to developing client applications that run directly on the end-user’s desktop. It would therefore be very interesting to them to be able to benefit from an open source project that would only require of them to write the connector to their product and immediately benefit from a full-featured and stable client to provide as a syncing solution. These actors could of course also contribute back to such a project.

On the enterprise side, unifying the deployment of a client application such as Syncany would also ensure more control over what is being deployed, and might also offer the opportunity to add additional enterprise features such as access control, deployment scenarios, or even monitoring. Using an open source product would make it easy to customize the solution to whatever needs a specific company might have for such technologies.

With all the above reasons and motivations, it is almost surprising to me that this hasn’t already happened, and it is my hope that the momentum will be reached soon so that this may finally become a reality.

Serge Huber

Serge Huber
Serge Huber

Serge Huber is co-founder and CTO of Jahia. He is a member of the Apache Software Foundation and Chair of the Apache Unomi project management committee as well as a co-chair of the OASIS Customer Data Platform Specification.

Back