Advocates of web 2.0 suggest that we can access nearly all of the services we need from web suppliers. We can edit our documents, store our photos or company data, and run our applications. It sounds great - but what happens when the web is unavailable? Over the last few years I have travelled quite a bit and I've often found myself in places with no wifi connectivity - or at least none at a price I'm willing to pay. So I value having a copy of my data on my laptop, so that I can carry on working.
I've put forward this argument at a couple of events recently. At an excellent session on Web 2.0 and science at the UK e-Science All Hands Meeting, the response was that 3G coverage will soon be sufficient to give us access almost everywhere. The next generation will take it for granted, the way they take GSM talk coverage for granted already. I have to admit that this scenario seems quite likely, although of course there are still places that don't even have talk coverage.
Nevertheless, there are still problems. Cloud services are far from 100% reliable, at least as yet. The word from companies using cloud computing for their business is that we should expect failure and deploy applications on multiple providers. I believe we should do the same with our data. In addition to guarding against technical failures, it would protect us from vendors who go out of business or close down a service. It would would also prevent vendors from taking advantage of "lock-in" to increase their prices.
So, we need systems that can replicate data from one data store to another. Fortunately, we know how to do this, whether via Grid or via P2P technologies. Unfortunately, we seem no nearer achieving standards for interoperability, so we will need to build systems that interface to the variety of proprietary systems out there.
Ideally, the data should be self-describing, so that two copies can be synchronised by a different application from the one that actually created the copies. I'm put in mind of the apparently simple problem of syncing my calendar between my PDA and my PC. When I migrated my PC calendar to a new application, the next synchronisation created two copies of each event. You'd have thought that the iCalendar format would tag each event with a UUID so that multiple copies could be easily reconciled, but it seems that this doesn't happen. Let's make this a ground rule for storing data in the cloud.
I'll leave the last word to a panellist at the Cloud Computing event in Newcastle. When I explained that I wanted my data on my laptop so that I could work on the plane, he suggested that perhaps I'd be better using the time to read a good book.
I've put forward this argument at a couple of events recently. At an excellent session on Web 2.0 and science at the UK e-Science All Hands Meeting, the response was that 3G coverage will soon be sufficient to give us access almost everywhere. The next generation will take it for granted, the way they take GSM talk coverage for granted already. I have to admit that this scenario seems quite likely, although of course there are still places that don't even have talk coverage.
Nevertheless, there are still problems. Cloud services are far from 100% reliable, at least as yet. The word from companies using cloud computing for their business is that we should expect failure and deploy applications on multiple providers. I believe we should do the same with our data. In addition to guarding against technical failures, it would protect us from vendors who go out of business or close down a service. It would would also prevent vendors from taking advantage of "lock-in" to increase their prices.
So, we need systems that can replicate data from one data store to another. Fortunately, we know how to do this, whether via Grid or via P2P technologies. Unfortunately, we seem no nearer achieving standards for interoperability, so we will need to build systems that interface to the variety of proprietary systems out there.
Ideally, the data should be self-describing, so that two copies can be synchronised by a different application from the one that actually created the copies. I'm put in mind of the apparently simple problem of syncing my calendar between my PDA and my PC. When I migrated my PC calendar to a new application, the next synchronisation created two copies of each event. You'd have thought that the iCalendar format would tag each event with a UUID so that multiple copies could be easily reconciled, but it seems that this doesn't happen. Let's make this a ground rule for storing data in the cloud.
I'll leave the last word to a panellist at the Cloud Computing event in Newcastle. When I explained that I wanted my data on my laptop so that I could work on the plane, he suggested that perhaps I'd be better using the time to read a good book.
Comments