The Sidekick Cloud Disaster
- 13 Oct 09, 09:48 GMT
A year ago, I visited a giant data centre belonging to Microsoft a hundred miles or so north of Seattle in Washington State.
It was an impressive sight: room after room of servers cooled by by electricity which arrived from a nearby hydroelectric scheme, plus two sources of back-up power if the mains connection should somehow fail. All of this was just part of Microsoft's substantial global investment in cloud computing, to ready itself for a future where we'd all keep more and more of our data online in secure locations like this.
But I had a question.
What if, due to some unforeseeable chain of circumstances, the whole place went up in smoke - taking our valuable data with it? Back they came straight away with an answer - "redundancy". No, no, not mass sackings amongst Microsoft employees responsible for data loss, but backup systems.
In other words, every single piece of data stored in the Washington data centre would also be held elsewhere, just in case. It all seemed pretty satisfactory to me - and indeed I've put more and more of my own data into , from Google Documents, to web-based e-mail, to photo libraries stored on Facebook.
. Users of a popular mobile phone on the American T-Mobile network have lost some of their data, and the apparent cause is a server failure.
It's being called the biggest disaster yet for the whole concept of cloud computing. The software and the services for the Sidekick phone are designed by a company called Danger, which helps users store their contacts, photos and all sorts of other personal data in the cloud. But Danger was bought last year by - guess who? - Microsoft, so the software behemoth is now going to cop a lot of the flak for this disaster.
says:
"T-Mobile and Microsoft/Danger continue to do all we can to recover and return any lost information. Recent efforts indicate the prospects of recovering some lost content may now be possible."
But it also talks of compensating people if they suffer "a significant and permanent loss of personal content" - which sounds pretty ominous.
What's not really clear is what happened to the famed redundancy of Microsoft's cloud operation. Reuters is quoting a statement from the company talking of "a confluence of errors from a server failure that hurt its main and backup databases supporting Sidekick users." But does that mean the backup databases were in the same place as the main ones?
If we're all to entrust our most valuable data to Microsoft's - or anyone else's - cloud, we're going to need to be sure that they tend it as if it were their own. If this kind of reassurance is not forthcoming, then all those forecasts of explosive growth in cloud computing will be, well, redundant.
The ´óÏó´«Ã½ is not responsible for the content of external internet sites
Comment number 1.
At 13th Oct 2009, Mo McRoberts wrote:The issue isn’t the cloud per se, but the age-old of problem of putting all of your important information (even if fairly trivial, it still has personal value and takes time and effort to reconstruct) in one single place and having a synchronisation mechanism which deletes the local copy of if it goes away.
You’re no smarter keeping your only copy of your address book and e-mail folders on your laptop than you are in Microsoft’s (or Google’s, or Yahoo!’s, or whoever’s) cloud.
If it’s important, back it up. Quite often, cloud services are being used _as_ the backup for information held securely elsewhere (which makes a fair amount of sense: you consider the cloud version temporarily expendable if necessary).
However, I do think that there’s a responsibility for mobile operators to make it easy and straightforward to customers to back up information from their devices—whether it’s synchronised to a hosted service or not—and to back it up in a device-agnostic fashion (it’s not like we don’t have fairly standard formats for most of the stuff which gets stored on phones, after all).
Complain about this comment (Comment number 1)
Comment number 2.
At 13th Oct 2009, Kite wrote:I know the 'cloud' is the future, but I would rather keep my data & backups on my external hard-drives. I know where it is and the chances of all 3 hard-drives breaking at the same time is pretty remote.
Good blog Rory
Complain about this comment (Comment number 2)
Comment number 3.
At 13th Oct 2009, steveellwood wrote:Chris Robinson, referred to at said:
There is no Sass without a rub …
Sass = Software as a SECURE Service
RUB = Relocatable user backup
Unless you can get a full backup in an agnostic format of data (and preferably application too) which you can relocate to somewhere other than your service provider you are not secure. Period. You can get more details of Sass and RUB in Wikipedia or at the Webrecs website webrecs.com.au
Complain about this comment (Comment number 3)
Comment number 4.
At 13th Oct 2009, Technicalfault wrote:This is exactly why rolling out brand names like Google Heath, Microsoft Doctor or whatever is NOT a universal answer to solving IT in our own NHS.
Complain about this comment (Comment number 4)
Comment number 5.
At 13th Oct 2009, badgercourage wrote:I will NEVER entrust anything to so-called cloud computing. It's been a disaster waiting to happen from day one.
But this story does remind me to do my own backups more frequently. And to keep copies of important stuff at a friend's house. And not to use cheap CDs/DVDs for backups. And all the other simple logical steps one can take...
Complain about this comment (Comment number 5)
Comment number 6.
At 13th Oct 2009, AviemoreBusiness wrote:In the last year or so we have progessively used 'the cloud' because it is easy, frees up space on the laptop, means that other people can access it and all in all we are delighted with it. However, if you put things that your business couldnt run without up there and have it no where else then quite frankly you are silly. There is no one and only totally secure for ever place, well not that i know of anyway. But the shift is towards cloud working and I for one am all for it.
Currently using Google docs and even though we are a small 2 man team it is just perfect for us to comment on, adjust, delete and basically work side by side on the same document and then give individual customer access to as and when required. How else are we going to do that if we dont use the cloud?
Complain about this comment (Comment number 6)
Comment number 7.
At 13th Oct 2009, Jedra wrote:There is no real difference between 'the cloud' and any other normal network store - it's just a question of scale and accessability. At this scale the technology is in it's infancy and Microsoft and Google will be balancing the ability to make data 100% recoverable against the cost of doing so.
In the end, every user should treat his/her data responsibly - should it be important then you should not rely on one place to back it up (even if that place is a cloud run by MS or Google). It is your data and is more important to you than anyone else. If you always assume that you may lose it, then you can judge how important it is to you and take the necessary steps to secure it elsewhere as well just in case.
Complain about this comment (Comment number 7)
Comment number 8.
At 13th Oct 2009, Mark_MWFC wrote:Rory, although Microsoft must be held accountable for this I would point out that the storage was not at their facility but was subcontracted to Hitachi Data Systems. It was their failure to make a back up (and Microsoft's to check they had) that caused this issue.
The lastest update is that T-Mobile (the carrier operating the Sidekick) also believe they will now be able to restore at leats some of the data. For those for whom it cannot it will offer compensation.
However, let's make no mistake - this is bad and a poor advert for cloud computing whoever offers it. It is sheer incompetence not to back up critical data before attemptign a migration or other change and I can only hope the industry learns from this.
Complain about this comment (Comment number 8)
Comment number 9.
At 13th Oct 2009, tonyhrx wrote:I have run server farms for a few years and have used a few different suppliers who always assure me there are "backups" for power failures, bandwidth failures and so forth. The trouble is these systems are not really testable. Sure, you can run a test and fix things - but there are lots of things that can go wrong after that test up until when a real failure occurs.
So these failures really always happen no matter what the contract says - so you should always have your own local backups
Complain about this comment (Comment number 9)
Comment number 10.
At 13th Oct 2009, Jon wrote:There's one thing I don't understand - why would it be down for a whole week? Even if the user data is unrecoverable, it doesn't take a whole week to get the servers back up and running.
From what I've read, Danger's system was designed with an array of many high-availability redundant servers, so this means that EVERY server would have to have been taken out by whatever it was that happened.
I also find it hard to believe that a company as big as Microsoft could be so lax with backups. They run Hotmail and Microsoft Online Services so it's not like they've never done this kind of thing before.
It just doesn't add up.
Complain about this comment (Comment number 10)
Comment number 11.
At 13th Oct 2009, David wrote:I work in business continuity so am pretty paranoid about my own data. All emails, files and pictures on my main computer and my two son's machines are backed up to an external hd and a smaller memory stick which leaves the house with me. I don't rely on the cloud for anything but I do sometimes forward emails to my Yahoo account so I've got a copy there too.
A few years ago when all those online office suite providers started springing up, I bookmarked a few sites. Recently I was looking back through them - a few obvious ones are still going (Google Docs, Zoho) but quite a few sites no longer exist or the domain now belongs to someone else. Imagine if you had data on one of those and the plug is pulled.
The cloud is not yet to be relied on technologically. The whole privacy & ownership issue is a debate for another day.
Complain about this comment (Comment number 11)
Comment number 12.
At 13th Oct 2009, Andrew Downes wrote:Thanks for a good blog post Rory, but you don't highlight the real issue - COST!
No IT system can be 100% guaranteed against failure. Professionals talk about RTO (recovery time objective = seconds of downtime acceptable in the design) and RPO (recovery point objective = maximum seconds period prior to a failure from which the design accepts data could be lost). Note use of the word "objective" not "guarantee" - any complex design can turn out to be flawed.
It's common to turn RTO into "99.x% availability" because it sounds like failure is very unlikely, but that's less specific because it is an average over a time period and might represent more than one individual failure.
Cloud service providers are trying to deliver at a low price. RTO = 0 and RPO = 0 is expensive, that's just a fact. NASA has such objectives for the IT supporting manned space missions, but they can afford it and they definitely wouldn't use cloud services to achieve it!
Cloud providers claim they can offer low costs by exploiting their scale, but that's only part of the story. In fact, all have sacrificed both RTO and RPO. Example: GMail has published 99.9% application uptime (for commercial clients) - that's definitely not RTO=0, but nor does it tell you what their RTO actually is. You can be sure RPO is not zero either.
We all need to be more aware of these weaknesses as our data - especially life-critical healthcare data - is increasingly put into IT services that might in turn be hosted in the cloud. There's not yet a substitute for writing something down and putting it in fireproof storage - think of the dead sea scrolls.
See this article and the comments to it:
Complain about this comment (Comment number 12)
Comment number 13.
At 13th Oct 2009, RevK wrote:I think all would agree this is a poor ad for the cloud. I guess the sidequick loss story raises a question of whether MS had integrated that data into the normal ms cloud set up or if it continued on its own independent cloud server which was not provided the resilience features that were shown to rory last year.
I hesitate to use the cloud because I do not yet feel mobile internet is strong enough for me to guarantee access wherever I may go, which means I need local copy. Perhaps cloud for backup but not the main docs under use.
The story also raise a data security question, not in the conventional sense of some unauthorised individual having access to data but that too much data in one place becomes target of rogue state or terrorism. Perhaps they have delt with the risk of a car bomb etc but I am reminded too much of Tom Clancy's story of NYSE data being lost through a deliberate attack on servers recording the data. Not through internet attack but a disreputable software developer installing an update to all servers.
Complain about this comment (Comment number 13)
Comment number 14.
At 13th Oct 2009, ravenmorpheus2k wrote:Good blog Rory. Far more tech related and better than Jobs@Disney...
My main gripe with "cloud computing" is that I don't trust corporations to not sell my data on for a profit.
And also there is the question of how secure it all is when it comes to hacking such backups to obtain personal details.
These are two points that in my opinion people seriously need to consider instead of blindly adopting "cloud computing" as the way forward.
Complain about this comment (Comment number 14)
Comment number 15.
At 13th Oct 2009, WildGardener wrote:#10 "I also find it hard to believe that a company as big as Microsoft could be so lax with backups."
Never forget the old joke:
Q: How many MS employees does it take to change a lightbulb?
A: None. MS lightbulbs do not fail.
But it's not fair to pick on MS. There are hundreds of other IT companies, large and small, who do no better.
Complain about this comment (Comment number 15)
Comment number 16.
At 13th Oct 2009, contextfree wrote:The Sidekick data was maintained on a completely separate system Microsoft inherited when they bought Danger. The rumor is that Microsoft's mobile management severely neglected the Sidekick in favor of its own preexisting projects and laid off, transferred or alienated much of the Danger staff leaving nobody left who understood how to run Danger's servers, but at any rate, their data wasn't kept on the Microsoft data centers you were shown.
Complain about this comment (Comment number 16)
Comment number 17.
At 14th Oct 2009, David wrote:The last comment was very good. Business results (priority) rathan than competence was the problem, probably.
And I, too, have experienced many websites that were seemingly here forever then gone forever...
Complain about this comment (Comment number 17)
Comment number 18.
At 14th Oct 2009, Marc wrote:The last time I remember something like this happening was back in 2000 when Microsoft lost the contact lists of many MSN Messenger users.
After that, they added the option to export and import contacts, which is still there today. Of course back then we didn't have the buzzword "cloud" (even though many "cloud" services existed) so it wasn't made out to be some catastrophic realignment of the computer industry as seems to be the case these days.
Complain about this comment (Comment number 18)
Comment number 19.
At 14th Oct 2009, jamezmosley wrote:What has this got to do with Cloud computing? Sounds to me like Danger had a dedicated infrastructure to run an application and that dedicated infrastucture broke. Media hype is putting the "cloud" tag on anything online. Cloud is not perfect but lets not blame it for things it has nothing to do with.
Complain about this comment (Comment number 19)
Comment number 20.
At 14th Oct 2009, Radiowonk wrote:Substitute the word "fog" for "cloud" and the whole concept starts to sound less appealing. It is also advisable to remember that the only person who has your best interests at heart is, er, you. Excessive reliance on others is seldom a good idea.
Complain about this comment (Comment number 20)
Comment number 21.
At 14th Oct 2009, MechanicsOfTechology wrote:BrianJohnHunt (11:49 Oct 13) is completely correct: cloud computing is simply an extension of the network. It is part of our future for the same reason data center consolidation/outsourcing is so popular: cost containment, regulatory compliance, and operational efficiencies. The fact that it has such a long pedigree (arguably just the latest name for ASP, SaaS, BSP, etc. etc) suggests it is a concept with enough appeal and dynamism to remain a factor in the foreseeable future.
It would not at all surprise me if what contextfree (9:06 Oct 13) commented were true. I have personally seen several examples of technology orphaned by mergers and cost cutting produce results (on a smaller scale) exactly like the Sidekick fiasco.
Other than some allusions I have not read any comments on what first occurred to me: the bogeyman single point of failure that is the backup process itself. All the hardware and hot sites in the world won't help you if the backup "tapes" are bad. If the people whacking resulting from the acquisition isn't the issue, my money is on a corrupt backup replicated to the recovery sites.
Complain about this comment (Comment number 21)
Comment number 22.
At 14th Oct 2009, RefMinor wrote:This comment was removed because the moderators found it broke the house rules. Explain.
Complain about this comment (Comment number 22)
Comment number 23.
At 20th Oct 2009, syamdive wrote:The Sidekick has a peculiar & potentially dangerous way of storing personal data, the data in stored in some kind of cache that will be deleted when the phone is off. This in itself is suicidal.
Blaming the cloud is not entirely fair. Any data that is worth any value to a user should be backed up in multiple places (online, local, on paper) and should be portable. As a user you a have a right to your data and should be allowed to export your data for safekeeping or use in another service.
This requirement of not wanting to be locked to a single mobile operator or a phone maker led me to use Rseven. It's cross platform and allows me to sync data between my Windows Mobile Samsung i780 to my Nokia E75, both phones that I use on different networks.
Complain about this comment (Comment number 23)