Cloud computing is great, but it isn't a backup system

Written by Phil Rhodes | May 11, 2021 11:00:00 PM

Replay: One of the great things about the modern internet is the way in which we can share and move files around. But if you are dealing with important information, cloud computing is in no way a good backup system.

Sometimes it’s hard to know how excited to be about the thing we’re now calling “the cloud.” In one sense, it’s not a particularly new idea: most of us have been using something we could reasonably call “cloud email” since the 1990s. Where’s your email stored? Doesn’t matter; it’s on a server somewhere. Of course, we could write off much of the modern cloud as “a server somewhere.” What’s changed is that everyone’s internet connections, even wireless internet connections over large areas, have made it possible to do a lot more with the system than just tiny fragments of data such as emails.

On some level, anything connected to the internet is part of some sort of cloud

In a lot of the world, and particularly here in the UK, governments and corporations seem to have concluded that there’s a certain maximum amount of money people are willing to pay for an internet connection, and that making greater provision doesn’t necessarily mean more income for a service provider. That’s a shame, because it risks denying a number of possible futures for the internet in general and the cloud – let’s call it wide-area distributed computing – in particular. Even so, the capability available is now enough to mean anyone who encounters a short-notice and temporary need for several terabytes of storage can have it, even if a bit slowly on the average home broadband connection. Hopefully, that’ll be a statement we can come back to and laugh about in a few years.

Cellphone tech - as here with JVC's Connected Cam series - allows us to move data around with unprecedented ease. Where it ends up still matters

One of the most popular and encouraging things about the cloud is that reliability isn’t the user’s problem. At some level, cloud storage still (almost invariably) boils down to some hard disks in a rack somewhere, but we can quite safely assume that those hard disks will be part of some larger reliability arrangements, whether a conventional disk array or, more likely, some sort of vast distributed object storage system, the details of which are hidden from, and irrelevant to, the user. Either way, there will be redundancy built in. Usually, those racks full of servers are kept in deliberately nondescript buildings on industrial estates round the world.

The simplest cloud provision might not, or in fact probably would not, even let a user know where the data is physically located; as we know from Amazon’s example, these places are somewhat security sensitive. Some cloud providers, though, offer the option to specifically keep data in more than one of these locations, perhaps several at once. That creates a really impressive degree of what the information technology industry terms “disaster recovery.” If something genuinely catastrophic happens – fire, flood, meteor strike, zombie apocalypse – then there’s a huge resistance to actually losing any data. Great! Now we don’t need that LTO drive, anymore, do we?

At some point it all comes down to this, no matter where you store it

Not so fast, sparky. The things we’ve talked about provide more or less the same sort of data security that a RAID does; security against equipment failure, or, with multi-site storage, against natural or man-made disasters. It’s still one storage system; to recover the data, it must pass through the iris of one system in the form of whatever mechanism the cloud provider offers. A lost password or a simple mistake such as accidental deletion or an inadvertently closed account still represent a significant single point of failure.

Yes, cloud providers probably have backup systems of their own; whether that’s automatically mirroring data to two sites or making a backup to magnetic tape, it’s likely to be invisible to users. The only time a user might become aware of this is if a datacentre is burned to the ground in an unanticipated brush fire and an apologetic email turns up warning that the data will be unavailable for a week and might then represent the data as it existed some time in the past. That sort of backup is for the benefit of the cloud services provider, not the user. Begging for access to that sort of backup after having made a crucial mistake relies on a provider feeling charitable, if that sort of service wasn’t specifically mentioned as part of the package to begin with.

The Cloud basically boils down to this. Lots of this

Certainly, some cloud services providers do offer that sort of backup, whether as a sort of rollback operation - “undo,” as it were – or to different sites, or to tape, as a feature. There will of course be a fee, perhaps to provide the feature at all, and perhaps to make use of it. In that case, it might be reasonable to consider cloud storage a very robust backup mechanism, but potential cloud clients would be well-advised to ask specific questions about exactly what’s available and for how long, and how fast the backups can be accessed.

Sure, put it on the cloud. But also put it here

As we’ve said before, RAID is not backup. Now cloud storage is becoming less of an esoteric approach, it’s necessary to be clear that cloud storage isn’t backup either. Keeping data on a RAID and on cloud storage is one layer of backup, certainly, and cloud platforms certainly offer useful reliability features, but the old adage remains true.

If you don’t have at least three copies, you don’t have it, and the cloud doesn’t change that.

Header image: Shutterstock

View full post