A remote access company without automatic fail-over...
... amateurs.
An explosion in Los Angles last night triggered a power blackout and data centre outage, which led to a knock-on impact for UK customers of LogMeIn today, who were left unable to access remote desktop services, The Register can reveal. The explosion in a Los Angeles high rise hospitalised two people and caused a power blackout …
... amateurs.
Not necessarily. Automatic DR fail-over is a mistake often made by amateurs. It may be viable when your secondary data centre is a few blocks away, where you can have guaranteed redundant links and synchronous replication or mirroring, but that's more a case of High Availability than Disaster Recovery. It's a very bad idea when your DR site is 500+ miles away.
If you've ever experienced a major disaster, like an earthquake or flood, you'll know that it can be hours before you really know what's going on. Having the IT systems start their own recovery while the business continuity staff are rolling out the BC plan to the company, can make things much worse. A switchover to a remote site can take a long time, when it involves things like fsck, database recovery/restart, DNS updates, etc. You don't want to do it unless you have to.
Indeed, that could be what happened here, an over-eager decision (manual or automatic) to switch to the remote site, when if they had simply waited for the backup power to come in they could have had just a 5-minute outage like Shania Twain.
This post has been deleted by its author
The concert would only need localised power to continue. A data centre may indeed kick back in with its UPS and generators, however if the explosion also knocked out the various telecoms exchanges between it and the rest of the world then it doesn't matter much if the data centre is back online.
Wow, either the USA networks run to very different standards to the rest of the world, or standards are different through UK Europe the Middle East and the Far east in which locations I worked. A fire in a single building should never kill whole networks over a wide area, transmission suppliers should have their own back up capabilities.
An earth quake might be and was a different story.
"outage in one of our primary data centers"
"We began a roll over to our other global data centre"
Translation: All they had were a few servers in those two DCs? Does "roll over" mean "spin up a VPS and restore from last week's backup"?
Are they still running it like a free service?
It's not always about power, living in an earthquake zone I had to point out in a disaster review meeting once that any earthquake that took out our control centre and systems would likely mean all out infrastructure was compromised to the point we wouldn't be running anything for weeks while inspections were completed and general telecom infrastructure would be out as well so a multi million alternative out of region backup would be of little value in terms of quick recovery in that instance. Sometimes you have to look at the bigger picture.
That said there were other scenarios that would justify it, just not the 'big one' happening any day soon, though that was the preferred excuse as there was funding and grants available for earthquake resilience at the time.
General telecoms also has backup and redundancy.
At least, it does in the EU - it's a legal requirement of being a telco.
Perhaps that's not true in 3rd world countries.
So if you pay for "last mile" redundancy yourself (separated links to different exchanges), you're covered for most.
Rob Lee - we use LMI Central and have hundreds of clients on it but as soon as we couldn't use it we just started using Team Viewer during the downtime. Any customer with a problem just installs the quick support client and away you go.
IT Support companies should have backup too in case of supplier failure.
Lost complete power (i.e. grid supplies (all of them), batteries, and generators all failed) and possibly other critical services in one site.
OK, that's into disaster recovery mode. That's something that is practised a few times year, surely, for such a professional outfit. They've surely switched to (one of) their secondary data centre(s), and resumed service with hardly a blip. Perhaps a 50ms impact. Hardly a noticeable.
Oh, sorry, this is the cloud, not telcos who generally know what they are doing, and know what an SLA is, and what the regulatory impact of failing to meet 5+ 9s availability means.
Oh, you didn't pay for 5 9s availability? Well shout and scream at the hand, because the face is targeting the profitable part of the client base.
I used to work for a company that made Data centre power backups (ups and diesel gennies combined,'Active Power', nice company to work for).
Why they didn't have that, or something similar reeks somewhat of incompetance.
I'm willing to bet other similar companies in the area DID have something in place and their customers wouldn't have even noticed.