back to article NHS and POLICE non-emergency services GO DARK after Voda switch failure - source

The collapse of Britain's non-emergency telephone numbers for the police (101) and the NHS (111) last week was due to a switch failure at Vodafone, The Register understands. The outage affected a number of Vodafone's customers on 22 November, including First Great Western, Barclays and RAC. A source told El Reg the outage …

  1. Alister

    Auto-failover is all very well, but if you can't trust it (and I've had various vendor's examples fail on me) then you might as well not bother, and just have a manual process instead.

    Frankly even with active/active stuff, I tend to make sure I'm alerted just to check it did actually swap properly...

    Yes I know, I'm paranoid...

    1. Anonymous Coward
      Anonymous Coward

      Testing Testing 1 0 1

      If you work on a big enough site your fire alarm will likely be tested regularly.

      If the site is big enough to have a backup generator as well as grid power, the backup will likely be tested from time to time.

      If your critical connectivity is provided by Vodafone, it will not have been properly tested and the first time it is tested it will fail?

      What's up with this picture?

    2. A Non e-mouse Silver badge

      Yes I know, I'm paranoid...

      No, you're not paranoid. You're a sysadmin who's learned over the years not to trust the vendors, and also knows that resiliency, despite all the claims to the contrary, is fragile (usually due to it introducing extra complexity).

      1. Elmer Phud

        "Vodafone said in a statement would not confirm or deny whether the cock-up was due to a switch failure, but said it has "identified the issue"."

        So, not admitting to trying to save money and opting for 'Live testing' instead?

  2. Anonymous Coward
    Anonymous Coward

    "We do not anticipate any further outages"

    Unusual for a company to publish such an honest and succinct statement of their approach to system resilience.

    W

  3. Anonymous Coward
    Anonymous Coward

    at one point in Cable & Wireless in the mid 1990's, their main network fault monitoring centre computer was a wonderful PDP-11. I didn't dare touch it, breathe near it etc. I think they were looking for a spare as it was getting slightly 'iffy'. At least there was no danger of a cyber-attack to the NOC as all my colleagues at C&W refused to believe me when I told them "the internet is coming soon" - so there wasn't any internet connectivity, as they just didn't believe in it. (I'm aware that some lucky PDP-11 maintainers will be looking after nuclear plants in the UK until around 2050)

    The NOC PDP-11 running Westinghouse Brake & Signal Company Ltd software did mostly detect network problems & re-route everything via Glasgow when the WDM fibres were taken out

    Oh, and whilst running the C&W network (called 'Mercury' in those days) we often tested the enormous Rolls-Royce back-up gen-set's at strategic locations. They did sometimes fail to come online & work, isn't SNAFU the appropriate acronym?

  4. Spud

    Don't trust your spares

    We had a motto in the RAF not to trust spares and on occasions it was true. Sometimes replacement systems or spares develop faults while not in use and these faults only show up when they are placed into active service.

    I had some ACE modules that had a backplane fault which didn't manifest until they were passing traffic and certain BUS connections were in use. Even if you're testing your backup infrastructure on a weekly basis you could still be exposed for up to 7 days if there is an issue with the standby kit.

    That being said ... this is Vodaphone so I don't expect that applies here :)

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like