´óÏó´«Ã½

« Previous | Main | Next »

´óÏó´«Ã½ Online Outage on Wednesday 11th July 2012

Post categories:

Richard Cooper Richard Cooper | 10:38 UK time, Thursday, 12 July 2012

Hi, I'm Richard Cooper, the ´óÏó´«Ã½'s Controller of Digital Distribution for ´óÏó´«Ã½ Future Media.

As some of you will have noticed, we suffered a major failure of ´óÏó´«Ã½ Online last night. The site started to fail at 20:10, and by 20:25 was completely down. It stayed down until 21:10, when it started to recover, and by 21:30 the site was back. Some of you may then have experienced problems accessing some pages between 21:55 and 22:10 as we restored full resilience, and from 22:10 onwards we were back to full operation.

The problem was caused by a failure of the traffic managers in both our .

These traffic managers are a critical part of our infrastructure, responsible for handling all requests to the site, and routing those requests to the right servers. They are designed to be highly reliable, and have served us very well to date.

We are still investigating the root cause of this incident, and I would like to apologise for any inconvenience that this outage may have caused. We are working hard to make sure that the causes of the issue are addressed, and that this does not happen again. I will keep you updated on this blog in the coming days.

Richard Cooper is Controller of Digital Distribution, ´óÏó´«Ã½ Future Media

Comments

  • Comment number 1.

    Was that a hardware or a software failure of the traffic managers?

  • Comment number 2.

    Your post suggests there are two physical systems so presumably there are separate traffic manager with redundancy in each data centre (or is that not correct)?
    If not why not, and if so, what took both out at the same time - if indeed that is what happened, because one can only assume some kind of attack . . .

  • Comment number 3.

    At one point (~2130) ´óÏó´«Ã½ iplayer told me I needed to be in the UK (I am in the UK and TV-license paying), which made me check my broadband connection (which turned out fine).

    This post makes it clearer that it probably was the traffic managers at the ´óÏó´«Ã½ data centres then trying to reroute me while they were sputtering back to life. Thanks.

  • Comment number 4.

    "website_outage_june_11_2012.html" or July 11th?

  • Comment number 5.

    @SeeYouOnTheWayDown

    Well spotted. Since changing the web address would break any existing links, I won't change it; but as the blog post says, the outage was on July 11th.

  • Comment number 6.

    Ah yes, 4 and 5. On the ´óÏó´«Ã½ News website yesterday - well before the reported troubles (say around 15.00) there was this June 11/July 11 confusion.

  • Comment number 7.

    As a side note, can I suggest that you change the image on the error page? A picture of the scary clown from the test card against a background of flame does not lead the user to think that all is under control.

  • Comment number 8.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 9.

    The system is redundant (in which case the odds of both data centre traffic managers is small) or not. Which is it? If not, why not? And will the ´óÏó´«Ã½ be publicising this as much as the O2 outage? I couldn't see it on the homepage today or last night.

  • Comment number 10.

    next youll want us to use snail mail and read real newspapers again!

  • Comment number 11.

    It would be interesting to know more about your set up and why a failed traffic manager led to this incident... I am assuming it was the head manager that died or had an issue otherwise redundancy would kick in and send you to another DC (yes?). But if the head fails you simply don't get directed and this is where we get the fall back to dolly girl errors. Am I right in saying that iplayer was formerly kept separate to prevent this happening, if so what has changed and why? was it with the homepage / news updates you moved to same DC's ?

  • Comment number 12.

    Bit of a wobble again this evening (12 July 2012) has anybody been watching where the cleaning staff plug in the vacuum cleaner?

  • Comment number 13.

    This comment was removed because the moderators found it broke the house rules. Explain.

  • Comment number 14.

    I looked for info about the downtime and found the link from the home page of ´óÏó´«Ã½ News (/news/technology-18805912%29. I read, "The problem was caused by a failure of the traffic managers in both our data centres."

    After recent news about the Border Agency and G4S, my immediate assumption was that the curse of inefficient middle managers had hit the ´óÏó´«Ã½ website as well.

    I am a little more aware of the handling of big data than that and immediately corrected the mental image to server context. However, there must be others who clicked on that link to this blog - and even more to the quotation from it on the Tech News page - who would misinterpret that phrase.

    Perhaps you explain just a little more clearly, "The problem was caused by a failure of the computers which manage the data traffic in both our data centres."



    Once I had got my head straight, I wondered, "What both of them? At the same time?" After which I assumed that my ignorance was showing. Later I read Jules (post #2) and realised that I was not the only one,

    "Your post suggests there are two physical systems so presumably there are separate traffic manager with redundancy in each data centre (or is that not correct)?
    If not why not, and if so, what took both out at the same time - if indeed that is what happened, because one can only assume some kind of attack . . ."

    So, was the ´óÏó´«Ã½ under attack?

  • Comment number 15.

    After this I can no longer play iplayer on my ipad. Have worked perfectly previously, now the picture drops off after a few minutes whilst the audio continues. I first noticed this through the app but having tried to view via safari, I also get an error message... please help!

  • Comment number 16.

    @Rachel

    has a place to report problems with ´óÏó´«Ã½ iPlayer specifically.

  • Comment number 17.

    i am mr vivian hankey i have been tweeting a SteveH at @SteveH about the trouble that i have been having trying to watch video through my feeds with my 64bit windows7 premeum computer running ie9 64bit browser well after the 25th of June 2012 after the EU cookiees law was brought in my 64bit computer stoped playing video before then i could watch video through my news feeds using the same browser at the moment i have changed my computer back to my 32bit vista premeum powered ie9 browser copmuter and i can watch the videos from my news feeds from the bbc which i can not do now with my 64bit windows7 ie9 powered computer so my i sujest to you that, that part of your browser that you use to generate video has become incompatibule with 64bit computers as at this moment i can watch all of the video news feeds that i could not watch on my 64bit computer using my 32bit computer so as far as, as one of your call center staff said ( it was a man ) it is not my browser on my 64bit computer that wants re-setting i think that the fault is at your end and lays somware in your browser and the mind of that call center operator

  • Comment number 18.

    @Vivian

    You might find the ´óÏó´«Ã½ help page for technical faults useful. I think there might have been a misunderstanding - ´óÏó´«Ã½ Online does not have a call centre.

    Thanks,

    Ian

Ìý

More from this blog...

´óÏó´«Ã½ iD

´óÏó´«Ã½ navigation

´óÏó´«Ã½ © 2014 The ´óÏó´«Ã½ is not responsible for the content of external sites. Read more.

This page is best viewed in an up-to-date web browser with style sheets (CSS) enabled. While you will be able to view the content of this page in your current browser, you will not be able to get the full visual experience. Please consider upgrading your browser software or enabling style sheets (CSS) if you are able to do so.