1. Reed McLay's Avatar
    RIM Blames Service Outage on App Updates

    * "Between 3:00 and 4:00 PM EST - Problems with BBM and BIS internet browsing reported around the web.

    * "Between 6:30 and 7:00 PM - The problem extended to BES email, preventing the delivery of BES emails to and from BlackBerry smartphones. At each of our customers, BoxTone detected a greater than normal quantity of users with messages pending, based on our learned baseline of what is normal for each server and carrier, and immediately generated a warning alert our customers before the flood of user calls. BoxTone also placed all affected BES and Carriers in a Critical state on our customers' Operations Dashboards (depicted by the red dots next to each BES and carrier). The steady growth in Pending Messages beginning around 6:45 continued until the issue was resolved early this morning. From our monitoring data, it appears that BES were able to communicate with the RIM NOC throughout the outage; however, the NOC was unable to deliver messages.

    * "At approximately 12:09 AM, BoxTone detected a brief disconnect in the SRP connection of each BES to the NOC; it appears RIM reset the NOC SRP connection to complete their fixes. Following this reset, delivery of BES mail resumed.

    * "By 2:45 AM or earlier, BoxTone detected that most of our customers had returned to their normal (baselined) service levels, and that the backlog of pending mail had been delivered. BoxTone generated notifications informing our users that their service levels had returned to normal and updated the status of the BES and carriers to Normal."
    3:00 PM EST - 2:45 AM. Just shy of 12 hours, middle of the day on the West Coast.

    The first one was of similar length, but not as disruptive. This one impacted BES users.

    100-(12/(24*365))*100)=99.8630 % Uptime.

    12-23-09 01:20 PM
  2. Reed McLay's Avatar
    RIM Sorry For Service Outage; Backup Systems Questioned - WSJ.com

    "RIM continues to monitor its systems to maintain normal service levels and apologizes for any inconvenience to customers," the statement said. A spokeswoman declined to comment further.
    ...

    Ken Dulaney, vice-president of mobile computing at Gartner Inc., said all technology companies have failures from time to time, which is why it's important to have a proper "failover" in place, or a standby system that can be switched to instantly.

    "If this is due to the fault of RIM not having good failover procedures, then I think we can come after them and say, once again, they didn't follow proper process," he told Dow Jones.
    ...

    Maribel Lopez, principal analyst at Lopez Research, said the BlackBerry has the perception of being a "super-reliable email platform," and when the service goes down, it's seen as a very big deal, unlike for other smartphones such as Apple Inc.'s (APPL) iPhone.

    "People do have higher expectations from RIM. They're considered the gold standard, and when the gold standard goes down, people freak out," she said, adding that it also shows their services are considered "critical."

    ...
    "We're moving into new ground here. If a messenger app can take your service down, once you start getting these really rich robust applications on it, there's going to be a need for a lot of testing, and in some ways you have to wonder, does that slow things down?" she asked.
    ...
    12-23-09 03:57 PM
LINK TO POST COPIED TO CLIPBOARD