Amazon Stubs Cloud Toe — Again, and Again….

Brian Wood Blog

I wouldn’t be surprised if customers of Amazon Web Services (AWS) are placing a fresh new item at the very top of their New Year’s Resolutions lists:

“Get a new cloud provider — one that’s reliable, stable, dependable, and won’t cause me prematurely gray hair or drive me to drink.”

It’s not asking too much, is it? The services you need for your business — the services you pay for — are there when you need them?

Fool me once, shame on you. Fool me twice, shame on me.

Fool me four times (Dec ’12, Oct ’12, Jun ’12, Apr ’11), and it’s clear that I must care much more about inertia and/or low prices than about reliability and service quality — hence AWS.

But for all others who DO CARE about reliability and service quality and non-prematurely-gray-hair with highly competitive pricing, there is AIS — as in AIS BusinessCloud1 and AIS ClearCompute.

The two articles below are re-posted from WSJ and NYTimes.

Emphasis in red added by me.

Brian Wood, VP Marketing

————–

Amazon’s Snafu Rattles Customers

By GREG BENSINGER, December 27, 2012

Amazon.com Inc.’s latest technical glitch—interrupting service for Netflix Inc. and others—is causing some companies to rethink their reliance on the Seattle-based company for the bulk of their Web-computing needs.

Millions of Netflix customers from Canada to Brazil were unable to stream video on Christmas Eve after technical issues in Amazon’s servers in Northern Virginia felled service from Dec. 24 through the following morning. Netflix said the outage lasted nearly half a day for some of its users, and stemmed from problems with Amazon’s Web Services unit, or AWS, which manages online operations for many companies.

The Amazon unit said it identified and fixed technical problems at its operations in Northern Virginia. They affected other firms besides Netflix, including Scope, a San Francisco social-media company, and software company Heroku Inc.

Amazon hasn’t offered an explanation for the source of the outage, which didn’t affect its own online operations. A spokeswoman said Amazon would release a full summary of the outage in the coming days.

“Our goal remains to make our operational performance indistinguishable from perfect, and we know we have more work to do,” she said. “Our operational performance has been quite strong over the last seven years and one of the key reasons we’ve grown as quickly as we have.”

Scope Chief Executive Amit Kumar said he had been considering alternatives to AWS after the most recent outage.

This does seem to be happening on a fairly consistent basis,” said Mr. Kumar, whose engineers devised an AWS workaround on Christmas Eve. “I’d like to spread our risk out to prevent this from happening again and I am looking into what options we have.”

The online retailer, which offers a video-streaming service that competes against Netflix, has placed an intense focus on the Web-services unit lately. Last month it hosted a three-day conference in Las Vegas to promote AWS that featured a talk by Jeff Bezos, Amazon’s chief executive, as well as parties and developer events.

The division has been growing rapidly as more companies seek to avoid the expense of operating their own servers. Baird Equity Research estimates AWS will account for $1.5 billion in revenue this year, about triple the result in 2010, and will reach $3 billion within two years. Amazon doesn’t break out revenue for AWS and won’t say if it is profitable.

AWS outages earlier this year brought down other prominent websites for several hours, including Pinterest and Foursquare, as well as Netflix. Others weren’t affected, including Amazon’s own streaming service.

One reason is that some companies, including Netflix, use AWS servers that are dependent on traffic routers in only one location in the Americas. A failure disrupting one of those locations can bring down all of a customer’s operations, so some companies make use of facilities in different states. As a safeguard, AWS also operates multiple data centers at individual sites to help provide redundant capability, but it said that services designed to funnel traffic among them temporarily failed in Northern Virginia.

During the Christmas Eve Netflix outage, Twitter and Facebook were abuzz with speculation about why Amazon’s competing Prime streaming movie service was still functioning. Amazon said it wasn’t using the failed service at the time for its own streaming service, which was unaffected.

Rob Bernshteyn, chief of software company Coupa, said he uses two AWS operations to help prevent major outages.

“It’s a little bit more expensive, but it’s worth it,” said Mr. Bernshteyn, whose firm has used AWS since 2007. “We have to spread the risk out.”

A spokesman for Netflix, which is based in Los Gatos, Calif., said it was in discussions with Amazon about how to prevent future recurrences. “We are investigating what happened,” he said.

Netflix has said it relies on AWS for 95% of its computation and cloud-storage needs. Chief Executive Reed Hastings spoke at the AWS event in Las Vegas, and stated that Netflix will be almost entirely dependent on Amazon by the end of 2013. “It’s worked out great for us,” said Mr. Hastings at the conference.

But Manish Chandra, the CEO of shopping app Poshmark, said he was hoping for an explanation from Amazon, even though his company wasn’t affected.

Potentially in the next year or so, we may look at other providers,” said Mr. Chandra, adding that he had been “happy” with his AWS experience to date.

This is a pretty serious thing to have it go down and there’s no way Amazon could reimburse us for lost sales if it did,” Mr. Chandra said. “I’ve definitely been thinking more about it in the last few days,” said Mr. Chandra.

—————–

‘The Cloud’ Challenges Amazon

By BRIAN X. CHEN, December 26, 2012

For some on Christmas Eve, “White Christmas” was a blackout on Netflix.

That’s because problems with Amazon’s cloud computing service, which provides storage and computing power for all kinds of Web sites and services, caused Netflix to go down for much of the day.

In updates on a Web site that reports on the status of its online services, Amazon traced the trouble to Elastic Load Balancing, a part of its service that helps spread heavy traffic among multiple servers to prevent overload. The company gave few details about the problems in its data center in Northern Virginia beyond this.

Social networks filled with complaints. Some customers also complained that Amazon’s own streaming service, Amazon Prime, was down. Amazon said it had fixed the problem completely by the afternoon of Christmas Day, and Netflix said it had restored its services to most of the affected consumers by late Christmas Eve. But the episode highlighted how consumers are increasingly using “the cloud.”

As more everyday devices, appliances and even automobiles rely on services connected to the Internet, consumers expect those services to be available at all times. Yet all sorts of disruptions — harsh weather conditions or an apparent overload — can knock a service out for hours.

In October, problems with the same Amazon data center in Virginia took down Reddit, Foursquare and Heroku. The instance was explained on the status Web site as “degraded performance” in some parts of Amazon’s storage service.

In June, a lightning storm hit the Virginia data center, taking Netflix as well as Pinterest, Instagram and other sites off line for hours. That time, too, customers were offered little insight into what had happened.

In April 2011, an Amazon failure took down many smaller sites that had rented cloud storage space from the Internet giant. That time, the companies that were most affected were start-ups that were less likely to pay for so-called redundancies, or backup systems that kick in when a service fails. Netflix was not affected then, and said at the time it was because it had taken advantage of the redundancies that Amazon offers.

Netflix has said that it has built several redundancies into its cloud-based system. For instance, it stores its data across multiple “zones,” so if there is a failure in one zone, it can retry in another. It says it also spends money on more capacity than it needs, so that if there are large spikes in customer activity, the service is less likely to go down.

Joris Evers, a Netflix spokesman, declined to elaborate on why Netflix went down despite these safeguards. He said the company was investigating the cause and would do what it could to prevent the interruption from recurring.

“We are happy that people opening gifts of Netflix or Netflix-capable devices on Christmas morning could watch TV shows and movies and apologize for any inconvenience caused Christmas Eve,” Mr. Evers said.

Tera Randall, an Amazon spokeswoman, said the company has been “heads down” to ensure services are running smoothly and that a full summary of the incident would be published in a few days.

Amazon is one of the biggest players in online services, hosting data storage and computation for hundreds of companies, including Netflix, Instagram and Pinterest. Once a sideline Amazon set up six years ago, the cloud service has since exploded into a business that is expected to bring in about $1 billion to the company this year.

Other companies offer similar services, notably Google, which introduced its competitor in June. Microsoft is also in the business with Windows Azure.

Although the service disruptions may annoy some companies and their customers, it’s unlikely many businesses will end their partnerships with Amazon in light of this latest Netflix failure, said James McQuivey, an analyst for Forrester Research. He added that it was unlikely that a temporary service failure for Netflix was going to cause many to cancel subscriptions.

He said companies can pay extra to Amazon to add safeguards that increase reliability of their online services, but they typically choose to save costs and take the risk of their services going down temporarily. He said that Amazon has been especially popular among businesses because it has been gradually improving its services and lowering its costs.

Businesses, “of course, are going to say, ‘Gee, Amazon, what’s going on?’” Mr. McQuivey said. “But in reality they’re all getting such a great deal. I don’t see them getting that upset about it.”

For consumers, though, it may be a different matter. On Christmas Eve, Merrilee and Alex Barton were watching an episode of “It’s Always Sunny in Philadelphia” when their Netflix feed started to stammer and finally froze, then began to buffer excessively. “It would try to load and get to about 2 to 7 percent of the way through and then just hang there for five minutes,” Mrs. Barton said.

Eventually the two said they gave up and — with nothing else going on in Farmingdale, N.Y. — decided to “nerd it up.” They played a few games of Minecraft, a video game in which the players can build whatever they wish. In the game, all the technology worked.

http://www.nytimes.com/2012/12/27/technology/latest-netflix-disruption-highlights-challenges-of-cloud-computing.html?_r=0