Online Tsunami

Online Tsunami

In 2009 a company called Lockerz was launched by Kathy Savitt, the sister of the man who was CEO of Amazon Web Services at the time and is now CEO of Amazon itself, Andy Jassy.? The idea was simple – give away lots of stuff to attract users, then keep them hanging around to buy stuff.? She raised a pile of money from blue chip venture firms and even Amazon, then set out to accomplish her goal.

The plan was to award points (“PTZ”) to users who participate in various activities on the site and then allow them to redeem those points in a massive giveaway each month, which included everything from Apple iPads to screensavers.? But when they launched it was so successful that it instantly melted the web site!? So, they knew that had to rewrite the site from scratch to be scalable enough to handle over 1 million hits per minute (which they saw during the launch).

That’s when they turned to Ken Ruck, President and Co-Founder of Hemisphere Interactive, a company with deep experience in building scalable web sites for media delivery and other demanding applications. Along with Glyn Beaumont, the Founder and CTO, their entire team was put to work to meet a very pressing deadline.? One thing they would need for sure is a way to recreate the loads they saw in the original launch to validate their work.

According to Glyn Beaumont. Founder and CEO of?Hemisphere Interactive:

I run a professional services team called ‘Hemisphere’, specializing in very high-volume video delivery and large-scale social media tools. We’re based in Auckland, New Zealand, but our clients are all in the United States. At the end of 2009, we were asked to develop a cloud-based (EC2) application for a client that was expected to receive some incredible load. We were frankly skeptical but shown evidence that suggested we could anticipate millions of hits per minute sustained across a short period.

As it turned out, our go-live began with 21,000,000 hits in the first 20 minutes. We cast about for performance-testing resource so that we could be sure the application we built stood up to the impending tsunami of visitors, and almost everybody said (a) load testing like that had never been done, and (b) load testing like that probably never WOULD be done, because nobody had the infrastructure capable.

Then we met Randy Hayes and the team from CapCal. CapCal runs cloud-based load-testing, and Randy was sure their team could pull it off. We wanted to hit three million concurrent users across a test suite of over three hundred highly-spec’ed EC2 Apaches, plus a dedicated memcache farm and a master/slave MySQL implementation. No problem, said Randy. He wondered aloud if it might be the biggest load test in the history of the internet, and we suspect it probably was.

The largest test we ran generated 1.5 million hits per minute from 500 load agents in the cloud.? It even moved the needle on the AWS dashboard that monitors all the datacenters.? But it was also famous inside AWS because this was the CEO’s sister’s company!

To achieve those kinds of loads I had to increase my cap with AWS from 100 to 500 instances of the load agent, but I knew the controller could handle at least that many.? As we ran a series of ever-increasing loads against the site, the system admin Drew Zhodrague monitored the health of the server farm while the developers watched the logs.? At over a million hits per minute it could almost be guaranteed to exhaust the entire million-dollar-plus inventory in less than a minute.? According to Drew Zhodrague:

These were LAMP architecture servers running on a Debian instance and load-balanced by a customized version of NGINX.??One of the key factors in being able to deliver things as quickly as necessary was the extensive use of memcache.? That meant that we could take a firehose approach to database writes and the reads would all be coming out of memory.? During our testing, we even discovered and reported a bug in memcache that had to be fixed for us to be able to deal with the high concurrency.

When the site went live we knew it could withstand the tsunami and it did, with flying colors.? But ultimately, even though the giveaways were hugely successful, the demographic that they were aiming for (young people) proved to be too fickle to keep engaged and buying things without continued giveaways and it eventually petered out.? But it was a great idea and a huge lesson for so many of us about what it takes to build a massively scalable web application and what it takes to test it!

?

要查看或添加评论,请登录

Randy H.的更多文章

社区洞察

其他会员也浏览了