Alexa, hypnotize me
It was Friday night after a long and stressful week at work. I came home around 5:30 PM, had a light dinner and took a nap. My daughter called me at 7:08 PM to remind me to pickup her dog on Saturday morning. As I finished the call I looked at my phone … something very weird was going on.
The Amazon Alexa skill that I had created with Barry Thain was suddenly having issues. The error rate was going through the roof. The skill had been running without problems for over one year and I hadn’t made any changes recently. "What the heck was going on,” I was thinking. I did try to start the skill on my Echo device and got an immediate error.
I opened my laptop and logged into AWS console. I had written the skill in Python and I am hosting it using AWS Lambda. Monitoring was showing exponentially growing Lambda invocation count and invocation duration was peaking at 30 seconds that was the pre-configured Lambda runtime limit.
I looked at CloudWatch logs and saw huge spike in errors in my Python code that had been running quite solid for over one year. As I was going through the logs I saw rapidly increasing amount of errors in the section that keeps state of the user interaction in AWS DynamoDB. Thankfully there was a clear error message:
"ClientError: An error occurred (ProvisionedThroughputExceededException) when calling the UpdateItem operation (reached max retries: 9): The level of configured provisioned throughput for the table was exceeded. Consider increasing your provisioning level with the UpdateTable API.”
A quick look at DynamoDB monitoring did explain the problem. I had provisioned only the minimum write capacity based on the usage patterns I had seen over the past 12 months. I increased the capacity for read and write operations and also enabled autoscaling.
As I was working on this problem the usage count was still growing at an exponential rate in the Lambda service. I checked the Amazon developer console and total unique customers was growing at a very quick rate. During the last 12 months the daily unique customers had varied from 66 to 250 users. Now I was seeing a peak up to 1258 customers in a 15 minute period.
"What the heck is going on,” I was thinking. “What is driving this growth?”
I checked my emails and then it dawned on me.
Check bullet #13 on this “What’s new with Alexa” marketing email.
I did say the phrase: “Alexa, hypnotize me” and sure enough my ‘hypno therapist’ skill was launched on the Amazon Echo device. Amazon had created a shortcut to my skill and published it on the weekly email that comes every Friday.
This brought back memories on the Slashdot Effect. When your website got linked on the front page of a popular website, this caused a massive influx of web traffic. This overloads the smaller site, causing it to slow down or even temporarily become unavailable.
This little episode demonstrates how massive scale the Amazon Alexa ecosystem has already reached. Over 40,000 people had read the 13th bullet point on this marketing email and tried it on their Amazon Echo devices. This drove over 15,000% usage growth rate in a few short hours.
So what are the lessons learned for software developers working on Alexa skills?
If you are hosting your service in AWS cloud you should really utilize their autoscaling functionality.
Make sure your service monitoring is able to catch this kind of events.
Be ready - when Amazon Alexa team is promoting your skill you can expect huge usage growth in a few short hours.
PS. The story how I came up with the idea for this ‘hypno therapist’ Alexa skill is here.
Great going mauri make that. Plus 1 ??