Airbnb's Three Biggest Mistakes
This is the perspective of one lowly engineer who cared too much. For context, I spent 6 years at Airbnb from 2016 to 2022, first as a senior engineer, then an engineering manager and finally as a staff engineer. So take these opinions with a grain of salt, not like I was steering the ship.
Also, I really loved my time at Airbnb, it's a cool company and it's done a lot of amazing things. But this is me giving some tough love, the things I was most bummed about during and after my time there.
"Don't fuck up the culture"
Spoiler, we did...?If you don't know, this title is a reference to a great blog post of Brian Chesky's back in 2014. There are some beautiful insights in it and it's one of the primary reasons I was so excited to join Airbnb in the first place. This excerpt was particularly inspiring to me:
The stronger the culture, the less corporate process a company needs. When the culture is strong, you can trust everyone to do the right thing. People can be independent and autonomous. They can be entrepreneurial.
The things that I loved when I joined Airbnb: it was quirky, intimate, entrepreneurial, people cared a lot about our values like "being a host". It was a bunch of 20-somethings that made Airbnb a household name. The culture was incredible and infectious. There was a high quality bar, a commitment to craft that didn't come out of process but came from shared passion. Like Brian said:
Culture is simply a shared way of doing something with passion.
So what went wrong? Rapid growth.
Rapid growth destroys culture. Hiring top performers from other companies inevitably imports those cultures into your company. The more senior the ICs and leadership you're bringing in, the more of their culture you are adopting.
For existing employees, this is incredibly disorienting; it causes significant attrition and damage to the product. Suddenly what was once a shared way of building and pushing the product forward now becomes an endless debate about tech stacks, programming conventions and scrum processes among a thousand other details that required no debate before the onslaught of a thousand new perspectives. All because "this worked great at my last job".
Under rapid growth, you end up with many fractured little cultures. Little fiefdoms of competing cultures and some islands of cultural refugees. No longer do you have a shared passion and vision, but now you have orgs and reorgs, promo culture, and an endless parade of "best practices" evangelists. Ultimately you have no culture.
From the ashes of our eviscerated culture, Airbnb had an identity crisis. We went through an Amazon phase, a Google phase, now seems to be in an Apple phase. Gotta love those keynotes!
In the wake of this culture vacuum, influential new leaders end up in a land grab, trying to grow out the biggest org with the most influence. The bigger the team, the more you can get your agenda carried out and steamroll others. The bigger the team, the bigger your career. What's the number one question leaders get asked in interviews... "How big was your org?"
Rapid growth significantly damaged Airbnb's culture and its product.
In retrospect, I believe there was a key thing that maybe broke Brian's commitment to culture, at least from the engineering side. To be clear, this is 100% speculation and probably wrong. But during that big hiring push, the promise seemed to be "just give us more engineers and the product will be amazing". In my time there, I'd say it got worse. Eventually, after a few years, this led to Brian taking over, claiming to get rid of product managers, and becoming the single gatekeeper for all product changes. I imagine he was really frustrated and felt lied to. Like "We spent millions? billions? growing this team and the app is worse?!?"
What would I have done?
First, to be totally honest, I probably wouldn't have been hired had they not been growing so fast, so I'm begrudgingly very thankful for this mistake ??
But if I were Brian, I would've focused on this:
I thought to myself, how many company CEOs are focused on culture above all else? Is it the metric they measure closest? Is it what they spend most of their hours on each week?
Airbnb should've only grown as fast as it could indoctrinate the new recruits into the Airbnb way. Don't allow people to change shit just "because it worked so well at Google." Hire great engineers with small egos!
Microservices
So Airbnb is growing really fast, hiring lots of smart people from Google, Amazon, Apple, Facebook etc., who all want to do big things, make big changes and continue to grow their careers. Airbnb wants to increase velocity. The conventional wisdom at the time was do microservices.
Microservices really aren't about scalability, reliability or a better dev experience - they're about scaling your eng team. I'll explain...
Microservices are not about scalability for a lot of reasons that you can research more deeply if you'd like, but an incomplete list might be: n+1 queries, thundering herd, hot partitions, fan-out and data skew. Some of this list is really more specific to distributed databases, but often microservices architectures become duct-taped together databases. Database joins get replaced with composition services (either bespoke or something like GraphQL) that stitch together lots of different data sources. These in-service joins are hotspots for poor performance and inconsistent data states.
Often by splitting out services, you're fracturing code, not for performance or simplicity, but so that teams can operate "independently". Only to find out that this independence is very short-lived and now you've replaced hundreds of function calls with network calls.
The real scalability issues exist in your database, not your app code. Thinking hard about sharding, denormalization, query optimization, indexing strategies, connection pooling, read replicas, and caching layers pays far more dividends than breaking your monolith apart.
Microservices are not about reliability. Here's an incomplete list of things you now have to think about as you spin up more services: back pressure, split brain, retry storms, clock synchronization and eventual consistency.
In light of all these distributed system challenges, observability becomes significantly more difficult. In a system with a monolithic process, most errors can be understood in a single stack trace. In microservice architectures, errors can originate from a complex dance between multiple services, making it nearly impossible to debug the full path of a failure without sophisticated distributed tracing tools. These errors often span multiple teams or orgs, making coordination and resolution significantly more complex and time-consuming.
Microservices are not about dev experience. At first glance it might appear they are; spinning up a microservice is sort of a greenfield experience. You get to make good decisions, you get a lot of freedom to design a great piece of software, you can choose modern tech stacks, you can implement clean architectures - if only you were in a vacuum. Putting this in the context of tens or hundreds of other teams also building their services and trying to make a cohesive whole is a nightmare.
Testing is the biggest issue. Testing is the absolute foundation of good software. Tests are fundamental to velocity. You know a piece of software has amazing tests when someone new to the codebase can confidently make a change, ship it and not break a thing. You know software has bad tests when even the architect breaks into a cold sweat before every deploy.
Microservices were a way to scale engineering but they led to a fragmented and brittle stack, turning our coherent monolith into a chaotic distributed monolith held together with duct tape and prayers.
Culturally this led to very perverse incentives. Architecting and building services was seen as the mark of a senior engineer. Want to get promoted? Build a service, I did it ?? With maybe 2000 engineers we had something close to 1000 services, not to mention the sprawling ocean of MySQL databases and other random data stores.
Building anything became an exercise in convincing other teams to build the features you need in their services so your service could do the thing it needs to do. How well you could collaborate and align teams was a significant factor in promotion, causing more senior engineers to act like PMs rather than builders.
We eventually "shipped our org chart" to our product, creating a complex web of services that mirrored our organizational structure rather than focusing on optimal system design. The product became bad enough that Brian decided he had to become a gatekeeper for all product changes.
领英推荐
To give credit where it is due, towards the end of my time there, infra had made significant progress with monorepos, dev tooling and observability to help with the complexity, but the costs to get there were massive.
I can't help but wonder if...
What would I have done?
What if eng had stuck with the monolith? Focused on database performance, denormalize, drop crazy joins, create clear modules within the monolith to separate concerns like app, payments and support, speed up tests and the deploy train. What if eng hadn't quadrupled in size, dedicated more headcount to dev tooling and testing on the monolith earlier, reward deleting code as much or more than creating it. Ultimately Airbnb isn't a very complex product, and a polished monolith would have served us far better than our sprawling service ecosystem.
Outsourced support
This is a tale of two call centers.
First, back in 2016, an Airbnb office with customer support agents who were full-time employees, colocated with engineers and product folks. Airbnb agents were paid decent wages, 70-80k if I recall correctly, with full benefits and real opportunities for career growth. At this time the agent tools were very... sharp. Think Rails templated pages that provided a thin layer over direct database access. These agent tools weren't particularly pretty, but they were easy to build and they got the job done.
Watching these seasoned Airbnb agents work was a sight to behold - armed with sharp tools and direct database access, they had the autonomy to solve problems and issue refunds on the spot. Lightning fast and fully integrated with the team, they genuinely cared about customers and approached each issue with ownership and care.
The other is a few years later, a call center for hire, contracted by Airbnb. I'm not sure how much contractors were paid... but it was definitely much less. Attrition for contractors was through the roof; they'd spend significant time getting trained only to quit shortly after because the job suuuucked.
Contracted support sometimes cared. Most did not. Most call centers simply gamed whatever performance metric we set up to maximize their profits. I do not judge them; it makes complete sense - they have no incentive to do the right thing for a customer, only to meet the metric that leadership believes represents the "right thing."
Since contractors weren't paid as well and weren't a part of the team... well we couldn't REALLY "trust them" ?? So we made lots and lots... and lots of tools to redact personal data, monitor screen activity, limit database access, track keystrokes, and record calls - building an entire surveillance apparatus just to manage our own agents. But then engineering was constantly behind, unable to build all the perfectly secure redacted tools needed, simply because we couldn't trust the agents we'd hired.
This is me making up a number, but I don't think it's an exaggeration to say that the FTE agents were 3x faster and more effective at solving host and guest issues than the contracted agents. This is not to shit on contracted agents; there were very kind and lovely people that I met. This is to throw shade at Airbnb, thinking they'd cut costs and boost profits by paying less and outsourcing. As my cofounder James Pozdena said:
Airbnb's product was search and customer support, but customer support became a cost and not a core value
I look at outsourcing support as both a significant moral and strategic failure. Moral because I don't believe the treatment of contracted agents aligned with Airbnb's values; they were not treated like "Airfam." It was a strategic failure because it caused and continues to cause significant damage to the brand and the bottom line, while ironically they did it in the interest of cost savings.
Let's do some back of the envelope, completely made up math:
FTE Version
Total Revenue Impact: +$500M
Net Cost: ~$138M annually
Contracted Version
Total Revenue Impact: -$500M
Net Cost: ~$1.16B annually
So in my totally accurate hypothetical numbers, hiring support agents as full-time employees and paying them a living wage to solve problems saves Airbnb a cool billion, not too bad for doing the right thing ??
Leadership liked to quote Rahm Emanuel:
You never want a serious crisis to go to waste
On the public stage they did this well; the million-dollar guarantee came out of a huge public failing. It basically said "we really screwed up, we take responsibility for this awful situation, here's how we're going to make real sacrifices to make it better." Every failure is an opportunity to build trust by showing up and owning your mistakes. In the private world of guests' and hosts' little crises, Airbnb is doing much worse. Airbnb hell, nerd wallet and many others document the decline.
What would I have done?
Keep support in-house, compensate them fairly, and treat them like valuable team members - because they are. Invest heavily in simple, powerful tools that let agents solve problems quickly. Trust them to do the right thing and empower them to actually help customers. The cost may look higher on a spreadsheet, but the long-term value of excellent customer support is worth far more than any short-term savings from outsourcing. Most importantly, recognize that customer support isn't a cost center - it's a core part of the product experience that can build or destroy trust in your brand.
Conclusion
Looking back at these three mistakes - destroying our culture through rapid growth, fragmenting our architecture with microservices, and outsourcing our core support function - there's a common thread. Each represents choosing short-term scalability over long-term sustainability. We sacrificed what made Airbnb special - its culture, its technical simplicity, and its human touch - in pursuit of rapid growth and cost optimization.
The irony is that these "scalable" solutions often created more problems than they solved. Importing corporate cultures destroyed our shared vision. Microservices made our system more complex and brittle. Outsourcing support damaged our relationship with customers while likely costing more in the long run.
Great companies aren't built on PowerPoint strategies and MBA frameworks. They're built on strong cultures, thoughtful technical decisions, and genuine care for customers. Sometimes the most "inefficient" solutions - like maintaining a monolith or paying support staff well - are actually the most effective paths to sustainable growth.
Written with revi.so
Software Engineer
3 个月This article is so insightful. Thanks for the sharing!
Software Engineer at Fortune Electric | Master of Science in Computer Science and Financial Engineering
3 个月Nicely done ?? The most insightful article I have read recently. This article immediately reminds my ex-company which has completed 87% of what you mention. It is ongoing and, Yes, one of the founders jumped back in as the CEO again ??
Workforce Management Professional, and Data Enthusiast
3 个月There are reasons outside of cost that outsourcing happens. For instance, who covers the phones during an all hands meeting? Who covers them on Holidays? During overnight shifts? How do you appropriately staff weekends without massively overstaffing mid week knowing that FTEs expect to always have at least one weekend day off and to always have their days off together? That doesn't even begin to touch on Tom Getty's point around language capability, or the challenges of Airbnb's very steep seasonality. Forecasting, and staffing for customer service is a far more complex exercise than most folks give it credit for, and requires flexibility that just isn't possible when you staff 100% with FTEs.
As a former Airbnb CX Specialist in Portland, I really appreciate this post. I’m not familiar with how Airbnb CX was ran after I was let go during the pandemic, but I know I felt the culture slipping away and the impending outsourcing by Airbnb One 2017. It does seem inevitable to lose culture as a company scales, but I sure hope there’s still some of the magic that was there before the Portland office closed. Thanks for the post and all the information.