Decoding Stack Overflow's On-Premises Monolith: A Dive into Infrastructure
Stack Overflow is a popular online forum where programmers can ask and answer questions about programming. It is the largest such community, with over 100 million visitors each month. The site was founded in 2008 .
Stack overflow is multi-tenant application that hosts over 180 Q&A sites with all sites powered by same application . Server Fault, Super User and Ask Ubuntu are few of other sites hosted on same infra .
If you see the website usage stats and the performance of the website then you may think they use a microservice solution running in the cloud with Kubernetes or server-less solution .
But their approach is 15 years old on-premise monolithic application. This singular application, hosted on IIS, manages a staggering 180 sites. Spread across nine web servers.
To ensure high availability Stack Overflow?and other applications have been deployed on nine IIS webservers which are hosted across nine virtual machines . Since the infrastructure is situated within an on-premises data center, it lacks the capabilities for resource scaling that are typically available with cloud providers. In order to handle the significant influx of requests, they opted to employ virtual machines (VMs) with ample memory, each boasting a capacity of 64 GB. The CPU bears an average load of 5%, with a peak of 12%. This indicates the presence of abundant untapped resources ready to accommodate any surges in demand.
Stackoverflow employs one primary SQL server along with one standby server with ample RAM (1.5TB)
领英推荐
?Two Redis servers (master and replica) are being used for caching .
Besides this, 3 Elastic search servers, which are used for 34 million daily searches.
For distributing traffic evenly among nine webservers, the solution employed is HA Proxy. To put it more plainly, HA Proxy takes in the incoming traffic and evens out the workload across the servers. Furthermore, HA Proxy is equipped to handle situations where a server might malfunction, as the load balancer can identify unresponsive servers and cease sending traffic to them on its own.
In one of the podcast Roberta Arcoverde, Director of Engineering at Stack Overflow was asked about moving to the cloud, Roberta said the cost and latency compared to the current set up isn’t worth the effort.
Roberta mentioned that although Stack Overflow deals with numerous inquiries concerning cutting-edge technology trends, the platform has mostly avoided incorporating technologies like Kubernetes and microservices. This avoidance stems from the fact that the company isn't encountering the specific challenges these tools were developed to address. Roberta questioned the rationale behind breaking down a monolithic system into microservices or services, stating that such a transition is typically undertaken to enable seamless scaling across different teams, facilitate collaborative work without conflicts, and achieve swift deployment. She emphasized that rapid deployments have never posed an issue for Stack Overflow. Nevertheless, she acknowledged the possibility of this stance evolving in the future.
Credits - .Net Conf 2022 (https://www.youtube.com/watch?v=nZX13dVxnJw&t=1170s) and diagrams are referred from https://stackexchange.com/performance .
DevOps Lead Engineer
1 年Awesome sir. Thanks for sharing it with us ??