System Design - Conference Management

System Design - Conference Management

Goals of this article:

  • To share the knowledge and wisdom I've gained through working on a web application that is rapidly scaling up and requiring us to improve our software architectural design.
  • To understand how a real-world web application handles a high volume of traffic and how to ensure that information provided by the service is consistent and available at all times.

How does a Conference Management Application help?

The basic functional requirement of a CM application:

  • Application hosts multiple organizations' in-person, hybrid, and virtual-only conferences with their respective sessions, consisting of many different sub-modules/workflows for different useful purposes, like registrations, paper reviewing, abstracts, etc.
  • The main focus of the article will be on participants' registration (Write heavy). I'll keep all other submodules out of the scope of this article.
  • The conference managers review each registration and perform different functions like moderation (approve, reject), email communications, etc. (Read heavy)
  • The hierarchy approach is Category (organization), then sub-category (sub-organizations), then conferences within a category or a sub-category.
  • Multiple customizable registration forms within an event that stores users' registrations.

Categories -> Events -> Registration Forms -> Registrations

Non-functional requirement:

  • As the application is globally used, it is expected to be consistent and available at all times. Maybe error-free too :)

System Interface Design, and extended requirements:

  • The Web Application must be available through REST APIs for different applications with their own specific use cases like participant check-in mobile app, badge printing software, etc.

Capacity Estimation and Constraints (CEC)

  • The number of participants registering for conferences will be more than the number of conferences and the number of conference managers in a conference. This concludes application is Write heavy.
  • We can assume the Read-Write ratio to be 30:70

Traffic Estimates

  1. How many registrations are written in a minute to the database?
  2. How many registrations were reviewed in a minute?

  • Let's assume we have 10K registrations in a day for any event.

10000/(60*24) = 7 Registrations/minute

  • To calculate Query Per Second (QPS), 7/60 = 0.117, it will be difficult to have other calculations on this number, so let's stick to per minute, but for other large-scale distributed systems like Facebook, Uber, Quora, Amazon, Netflix, etc Query per second makes perfect sense.
  • Sometimes, there are last-minute unplanned conferences hosted on the application with shorter registration form deadlines which makes the system, even more, write-heavy. Just like Amazon's Big billion days (Diwali), black Friday sales, Twitter's celebrity problem, Quora hot topics, etc. We horizontally scale up our web servers and also have write-only or ready-only specific replicated DBs added to the clusters to balance the peak load.

Storage Estimates

  • Registration forms are customizable by managers. Therefore, there can be many file fields that are stored in a file system or text fields to be stored in a database.
  • Usually from my experience, there are a minimum of 4 mandatory file fields in every form. (Participant's picture, Proof-of-identity, organization's referral letter, any other)
  • Let's assume each user uploads each file of approximately 5 MB,

4*5 = 20 MB file store/registration

  • Some fields on the form are for users to fill in their information and some are for the managers like notes, and background information checks.
  • We can safely assume 40 fields (INTEGER, BOOLEAN, CHARACTER, TEXT) per registration i.e 1 field taking approx. 1 Kilobyte.

40 * 1 = 40 KB

Therefore, each registration takes approx. 21 Mb of file and database storage.

  • If we store the current rate of registration information for 10 years,

21Mb * 10K * 30 days * 12 months * 10 years = 720 Terabyte of data

This helps in understanding our limit on file uploads of user content.

Bandwidth Estimates

Crucial to balance the load between multiple application servers:

7 Registrations * 21 MB = 147 MB/minute

After some standard unit conversions at 2.45 MB/second data is written to the file system and database.

Reads are mostly in bulk via the server-side registration list by the managers, where a manager can read up to 200 registrations on one page. It is usually the information fetched from the Postgres Database, and files are accessed as one file at a time.

Cache or Memory Estimates

Postgres DB clusters (PgPool II) are powerful to handle multiple large queries per second and currently store the registration information in a redundant de-normalized JSONB column like a Document DB (Mongo DB) that helps in efficient pagination and filtering.

The user's session-related information and some frequently accessed temporary information are stored on Redis (distributed cache).

The operating system or WSGI/ASGI web server might cache the frequently accessed files such as event materials (open to the public) like ppts, or other document files.

Some registrations are frequently accessed by different managers, therefore let's take in the same number of registrations often read i.e 7 for 1 hr of TTL cache.

20 MB * 7 registrations / per minute * 60 minutes = usage of 10 GB of RAM per minute

Done

Things on which we are scaling up and future predictions are not included in these back-of-the-envelope estimates, for example, the number and type of sub-organization being onboarded which bring the number of new events in a month, thus increasing the number of registrations and active users per day.

It's important to regularly monitor the application's performance for adjustments and removing the code or infrastructure-related bottlenecks.

Thanks for reading,

要查看或添加评论,请登录

Vasant Vohra的更多文章

  • Swiss-Life ????

    Swiss-Life ????

    12 Lessons from Living in Switzerland for 12 Months ?? Switzerland is Clean: The air feels fresher, the streets are…

    2 条评论
  • EuroPython2022

    EuroPython2022

    Overview This is an article to share my wonderful experience participating in the EuroPython Conference in Dublin…

    2 条评论
  • Technical Solutions for Healthcare, Transportation, Mortgage Industries...

    Technical Solutions for Healthcare, Transportation, Mortgage Industries...

    one needs to solve problems which have a bigger impact in others life 2021 Indico.UN Largest Events, and conferencing…

    1 条评论
  • 7 steps for code reviewing via C3.

    7 steps for code reviewing via C3.

    Code reviewing is a fun activity within the team. Similar to taking a chilled beer with colleagues, the code review…

    3 条评论
  • Every developer must know...

    Every developer must know...

    Hello, my dear change-makers of society. In this article, I try to briefly explain the 5 basic SOLID Principles, every…

    3 条评论
  • Do you really know SCRUM?

    Do you really know SCRUM?

    Fail Fast, Learn Fast, Feedbacks I guess we all know about SCRUM, nearly every software company is being Agile and…

    4 条评论
  • WeighBridge-Indian Trucks ALPR

    WeighBridge-Indian Trucks ALPR

    Challenge Evident from the image on the right, Licence Plates in Indian trucks have variations. It's easy for humans to…

    1 条评论
  • Docker

    Docker

    I have been developing projects related to Augmented Intelligence as well as updating some of them on Github. As the…

  • Software Engineers support to ease COVID-19.

    Software Engineers support to ease COVID-19.

    Over the past few days, I've been thinking about how to contribute my skills and abilities to save not the world but…

    1 条评论
  • Waste Segregation -> Convolution Neural Network.

    Waste Segregation -> Convolution Neural Network.

    Challenge Waste management is a crucial concern in India. There is no automated waste segregation strategy at everyday…

    2 条评论

社区洞察

其他会员也浏览了