System Design and Development of a URL Shortener Service (1 of 2)

System Design and Development of a URL Shortener Service (1 of 2)

Article Goals

The intent of this article is to serve a holistic approach to systems design by using a URL shortening service as a case study. The intent is to relay an overview of baseline considerations, an accurate solution, and a template for future articles.

This article, the first installment of a two-part series, aims to explore the system design of a URL shortener service. The follow-up article will shift towards the practical aspects of development and testing.

Table of Contents

  1. Introduction
  2. Initial Considerations
  3. Functional Requirements
  4. Non-Functional/Capacity Requirements
  5. Initial System Diagram, Redundancy Considerations, and Caching Strategy
  6. System Tradeoffs
  7. Calculation of Database Size
  8. Designing the length of the Short URL
  9. Encode and Decode Methods
  10. Scaling for Data Size
  11. Final Systems Architecture
  12. Building a Practical Demo: A Step Toward Realization- Personal Takeaway
  13. References
  14. Appendix


Introduction

Designing a URL Shortener is a great intro to systems design as a whole since this is a common tool on the web for truncating URLs. A URL Shortening service offers valuable insight through data analytics and affiliate marketing for businesses, and customers can use this service to post cumbersome links on social media to enhance brand visibility. Over time this underlying system will need to accommodate increasing demand for instant access, and the sheer volume of data being processed necessitate both efficiency and scalability.


Initial Considerations for a URL Shortener Service

To create our URL Shortener service we will hold key-value pairs for O(1) read and write operations of the data such that the short url is the key, and the long url will be its respective value. The choice to make the short url as the key is for the case where the client service allows for a quick reverse lookup of the long URL for a given short URL. Should a URL be re-entered in the system we may treat the request as non-idempotent by generating a new unique URL.

//pseudo-code
DB = {[shortURL]: [longURL]}         

Functional Requirements

  1. Generate unique and short URLs
  2. Encode/Decode methods
  3. If a recurring long URL is entered, return a new unique URL
  4. URLs can be customized optional
  5. Least Recently Used (LRU) Cache optional
  6. Data Time To Live (TTL) optional
  7. Batched processing of URL generations optional
  8. Data Analytics optional


This service can be created as a typical client and server application, however there are considerations for why we may need to opt for a distributed system as follows:

  1. To create our service we need to a place to store our URLs in a database as well as a cache tier for high speed data retrieval. Both the database and the cache may need to be scaled if the data is too intensive for our system.
  2. We should consider scaling the service for data throughput to in case the number of requests to our server is too large.
  3. If the response time of our system is slow then we may need to parallelize the computation.
  4. To optimize for availability/reliability in the case of faults or disaster we need system redundancy.
  5. To optimize for client geolocation we can minimize network latency by placing our servers closer to the client.
  6. Hotspots may occur due to disproportionately high loads.


Non-Functional/Capacity Requirements

Calculation for the Short URL generation via the encoder function

There are an estimated 2-3 Billion short URLs created per year:

(Number of requests made per year) / (Number of seconds in a year) = Number of queries per second (qps)

(2.5 x 10^9) / (365 days x 24 hours x 60 mins x 60 seconds) ~= 75 qps


Calculation for the Long URL retrieval via the decoder function

There are estimated 20 Billion clicks to obtain the redirect URL from this service per month:

(20 x 10^9) / (30 days x 24 hours x 60 mins x 60 seconds) ~= 7700 qps


QPS CALCULATIONS SUMMARY

Short URL generation: ~75 qps for Write

Long URL retrieval: ~7700 qps for Read


Initial System Diagram, Redundancy Considerations, and Caching Strategy

Accessing our data via disk is significantly slower than accessing it in-memory since RAM provides quick data access. RAM is also volatile so we may lose our data if the system is turned off, so keeping a record of the database on disk provides better fault tolerance should the system go offline due to a power outage. Disk also has a larger capacity than RAM per cost in terms of data storage.

For our architecture we will use a hybrid solution to store our database both on disk and in-memory due to their aforementioned tradeoffs. We require a few services for our initial architecture: a compute instance, a cache to allow for high speed data retrieval, and a database to serve as a long term, non-volatile and fairly low cost storage solution.

For our initial architecture we are also employing a cache-aside (lazy-loading) strategy over a cache-through approach. This is due to the preference in the design for availability over consistency since the data in such a service is not paramount.

Initial Design

System Tradeoffs

No distributed system is safe from network failures, so we need to build system tolerance. Moving forward we will create the system with two replicas per tier as a baseline for redundancy. We will also assume an arbitrary 3-year overall data retention policy for further system optimizations pertaining to data load and throughput. Afterwards the service will need to be reevaluated and built around the newfound system analytics.


Partitioned Design


The designed system is read heavy, as projected earlier in the long URL retrieval QPS calculation, so an opt for a hot-warm architecture will optimize the system around availability. Therefore the overall system is formulated around availability and partition tolerance of the CAP Theorem.

To go for a hot-warm architecture the client of the service can read and write to data handled in the lead partition, and a read quorum will also be placed on the secondary partition, the replica. The tradeoff for the added redundancy would be potentially stale data in the secondary partition. And finally the third partition, a replica of the secondary partition, will serve as a failover backup. This pattern allows the client the continue to read data should the primary partition be downed, with the tertiary partition providing an additional layer of durability to recover data.


Hot-Warm Architecture Nested Within Partitions



Calculation of Database Size

In order to calculate data size we may refer back to our QPS for creating the short URLs, which was about 75 qps, and use this alongside the projected life of our system (three years stated earlier) to help determine the overall expected data:

  • Let W be the number of write queries, in QPS, for the short URL generation such that W = 75 qps ~= 100 qps = 10^2 qps
  • Since the keys in the database will be the short URL, the qps of the short URL generation is important because it correlates to the number of key-value pairs to keep record.


Total Number of Queries will be Number of seconds in three years x W:

  • Number of seconds per day = 24 hours x 60 minutes x 60 seconds = 86,400 ~= 100,000 seconds/day = 10^6 seconds/day
  • Lifetime of 3 Years = 365 days x 3 ~= 1000 days = 10^3 days
  • Number of seconds in three years = (10^3 days) x (10^5 seconds/day) = 10^8 seconds
  • Total Number of Queries = Number of seconds in three years x W = (10^8 seconds) x (10^2 qps) = 10^10 queries

Using the Total Number of Queries of 10 billion we may calculate the Total Data Size, which would would be the Total Number of Queries x Size of Data.

The Size of Data needs to save record of key-value pairs. Since the short and long URLs are text-based, an assignment of an arbitrarily long length of 2048 bytes or 2kB is sufficient expectation to keep record of a single key-value pair.

Total Data Size = (10^10 queries) x 2kB ~= (10^10 queries) x 2kB

Total Data Size = (2 x 10^10) kB

Total Data Size = 20 TB


Designing the length of the Short URL

In the previous section, Calculation of Database Size, a key detail of write queries of the short URL generation is that it correlates to the number of the key-value pairs to keep record of in our database. Since the unique keys in the system are short URLs there is an approximate 10 billion queries or 10 billion key-value pairs to keep record. We can use this information to identify the length of the short URLs because the hashing needs to generate sufficient unique URLs.

Initial Considerations for Short URL Generation

Regarding the generation of short URLs we may consider an existing cryptographic hashing function such as SHA-1, SHA-2, SHA-3, MD5 on our long URL. However, these existing functions are deterministic, so they will not meet our requirements and therefore we are better off to consider creating our own encode and decode methods to have full control of the system functionality.

Faulty Approach: Store an index as the short URL by maintaining a counter

We need to come up with a solution that is both short and unpredictable. As an initial approach we can generate our shortened URLs by storing an index as the short URL via an updated counter. Although this implementation may generate unique values for the short URL, it is a predictable solution and can also theoretically be reverse engineered. The generated data would also not be so short as the index will eventually be very long value by nature.

Sophisticated Approach: Hash the URL and update the counter to avoid data collisions

To make our hashing algorithm unique it makes sense to generate a random value to enable our encode functionality. However, there is no way to generate a truly random value in computer programming, so we will recycle the previous approach of handling an index via a count. This count is special because it will serve as a Primary Key: preventing data collision due to it being unique and never null. Alternatively, we may opt to use a unique non-repeating value such as system timestamp or epoch time.

//psuedo-code
shortURL = encode(longURL, primaryKey)        

Ultimately we will convert a number n to a base x such that a unique and short URL is generated for a given long URL. So suppose the short URLs will be a binary of 0's and 1's, this would mean that we would need to generate a short URL where each character would be a base-2 representation of characters. This may suffice however we can certainly do better by simply adding a higher base, so let's do that by upping the base to base-16. This would mean that we can use 0-9 and the characters A-F as our bits to represent the short URL.

We can take it even further by upping the bits further to base 62 by using 0-9, a-z, A-Z. We can add two more character such as hypen "-" and underscore "_" to make the bits to be of base-64 because the math tends to be cleaner if we go by the powers of 2.

Calculate the length of the short URLs

There are 10 billion key-value pairs of data, so at the very least we need 10^10 data stores. Since there are 64-bit characters, can simply calculate the minimum length of the short URL needed to handle the data storage.

Let L be the length of the short URL, let's calculate the value of L:

64^L >= 10^10

log(64^L) >= 10

L >= 6

Rounded to the nearest whole number we get L >= 6, meaning we can generate 64^6 key value pairs.

Let's verify how many unique short URLs we can generate with values that are more human readable:

64^6 = (2^6)^6 = 2^36 = (2^30) x (2^6) = (2^10)^3 x 2^6 ~= (1 Billion) x 64 ~= 64 Billion unique short URLs may be generated, meeting and exceeding our expected requirement of 10 billion keys.


Encode and Decode Methods

Heres a language-agnostic solution for the short URL generation using an encode function based on what we've gathered in previous sections. The comments in the following code are present to guide learners. See Appendix for Python, TypeScript, and Go specific code.

counter = 0  # Counter needed to guarantee uniqueness
base_url = "https://url.shorten/"

# Simulated database using a global dictionary (consider replacing with a real database in development)
db = {}

def encoder(long_url):
    global counter
    counter += 1
    digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_"
    base = len(digits)  # Base-64
    digit_list = []
    curr_val = counter

    while curr_val >= base:
        quotient, remainder = divmod(curr_val, base)
        digit_list.append(digits[remainder])
        curr_val = quotient

    # Append the most significant bit
    digit_list.append(digits[curr_val])

    # Reverse the list and join to form the short URL path
    short_url = ''.join(reversed(digit_list))

    # Store the mapping in the dictionary
    db[short_url] = long_url

    # Return the complete short URL
    return base_url + short_url        

And the decode will be a simple lookup in the database:

def decode(short_url):
    key = short_url.replace(base_url, "")
    return db.get(key, "URL not found")        

Scaling for Data Size

Before delving into the optimization strategies for data size within our URL shortener service, it's essential to establish a few heuristics related to caching and database management. These heuristics will guide our approach to efficiently managing and accessing the vast amounts of data our service will handle.

Heuristics for Caching and Data Access:

  1. Cache Utilization: We estimate that maintaining around 10% of the database entries in our cache should suffice for most use cases. With this setup, we can anticipate a cache hit rate of approximately 90%. This means that 90% of the time, data requests can be served directly from the cache, significantly reducing data retrieval times and lessening the load on our database.
  2. Cache Expansion: If we opt to increase our cache size to include 20-30% of the database entries, we project a substantial improvement in our cache hit rate, reaching 98-99%. This approach ensures even faster data access for the majority of requests, further minimizing the dependency on direct database queries. However, this comes with increased memory requirements for the cache, which needs to be balanced against the available resources and cost implications.

Heuristic for Optimal Cache-Hit Rate


Database Partitioning

Referring back to the calculation of the data size present in our system over its lifecycle, its projected that there will be 20 TB worth of data in the system. Assuming that a typical commercial commodity machine has roughly 1-2 TB of disk space, we would need 10 partitions of 2TB disk space each. Following our initial system design, each machine will have 2 replicas, necessitating a total of 30 partitions for database storage.

Cache Partitioning

Opting to store 20% of the database in our cache, as specified in our heuristic approach, denotes that 4TB of data, or approximately 2 billion key-value pairs, will be present in our cache at all times. Given that a standard commodity machine comes with about 128 GB of RAM, we can calculate the need for roughly 30 partitions to accommodate the 4TB by dividing it by 128GB. As per our design, each partition will have two replicas, requiring 90 total partitions for the cache.


Data Size Optimized Design

Final Systems Architecture

In the context of our distributed URL shortener service, the architecture's resilience and efficiency are paramount. This necessitates an effective routing policy at the application tier, typically managed by an API Gateway or a Load Balancer. These components play a pivotal role in distributing incoming traffic to various service instances, ensuring that no single point becomes a bottleneck or a point of failure.

Finalized High-Level Distributed System Architecture

Load Balancer and Redundancy

To mitigate risks associated with a single point of failure, it's crucial to implement redundancy for the Load Balancer. This can be achieved by deploying multiple Load Balancers and using a Domain Name System (DNS) strategy to distribute traffic among them. Such a setup not only enhances system reliability but also enables seamless handling of load spikes and failover scenarios.

DNS Strategy

A sophisticated DNS strategy can significantly contribute to system resilience and load distribution. By directing traffic across multiple Load Balancers, the DNS serves as an additional layer of load balancing. This approach also facilitates geographical load distribution, directing users to the nearest or most available service instance, thus minimizing latency and improving the overall user experience.

Considerations

  • Scalability: As the service scales, both the API Gateway and Load Balancers must scale accordingly. This may involve scaling out (adding more instances) and scaling up (upgrading existing instances) based on traffic patterns and system load.
  • Health Checks: Regular health checks are essential to ensure all system components, including Load Balancers and service instances, are operational. This enables the automatic rerouting of traffic away from any failing components, maintaining the system's availability and reliability.
  • Security: Implementing security measures at the Load Balancer level, such as SSL/TLS termination, can help offload encryption tasks from the application servers, optimizing performance while ensuring secure communications.

The final architecture of our URL shortener service demands a robust, scalable, and resilient setup, with a special focus on effective traffic distribution and failover management. By leveraging a combination of API Gateways, Load Balancers, and a strategic DNS configuration, we can achieve a highly available and efficient system capable of handling the demands of a distributed environment.


Building a Practical Demo: A Step Toward Realization- Personal Takeaway

While the comprehensive system architecture outlined represents an ideal, large-scale deployment, the reality of development, especially for individuals or smaller teams, often necessitates starting on a more manageable scale. To bridge this gap between theory and practice, I aim to create a simplified yet functional demo of our URL shortener service. This demo will serve as a tangible example of applying the architectural principles discussed, scaled down to a level that's feasible for solo developers or small teams.

Why a Demo?

  • Understanding through Doing: Building a demo allows for a deeper understanding of the system's components and their interactions, illuminating the complexities of distributed systems in a hands-on way.
  • Validation of Concepts: It provides an opportunity to validate the architectural decisions made on paper, identifying potential oversights or areas for optimization.
  • Educational Tool: The demo can serve as an educational tool, not just for myself, the developer, but also for the community, offering insights into practical challenges and solutions in system design.

Demo Overview

For the upcoming demo, we'll employ a tech stack that's both robust and accessible:

  • Programming Language: Go (Golang), chosen for its efficiency and suitability for building high-performance, concurrent applications.
  • Caching Layer: Redis, a versatile in-memory data store, will serve as our caching solution, offering rapid access to frequently requested data.
  • Database: A NoSQL database, such as MongoDB, will store URL mappings, providing scalability and flexibility in handling unstructured data.
  • Containerization: Docker will encapsulate our service's components, ensuring consistency across different development and production environments.
  • Orchestration: Kubernetes will manage the deployment, scaling, and operation of our containerized applications, facilitating the demo's scalability and resilience. Depending on the risk associated with the cost and other personal factors this may be dropped.

Building the Demo

The development process will involve:

  1. Setting Up the Development Environment: Configuring Go, Redis, MongoDB, Docker, and Kubernetes on a local development machine or development server.
  2. Implementing the Core Service: Coding the URL shortening logic in Go, integrating Redis for caching, and MongoDB for persistent storage.
  3. Containerization: Packaging the service and its dependencies into Docker containers, making the application environment-agnostic.
  4. Deployment and Orchestration (Stretch Goal): Deploying the containers onto a Kubernetes cluster, setting up services, and load balancing for handling traffic.

Future Steps

This demo will not only demonstrate our system architecture but also lay the groundwork for scaling up. As my development and data engineering skills evolve, I plan to incrementally incorporate more elements of the full system architecture, tackling challenges such as distributed caching, database sharding, and advanced load balancing techniques.

Stay tuned for a follow-up article where I'll share the development journey, challenges encountered, code snippets, and insights gained from building this demo. Together, we'll take a step closer to turning the theoretical architecture of our URL shortener service into a tangible, functioning reality.


References

  1. "LRU Cache." LeetCode. Accessed January 24, 2024. https://leetcode.com/problems/lru-cache/description/.
  2. "What is an LRU Cache?" Redis. Accessed January 26, 2024. [https://redis.com/glossary/lru-cache/#:~:text=The%20Least%20Recently%20Used%20(LRU,remains%20available%20in%20the%20cache.](https://redis.com/glossary/lru-cache/#:~:text=The%20Least%20Recently%20Used%20(LRU,remains%20available%20in%20the%20cache.)
  3. Patel, Abhi. "System Design: Storage, Latency, Throughput, and Availability." Medium. Accessed January 30, 2024. https://medium.com/@abhipatel3112001_24075/system-design-storage-latency-throughput-and-availability-ca009a6a57ac
  4. "How Do We Design for High Availability?" ByteByteGo Blog. Accessed February 1, 2024. https://blog.bytebytego.com/p/how-do-we-design-for-high-availability
  5. "CAP Theorem." Wikipedia. Accessed February 3, 2024. https://en.wikipedia.org/wiki/CAP_theorem
  6. "What is Quorum in Distributed Systems?" Educative.io. Accessed February 5, 2024. https://www.educative.io/answers/what-is-quorum-in-distributed-systems
  7. Poppe, Mauricio. "Back-of-the-Envelope Calculations." Mauricio Poppe's Notes. Accessed February 8, 2024. https://www.mauriciopoppe.com/notes/computer-science/system-design/back-of-the-envelope-calculations/


Appendix: Code Implementations

This appendix provides code samples for the URL shortener service's encode function in Go, TypeScript, and Python. Additionally, a simple decode function is provided for each language, demonstrating how to retrieve the original URL from the shortened version.

Go Implementation

Encoding

package main

import (
    "sync"
)

var (
    counter int
    base64Digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_"
    base         = len(base64Digits)
    db           = make(map[string]string)
    lock         = sync.Mutex{}
    baseURL      = "https://url.shorten/"
)

func encode(longURL string) string {
    lock.Lock()
    defer lock.Unlock()
    counter++
    var digitList []byte
    currVal := counter

    for currVal > 0 {
        remainder := currVal % base
        digitList = append([]byte{base64Digits[remainder]}, digitList...)
        currVal /= base
    }

    shortURL := string(digitList)
    db[shortURL] = longURL
    return baseURL + shortURL
}        

Decoding

func decode(shortURL string) string {
    lock.Lock()
    defer lock.Unlock()

    shortURL = shortURL[len(baseURL):] // Remove the base URL part

    return db[shortURL]
}        


TypeScript Implementation

Encoding

let counter = 0;
const base64Digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_";
const base = base64Digits.length;
const db = new Map<string, string>();
const baseURL = "https://url.shorten/";

function encode(longURL: string): string {
    counter++;
    let digitList: string[] = [];
    let currVal = counter;

    while (currVal > 0) {
        let remainder = currVal % base;
        digitList.unshift(base64Digits[remainder]); // Prepend to make reversing unnecessary
        currVal = Math.floor(currVal / base);
    }
    const shortURL = digitList.join('');
    db.set(shortURL, longURL);
    return baseURL + shortURL;
}        

Decoding

function decode(shortURL: string): string {
    const key = shortURL.replace(baseURL, "");
    return db.get(key) || "URL not found";
}        


Python Implementation

Encoding

counter = 0
base64_digits = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-_"
base = len(base64_digits)
db = {}
base_url = "https://url.shorten/"

def encode(long_url):
    global counter
    counter += 1
    digit_list = []
    curr_val = counter

    while curr_val > 0:
        remainder = curr_val % base
        digit_list.insert(0, base64_digits[remainder])  # Prepend to make reversing unnecessary
        curr_val //= base
    
    short_url = ''.join(digit_list)
    db[short_url] = long_url
    return base_url + short_url        

Decoding

def decode(short_url):
    key = short_url.replace(base_url, "")
    return db.get(key, "URL not found")        

This appendix provides a concise, clear presentation of the encode and decode functionalities across three different programming languages, intended for readers to easily adapt or extend based on their specific requirements or to explore further optimizations.

Anh Nguyen

Software Engineer | Backend & DevOps Enthusiast | Passionate about Scalable Systems, Cloud Infrastructure, and Mission-Driven Technology Solutions

3 个月

Hello. I came across your post when I noticed that LinkedIn automatically shortens URL. This is a really nice read. Thanks for sharing ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了