Software Architecture & System Design!

Software Architecture & System Design!

Thanks to the original writer and article :

https://medium.com/@olgamitroshyna/software-architecture-i-wish-i-had-known-about-this-earlier-4df43eae57db


No alt text provided for this image

0. Customers:?the end users of your web application.

1. Domain Name System (DNS):?defines an IP address (the address of the server which will process a user request) based on a domain name (e.g.?olgamitroshyna.com).

2. Load Balancer:?distributes traffic among multiple servers to share the load.

3,5. Cache:?stores data to serve user requests faster.

4. Front Application (Front-end):?this is the user interface, an application skin, a presentation layer.

6. Message Queue:?stores user requests for further processing by web services.

7. Web Services (Back-end):?where the business logic (app functionality) lives.

8. Data Store:?where web services write data and read data from.

9. Search Engine:?responsible for complex search queries that a data store can’t handle efficiently.

10. Content Delivery Network (CDN):?stores static files like images, CSS, and JavaScript files to serve user requests faster.

11. Queue Workers:?additional servers to process requests (i.e. messages) from a message queue.

Front-end layer

Front-end layer components are:


  • DNS;
  • CDN;
  • Load balancer/reverse proxy. Can be one of three types:
  • a) A hosted service (e.g. Elastic Load Balancer by Amazon);
  • b) A self-managed software-based load balancer (e.g. Nginx);
  • c) A hardware load balancer.
  • Front-end web servers (a presentation & back-end results aggregation layer; technologies: PHP, Python, Groovy, Ruby, or JavaScript (Node.js)).

It’s also important to know that the front-end layer stores the information about the?HTTP session?(the data about a user) via: a) cookies; or b) an external data store; or c) a load balancer if this is the case of a sticky session: the load balancer needs to make sure that requests with the same session cookie always go to the server that initially issued the cookie.

Back-end layer / web services

Options to implement an app:


  1. Build a monolithic app, then add web services according to the business needs;
  2. Follow an API-first approach: all clients (mobile app, desktop website, mobile website, etc.) use the same API interface when talking to a web application;
  3. Combination of those two above.

Types of web services:

  • Function-centric
  • > The ability to call functions’ methods on remote machines without the need to know how these functions are implemented;
  • > Example: SOAP (uses XML and HTTP protocol); SOAP is more complex & secure than REST, REST is more lightweight in terms of documentation than SOAP.
  • Resource-centric (REST + JSON)
  • > Resources are treated as objects, and 4 operations can be performed on the objects: read, create, update, and delete (GET, POST, PUT, DELETE);
  • > REST requires authentication to access resources (OAuth 2);
  • > Depends on transport layer security (HTTPS).

Scaling REST web services:

  • Into functional pieces / functional partitioning
  • > A way to split a service into smaller, independent web services, where each web service focuses on a particular functionality;
  • > There can be a few dependencies between web services — and that’s ok (e.g. between a user (UserProfileService) and a product catalog (ProductCatalogService) when a user saves some products from a catalog);
  • > Each web service can be scaled independently;
  • > Services integration may be challenging;
  • > The author recommends using the service-oriented architecture and web services only when a tech team grows above 10-20 engineers.
  • Adding clones
  • HTTP protocol caching
  • > When GET responses are cached (a response is returned from the cache rather than asking a web service for the response).

Scalability solutions

  1. Adding more clones/servers?(the easiest, cheapest option);
  2. Division by functionality?(servers specialization, represents services-oriented architecture (SOA); requires more effort; functionalities are limited);
  3. Division by data?(please see “Data layer” paragraph below).

Data layer

Traditional scaling — vertical (buying stronger servers, adding RAM, more hard drives, etc.).


Scaling a relational data store (e.g. MySQL):

  1. Replication
  2. > Having multiple copies of the same data stored on different machines;
  3. > Need to sync the state of two servers: source & replica;
  4. > Data modification — only via a source server, but read queries can be distributed among replicas;
  5. > Challenges of replication: a) scaling only reads (excellent for read-heavy apps); b) not a way to solve the problem of an actively growing data set; c) replicas can return outdated data.
  6. Data partitioning / sharding
  7. > Division of a data set into smaller pieces (no need to process the entire data set);
  8. >?Sharding key?is a criterion for partitioning (e.g. we have users in an online shop, a user id can represent a shard, so any user information like orders is stored in that shard);
  9. > Disadvantages: a) adds a significant amount of work and complexity; b) you cannot execute queries across multiple shards; c) depending on how you map from the sharding key to the server number, it might be difficult to add more servers to your infrastructure;
  10. > Azure SQL Database Elastic Scale is a ready-to-use solution for sharding.

Scaling with NoSQL (e.g. Cassandra, Redis, MongoDB, Riak, CouchDB):

Eric Brewer’s CAP theorem: it is impossible to build a distributed system that would simultaneously guarantee Consistency, Availability, and Partition tolerance.

  • Consistency: the same data becomes visible to all of the nodes at the same time.
  • Availability: all available nodes need to process all incoming requests and return a valid response.
  • Partition: the cluster must continue to work despite any number of communication breakdowns between nodes in the system.

That means only 2 of 3 attributes can be met at a time. E.g. MongoDB trades high availability for consistency, it’s a CP data store. Cassandra is an AP data store — it delivers availability and partition tolerance, but can’t deliver consistency all the time.

Current trend:?using the functional partitioning of the web services layer and different data stores based on the business needs.

Caching

  • Used to increase performance and scalability because it returns the ready-to-use results;
  • Try to achieve a higher?cache hit ratio?(how many times you can reuse the same cached response);
  • Caching is good for apps with many reads and may be useless for the apps with many writes;
  • Any caching can be added at a later stage if needed.

HTTP-based cache?— read-through caches (it means that a client speaks to the cache, and only if the cache can’t respond to a client, it asks the web service).

Types of HTTP-based cache:

  1. Browser cache
  2. > We store data in the browser.
  3. Caching proxies
  4. > A server is usually installed in a local corporate network or by the Internet service provider (ISP).
  5. Reverse proxies (e.g. Nginx)
  6. > Placed in your own data center to reduce the load put on your own web servers;
  7. > An excellent way to scale.
  8. CDNs
  9. > Used to cache static files like images, CSS, JavaScript, videos, PDF (but can also serve dynamic content if needed).

Custom object caches:

  1. Object caches on the client side
  2. >?Stored on the client’s device.
  3. Caches co-located with code
  4. >?Located on web servers (FE or BE);
  5. > Objects can be cached directly in: a) the application’s memory/RAM; b) shared memory (multiple processes running on the same machine could access them); c) a caching server can be deployed on each web server as a separate application (for tiny web apps).
  6. Distributed object caches
  7. > Redis, Memcached

Asynchronous Processing

Synchronous processing?— the caller sends a request and waits for the response before continuing its own work. You can’t build modern responsive apps using synchronous processing.


Asynchronous processing?—a client can finish its own job without knowing if the request was processed or not, a “fire-and-forget” principle.

Message queues?are an asynchronous processing technology:

  • Message producers?— a part of the client code, create a message and send it to a message queue.
  • Message queues — where messages are sent and buffered for consumers;
  • Message consumers — receive and process messages from a message queue. Types of message consumers: a) cron-like (pull messages from the queue); 2) daemon-like (a push model).

Messaging platforms:

  • Amazon Simple Queue Service (SQS) (simple, pragmatic; a good solution for early-stage startups);
  • RabbitMQ (provides many features (incl. complex routing), rather simple, flexible);
  • ActiveMQ (Java-based, much lower latency, less flexible routing, can be sensitive to large spikes of messages being published).

Event-driven architecture

  • Not a request/response model, components announce events that have already happened (instead of requesting work to be done);
  • Event?is an object or a message that indicates something has happened;
  • We have publishers and consumers that don’t know anything about each other —they know just the format and meaning of the event message.

Searching for data

  • A full table scan is an ordinary search (you need to scan the entire data set to find the row you are looking for);
  • Indexes are used to speed up the search:

No alt text provided for this image

  • As for data models, a relational data model is the representation of tables that have relations. In a nonrelational data model, you focus on use cases and design the corresponding queries, e.g. return a collection of products (and that’s usually a JSON with the list of products).
  • It’s recommended to use search engines for complex search queries. They usually use the inverted index which allows searching for phrases or individual words. The ready-to-use search engines are: Amazon CloudSearch, Azure Search, Elasticsearch, Solr, Sphinx.

And more…

Scalability is not just about the architecture, it’s also about:


  • Automation of various processes (whether it’s testing, build and deployment processes, monitoring and alerting, or log aggregation);
  • Scaling yourself:
  • > Work smarter, not harder;
  • > Avoid overtime as it leads to mental problems and burnout;
  • > Manage your tasks by prioritization, understand their real value;
  • > Build simple, minimalistic functionality;
  • > Delegate;
  • > Share knowledge, collaborate;
  • > Use 3rd-party services and don’t reinvent the wheel;
  • > Negotiate deadlines;
  • > Release in small chunks, gather feedback, don’t develop in a vacuum;
  • > Create small cross-functional autonomous teams of 4–9 people for particular product areas (e.g. a team around a checkout functionality);
  • > Keep all your project procedures and standards flexible as they restrain creativity and innovation;
  • > Align teams, set common goals, and build a good engineering culture;
  • > And much more useful advice!

there is still so much to learn ??

Hamzah Khammash

Backend lead at Network International

2 年

nice article!

回复

要查看或添加评论,请登录

Omar Ismail的更多文章

社区洞察

其他会员也浏览了