Painless Serverless for Enterprise
Abstract
Interest in serverless architectures is rapidly increasing. The promises of cost reduction, high availability and native autoscaling are too tempting. But is it also a stable technology ready for enterprise world? What are the use cases, what the advantages and the disadvantages? We are going to design an architecture relying on AWS services: Api Gateway, Lambda functions, Step functions, Aurora DB and S3. Code will be written in Python using also some of the main frameworks such as SQLAlchemy and Marshmallow.
Introduction
According to observers, the interest around serverless architectures will grow enormously in the next 12-18 months. Just a few weeks ago, O'Reilly published its Serverless Survey 2019 which showed companies are mostly focused on costs reduction and autoscaling features. But results are not always coming shortly: the survey reveals that companies that have already adopted serverless architectures since more than three years consider it a successful choice, while those that have not yet adopted it have been slowed down by doubts related to security issues or skills lacking. We want to investigate more by studying a possible implementation.
IT has always slowly chased the business and its evolution speed. A software architecture should be agile and flexible to be able to adapt easily to changes. These goals are achivable only by adopting patterns and paradigms that allow
- decomposing applications into simple objects, easily maintainable and adaptable, generic and reusable.
- decoupling application components so that they are independent
The most popular synthesis of these concepts today are microservices. Serverless architectures push these concepts to an even more advanced level and, in addition, make scalability and high availability easily possible, making the software a truly flexible object. Each Cloud vendor has its proposal in this area: Google Firebase, IBM BlueMix OpenWhisk, Azure Functions: a wide range of possibilities. In this article, we will focus on the services offered by AWS.
Architecture
Let's assume that we have to create an application for products management. We want to perform the basic CRUD operations on products and to attach one or more documents to each product.
The application will be splitted in three tiers: user interface, business logic and the data layer according to the n-tiers model
In the Data layer we are going to use the following managed services:
- A MySQL instance of Aurora as database
- S3 as documents repository
Business logic is implemented in Backend layer using several Python lambda functions exposed through a REST interface built with API Gateway
Finally the UI layer that is completely decoupled and can be implemented with any technology.
The REST API will be eventually used also for any future system integration. The choice of managed services makes us get out of the box two key features at every layer: high availability and autoscaling. In a iaas approach we had to design a HA solution and manage autoscaling to get an highly available, fault-tolerant, and load adaptive system. These are not trivial issues and must be taken into consideration when comparing different solutions' operating costs. Let's analyse every tier following a bottom-up path. We will leave out just the front-end layer which is completely decoupled and independent by other layers.
Data Layer
Aurora is the AWS serverless relational database service. It is compatible with MySQL and PostgreSQL and uses a distributed and fault-tolerant storage system with automatic repair feature. It's more than 99.99% available (no more than 52 downtime minutes per year), uses 6 replicas of the data in 3 different availability zones and continuously stores data backups on S3. This means we have a highly available database without worrying about the design and maintenance of the complex infrastructure necessary to guarantee such reliability. We'll pick MySQL flavour and we'll pay attention to ask for a minimal sizing to keep developing costs extremely low.
Data model is trivial, made just by two tables. The product table contains two columns: id as primary key and name. The attachments table will store the metadata of files attached to products.
Files, as anticipated, will be stored on a dedicated S3 service that we set up:
Backend
Once completed the data layer we can analyze the Backend layer. This tier can be decomposed into four further layers. Each layer has a specific role according to the single responsibility principle:
Model & ORM deals with the definition of data models, manages sessions and persistence through an ORM tool. It assures an efficient access to the database.
DTO maps model objects (non-serializable) with serializable objects easily reusable in other application layers (rest api and UI).
Business Logic contains just application logic
REST Api finally exposes an interface according to the RESTful specification
Naturally back-end components will be serverless too. REST API is implemented using AWS API Gateway service. Other layers are implemented using AWS Lambda service. Lambda is a fully managed service that can execute code in response to various events. It's scalable, efficient and highly available . It natively supports many programming languages such as Java, Go, PowerShell, Node.js, C #, Python, Ruby and allows to define up to five Lambda Layers. A layer is a tool that allows to integrate code, packages, frameworks, libraries and other dependencies. Layers help to build a very complex project by breaking it down into simpler and independent modules. This is a key features for enterprise applications.
Why not Java
Initially, we assessed the adoption of Java because of team members' skills and because java has an indisputable enterprise vocation. The idea was, of course, to use JPA (Jakarta Persistence) as persistence API, maybe using Micronaut Data, a toolkit for accessing data with extremely innovative and interesting features:
- native support for serverless computing
- absence of a runtime model for modeling relationships between entities
- queries generated at compile time and not translated at runtime
- absence of runtime proxies or reflection
Micronaut Data supports Java (both JPA and JDBC flavours), Groovy and Kotlin. It seems to be the ideal choice for a serverless context. The announcement of the stable version 1.0 is however very recent (August 2019) and we preferred not to use it in an enterprise operating context. Anyway this is an extremely interesting project. As the whole Micronaut galaxy (a full-stack framework for building modular, easily testable microservice and serverless applications) is too.
Additional considerations regarding the use of Java in serverless contexts:
- additional JVM overhead
- worse performance in case of "cold start"
- larger packages size
- higher memory consumption (reflects on operating costs)
- higher latency (reflects on operating costs)
led to consider adopting Python as an alternative.
Why Python
Python is an extremely powerful and versatile programming language. It has a synthetic syntax that often helps to increase productivity and to write more readable, more maintainable and a better code. It is open source, it is stable (the first version dates back to 1991), it's cross-platform and not hard to learn. Python, as Java does, compiles the source code in bytecode. Compiled code is then interpreted by the PVM. The software is portable and more performing than a non-precompiled language (the bytecode generated after the first execution is reused) and the distribution of closed source bytecode is also allowed. The huge amount of standard and third-party libraries is an added value: Python gets wide supports and contributions both from independent developers and from companies such as Microsoft, IBM, Google, Facebook, Dropbox and many others.
We are going to use in our project some of these libraries:
- SqlAlchemy is a SQL and ORM toolkit that allows a very performing access to the database by implementing the major persistence patterns.
From the official website we learn that it is widely used in enterprise contexts by brands like Yelp !, reddit or DropBox
- Marshmallow is a library for converting complex objects into native data types and is useful for easy serialization and deserialization of objects.
Finally, the adoption of Python is also an educational opportunity not to be missed to let the development team acquire new skills.
And there will also be a reason if 47% of all Lambda functions are currently written in Python, and another 39% in Node.js.
All these features make Python certainly suitable even for extremely complex and enterprise applications. Google, YouTube, Merrill Lynch, Cisco, VMware and Philips have widely demonstrated this.
The implementation
Backend: Model & ORM
First step is to build the model objects that map our database tables. We'll work with workbench_alchemy, a scaffolding tool that automatically creates SQLAlchemy model objects from the MySQL Workbench project. It can also create tables on the database, saving us a bit of work. The project is a few years old and, according to the author, it is currently under revision. It seems not to have been updated for a while. In any case, it does everything we need.
Here are Product and Attachment generated classes:
class Product(DECLARATIVE_BASE): __tablename__ = 'products' __table_args__ = ( {'mysql_engine': 'InnoDB', 'mysql_charset': 'utf8'} ) id = Column(INTEGER, autoincrement=True, primary_key=True, nullable=False) # pylint: disable=invalid-name name = Column(VARCHAR(45), nullable=False) isDeleted = Column(TINYINT, default=0, nullable=False) attachments = relationship("Document", back_populates="product") def __repr__(self): return self.__str__() def __str__(self): return "<Product(%(id)s)>" % self.__dict__ class Attachment(DECLARATIVE_BASE): __tablename__ = 'attachments' __table_args__ = ( {'mysql_engine': 'InnoDB', 'mysql_charset': 'utf8'} ) id = Column(INTEGER, autoincrement=True, primary_key=True, nullable=False) # pylint: disable=invalid-name fileName = Column(VARCHAR(255)) mimeType = Column(VARCHAR(50)) isDeleted = Column(TINYINT, default=0, nullable=False) products_id = Column(INTEGER, ForeignKey("products.id"), index=True, nullable=False) product = relationship("Product", foreign_keys=[product_id], back_populates="attachments") def __repr__(self): return self.__str__() def __str__(self): return "<Attachment(%(id)s)>" % self.__dict__
The tool created properties necessary to map the relationship between the tables too. The SQLAlchemy relationship API describes the relationship between the Product and the Attachment classes. In the first one we will find the attachments property containing all the files linked to the product. In the second one we will find the product property containing the parent product.
Backend: DTO
Model objects, defined as we have just seen, cannot be serialized. In order to be able to move informations between front-end and back-end we need the objects to be serializable in json. We need to introduce model like serializable objects, according to the DTO pattern. Mapping, serialization and deserialization will be demanded to marshmallow, a library for the bi-directional conversion of complex data types to native Python data types. Here are the DTO objects that we need (called Schema in marshmallow):
class ProductDTO(Schema): id = fields.Int() name = fields.Str() isDeleted = fields.Bool() attachments = fields.List(fields.Nested(AttachmentDTO)) def __repr__(self): return self.__str__() def __str__(self): return "<Product(%(id)s)>" % self.__dict__ class AttachmentDTO(Schema): id = fields.Int() fileName = fields.Str() mimeType = fields.Str() isDeleted = fields.Bool() product_id = fields.Int() content = fields.Str() product = fields.Nested(ProductDTO) def __repr__(self): return self.__str__() def __str__(self): return "<Attachment(%(id)s)>" % self.__dict__
At this level the relationships between the objects are defined through Schemas nesting. The code is more readable and properties are immediately bound to native types. So ProductDTO has a nested AttachmentDTO fields.List for the attachments, and AttachmentDTO has a nested ProductDTO Schema for its product. Marshmallow is an extensible, very powerful and well documented tool. It's another key component, as SQLAlchemy is, to build a structured, flexible and robust enterprise architecture.
Backend: Business Logic
The business logic of our REST services, as already said, is implemented using Lambda Functions. The first issue to solve is about structuring and organising services (and the functions). In the "monolithic" approach a lambda function is seen as the component that has complete responsibility for the actions that can be performed on a resource. So there will be a service that will be invoked whenever we need to perform any operation on the product resource. This implies that the service must be able to discriminate which action (HTTP verb) we really want to perform:
if (event.context['http-method'] == 'DELETE') { ... } else if (event.context['http-method'] == 'GET') { ... } else if (event.context['http-method'] == 'POST') { ... } else if (event.context['http-method'] == 'PUT') { ... }
Those who prefer this approach generally argue that it:
- reduces the number of functions
- facilitates the organization of the code (fewer objects to manage)
On the other hand, the code is not so elegant, more confusing and certainly less maintainable. And things get worser as the development team size increases. We don't want that. We should instead split the code in small units in order to reduce the risk of conflicts and to push the work parallelization. Moreover every developer will have to to deal with more lines of code and it will be a little more difficult finding the portion he has to work on. It also will significantly slow down service execution time, especially in the case of cold start. When a lambda function is invoked for the first time indeed, behind the scenes, a container is created for its execution, code is downloaded from S3, then it is loaded and executed. Once the execution is complete, the container is kept active for a while in the event that we ask to execute same function again. This is why first execution is always slower and is called "cold start". And so, the shorter the code is, the more performing will be cold starts . Monolithic functions inevitably lead to longer cold starts.
The opposite approach involves specialized functions, each responsible for a single action (HTTP verb). The number of functions will inevitably increase, but each one will contain a specific portion of code. It will be easy to catalog and find them using naming conventions and classification tools (such as tags). This approach increases fragmentation but respects the single responsibility principle and helps writing shorter, simplier and more specialized code that is easy to maintain. It also helps in parallelizing development activities and reduces cold start latencies. The following could be the implementation of the lambda function that will return the products' list and that has to be executed in response to the invocation of the GET method of the resource ... / v1 / products
import logging from sqlalchemy.orm import sessionmaker from sqlalchemy.exc import DBAPIError, SQLAlchemyError from sqlalchemy import text from custom_library import * from dto import * from model import * logger = logging.getLogger(__name__) logger.setLevel(logging.DEBUG) def handler(event, context): # dumping values received in logger.debug(f'Request event: {event}') logger.debug('name-' + event['name'] + '-') headers = event['headers'] logger.debug('user-' + headers['user'] + '-') logger.debug('role-' + headers['role'] + '-') # define standard response response = { 'statusCode': 200, 'products': 'error message or json created', } try: products_dto = ProductDTO(many=True) # create a session Session = sessionmaker(bind=engine) session = Session() filter = "isDeleted=0" if event['name']: filter = filter + " and name like :name" resultset = session.query(Product).filter(text(filter)).params(name='%' + event['name'] + '%').all() productsList = products_dto.dump(resultset) logger.debug('Products retrieved -- %s' % len(resultset)) except (DBAPIError, SQLAlchemyError, Exception) as ex: logger.error(type(ex)) # the exception instance logger.error(ex.args) # arguments stored in .args logger.error(ex) # __str__ allows args to be printed directly raise Exception('500 Bad request: Internal Server error') finally: session.close() return { 'statusCode': 200, 'rows': len(productsList), 'products': productsList }
In the first lines we define dependencies to be imported: the SQLAlchemy components that we need, the model and dto objects and other project’s utility components. These resources will be included in two Lambda Layers. The first one contains all third party libraries and external dependencies. The second one contains internal dependencies, components and project libraries. As already said, Layers allow to integrate libraries and dependencies so that we can design structured enterprise applications. The handler method is the function entry point. It is invoked every time we request the resource .../v1/products by HTTP GET
Here is the main logic:
products_dto = ProductDTO (many = True)
The Marshmallow Scheme that we will be used for serialization is istantiated
# create a session Session = sessionmaker (bind = engine) session = Session ()
SQLAlchemy factory sessionmaker instantiates the Session class. The Session is responsible for database communication and provides methods and constructs for querying.
filter = "isDeleted=0" if event['name']: filter = filter + " and name like :name" resultset = session.query(Product).filter(text(filter)).params(name='%' + event['name'] + '%').all()
The query() method invocation results in the creation of a Query object, a very powerful object that can manage a variable and combinable number of arguments and methods to obtain complex queries with filters, sorts, joins, unions, projections, aggregations, groupings and so on. Please refer to the documentation to learn more.
productsList = products_dto.dump (resultset)
The session.query().all() method returns a list of model objects.
We then use marshmallows calling the dump method of the previously defined Scheme. And this way we get a list of serializable ProductDTO objects that we can return back to API Gateway that invoked our lambda function. We are going to see it in a while:
return { 'statusCode': 200, 'rows': len(productsList), 'products': productsList
}
Now we only have to define our REST API.
Backend: REST API
API Gateway is a fully managed AWS service to create HTTP, WEBSOCKET and REST API interfaces that supports OpenAPI standard. We can easily define the API interface using a json description file. Sometimes it happends we have to deal with out of standard API because of various reasons, especially in enterprise contexts.
In my humble opinion moving away from the conventions is always a bad idea. A RestFul API should always behave as expected expecially if other developers will deal with our services. We absolutely want to be as much as possible RESTful compliant. So here is the definition complete with
- route
- allowed HTTP verbs
- input and output parameters definition
- request and response template
- consumed and produced json objects
- responses
- lambda functions to invoke
Here are only the significant parts:
{ "swagger": "2.0", "info": { "version": "2019-11-27T13:11:08Z", "title": "painlessServerless" }, "host": "xxxxxxxx.execute-api.yyyyyyyyyy.amazonaws.com", "basePath": "/painlessServerless", "schemes": [ "https" ], “paths”: { "/v1/products": { "get": { "consumes": [ "application/json" ], "produces": [ "application/json" ], "parameters": [ { "name": "name", "in": "query", "required": false, "type": "string" } ], "responses": { "200": { "description": "200 response", "schema": { "$ref": "#/definitions/responseProducts" }, "headers": { "Content-Type": { "type": "string" } } } }, "x-amazon-apigateway-integration": { "uri": "arn:lambdaFunctionGetProduct", "responses": { "default": { "statusCode": "200" } }, "passthroughBehavior": "never", "httpMethod": "POST", "requestTemplates": { "application/json": "{\n\"body\" : $input.json('$'),\n\"name\": \"$input.params('name')\",\n\"headers\": {\n #foreach($param in $input.params().header.keySet())\n \"$param\": \"$util.escapeJavaScript($input.params().header.get($param))\" #if($foreach.hasNext),#end\n \n #end \n }\n}" }, "contentHandling": "CONVERT_TO_TEXT", "type": "aws" } }, }, [...] “definitions”: { "Empty": { "type": "object", "properties": { "property": { "type": "string" } }, "title": "Empty Schema" }, "responseProducts": { "type": "object", "properties": { "statusCode": { "type": "integer" }, "rows": { "type": "integer" }, "products": { "type": "array", "items": { "$ref": "#/definitions/product" } } }, "title": "responseProducts" }, "product": { "type": "object", "required": [ "name" ], "properties": { "id": { "type": "integer" }, "name": { "type": "string", "maxLength": 45 }, "isDeleted": { "type": "boolean" } }, "title": "product" } }, [...] }
Orchestration
Beyond the classical CRUD oprations an application generally needs an orchestration layer to run services in sequence, compose the results, retry failed operations, manage error cases and services timeouts. AWS Step Functions is a fully managed service that helps in coordinating multiple AWS services in a serverless workflows. It is possible to coordinate the execution of various Lambda functions in order to perform a complex operation composed by various simple tasks, managing timeouts retries and logging. It introduces another decoupling level that makes our design more flexible and adaptable to future evolutions. It also moves part of the business logic to another service, increasing the fragmentation of the code. Furthermore this logic will be written in ASL (Amazon States Language), a Json-like language that describes the behavior of the Step Function. This sounds like a vendor lock-in.
Build & Deploy Automation
build and deploy automation are the last feature we should talk about. Any project needs automation tools to manage lifecycle phases: compilation, test and deploy. These tools incredibly increase the quality of delivered software and lighten the developers workload. They are truly indispensable in cloud systems where otherwise we should deal with unfriendly Command Line Interface (CLI) tools or with uncomfortable, slow and repetitive Web interfaces. As seen in this small project, the number of components would tend to rise very quickly: we could expect several dozen Lambda Functions, some Lambda Layers, a dozen Step Functions, the REST API... We will also have to manage the versioning and the cross dependencies between Lambda Functions and Lambda Layers: when we release a Lambda Function we want this to be hooked to the latest layers versions. And when we release a new version of a layer we want the Lambda Functions to be updated to use the just released layer version. And we should care of this in three environments. You don't want to do it by hand!
There is a very wide range of possibilities to address these issues: GitLab, Jenkins, deployment automation services of cloud providers (CodeDeploy for AWS). The choice may depend on many factors: the context, the involved platforms, the TCO. In our case we choosed Azure DevOps because it is a corporate tool. Azure DevOps has been paired with Terraform to automate cloud-side operations. Terraform is a very powerful tool that allows to build, modify and update cloud infrastructures in a simple and efficient way using the Infrastructure as Code (IaC) paradigm. It is a must-have tool for the deployment of enterprise projects in the cloud.
Final considerations
We have finally reached the final considerations about the architecture we have designed and tested. What impressions do we have? Were we able to answer the initial questions? We covered a lot of topics and technologies and introduced a lot of tools and we couldn’t go deeply into everything here. Fortunately we had the opportunity to try it "on the road" because we had to implement this architecture for a real use case.
The serverless paradigm is mature and stable. The programming language we choosed, Python, is stable too, it is a thirty years old project, developed, maintained and documented by a large community of users that produces an impressive number of components and frameworks. Libraries and tools we used are reliable and stable, they are indeed de facto standards. These are surely enough reasons to consider these technologies widely adoptable in the enterprise world. The decoupling and the code specialization allow to efficiently parallelize the activities between team members reducing delivery times and costs. The use of fully managed, reliable and efficient services simplifies the implementation, especially if we consider that we are able to build highly available and scalable applications without any effort on our part. This is not a detail: designing highly available and autoscaling solutions is not a trivial topic and we need to keep this in mind when comparing solutions. Serverless architectures also significantly reduce the TCO since they are pay-per-use and there are no unnecessary infrastructure costs. Cloud provider will charge just the execution costs and we won’t pay for an idel infrastructure. Here’s what we payed for a month of this project development:
Let's consider a iaas solution for the same project over the same period:
an t2.large EC2 turned on for about 10 hours a day 5 days a week is something like $ 34 (almost 13 times more expensive). It's also true that iaas solutions can be optimized and there are large profit margins but the fact remains that you also pay for idle systems and that you won't get autoscaling and high availability out of the box. Serverless computing gives also:
- short time to market
- autoscaling
- high availability
- short delivery times
These asre very important advantages, hard to quantify in terms of savings.
Certainly the serverless model will help to push towards really fluid, dynamic and flexible systems. Strongly decoupled systems, based on reusable services that can be easily recomposed to meet future needs, to build new functions, or to update existing ones without changing the whole infrastructure.
The IT companies that will be able to exploit this potential will make a difference in the near future.
Maybe the cloud is not just someone else's PC.
References
https://www.oreilly.com/radar/oreilly-serverless-survey-2019-concerns-what-works-and-what-to-expect/
https://www.oreilly.com/radar/what-is-next-architecture/
https://www.fullstackpython.com/enterprise-python.html
https://www.javaworld.com/article/2078655/python-coming-to-the-enterprise--like-it-or-not.html
https://www.datadoghq.com/state-of-serverless/
https://techbeacon.com/enterprise-it/essential-guide-serverless-technologies-architectures
https://www.cloudtp.com/doppler/is-serverless-ready-for-the-enterprise/
https://cloudacademy.com/blog/austin-collins-serverless-framework-interview/
https://medium.com/@mwaysolutions/10-best-practices-for-better-restful-api-cbe81b06f291
https://en.wikipedia.org/wiki/AWS_Lambda
https://aws.amazon.com/it/blogs/compute/container-reuse-in-lambda/
https://docs.aws.amazon.com/lambda/latest/dg//running-lambda-code.html
https://marshmallow.readthedocs.io/en/stable/
https://micronaut-projects.github.io/micronaut-data/latest/guide/
https://medium.com/the-theam-journey/benchmarking-aws-lambda-runtimes-in-2019-part-i-b1ee459a293d
https://marshmallow.readthedocs.io/en/stable/
https://github.com/PiTiLeZarD/workbench_alchemy/blob/master/README.md