Serverless AI Systems. Multicloud Orchestration
Serverless computing and functional programming are fundamentally changing the way we build and deploy applications. The modern world is simply unthinkable without the use of distributed intelligent systems and platforms that provide services such as: queues, APIs, gateways, authentication tools. Functional serverless computing and flowchart applications have made significant progress in the last year. Container technologies are still in active development. Even the simplest mobile application has an API that connects to cloud storage. However, the design of distributed systems is still an art, not an exact science. The need to build on a solid foundation is long overdue, and if you want the confidence to build, maintain, and operate distributed systems, start with this book!
In this book, lead expert Prof. Nikolay Raychev sets out the minimum you need to properly design and implement scalable models for Machine and Deep Learning using serverless architectures for distributed systems. The book discusses the basic models for designing distributed systems. This will help you not only to create such systems from scratch, but also to effectively upgrade existing ones.
Today, almost every developer is a creatorand a user of distributed systems. Evenrelatively simple mobile applications rely on cloud APIs to make data available on any device the customer wants to use. Whether you are new to distributed systems development or a battle-hardened veteran, the models and components described in this book can help transform such systems from art to science. Reusable components and distributed system design schemes allow you to focus on the critical details of your application. This release will help any developer create distributed systems better, more efficiently, and faster.
Why I wrote this book
During my career as a software developer, from web applications and bots, through ERP applications to automated cloud systems with artificial intelligence, I have built many scalable, reliable distributed systems. Despite the fact that many of their principles and logic of operation often coincide, the template solutions or reusable components are not so easy to implement. This made me waste time implementing systems whose quality could be better than it turned out to be.
Recent container technologiesand the instruments for their orchestration have radically changed the direction of the development of distributed systems. We have an object and an interface that allow us to express the basic templates for the design of distributed systems and to compose containerized components. I wrote this book so that we have a common language and a common standard library through which we can build better systems in less time.
All roads lead to clouds
2020 was an exciting year for cloud service providers. Clouds were needed not only by start-ups and technology companies, but also by government agencies, hospitals, banks and insurance companies. The trend will continue in 2021, with companies of all sizes moving or planning to move to the cloud next year. Gartner forecasts 17% growth in cloud revenue in 2021. If you are a company that wants to become a market leader, then you may need to rethink your cloud computing strategy. If you are a software developer, then 2021 will be a good time to get acquainted with the cloud technologies of Amazon, Microsoft, Google.
Amazon AWS is the leader, but other vendors will continue to grow.
2019 and 2020 were the years of Amazon, 2021 will be the same.
Microsoft will focus on the corporate market, in 2019 reached a major deal with the Pentagon on the JEDI project, this deal worth $ 10 billion will accelerate the promotion of Azure technology.
Google is promoting the Cloud Native Computation Foundation and will facilitate cloud migration for small businesses and those already partnering with AWS.
In 2021, many start-ups will appear in Multi-Cloud, competition will lower prices and more innovation in this area.
Containerization: Kubernetes will get better
The last battle between Kubernetes, Docker Swarm and Mesos is over. Defeat the Kubernetes.
Kubernetes' popularity is growing exponentially, which will keep it strong in 2021. While Docker was sold to Mirantis. A few years ago the world revolved around Docker, now it's Kubernetes. Docker is trying to monetize its efforts with a delay, and the industry has already shifted to competitors. This once again shows that technology is everything.
Software architecture. Microservices are and will be widespread
In the world of cloud platforms, microservices will dominate. For Cloud Native, this is the ideal architecture for rapid development.
Python's popularity will continue to grow
Due to Machine Learning, data analysis and processing, the web, enterprise software development and black hole photography, Python will be used almost everywhere.
In 2021, Python will become the first or second most popular language, after Java.
It should be noted that in 2019 Python became 2 times more popular (from 5% to 10%).
Python will continue to grow in 2021, replacing the C and Java languages. Another ubiquitous JavaScript programming language is facing a downward trend. Why is Python so popular? It has reduced barriers to entry into programming, has a great community, and is popular with future generations of scientists and data developers.
Programming (Enterprise): Java and JVM will remain in place.
According to the TIOBE index, JAVA is the most popular language on the planet and will remain so in 2021. JVM is one of the best products today. Many languages such as Kotlin, Scala, Clojure, Groovy use JVM. Recall that in 2018 Oracle changed the license to use Java, Kotlin, Scala and JVM to paid.
Java Enterprise: Spring is here
Years ago, there was huge competition in enterprise software development between Spring Framework and Java Enterprise Edition (Java EE). But Oracle lost to competitors by being inactive in Java EE. This led to the formation of the microprofile initiative and eventually to the creation of the Jakarta EE, where Oracle opened the Java EE source code.
While the whole community was around Java EE, Spring Framework won the war of the JVM Enterprise Framework with very active development and responsiveness to the changing environment.
Two very attractive projects are under development that make Java smaller, faster and a good choice for serverless computing. These frameworks are called Micronaut and Quarkus. Both aim to use GraalVM and will run Java in 2021.
Programming: Rust, Swift, Kotlin, TypeScript will make a breakthrough
At the beginning of the 21st century, there was stagnation in the language landscape. Most people thought that there was no need for a new programming language, as Java, C, C ++, JavaScript and Python met every need. Google has opened the door to new programming languages with Go. Over the last decade, many interesting programming languages have appeared, such as Rust, Swift, Kotlin, TypeScript. One of the reasons for this development is that existing programming languages often fail to take advantage of the latest hardware changes (eg multi-core processors, fast networks, clouds). Another factor is that modern languages are particularly focused on the ergonomics of developers, ie. faster and easier development.
Microsoft recently announced that they are actively studying Rust's programming to develop security software. At the same time, Amazon announced that it will sponsor the development of the language.
The Kotlin programming language also became a major competitor to Java in the world of JVM when Google announced official support for Kotlin on Android.
Angular supports TypeScript as the primary programming language instead of using standard JavaScript. Other JavaScript frameworks such as React and Vue have also begun to offer more TypeScript support.
The trend will continue in 2021 with many other giants who are likely to take a closer look at next-generation programming languages such as Rust, Swift, TypeScript, Kotlin and go out and openly state their support.
Network: JavaScript will continue to dominate
JavaScript was once not a strong enough programming language and Frontend applications were developed mainly using Backend technologies such as JSF, Ruby on Rails, Django, Laravel and server-side rendering. This changed forever after AngularJS came on the scene in 2014. Since then, many other JavaScript web frameworks have appeared (Angular 2+, React, Vue.js, Meteor.js) and JavaScript is now the main technology for web development. With many innovations in JavaScript frameworks and the advent of microservice architecture, JavaScript languages will dominate even more in the development of Frontend in 2021.
JavaScript Web Framework: React Rocks
Although it appeared after AngularJS, React had the biggest impact on web development in the last decade and saved Facebook in its fight against Google+. React brought some fresh and innovative ideas for Frontend development, for example: event source, Virtual Dom, one-way data connection, component development and more. This affected the community so badly that Google abandoned AngularJS and completely rewrote it to Angular2 +, borrowing ideas from React. React is the most dominant and stable JavaScript web framework.
Facebook recently announced the React-Fiber project, a completely redesigned React.
In addition, in 2021, React should become the web platform for new projects. How about other Frontend Web Frameworks like Angular (Angular2 +) and Vue? Angular is a stable web development framework, especially suitable for Enterprise. I am confident that Google will actively invest in Angular in the coming years. Vue is another very popular web framework maintained by the community and several giant Chinese corporations. If you already use Angular or Vue, there is no need to switch to React in 2021.
Application Development: Native for Enterprises
When developing mobile applications, the buzz around hybrid or cross-platform applications has subsided a bit. Developing cross-platform applications is faster because you only need one team instead of two. But native applications provide a better user experience and better performance. Also, when developing applications on different platforms, you always need to customize them to have advanced features. For companies, developing their own applications is still the preferred solution and this trend will continue in 2020.
While Facebook is trying to improve React Native and Google is actively promoting its own Flutter application development platform, they are best suited for prototype, POC, MVP or functional applications, while Native App Development will continue to dominate in 2021.
An interesting fact among the developers of the Native App is that Google promotes Kotlin, and Apple promotes Swift as the main programming language. Google recently confirmed its support for Kotlin, which is good news for language users.
Development of cross-platform applications: React Native
There are many scenarios in which a hybrid / cross-platform application is a rational choice. There are many options in this area: the existing Xamarin, Ionic and the newer React Native, Flutter. Facebook has built React Native on top of React's highly successful and mature network. Like its web counterpart, React Native is also the dominant framework for developing hybrid / cross-platform applications.
React Native and React share the same base, they offer reusable code and the "Write once, apply anywhere" option. Another added benefit of using React Native (or another Facebook platform) is that Facebook uses React Native to develop its own mobile app. Google lags behind in this area, but has gained significant popularity in the last year with its own cross-platform platform, Native App Flutter. Flutter offers the best performance, but needs another not yet used Dart programming language. Given all the changes taking place in the ecosystem, React Native will continue to dominate this area in 2021.
API: REST
REST is like a heavy gorilla in API technology. This is de facto the most widely used method in API-based communication between services. There are other ways. As you may have guessed, these are gRPC from Google and GraphQL from Facebook.
Both are interesting technologies, but offer different possibilities. Google has developed gRPC as a reincarnation of remote procedure calling techniques (such as SOAP), but steroids. It uses Protobuf instead of JSON as its message format. On the other hand, Facebook has developed GraphQL as an aggregation layer to avoid frequent REST calls. Both gRPC and GraphQL have succeeded in their respective niches. In 2021, REST will remain the most dominant API, while GraphQL and gRPC will be used as complementary tools.
Artificial Intelligence: Tensorflow 2.3 will dominate
In the field of Deep Learning / Neural Networks, Google and Facebook are the main players. Google has provided us with TensorFlow, which is based on the popular Deep Learning platform Theano. It quickly became the main library for deep neural training. Google even introduced a specially designed graphics processor (TPU) to speed up TensorFlow's calculations.
Facebook is not lagging behind in Deep Learning, as they probably have the largest collection of images and videos. Facebook has launched the PyTorch Deep Learning Library, based on another popular Torch Deep Learning Library. There is a slight difference in the way the two platforms work. TensorFlow uses a static graph for its calculations, while PyTorch uses a dynamic graph. The advantage of using a dynamic graph is that it can be adjusted on the fly. In addition, PyTorch is more friendly to Python, which is the main programming language in Data Science.
With the growing popularity of PyTorch, Google released TensorFlow 2.3 in August 2020, which uses dynamic graphs and is more convenient for Python.
In 2021, TensorFlow 2.3 and PyTorch will fight for their niches. Given TensorFlow's wider community, it can be assumed that TensorFlow 2.3 will be the dominant library for Deep Learning.
Database: SQL is great, but distributed SQL will be the Holy Grail
During the NoSQL fuss, many laughed at SQL and pointed out its limitations. Many publications have explained how NoSQL is much better and will replace SQL. But as soon as the super-noise ended, people soon realized that the world could not live without SQL databases.
SQL databases are in the top 4 places. The reason SQL dominates is that it implements ACID, which is the most critical requirement for business applications. NoSQL databases offer horizontal scaling, but at the cost of violating the ACID guarantee.
The companies working in the network are looking for a "basic database" - ie. a database that will provide an ACID, such as SQL databases, and offer scaling, such as NoSQL databases. Currently, two solutions partially meet the requirements of the "Main Database": Amazon Aurora and Google Spanner. Aurora offers almost all SQL functions but not record scaling, while Spanner offers record scaling but does not support many SQL functions.
We hope that in 2021 the two databases will converge or someone else will offer Distributed SQL. If that happens, he will probably win a prize in Turin.
Data Lake: MinIO will become more and more famous
As discussed in the previous section, the current data platform is complex. Companies typically have OLTP (SQL) databases to support ACID transactions and OLAP (NoSQL) databases for analytical purposes. In addition, businesses have other types of data storage, such as search (Solr, flexible search) or calculations (Apache Spark). Companies are building their data platform around the data lake, ie. the data is copied from the OLTP databases to the data lake. All other types of data applications (eg OLAP, search) use the data pond as their main source.
The Hadoop Distributed File System (HDFS) was the actual data lake until Amazon released Object Storage S3. Scalable and inexpensive, the S3 soon became the de facto data lake for many companies. The only problem is that using the S3 firmly connects the data platform to the Amazon AWS cloud. Although Microsoft Azure has BLOB storage, and Google has BLOB storage, but they are not compatible with AWS S3.
The new Object Storage MinIO is compatible with the S3, and is open source, so it can be a lifesaver for many companies. With the support of Enterprise and created for Cloud-Native environments, MinIO offers Cloud Neutral Data Lake.
Microsoft recently announced the launch of MinIO on the Azure Marketplace under the slogan: "Provide Amazon S3 API compatible data access for Azure Blob Storage Services." If Google GCP and others also offer MinIO, this could be a big step forward for Multi-Cloud support.
Big Data Calculation: Spark will continue to shine
Nowadays, businesses usually have to perform calculations on their large-scale data set, which requires distributed batch work. Hadoop Map-Reduce was Distributed Batch's first computing platform. Apache Spark recently took over Hadoop as the king of batch processing. How can Apache Spark give better results than Hadoop?
Spark specifically addresses the limitations of Hadoop Map-Reduce, i. processes everything in memory instead of storing data in storage after each expensive operation. Although Spark uses CPUs and intensive JVM machines for batch processing, it will still dominate in 2020 and beyond. We would like someone to develop a more efficient batch processing environment using Rust, which can replace Spark and save huge bills for companies.
Large data flow: Looking ahead
A few years ago, real-time stream processing was not possible. A micro-packet environment such as Spark Streaming is typically used, which provides "almost" real-time stream processing. However, Apache Flink changed the landscape, offering the ability to process live data streams.
Until 2019, Apache Flink could not gain enough strength as it could not compete with Spark. That all changed after Chinese technology giant Alibaba bought Data Artisan (the company behind Apache Flink) in January 2019.
Flink must be the number one choice if the company wants to process real-time data streams in 2021 and beyond. Although Flink has the same drawbacks as Spark, it uses a resource-intensive and heavy JVM.
ByteCode: WebAssembly is gaining ground en masse
To learn more about WebAssembly, you can read an interview with Brandon X, the creator of JavaScript. Modern JavaScript (after es5) is a great programming language. But like any other language, it has limitations. The biggest limitation of JavaScript is that it is slow, as the JavaScript engine must read, analyze, and process an "abstract syntax" text. Another problem with JavaScript is that it has a single thread and cannot take advantage of modern hardware (eg multicore, GPU). As a result, many browser-intensive applications (such as games, 3D graphics) cannot run.
Several companies (led by Mozilla) have developed WebAssembly, a low-level bytecode format for the browser, to run any supported programming language on the web. Launched MVP WebAssembly, which supports Metal programming language (eg C ++, Rust).
WebAssembly allows the browser to run computationally intensive applications such as iras and Autocad. The WebAssembly challenge is even greater and works outside of the browser. WebAssembly also offers security and sandbox thanks to its web support. This also means that WebAssembly can be used in the following scenarios outside of the browser:
· Hybrid natural mobile applications.
· Serverless calculations without cold start problems (cloud)
· Server untrusted code calculation (CDN)
2021 could be a breakthrough for WebAssembly, as many giant corporations (including cloud service providers) and the community are embracing it.
Encoding: Low-Code / No-Code
The rapid revolution in digitalization and Industry 4.0 means that there is a huge difference in the supply and demand of software developers. As a result, many people or companies are unable to implement their ideas due to lack of developers. To reduce the barrier to entry into software development, an attempt is made to develop software without a code or with a minimal code (with a low entry threshold). This effort is known as LCNC (Code Without Code) and achieved some success in 2019:
The goal of this movement is for everyone to be able to develop software, as long as they have a great idea without the need for programming experience.
While one may be skeptical about the use of LCNC frameworks in manufacturing, one can lay the foundations for others. Companies like Amazon, Google can build a reliable product on this basis (just as AWS Lambda thrives on Google App Engine).
It is worth looking at the LCNC movement, which will gain more power in 2021.
Modern software development is huge, complex and diverse. I try to analyze only the trends in areas in which I have experience. In addition, the choice of Tech Stack always depends on how it is used. If you want to choose the technology stack in a certain area, then make your own.
What you will learn as you read this book
· The basic principles around distributed systems
· Basic algorithms and protocols
· Launch an AWS web application with Docker: ECS and Fargate.
· Minimize downtime when deploying changes with health checks and load balancing: ALB.
· Scaling your web application requires a scalable database: RDS Aurora Serverless.
· Minimize operational efforts with managed services. Reduce costs and minimize empty resource payments.
· Dock your web application (Python, Spring Boot and Node.js).
· Monitor and debug with CloudWatch. Protect your web application with HTTPS.
· Use infrastructure as code to automate your stack: AWS CloudFormation.
· Versing your source code with Git: AWS CodeCommit.
· Continuous implementation of your code and infrastructure: AWS CodeBuild.
New: Coursera DeFi Specialization (May 28, 2024). Coursera Blockchain Specialization (May 28, 2018). Blockchain in Action (2020) - amazon.com/author/bina, 2019 SUNY Chancellor's Award for Excellence in Teaching
4 年Very nice, Nikolay.
Senior Solution Architect & AI/ML Specialist | Machine Learning, Data Engineering
4 年Looks like a great effort to bring best practices to an integrated approach. Will give it a read for sure!