The future of cloud computing
Vision from the BrainBoard team, WYSIWYG of the multi-cloud
S3 - Small Scaleway Stories
To begin, I’d like to start with three small stories to show you how BrainBoard arose from the mind of four engineers who were experts in making one another’s job less efficient.
Big data, big story
Two years ago when I started working in the cutting-edge technology company Scaleway, I was in charge of deploying a big data strategy & product. Without going into details into how this ended up failing, here is a step I took at one moment: Since we had data integration problems and spent too much time consolidating dashboards, I decided to deploy a simple data lake where we could replicate all the data we had and leverage it. Easier said than done, especially for the SRE team. Despite their tremendous experience in system administration, I figured out how to make it painful for them. First, ask for a lot of resources (dozens of cores, hundreds Gb of RAM, terabytes of disk). Second, just wait & not use it because I was playing on a Jupyter Notebook instead. Last, go back to them and ask for a different set of machines (let’s say I needed a GPU, or I prefered an ARM architecture). And that’s it ! I made this first class player crazy because I ended up using the machines for less time than he needed to deploy them. Because data-science environments are volatiles, with a short life-cycle.
The data center that relied on YAML files
To save my life, I decided to stop with big data, and to join Jeremy as an SRE in order to help him & the team deploy the latest Warsaw Scaleway Datacenter. On the paper, the plan was easy: We already had data centers, we knew what components of infrastructure we needed to deploy, and we wanted to use Ansible to do it. Everything was already expressed as an Infrastructure As Code, but so far we used SaltStack. And this is where everything explodes: we didn’t spend 3 months to design the architecture or think about business value concerns. We spent months just transcriptting yaml files from the SaltStack API to the Ansible API. Because writing a complexe, low level, secured infrastructure using YAML is painful! The learning curve is steep with hundreds of properties to use. We often introduce typos that make us spend days understanding why your VMs has 4 cores instead of 8 as specified. We have to navigate thousands of lines of YAML wall to find one environment variable. That detail changes the behavior of this role we didn’t even write but found on Github instead, and is spread across hundreds of servers.
Documenting faster than you YAML
So now that we had transcripted all those tabs & spaces (worse than semicolons because you don’t see them), entered the front team. For a cloud provider, the frontend is not only the client side (dev tools, CLI, website, …) but the internet facing part of our microservices (API Gateway). The front team wanted to get an up-to-date documentation of the infrastructure microservices APIs. But how to give them the documentation of hundreds of microservices that were updated and complemented every week? Writing textual files would require spending more time writing Markdown than YAML, drawing the state of things would require to spend more time drawing than writing the infrastructure code, and so on and so forth. Documentation is useless if it’s not up-to-date. And producing it is time-consuming. That’s why documentation of complex architectures is still a painful subject today.
The future of cloud management
Now that those small stories showed typical pain points when it comes to managing infrastructure in the cloud, how can we envision the future of it? What is the “high-level languages” equivalent of cloud providers? What are the future workflows we can expect?
Design your cloud
To answer these questions, let’s ask a simple question: how does a DevOps engineer create an architecture? Starting on a whiteboard, we draw by hand some small boxes describing the different components we need. Those components are rarely cloud products, they are instead abstract representations of the basic blocks needed to make an application run. As an example, a small box will represent a virtual machine to perform computations. Another one has the same visual but will be a load balancer. If you feel artistic, you can draw a bucket to represent an object storage endpoint. And because you want to visualise how they work together, you draw simple arrows describing the data flows through those different components.
As a result, the same way object-oriented programming is now able to represent business concerns and abstract the technical implementations like memory management, cloud management will eventually be done with visual representations of architectural concepts (there are already some patents here, here and here). Databases, even if they’re stored on NVMe SSDs, will still look like hard disk drives (like in 2020 the save icon is still a floppy disk…). Even if object storage implementations are cutting edge mathematical algorithms distributed across various racks of whatever machines, we draw a simple bucket. And while networking can be implemented with various physical means and implement complex IP Fabric architectures (video here), we’ll just draw a line.
One view for all the clouds
One of the biggest limitations of cloud management today is the diversity of APIs that vendors expose. Of course, the set of services they offer are more or less the same. And that’s simply because they did not change the way we use computers, they instead decided to handle a part of the complexity. That doesn’t prevent the need of a communication interface with the users, what we call the APIs. Those APIs play a major role in the cloud fragmentation, because each company has opinionated its own. The signatures, the names of objects, but also the way they interact together, all of this is defined by the cloud vendor in a way that matches its point of view.
Unlike virtualization, containers or more generally open source projects, cloud providers don’t offer progress for the computer science world. They instead shift the complexity from managing hardware to choosing & adapting to their APIs. That’s what we call vendor lock-in. In the future, ordering a VM or defining the size of persistent storage we need will be done using a unified model, independently from the vendor we choose to pay. The same way higher-level languages created a uniformed way to handle memory management, we need a uniform way to manipulate cloud resources whatever our use case or problem is. For now, Kubernetes has done a very good job in this direction, but at a higher level (applications).
All the clouds for one
Computer sciences have proven the tremendous power of abstraction. Thanks to it, everyone has access to the Internet, to ML models that require complex mathematical models or satellite images of their homes, just right at their fingertips. Once again, when the cloud computing is abstracted, it relieves two new powers.
First, exactly the same way we draw abstract architectures on a whiteboard, the future of cloud management is to declare abstract architectures that can be deployed on any cloud provider. That power instantly removes the vendor lock-in by reasoning at the resource nature level, instead of complying on a GAFA opinion about how distributed storage should interact with CPU computing.
Second, reasoning at the resource level, we will be able to develop real multi-cloud strategies. That means, we will decide that this relational database will be hosted in Europe for its privacy, that this block of CPU computing power is not a priority and should just go for the cheaper vendor, or that object storage buckets have to stay in the hands of your prefered vendor because of a commercial partnership. Moreover, failover strategies won’t be scoped to one vendor anymore. But instead, when GCP is dead, just move your instances to Microsoft. Imagine automating the location of your data using tags and privacy levels compared to the provider nationality? Trust me, you’ll always prefer to get your sensitive data in the hands of a European one.
BrainBoard - Implementing this future, today.
From your brain to a board
As I wrote earlier, we want a future where engineers keep their habit to design infrastructure on a whiteboard, and are able to deploy it without any transcription step. For this, the BrainBoad software starts with a simple whiteboard. On the left panel, the products from the selected cloud provider are listed so you can pick them. Soon, the agnostic mode will allow you to pick architecture-level components (computing, GPUs, object storage, …) without selecting a particular cloud product or provider. Exactly the same way you use a pencil to draw your infrastructure, you can now use BrainBoard to design your architectures, save them independently, replicate them between your different environments and most importantly have a visual overview of what is in production. This board is the first step of your work, so it’s in the first tab named “Design”.
From a board to any cloud
I recently had a discussion with Ivan, the C.T.O at the Jalgos AI company. Among different reflections he had, I asked him how he envisioned the future of cloud computing. Rapidly, we agreed that an abstraction would eventually come up above the various vendors’ APIs. This abstraction could be another API, oriented toward infrastructure concepts instead of products, or any other means to remove the vendor lock-in. With BrainBoard, we currently use Terraform to abstract the way you interact with your prefered provider. Of course, many of you already use this technology, but the combination of visual design with our Terraform Generator Engine (TGE) is what shifts this powerful tool from another Infrastructure-As-Code one, to a real cloud-provider-abstraction. Now, you no longer need to know the specificities of each provider’s API nor its Terraform properties, you only focus on your design but still have access to production-grade tools you already use in your environments. This feature is accessible in the second-step-tab named “Deploy”
All the clouds, just for you
At last, we explained how the power of abstraction would eventually empower multi cloud management. Failover, privacy, pricing or geographic strategies would be possible since we won’t express our needs using opinionated APIs, but instead reason at the component level. This vision is at the heart of BrainBoard, and that’s why it’s been designed from ground up to embrace the shifts between different cloud providers. By the click of a button, any architecture can be moved or replicated from one provider to another. Soon, the agnostic mode will even allow designing at the component level and deploy on all the providers at once. A part of your architecture will stay safe under the GDPR in Europe, while your CI/CD environment may be deployed on the cheapest VMs on the market at the moment. Cloud providers are always accessible, either you are currently designing or deploying your infrastructure.
Conclusion
This article has shown you three important things. First, that the cloud management is just at its beginning and far from being mature. It is fragmented and creates almost as many problems as it is supposed to solve. Second, we described our vision of a mature cloud ecosystem. Abstraction, compatibility and vendor agnosticism are for us what will definitely bring cloud management at its mature level. Third, we exposed how BrainBoard implements this vision in a very easy-to-use way. The tool doesn’t pretend to remove your habits nor the tools you already use, but to integrate with them instead. Don’t hesitate to come test it, we’ll be grateful for any feedback and most of all would be happy to design it for you. For this, don’t hesitate to come join us on our BrainBoard Community Slack. See you there !