Compute options on Google Cloud

Compute options on Google Cloud

It is essential for the Data Engineers, ML Engineers and Data architects to know about various storage and compute options on google cloud. At first this might appear as if its the job of platform or infra team to set it up, however those who work on implementations would know how important it is to understand these concepts as they play a significant role in performance tuning.

Google Cloud has 5 compute options which can be directly use to deploy data workloads or indirectly used as underlying infrastructure for some of the managed services in the data-AI lifecycle.

The below image gives a quick view of the compute options available on google cloud and these options are classified as IaaS, PaaS and Serverless options.



IaaS - The short form for Infrastructure as a Service. This allows users to spin up virtual machines with full flexibility in terms of configurations such as number of cores, memory, operating system, disk options, disk size, image etc

Its like renting a space and building your own kitchen, bringing your own chefs, furniture etc and have full control over how you build and run the restaurant

PaaS - The short form for Platform as a Service. This allows users to rent and start using the platform which includes servers, operating system, runtime environment, libraries that bind the application code with the underlying infrastructure, storage and networking components.

Its like renting a commercial kitchen directly which comes with the utensils, furniture etc, you just come in as a chef and start cooking your own food

Serverless - This does not mean there wont be any servers, instead it is a category where users don't have to worry about setting up any infra before deploying the code. They fully focus on the application code and deploying whilst the cloud service provider takes care of spinning up the resources, scaling it up and down

Its like preparing the food and delivering to various users via justEat or Deliveroo etc, we don't have to worry about the mode of delivery, we just focus on food quality

Now the compute options:

Google Compute Engine - This is an IaaS option which give users full flexibility to create a virtual machine of suitable configuration. There are multiple options available in terms of the machine families with a pre-defined configuration such as high memory, high compute, GPU enabled which makes it easier to spin up the infra.

GCE (Google Compute Engine) can be used to create a development environment, deploy databases such as MySQL, Postgres or Oracle (BYOL) etc or to run ETL pipelines

Google Kubernetes Engine - This is a managed Kubernetes service on google cloud which is used to run containerised apps. This compute option is widely used for ML model training where the model is deployed for training as a containerised image. GKE (Google K8s engine) provides users with flexibility that GCE offers and scalability that PaaS offers making it the best choice for massive workloads

Google App Engine - This is a PaaS offering widely popular for deploying web applications. It is suitable for hosting web applications and APIs without managing infrastructure. Its a fully managed environment and supports multiple languages and frameworks such as Java, Python, Go, C, C++ etc

Google Cloud functions - This is a serverless offering which is like Functions as a Service. Users can write functions in programming language such as Python, Java, Node.js and deploy it as a function which in turn provides a HTTPS URL that can be invoked or integrated into applications. GCF (Google Cloud Functions) is popular for trigger based use cases such as processing a file as it arrives into the data lake, triggering a data pipeline whenever an event happens etc

Cloud Run - This is another serverless capability just like Cloud Functions with a difference that Cloud Run is for running stateless containerised applications. It is built on top of open source platform called Knative that provides serverless development experience on Kubernetes.

Cloud run is quite popular for ML workloads where the model is deployed as an endpoint via cloud Run. Since Cloud Run scale up and scale down to 0, users literally pay per usage making it extremely cost effective

Thanks for reading!

要查看或添加评论,请登录