登录查看更多内容

Compute options on Google Cloud

Nikhil (Srikrishna) Challa

Cloud, Data & AI expertise ? Google Cloud Champion Innovator??? Authorised Trainer ? Startup advisor ? Writer ? Twitter/X @srikrishna6488

发布日期: 2024年11月12日

It is essential for the Data Engineers, ML Engineers and Data architects to know about various storage and compute options on google cloud. At first this might appear as if its the job of platform or infra team to set it up, however those who work on implementations would know how important it is to understand these concepts as they play a significant role in performance tuning.

Google Cloud has 5 compute options which can be directly use to deploy data workloads or indirectly used as underlying infrastructure for some of the managed services in the data-AI lifecycle.

The below image gives a quick view of the compute options available on google cloud and these options are classified as IaaS, PaaS and Serverless options.

IaaS - The short form for Infrastructure as a Service. This allows users to spin up virtual machines with full flexibility in terms of configurations such as number of cores, memory, operating system, disk options, disk size, image etc

Its like renting a space and building your own kitchen, bringing your own chefs, furniture etc and have full control over how you build and run the restaurant

PaaS - The short form for Platform as a Service. This allows users to rent and start using the platform which includes servers, operating system, runtime environment, libraries that bind the application code with the underlying infrastructure, storage and networking components.

Its like renting a commercial kitchen directly which comes with the utensils, furniture etc, you just come in as a chef and start cooking your own food

Serverless - This does not mean there wont be any servers, instead it is a category where users don't have to worry about setting up any infra before deploying the code. They fully focus on the application code and deploying whilst the cloud service provider takes care of spinning up the resources, scaling it up and down

Its like preparing the food and delivering to various users via justEat or Deliveroo etc, we don't have to worry about the mode of delivery, we just focus on food quality

Now the compute options:

Google Compute Engine - This is an IaaS option which give users full flexibility to create a virtual machine of suitable configuration. There are multiple options available in terms of the machine families with a pre-defined configuration such as high memory, high compute, GPU enabled which makes it easier to spin up the infra.

GCE (Google Compute Engine) can be used to create a development environment, deploy databases such as MySQL, Postgres or Oracle (BYOL) etc or to run ETL pipelines

Google Kubernetes Engine - This is a managed Kubernetes service on google cloud which is used to run containerised apps. This compute option is widely used for ML model training where the model is deployed for training as a containerised image. GKE (Google K8s engine) provides users with flexibility that GCE offers and scalability that PaaS offers making it the best choice for massive workloads

Google App Engine - This is a PaaS offering widely popular for deploying web applications. It is suitable for hosting web applications and APIs without managing infrastructure. Its a fully managed environment and supports multiple languages and frameworks such as Java, Python, Go, C, C++ etc

Google Cloud functions - This is a serverless offering which is like Functions as a Service. Users can write functions in programming language such as Python, Java, Node.js and deploy it as a function which in turn provides a HTTPS URL that can be invoked or integrated into applications. GCF (Google Cloud Functions) is popular for trigger based use cases such as processing a file as it arrives into the data lake, triggering a data pipeline whenever an event happens etc

Cloud Run - This is another serverless capability just like Cloud Functions with a difference that Cloud Run is for running stateless containerised applications. It is built on top of open source platform called Knative that provides serverless development experience on Kubernetes.

Cloud run is quite popular for ML workloads where the model is deployed as an endpoint via cloud Run. Since Cloud Run scale up and scale down to 0, users literally pay per usage making it extremely cost effective

Thanks for reading!

要查看或添加评论，请登录

Nikhil (Srikrishna) Challa的更多文章

Part 3 - Data Lifecycle and Product Catalog on Google Cloud

2025年2月12日

Part 3 - Data Lifecycle and Product Catalog on Google Cloud

After diving into the fundamentals in Part 1 and Part 2, we’re now taking it a step further by exploring two critical…
Part 2 - Machine Learning Fundamentals

2025年1月22日

Part 2 - Machine Learning Fundamentals

In Part 1 of this series we learnt about some interesting aspects related to what ML Engineering on GCP works like…
Part 1: Introduction to GCP ML Engineering

2025年1月8日

Part 1: Introduction to GCP ML Engineering

Getting Started with GCP ML Engineering: Build Skills, Not Just Certifications What is the Google Cloud Machine…

7 条评论
Big Data Architectures: Beyond the Classroom

2024年10月28日

Big Data Architectures: Beyond the Classroom

The scale and complexity of Big data architectures at enterprise level is too big compared to what we learn in online…

2 条评论
Performance essentials - BigQuery & Distributed data processing systems

2024年8月14日

Performance essentials - BigQuery & Distributed data processing systems

As Data Engineers designing and building data platforms, one thing we consistently strive for is cost efficiency and…

2 条评论
Real time Data Analytics solution with Spanner Change Streams

2024年7月25日

Real time Data Analytics solution with Spanner Change Streams

A real time streaming analytics solution with Cloud Spanner, BigQuery, Dataflow & Looker Studio If you are aspiring to…

3 条评论
Demystifying the Role of a Data Engineer

2024年4月9日

Demystifying the Role of a Data Engineer

Data engineering is an exciting field to be a part of, but it can come with its own set of challenges. One of the most…
Is cloud spanner underrated?

2024年4月5日

Is cloud spanner underrated?

What is cloud Spanner? Cloud Spanner is one of the google cloud's storage options which is highly available, strongly…

1 条评论
Data Integration patterns for ML/Data Engineers

2024年3月6日

Data Integration patterns for ML/Data Engineers

If you are a data engineer or ML engineer, it is essential to have a good understanding of different data integration…

4 条评论
Managed Instance Groups and Standby Pool

2024年2月23日

Managed Instance Groups and Standby Pool

Q: What are Managed Instance Groups? Managed Instance groups in Google Compute engine are a handy option when it comes…

2 条评论

See all articles

Nikhil (Srikrishna) Challa的更多文章

Part 3 - Data Lifecycle and Product Catalog on Google Cloud

Part 2 - Machine Learning Fundamentals

Part 1: Introduction to GCP ML Engineering

Big Data Architectures: Beyond the Classroom

Performance essentials - BigQuery & Distributed data processing systems

Real time Data Analytics solution with Spanner Change Streams

Demystifying the Role of a Data Engineer

Is cloud spanner underrated?

Data Integration patterns for ML/Data Engineers

Managed Instance Groups and Standby Pool