课程: Cloud Hadoop: Scaling Apache Spark

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Serverless Spark with Dataproc Notebook

Serverless Spark with Dataproc Notebook

- [Instructor] In the Spark ecosystem, there are a number of execution environments. As we've seen in other movies in this course, we can use GCP Dataproc for a managed Spark environment. A relatively new capability is one that many of my customers have found super useful, and I wanted to share a preview of it for you here. It's called Dataproc Jupyter Lab Plugin for serverless batch and interactive notebook sessions. That's a lot of words. What does that mean? It means being able to, from a Jupyter Notebook within GCP, scale out a workload when you need to have more than one computer involved in the analysis. What I've done to give you an intro of this is I've shortened this rather long tutorial so that you can see what it looks like and hopefully be compelled to try this tutorial in full yourself. So the first step is to set up a Vertex AI VM workbench instance in a Google Cloud demonstration project. Once that's set up, then you're going to access Jupyter Lab by clicking the link…

内容