课程: Cloud Hadoop: Scaling Apache Spark

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Scale Spark on the cloud by example

Scale Spark on the cloud by example

- [Instructor] In this section, I'm going to take you through some work that my team did in collaboration with C-S-I-R-O bioinformatics in Sydney Australia on moving to the cloud and scaling real-world Spark workload. The use cases for genomic analysis or bioinformatics research and there are several constraints for our customer here. They were researched focused, the didn't at the time we started to have a dedicated devops or cloud person. And they really wanted to make their solution flexible to work across any cloud. So, as starting point they had written a library called VariantSpark which runs on top of Spark and implements custom machine learning. We'll look at it a little bit more detail in a minute. They recorded it in Scala and they had open sourced it on GitHub. When we first stared working together they were using it internally. They were using it on a shared Hadoop Spark cluster and their frustration…

内容