Interview with Aaron Williams, VP, MapD Technologies - Speaker at Global Data Science Conference - Santa Clara April 2018
We feature speakers at Global Data Science Conference - April 2 - 4 2018 - Santaclara - CA to catch up and find out what he or she is working on now and what's coming next. This week we're talking to Aaron Williams, VP, MapD Technologies(Topic : Speed Meets Scale For Predictive Analytics)
1. Tell us about yourself and your background.
I have a computer science background, including an MS from Case Western Reserve University in Cleveland. I moved out to California after school and got a job developing software for the earliest versions of Java at Sun Microsystems. Since then I’ve started two companies in the entertainment space, and invented technology that was nominated for two Emmy awards along the way. I’ve run large-scale community and open source programs at Sun, SAP and Mesosphere, and I joined MapD about 6 months ago.
2. What have you been working on recently?
I’m VP of Global Community for MapD, where I’m responsible for our core open source project and our active community of end users. Recently my team and I have been working on a project with the VW DataLab to use XGBoost to analyze churn in their customers.
3. Tell me about the right tool you used recently to solve customer problem?
We mostly stick with the basics. We were using Jupyter and a few basic H2O machine learning algorithms this morning to make some predictions on a new dataset. Pair those tools with MapD’s Immerse visualization platform and our data scientists go from interactive feature engineering to training to visualized results (and black box testing) in seconds.
4. Where are we now today in terms of the state of Data Science, and where do you think we’ll go over the next five years?
Data Science is becoming mainstream, which means more and more areas of the business are being improved by the availability and completeness of data. As that trend continues, the challenge for the data science community will be empowering entire organizations to see benefits without needing to understand the details of how it all works. In five years, I predict that every company will need both a team of traditional data scientists to configure the tools and make the data available, but they’ll be empowering entire organizations to make data (including training models and being predictive) part of their everyday job.
5. You’ve already hired Y number of people approximately. What would be your pitch to folks out there to join your Organization? Why does your organization matter in the world?
MapD is fundamentally making the impossible possible by solving some of the world's most extreme data challenges. Exemplified through our work with Harvard to analyze the National Water Model’s flood predictions, and our work with the state of Florida to improve coastal planning using LIDAR data, the billions-of-rows scale and millisecond interactive nature of MapD’s platform are solving entirely new classes of problems.
6. What are some of the best takeaways that the attendees can have from your talk?
GPUs have been used by data scientists for machine learning for some time, but they also have some distinct advantages when it comes to data analytics and visualization. In my talk you’ll see how to leverage GPUs to manage a complete machine learning pipeline, and how the MapD platform plays a critical role in keeping the data in memory in the GPU.
7. What are the top 5 Data Science Use cases in enterprises?
I’m going to answer this a little differently because I think the traditional data science use cases are getting blended. The most interesting use cases we see are “hybrid” use cases, where multiple kinds of data and challenges are coming together and stressing the capabilities of traditional platforms. For instance, there has traditionally been a class of use cases that was focused on geospatial data points, and a separate class of use cases that contained billions of rows (“big data”), and another class of use cases that required interactivity with the data. Today those lines are blurring. When you look at autonomous vehicles, their data needs span all three of those classes. That’s not the biggest use case out there, but it is one of the most exciting.
8. Which company do you think is winning the global Data science race?
The cloud providers have an interesting inherent advantage because the barrier to adoption of any new tools they provide is so low. But I would still bet on the startups. We’re still miles from the end of this “race,” and that means major innovations are still to come between now and the finish line.
9. Any closing remarks
Thank you for including me and MapD in the Global Data Science Conference! I’m excited to participate, and learn from everyone in attendance
Hurry !! Register Today to get $125 Discount use Promo Code LIN.
If you have any questions concerning Global Data Science Conference, 2018 please do not hesitate to contact Leena: [email protected] or Call: 408-400-3769