Episode 7: Programming languages for Data Science

Episode 7: Programming languages for Data Science

Hello! And welcome to a new edition of the Data Science Now newsletter. In this session, I talked about the most important programming languages for data science. You can hear the podcast version here:

And if you prefer you can watch the video recording here:

Remember that we will be live (almost) every Wednesday here at Linkedin, 8 PM CST :).

Here's a short recap of what I covered in the session:

Why programming in Data Science (DS)? I've talked about several things in these sessions so far. Programming is important but only a tool, more important are the problem, business, add value for the company. But remember that we need to create data products, that means creating a system or model.

Programming is important in DS because you need to understand languages to transform the ideas into solutions (software solutions). You need to spend time learning to program. You need to understand how to code, the learning curve of programming, the basic structure of a language.

The most programming languages for DS used are python and r, but which one is better? We'll leave that for a different session.

When you need to decide where to start, learn python, it’s a full language. We have other sessions recorded about that:

You will find thousands of resources on Python and R in the web so I'm going to skip that for now. I want to talk about other languages that are also very use (or that are helpful for data science).

The list of relevant programming languages for Data Science:

  • Python
  • R
  • SQL
  • C++
  • Go
  • Julia
  • Scala
  • Java
  • Javascript

If you’re ambitious you can learn at least 5 of them in 2 or 3 years.

Note: SQL, it's not exactly a programming language, it’s a data related system, you can't create whole programs, it's still important, a lot of companies use it, if you wanna succeed as a data scientist, you need to learn SQL, if you go to a company you have to go to the database and understand the data. That's what SQL does. A great course on SQL is the one by my friend Kristen:

What I will do is to put a section with the name of the language, a short description, and some resources to learn it. If you want more context on the languages please see or hear the episode.

C++

C++ is everything C is, and more. It’s not new, either, and has itself been the inspiration for many languages that have come behind it like Python, Perl, and PHP. It does however add in a few modern elements that make it a step up from C.

Free Books on C++:

Free courses on C++:

Julia

Another one that’s important for Data Science is Julia, it’s new, has a few years, it’s growing a lot in scientific programming. It has an ecosystem for Data Science, dataframes, queryverse, juliagraph, Hadoop ecosystem. You also have libraries for ML like flux, automatic differentiation, GPU, deep learning algorithms. It's funded by great companies and schools, it's gonna be important in the future.

Free Books on Julia:

Free courses on Julia:

Go

Another one is Go, if you know c++ it’s very simple, you can create objects, very good alternative to speed up to c++, has an ecosystem for Data Science as well.

Free books on Go:

Free courses on Go:

Scala

If you ask me for my preferences, on of my favorite languages is Scala. A language that was gonna be the future, everyone was talking about it, but didn’t happen. Spark is written in Scala, it has great libraries for ML, we thought the community would go there, it's still important, if you have to choose between Java and Scala, and you are a data scientist, choose Scala, it's not that easy but it's good.

Free books on Scala:

Free courses on Scala:

Other languages like javascript, it's great for the web, it's weird, don’t learn programming with it, you’ll learn bad practices. You’ll take a lot of bad things, but it's still important to create dashboards. 

How much time should you spend learning these languages?

One of my favorite articles by Peter Norvig, Teach Yourself Programming in Ten Years, explain that this will take you years. If you wanna learn something like SQL, Python or R you can spend 6 months in a course, practice to master the language. For more complicated languages, it will take you 3 or 4 years to really understand and use them correctly. I'm not saying you need to wait that long to start, in a few months you can write codes, the complicated part is mastering them. If you can enroll into a computer science degree, do it!

Always Remember:

There's no easy path, you have to practice, study, and if you want to know where you're going, you need to understand where you come from. Then you will rule the world.

Thanks for reading this, please subscribe and share this with your network, it would help us a lot :)

With love by the Closter Team:

Gabriel ErivesHéizel VázquezEilén VázquezFavio Vázquez.

No alt text provided for this image


I have to learn Scala and Julia in a month time (or less). I know a lot in most of these programming languages. I'm presently wrapping up/finishing an ABSTRACT: ChainLadder Reserving Model with R (Solvency II). I told a friend we can do this Abstract in 2 weeks he didn't believe. We having a final meeting on Sunday where I 'm going to demonstrate to him the whole ABSTRACT in less then 10 R Functions.????????

要查看或添加评论,请登录

Favio Vazquez的更多文章

  • S2-E5: Exploring and Preparing Data

    S2-E5: Exploring and Preparing Data

    Hello! And welcome to the fifth episode from the Data Science Now newsletter about the project: Basics of Data Science.…

    11 条评论
  • S2-E4: Data Collection

    S2-E4: Data Collection

    Hello! And welcome to the fourth episode from the Data Science Now newsletter about the project: Basics of Data…

    16 条评论
  • S2-E3: Business Understanding. Part 2.

    S2-E3: Business Understanding. Part 2.

    Hello! And welcome to the second episode from the Data Science Now newsletter about the project: Basics of Data…

    3 条评论
  • S2-E2: Business Understanding. Part 1.

    S2-E2: Business Understanding. Part 1.

    Hello! And welcome to the first the Data Science Now newsletter about the project Basics of Data Science. Let me remind…

    4 条评论
  • S2-E1: Basics of Data Science

    S2-E1: Basics of Data Science

    Hello! And welcome to a new season of the Data Science Now newsletter. In this season, we will be discussing the basics…

    5 条评论
  • Episode 10: Best Books to Study Machine Learning

    Episode 10: Best Books to Study Machine Learning

    Hello! And welcome to a new edition of the Data Science Now newsletter. In this session, I talked about the best books…

    13 条评论
  • Episode 9: How Netflix Recommends Shows and Movies

    Episode 9: How Netflix Recommends Shows and Movies

    I want to thank Daniel Mora, most of what you are seeing in this newsletter comes from him and his analysis. Thank you…

    5 条评论
  • Episode 8: Understanding the coronavirus (COVID-19) with Data Science

    Episode 8: Understanding the coronavirus (COVID-19) with Data Science

    Hello! And welcome to a new edition of the Data Science Now newsletter. In this session, I talked about how to download…

    3 条评论
  • Episode 6: Business understanding for Data Science

    Episode 6: Business understanding for Data Science

    Hello! And welcome to a new edition of the Data Science Now newsletter. In this session, I talked about the importance…

  • Episode 5: Math for Machine Learning

    Episode 5: Math for Machine Learning

    Hello! And welcome to a new edition of the Data Science Now newsletter. In this session, I talked about the math you…

    6 条评论

社区洞察

其他会员也浏览了