What is Data Science, and where does Python fit into the picture?
Data Scientists and Data Analysts are undoubtedly some of the hottest tech careers today. That is hardly surprising given that the lucrative salary ranging from 5k to 20k every month. However, professionals and jobseekers might be put off by their lack of understanding of what data science entails, and how to embark on it.
In this article we seek to provide more understanding on what data science entails, and what it takes to embark on a career in data science.
What is Data Science all about and why does it matter?
Data Science is an interdisciplinary field that taps on a combination of scientific methods, algorithms and computational tools to extract insights from data.
Here is a relatable example — imagine a bank teller after processing 100 customer requests starts to have a good sense of which customers will likely default on their loan. This bank teller then provides value-added service to the employer by using this newly acquired sense to decide when to/not-to give out a loan.
Now in Data Science we do that too. Instead of having a human learn these patterns, we leverage on computers and mathematical models.
Data Science matters because there is a limit to how much data a human can process, and how many relationships he/she can uncover from the data. With Data Science, by relying on the powers of computers and models which can be scaled indefinitely, there is no limit to the amount of data we can go through, nor the number of insights and hidden patterns we can uncover!
The advent of data science has some parallels to the industrial revolution in the 19th century.. In those days, companies that were quick to adopt industrial machines thrived, while others that relied solely on physical labor quickly got displaced. Similarly, Data Science is our generation’s “industrial revolution” and companies that don’t adopt it may risk getting displaced.
What is the relationship between Data Science?
To get computers to learn from data, we need to provide instructions to a computer so that it knows how to learn from the data. This is much like how a teacher needs to provide instructions for a student to learn. However, in the case of computers, instead of providing instructions in human language (english), we have to provide instructions in a language it understands - programming language.
There are many programming languages under the sun, but Python is undisputedly one of the best programming languages where data science is concerned. There are a few reasons for this.
领英推荐
Extensive Libraries and Good Documentations
Firstly, the Python ecosystem has some of the most comprehensive and well-documented libraries for data science.
Libraries are pre-written codes written by other programmers which we can use in our code, instead of having to code it from scratch.?
A world without libraries is like a world where skyscrapers are built brick by brick. Eventually they do get built, but builders endure a long and tiring process. Some bricks are badly laid now and then, which could render the building structurally unsafe. Libraries, however, can be likened to prefabricated parts of the skyscraper, which can be used to build safer structures more effortlessly and in a much shorter amount of time.
In Python, developers have created many wonderful visualisation and machine learning libraries which significantly shorten the time required to create useful products. This allows us to stand on the shoulders of giants.
Verbose and Easy to Pick up
The second reason is because Python is quite a verbose programming language that reads somewhat like English. This makes Python easier to read and write, and less daunting for beginners. As such, even someone with no formal IT / programming education can easily learn Python, especially with the help of structured guidance in the form of courses.
Where can we pick up Python and Data Science?
Those who are disciplined can seek to pick up Python and Data Science from free courses available on YouTube and major MOOC platforms. However, the downside of self-learning is that it is difficult for one to know what courses / topics to focus on, and also the absence of a mentor to help clarify doubts.
To this end, those seeking structured guidance can check out courses provided by Heicoders Academy - a leading tech education academy in Singapore. They have a series of courses aimed at equipping learners with python programming skills and subsequently machine learning (a key topic in data science):
Heicoders Academy also offers complimentary career mentorship (resume planning, career planning customized to learners’ profile) for those who have graduated from AI200.