Are you an aspiring Data Scientist ?
SUBHASH CHANDRA (MBA, PMP, M-Tech)
Project Management | Stakeholders Management | Consultant | Data Enthusiast | Helping Businesses for sustainable growth With Innovative Approaches | PhD Candidate
Samantha works as an intern for a multidisciplinary firm which have core function to predict the outcome of different scenario and advice to their stakeholders. Her company is very busy and require data scientist who can take on new project with minimum supervision. So, her company decided to hire a new person as a data scientist. All the responsibilities and job functions of the position were mentioned in the advertisement. Samantha received large numbers of applications. She carefully shortlisted the deserving candidates and put them in 4 categories.
Category A : Only candidates who had excellent mathematical backgrounds only . Most of them were PhD degree with good academic backgrounds.
Category B : Only those candidates who had excellent computer programming background with several years of experience in programming .
Category C : Only those candidate who had extensively worked in public relation field and their job were communicating to the stake holders in day to day basis . These candidates had more than 20 years of experience in their field but they had none to minimum programming experience with reasonable mathematical background.
Category D : Only those candidates who had good mathematical & Statistical background with reasonable programming language experience and they can present their project work to stakeholders with confidence .
She presented these candidate list to her boss to decide which category of candidates she should call for interview. Her boss looked at the candidates’ list and their profile carefully again and he asked Samantha to call few candidates from Category D for interview. She was surprised and asked why he didn’t call Category A, B or C candidates for interview. Her boss gave her one Venn diagram and asked her to analyze carefully.
Looking at the Venn diagram she was convinced that the categories D candidates are appropriate category which can be used to fill the Data Scientist position.
Data scientists are those people who work closely with business stakeholders to understand their goals and determine how data can be used to achieve those goals. They design data modeling processes, create algorithms (require mathematical & statistical background) and predictive models (require programming background) to extract the data the business needs, then help analyze the data and share insights with peers (require good communication skill ) . While each project is different, the process for gathering and analysing data generally follows the various path:
- Ask the right questions to begin the discovery process.
- Acquire data.
- Process and clean the data.
- Integrate and store data.
- Initial data investigation and exploratory data analysis.
- Choose one or more potential models and algorithms
- Apply data science methods and techniques, such as machine learning, statistical modeling, and artificial intelligence.
- Measure and improve results.
- Present final results to stakeholders.
- Make adjustments based on feedback.
- Repeat the process to solve a new problem.
So , here question arises , can someone from any experience or background be data scientist ? the answer is ‘Yes’ !
Lets take an example of a person from Geoscience back ground, who has experience as a Seismologist . If that person plan to change his career as data scientist, what kind of attributes he require to fit as a data scientist.
Seismologists are natural data scientist. They use data science knowledge for EQ prediction by using different scientific technique in a better way.
Here are some of the job functions of seismologist are as follows :
Data Acquisition : Seismologist acquire the big amount of data from past and/or ongoing EQ activities in different format (structured and unstructured format).
Pre- Processing : EQ data are recorded in digital format or analogue format. Which are further processed by digitizing or filling the gap of unavailable data using different pre-processing techniques .
Processing : Depending upon the question to be addressed using the EQ data, seismologist use various processing techniques using highly scientific non-linear mathematical equation based algorithm to address the intended issue for further prediction or hypothesising the past using current data (back extrapolation) . For example , using big amount of local EQ data , future EQ can be predicted approximately by using focal mechanism process. . Earth Science require more data and big computation capabilities, to predict accurately the future EQ, so that the big disaster can be mitigated .They also advice to disaster management team for better planning.
Summary :
In summary , if someone is looking for career change and is aspiring data scientist, he/she should analyze his/her attributes and work on above mentioned skill set if any gap are there. Most of the reputed organization are always ready to hire candidates as data scientist who have these skills set.
Project Management | Stakeholders Management | Consultant | Data Enthusiast | Helping Businesses for sustainable growth With Innovative Approaches | PhD Candidate
4 年DataScienceAcademy Seismological Society of America