Applied Data Science

Given that I have been actively involved in application of data science in solving business problems for several months, I have realized that there is a lack of unified materials on the subject. This is largely because the applications and tools utilized are quite diverse and, more often, customized to specific requirements of different industries. Since, it is a field still in the making, new content is continuously being developed. Below, I intend to summarize key decision/ implementation steps required in successfully achieving business outcomes utilizing data science. Comments are welcome.

1. Business requirement driven data analysis vs data driving business analysis: It is a common trap to jump straight into data analysis based on data and model availability instead first identifying the business requirements. Instead appropriate hypothesis development should be the starting point.

2. Right model selection: After identifying the organizational needs, it is equally important to select an appropriate model for data analysis. Sometimes, multiple models are selected. Two main categories are regression and classification.

3. Choosing dataset/ proxy: Selecting dataset goes hand-in-hand with model selection. If for a particular model, a specific datapoint is not available, proxy data should be used. At times, data availability can dictate the choice for right model.

4. Algorithm selection: Depending on the data quality and computational resource availability, an algorithm is selected to train the model. Algorithms differ in their ability to reduce bias, variance etc and some may result in over-fitting of training data set.

5. Model evaluation: Model performance evaluation is very important to conclude utility of the model to the original business requirement. Performance measures are gain, lift and ROC curves, confusion matrix etc.

6. Timely re-validation of model and underlying assumptions: A stellar model becoming irrelevant with changing business environment is a problem experienced more commonly than imagined. Hence, suitability of assumptions and model effectiveness should be periodically tested.

7. Stakeholder communication and change management: Having gone through the data analysis driven several change management experiences, I can testify that insight implementation requires a well thought communication strategy, roadmap and operational acumen.

Image source: pexels

要查看或添加评论,请登录

Mitesh Agrawal, CFA, CAIA的更多文章

社区洞察

其他会员也浏览了