12 tricks to crack your data science problems
1. Understand the business use case :
We often are part of projects that start right with the implementation steps or usually planning without pausing to think about the goals and business drivers of the organization. With your AI and data science projects it becomes imperative to think through the goals. For example, if your goal is customer retention then looking at your customer service logs to get insights into the feedback would be helpful.
Starting with these mission-critical attributes would help you go a long way in successful implementation:
- Business drivers and mission of the organization
- Current Customers and their feedback
- Portfolio of products and services
- Determine your success criteria
2. Think broader about the solution :
Whenever we think of a project, we always think in terms of the trio of Time, Money and Scope. Thinking in terms of project management factors often leads to contrived solutions.
I suggest thinking broader as if you don't have the constraints of time, money, scope or resources. Narrow it down by adding constraints one at a time. this helps with innovative thinking and brings about inventive solutions.
3. Deal with data the right way
How do you think about the data? Do you think of a spreadsheet with columns, rows, and cells? Or do you think of a database with multiple data sources.
Whatever is the format of your data, the role of data in your solution is crucial in order to detect patterns using machine learning algorithms.
Artificial Intelligence depends on data mining methods that extract useful information from large data sets and data is the driver of the algorithm.
The 5 Vs of data: Value, Velocity, Variability, Veracity, and Volume will determine the solution that you would employ.
4. Plan with human-centric design :
One of the things that separate machine learning from traditional software development is that we are coding for the likelihood of an outcome based on past outcomes. Therefore, the need for human-centric design is crucial.
For example, when we recently designed and developed an intelligent bot for a doctor's office, we took time to meet with the doctor and identifying their customer personas, the types of issues they address and the outcomes they achieve for their patients.
We walked through the steps of prospecting, customer on-boarding, and finally customer results. This helped us create an intelligent agent that appears seamless to the customers and the conversation flows in a natural manner.
Other things to keep in mind are :
- What metrics are you measuring?
- What are the inputs and outputs of this system and where do they come from?
- What is the ROI?
5. Engage business users during the process
This seems self-explanatory but you would be surprised at how often we forget to do this. We might be the experts at implementing machine-learning algorithms, gathering insights from data and implementing complex technologies but the business users are the subject matter experts.
If you are working in the healthcare sector, you would spend a lot of time with the physicians and other team members to explain the business process, what the data elements mean and how they measure outcomes.
While technologists can work across diverse industry sectors, it is only the business users who are in the know about their industry, their metrics as well as their data.
6. Understand the process workflow:
Today's organizations are running a mile a minute and often do not discuss, document or debate their business processes.
This usually leads to interesting discoveries by the business since it might be the first time that they are thinking through their processes.
For example if the algorithm is used for medical image analysis and diagnostics, questions about when are the images taken, what steps do they undergo before arriving at a physician's desk and what determines the diagnostics, are some things to think and talk about with the business users. In this case, they are the radiologists, the physicians, and other technicians.
7. Simple pilot implementation :
Enterprises have several legacy systems that have been brought together over the years and form a complex web that makes data difficult to figure out.
You can overcome this complexity of interconnected legacy systems with an assessment of current technology infrastructure. A technology architecture gap evaluation can help all stakeholders understand what systems are in place, the synergies between these systems, and what meaningful data exists.
Armed with this knowledge, get a clearer understanding of the successes and shortcomings of how they currently operate, you can develop small pilot initiatives that can be validated before you proceed with an overall solution.
8. Break departmental silos :
In large organizations, the left hand usually does not talk to the right. Bringing multiple departments, teams, and sometimes vendors together would help break the silos and move you towards developing a holistic solution.
Reaching out to relevant stakeholders to ensure that misunderstandings or corporate policies don’t impede successful execution is a must.
Ensuring that all parties involved are informed about the inputs, outputs, and the success criteria will lead to an effective and efficient solution.
9. Select the right tool and the algorithm :
- Select the language: R or Python
- Select and research the algorithm: You can find plenty of open-source implementations of algorithms that you can code review, diagram, internalize and reimplement in another language.
10. Keep updated about the latest tools :
Machine learning and data science is an evolving field and every day there are new discoveries made, new tools introduced and more open-source projects available for reference.
Various research publications, books, blogs, and GitHub repositories can help you be on top of the learning curve and avoid any rework in the long run.
11. Unit test :
Always ensure that you are testing to make sure your machine learning model works. The easiest way to do this is to use a small subset of your data to overfit the model. This would be a quick test to confirm your model is sound.
You can also use test-driven model development that helps you test in small modules. Tests can be written for functions and methods, whole classes, programs, web services, whole machine learning pipelines, neural networks, random forests, mathematical implementations and many more.
12. Present business results in a business language :
Almost all business users are concerned about the business outcomes rather than the magic that is behind the scenes (in this case, the machine learning models). Good business presentations include presenting the representative data elements that were used (eg, customer engagement, customer feedback, customer's buying patterns etc.) and the outcomes the model provided. Also, make sure to include the accuracy levels since AI models are not exact results but present predictive analytics about the results.
Graph Database Ultra-Enthusiast
5 年I loved the article--especially the mind map of machine learning algorithms in item 9. I am going to save that off as a handy reference.? I would also recommending adding in Spark MLlib into your list of tools. R is great but cumbersome and slow. Python is flexible but slow. Spark MLlib is blazingly fast. Yes, you can wrap Spark MLlib and access it with Python, but it is better with Scala--or even Java or Kotlin.? I would also recommend adding an item in the article about choosing the right tool to feed your algorithms. Graph databases, like Neo4j, offer the opportunity to provide a much richer data input to Machine Learning algorithms than other data sources. Even before passing data into a Machine Learning algorithm, one can run a variety of clustering, community detection, classification, and natural language processing algorithms on a the data to further enrich it. If you want to turn feature engineering from a chore into a joy, just try doing feature engineering from a graph. Wow! You will love it!
Graph Database Ultra-Enthusiast
5 年Very nice article! I love it!
Helping people ?? Graph Analytics | B2B Marketing | Advisor | Author & Speaker // Want to understand why seeing connections matter? Let's talk.
5 年Nice article Swathi!? Your first few points tie in with something I've been mulling over a lot lately: How to make sure you have a clean/responsible "AI Supply Chain."? I'd love to chat over ideas with you sometime.?