Machine Learning and NLP, a practical example
Machine learning programs are opening up a new world of analysis especially when paired with natural language processing (NLP). This post will examine a simple example solution and also describe how you can get started building solutions on your own.
Knowing when to use machine learning and NLP
There are generally two reasons to choose machine learning solutions versus more traditional approaches. The first reason is when the “rules” for a particular problem are not well known. For these problems, even when requirements are defined they never seem complete enough, usually because some data aspect is always changing. The second reason is when the solution to a problem requires a high degree of automation. Solution that require continuous analysis and the capability to take actions without human intervention, machine learning is a much better approach.
Often machine learning libraries and analysis are not enough to provide real business value by themselves. This is why it is useful to pair machine learning with natural language toolkit ( NTLK ). Natural language processing can be used to build solutions by allowing the computer to “parse” human language. It’s important to note that the computers do not understand the words, they simply break down language structures into logic structures that can be manipulated. It turns out that these logical structures are perfect for machine learning. The programmer can use these structures in order to specify a set of “goals” and then train the machine learning model find solutions using the natural language processing structures. Let’s take a closer look.
Focusing on tools
What tools you use are an important set of choices when attempting to build machine learning programs that can be used to solve real-world problems. Generally there are 3 important choices that need to be made before getting started. Our first choice is to pick a programming language. In this example I’ll be using Python, it’s a solid choice especially for pure machine learning projects. Python enables you to write a programs with minimal “fuss” (because of this it is often recommend to beginners), but it also has an incredibly deep set of libraries that give you the flexibility to handle most programming tasks.
Our next choice to make is to pick a programming environment. For Python languages there are many choices here, but I recommend using something like Google’s Colaboratory. It is based on an open source project called Jupyter Notebooks which gives you the capability to break up your program into separate cells and run them one at a time. This is a great tool for machine learning since it allows checking the state of our program at each step.
The final choice that needs to be made is to pick a machine learning library. This library is critical because it will handle the computational aspect of our project. For this simple example which library to use tends to be more about convenience, so in this case the program uses Tensorflow which is available on Google’s Colaboratory platform without any additional setup.
All of the tools you use for development are free. However keep in mind that once your machine learning program is running at a large scale you will need to migrate your program from a free development environment to a production environment with high computing capabilities. These production environments are costly due to the intensive computation involved.
A practical example, routing of support issues.
Support functions are critical to businesses because the support team is often a primary way that customers interact with the business. However, problems start to arise when breaking apart support into various groups. For this example let’s assume there are 3 groups: the security group (I can’t login. Please help), the account group (When is the next account payment due?), the technical group (Help I got an error when I clicked this button!). So when new issues are created the program will need to identify the problem in the issue and route it to the correct group as fast as possible.
Before diving into a machine learning solution, let’s quickly review the reasons mentioned earlier in this post. In the case of routing support issues this means knowing all the ways a customer will ask for support. If I was constrained to use traditional approaches I would need to limit the input variations. Basically this means I need to make the customer fill out a form. Yuck! The 90’s called and they want their online forms back. Let’s just assume this is not desirable and that then that our program needs to handle the direct questions from customers. For this example, the routing should happen as fast as possible in order to expedite each issue. The routing needs to to happen “realtime”, or at least in a time frame that meets customer expectations and on an ongoing basis.
Based on this, both reasons appear satisfied. Time to dive into machine learning!
Building the program
First let’s create a “model”, this serves the basis of the machine learning algorithm. To start us off, there needs to be a small set of responses generated. Starting with some base knowledge is required in order to train the initial model. So in this example I'm going to make up these sample issues, however in a program designed to be used in a more real capacity, you'll want to use real issues.
Next our natural language processing library needs to parse the sample data sentences to determine which route to take. Initially this example is using a simple “Bag of Words” approach, but it could be developed into a more complex program using additional components such as grammar and context. The next step is to specify our model and train it. It’s important to keep in mind that machine learning is an iterative approach that requires retraining with more data over time. In order to ensure that the model continues to make good predictions about new data, create a test data set separate from the training set. Having a test data set allows detection of “overfitting”. Overfitting happens often especially with natural language processing if our program starts making predictions based on common linking words (for example words such as “to, the, in” are not useful in determining the support routing) instead of signal words (in this example these words are "password, account, error").
Taking action on the predictions
Now that the model is able to predict routes, let’s put these predictions to use. To make use of our predictions, let’s add the prediction data to our support issues. A simple custom field called “predictions” can store our route names (security, account, technical). Now instead of a person reading the issue and then manually routing it, the prediction field can be used to determine how routing should happen. It’s important to note that this assumes your issue tracking system can route based on fields, but even if it can’t our Python program can be used to fill any gaps (Python programs can call remote APIs or even send email!). Also during normal operation the model won’t be changing, however over time our this version of the model will become unreliable When this happens you can simply add any new misrouted issues to your training set, and create a new model. Over time your model will become better and better. Taking an iterative approach is a critical piece of having success with tools utilizing machine learning.
Hopefully if you have followed through this post and example, it is clear how machine learning can be used effectively with natural language processing. And this is just scratching the surface! With this set of tools new solutions to problems that were previously thought impossible to solve can be built.
References
- Tensorflow Text Classification – Python Deep Learning - https://sourcedexter.com/tensorflow-text-classification-python/
- Text Classification with TensorFlow Estimators - https://ruder.io/text-classification-tensorflow-estimators/
- When to Use Machine Learning - https://docs.aws.amazon.com/machine-learning/latest/dg/when-to-use-machine-learning.html
Have a comment, question or correction about this article? I’d love to hear from you! Drop me a note in the comment section below or send me a direct message on Twitter.
Copyright (c) Matthew Jackowski, 2018