Automate Classification with Machine Learning
Machine Learning is an exciting field, and while it carries risks (what if the computer makes the wrong decision?!), it has the potential to free up a significant amount of human resources. Specifically, in the field of trade compliance and supply chain management, machine learning offers tremendous value in several key areas. Product classification, is an area that can no longer be ignored.
Background
Artificial Intelligence (AI) is a term that’s been used for decades. As far back as the 1950’s science fiction shows frequently referred to it, even though it was little more than a concept at the time. Years later, IBM challenged the world’s greatest chess player to a match against its computer’s artificial intelligence. An impressive leap forward, but hardly the stuff of science fiction novels or movies where robots take control of mankind! Thankfully we are not yet the subjects of a cruel robot army, but AI has come a long way.
A related term used lately is Machine Learning. There is confusion over whether machine learning is the same as AI, a separate phenomenon, or a “subset” of AI. For our purposes, we refer to machine learning when referring to a computer system that will learn from experience and make predictions based on that learning.
Why is machine learning needed for product classification?
Before we continue, we should clarify what we mean by “Product Classification”. In the fields of supply chain management and trade compliance, goods typically need to be classified according to one or more classification schemes. For example, in the United States to determine the landed cost of an imported purchased good, you will need the tariff classification for customs purposes, or Harmonized Tariff Classification (HTS). Similarly, when exporting, you typically need to determine the Export Control Classification Number (ECCN), which will help you determine if a license or permit is required for the export. There are countless other examples of product classification within the supply chain, such as dangerous goods classifications or freight commodity classification codes, but for now, we are referring specifically to HTS and ECCN classification.
Focusing on HTS and ECCN classification, why is machine learning important, and why are current processes and systems inadequate? To put it simply, these classifications are complex involving massive lists of classification options, and generally requiring both significant time expenditure, and expertise/experience by the user. On top of the difficulty to classify, most companies have thousands or more products requiring a classification. This means one thing to a business: classification can expensive to do correctly and thoroughly. As a brief aside, many companies do not classify all subcomponents for this reason, but while that decision may reduce time in the short term, in can lead to unexpected costs due to not fully taking advantage of Free Trade Agreements (FTAs) that require tariff shift at the component level.
Beyond the cost of manual classification, there is a risk element to consider.
Take HTS for example, the classification chosen dictates, among other things:
Importers risk significant Customs penalties failing to assign the right HTS, if they discover so under audit. ECCN, if anything, is even more serious. Since the ECCN dictates whether an export license is needed, incorrect selection of ECCN can result in export regulation violations, which carry punishments up to and including prison time.
An automated solution offers the opportunity to be more efficient, accurate, and consistent than manual classification. Taking into consideration the risks cited above, machine learning has the potential to greatly reduce your cost, and increase your compliance level if it’s done properly.
Application of Machine Learning
How exactly would machine learning work for classification? To help understand this, it’s important to understand that a human manually classifying products operate like machine learning. We consider a vast amount of data elements related to the product, and assign a classification based on past feedback and experience.
To help illustrate let’s imagine a non-HTS or ECCN situation, where you are asked to determine if the small furry animal before you is a cat or a dog. Immediately you begin recalling what elements of a cat or dog are distinct: cats have tails, but so do dogs. Cats have retractable claws, dogs do not. Cats have whiskers, dogs do not. Dogs often have long snouts, cats never do.
That’s probably enough for most situations, let’s review:
Our small furry animal has:
I think most people would decide, despite the short snout, that this is probably a cat. Now, imagine the next thing that happens is: a passerby says, “Oh my it’s a rare Whatsadoodle Dog! Did you know they are the only dog with retractable claws and whiskers?!”
Well, now you would reconsider your classification, and file in memory that short-snouted, retractable clawed, and whiskered dogs are called Whatsadoodles! You would also likely start checking other data elements to distinguish this rare dog from a cat (maybe see if it can land on its feet?).
Machine learning for classification works just like this: it will consider all the applicable data elements, and determine a classification based on how previous classifications were decided. Now, let’s look at how this process works with a visual aid.
Figure 1 above, is a simple example of how machine learning works with classification using an SAP source system as an example.
Please allow a brief walkthrough of what the above flow means. The first step is the feeding of an existing, pre-classified product database (or multiple databases) into the system. This will include as many data elements as possible about each product. Thinking back to our animal example, you want whiskers, claws, snout, etc. Using SAP as an example, you want as many elements of the material master as you can get, including but not limited to Product Code, Product Description, Material Group, Product Hierarchy, etc. If you can get more data elements beyond what you have in your ERP system, that’s even better. For example, if you use SAP GTS, or TM there may be elements you can draw from there.
For our process flow, we have divided the data into three broad buckets:
This will allow the machine learning system to create a model, from which future decisions will be made. Following along the process flow, the next step is the introduction of unclassified products.
The system analyses the data elements for those products and compares them using the model. Using this combination of data and past data-classification correlation predicts a classification for the new products.
Along with this prediction, it also gives a confidence level. The system can automatically accept high confidence predictions, and add low confidence predictions to a worklist for user review.
After this comes the step that is essential to machine learning (and human learning for that matter) - feedback. Users will review some or all the classification decisions, and either approve or reject them. Ideally, when rejecting a decision, the user will assign a replacement classification. This is where the “learning” comes in: the system takes the new user feedback into account and learns from it.
A final step is needed, and that is an audit trail. An attentive reader may have spotted a potential weak point in the process: the user review. If users make bad decisions when reviewing system choices, it can cause the system to make bad decisions in the future. While there are certain system protections available to mitigate this risk (such as consistency checks – ensuring a user’s decision doesn’t flagrantly oppose previous decisions of a similar nature), an audit trail is essential to ensure the system works right. This report will be periodically reviewed by experts in classification, and act as a check against poor user decisions. As the system gets trusted more and more over time, these audits will need to happen less and less.
Having a machine learning model trained for a company’s products is not only valuable for classifying new products, but can be useful in auditing existing classifications and providing an automated second review to manual classifications. The existing classified product master data can be run through the system to identify misclassifications, improving data consistency.
Now let’s look at an example of this machine learning in action. It is an extremely simple example and takes only a couple of data elements into consideration. This is to help illustrate the process, but in actual use, there would be many different elements to consider, and the database of reclassified products would likely be quite large as well. Regardless of the scale, however, the same principles will apply.
Existing classification decisions are compared against new unclassified products, resulting in a proposed classification after all relevant data elements are reviewed. User reviews further hone the machine learning, until the system makes choices as good or better than a user would.
These machine learning concepts can also be applied to other areas of the supply chain.
Read the full KAI white paper here: https://avyay.solutions/resources/