Machine Learning Applications for BFSI Sector
Machine Learning and BFSI:
Machine Learning offers a revolutionary approach to solving problems in a fashion very similar to how human brain would solve it. The approach involves starting with historical data; creating & training a model, cross validating the model and then using the model to predict outcome on live data. This is exactly how our brain also works. With increasing experience, it learns more and predicts more accurately.
Banking, Financial Services and Insurance (BFSI) sector is quite data intensive. Right from customer acquisition to actual business, millions of transactions get generated each day. Combined with available supporting data such as customer demographics, time and location information etc. this data can be quite rich from the perspective of applying Machine Learning algorithms to make sense and to take some meaningful decisions.
Here are a few of the many possible areas where Machine Learning can help BFSI sector:
Application 1: Banking Fraud Detection: Typical banking and credit card frauds include Card not present, Counterfeit, Loss & stolen cards, Mail not received, Cheque Frauds, Identity Theft and Online Banking Frauds. Statistics show that less than 1% of all banking transactions globally are fraudulent. Reserve Bank of India data for Oct 2016 showed that across all banks in India, there were close to 145 million transactions per month totaling billions of US Dollars. To be able to find a few fraudulent transactions among million+ would be like finding a needle in a haystack.
By looking at various characteristics of a financial transaction, it is possible to judge if a transaction is fraudulent or not. These characteristics may involve aspects related to misspelled customer name, shipping address being different from billing address, email details, IP address, transaction size and frequency etc. A Machine Learning approach to identifying fraudulent transactions could involve:
- Choose transaction characteristics, x (such as those listed above) which might be indicative of a fraud.
- By looking at historical data (m training examples), build a Gaussian Distribution model by calculating mean, μ and standard deviation σ for each of the transaction characteristic.
3. For an example with 2-characteristics (x1 & x2) scenario, the Gaussian distribution would look like this:
4. Given a new transaction x, estimate the model value p(x) as below:
Transaction x is likely to be fraudulent if p(x) is less than certain threshold value Σ.
One would need to study training model closely to see if it’s Multivariate in nature i.e. the chosen characteristics have a correlation and therefore need further tweaking. Also, the model may need to be evaluated by calculating true positives, false positives, false negatives and true negatives followed by calculating Precision and Recall values from which F1-score. A higher F1-score would mean a good model.
Application 2: Upselling Financial Products: A customer buying a certain kind of financial product, say a term deposit, may also be an good target for another type of financial product, say an SIP whereby quarterly interest earned contributes towards EMI for the SIP.
Any financial product has many different characteristics such as ratio of debt to equity, term length, ROI profile etc. Any customer typically has a degree of preference for a financial product based on these characteristics. However, each customer may not have shown an explicit desire for a financial product for different reasons. For example, below table shows a table of customers with their preferences for financial products, which in turn are defined by characteristics.
As can be seen from this data, not each customer has had an experience of buying or rating each financial product and hence a value of ?. From financial institution’s perspective, it would be beneficial to predict what would be likely rating of a customer for a given product, which could directly indicate a potential for an upselling opportunity.
Collaborative filtering algorithm based on Linear Regression Gradient Descent can be used to predict potential customer preferences. Machine Learning approach for such a system would be:
Parameters:
r(i, j) = 1 if customer j has rated product i
y(i, j) = rating by customer j on product i
Θ(j) = parameter vector for customer j
x(i) = feature vector for product i
m(j) = number of products rated by user j
n = number of characteristics of the product
nu = number of customers
α = Learning rate (step size) of algorithm
λ = Regularization Parameter
We can find customer parameters Θ by trying to minimize the cost function shown below by applying gradient descent.
Once we know Θ, we can predict ratings of products where customers have not rated (i.e. having values ?) using the formula (Θ(j))T(x(i)).
Application 3: Insurance Claim Fraud: One statistics shows that Insurance Claim Frauds amount to nearly 4% of all claims disbursed by insurance companies. Losses due to insurance claim frauds result in higher premium costs, lack of trust and also unnecessary loading of process making it inefficient.
Given the skewed nature of positive fraud cases, Insurance Claim fraud is another case of Anomaly Detection similar to banking fraud detection, detailed earlier. Insurance Claims Fraud can be detected in a fashion very similar to banking transaction fraud detailed earlier. The transaction here would refer to insurance claim. Transaction characteristics to be selected here would come from insurance policy details, claim details, insured details, vehicle (or any other asset) details, repair details and risk profile. A Gaussian Model can be created based on historical data, which can then be used to classify if a given claim is a normal claim or fraudulent.
Application 4: Predicting Price: Investment Banking divisions of banks normally deal with stock and/or commodity trading. Future price of a stock or a commodity is a function of input parameters such as earlier prices, buy & sell trading volume, seasonal trends and many more. A very simplistic price prediction model could be built just looking at the last 3 to 5 prices.
While we know a bank would typically not indulge in food price trading, just for the sake of explanation, we’ll show how we can predict potato price at a given wholesale market. Shown below are historical potato price at a given wholesale market extracted from AgMarknet portal of Government of India:
Time-series prediction using Multi-layer Perceptron (MLP) Neural Network can be used to predict next 1-day’s price by looking at last few days of potato prices. Let’s say we use last 4 days prices to predict next day’s potato price. A 3-layer MLP neural network with 9 nodes in the hidden layer would look like this:
The approach for training this neural network based potato price prediction algorithm would be as follows:
1. Randomly initialize the weights of neural network.
2. Implement forward propagation to get predicted values of hidden layer and output layer nodes, hΘ(x(i)) using sigmoid function, where x(i) is the input to the node.
3. Compute the cost function J(Θ) using below formula
4. Implement backward propagation to compute partial derivatives for J(Θ) as below:
5. Use numerical checking for the first time to compare the derivatives with a numerical estimate.
6. Use gradient descent or advanced optimization methods/functions with backpropagation to minimize cost and get the values of weights Θ.
7. Once model is trained, use Cross Validation data set to test it to see it’s performance.
8. Subsequently, use it to predict next day’s price.
This simplistic model of predicting potato prices could be further enhanced to introduce more inputs variables and more hidden layers to make it work for scenarios related to stock trading, where there’s much higher volatility and buy & sell trading volumes generally play a key role in intra-day price prediction.
Technology Implementation:
A host of technologies exist today to implement Machine Learning algorithms, both as a bespoke application or a cloud based application. Choice of programming language depends on availability of libraries capable of vectorized implementation of linear algebra.
MATLAB by Mathworks or Octave (free version of MATLAB) could be a good choice for building prototypes fairly quickly to test the concept. Python or R are good programming languages to build production ready solutions. Neuroph is an open-source Java Framework to build Neural Network applications. It comes with it’s own GUI which is based on Netbeans IDE. Alternatively Microsoft Azure ML or Google TensorFlow can be used to create Machine Learning applications on the cloud.
Conclusion:
Machine Learning and Neural Network offer a revolutionary new approach to solving BFSI problems such as fraud detection, upselling or stock price prediction. Key would be to have sufficient training data and the right approach in building high performing algorithms.
Software Testing, Quality Assurance, and Project Management
7 年Very excellent and useful information Sanjeev. Thanks for sharing.
Co-Founder and Chief Technology Officer at Ellicium Technology Solutions
7 年Nice and informative article, Sanjeev. I can very well relate to it. BFSI is indeed one of the main domains where there is great potential for implementing Machine Learning to achieve a lot of automation. In fact, the same is true for several other domains as well e.g. for one of our leading LPO clients, we recently implemented our AI tool which automated their manual process of web monitoring and document classification. It is working wonders for them. As you rightly said, using the right approach and appropriate training data was the key to achieving it.
Leading an enriched life...
7 年Machine learning for Upsell seems to be a feasible idea, however, for fraud detection, the error percentage reduction needs to be implemented properly. Even in case of small error percentage, if the transaction is halted because of suspected fraud, it can lead to a substantial loss. Usage of MATLAB is undoubted, but there are others like Mathematica as well as a few opensource platforms..