Deep learning for fraud detection in the banking industry

By Servio Fernando Lima Reina

Traditional fraud protection methods for the banking industry have been rule based, where a human defines the rules. In fact, 90% of the financial and banking institutions rely on these methods. While more persons adopt new technologies, more fraud scenarios may happen, making those rule based methods not scalable and sustainable in the future.

Moreover, false positives (i.e. non fraudulent transactions that are cataloged as fraudulent) cause millions of dollars of lost transactions and customer complains in the banking industry, and rule based methods are part of the problem.

In addition, frauds have no constant patterns and they always change their behavior over time, making rule base systems cumbersome and rapidly obsolete.

The need for a new approach is evident. In the present essay we explore non-rule based system based in Deep Learning and in particular the family of unsupervised deep learning algorithms (i.e. those algorithms that does not require to be trained) that detect anomalies in a set of transactions. Autoencoders (AE) and Restricted Boltzman Machines (RBM) are analyzed in the context of fraud detection.

Introduction

A good fraud detection system should be able to detect the transaction in real time and with accuracy. There are two kinds of fraud detection systems: anomaly detection and misuse detection. Anomaly detection does not need to be trained whereas misuse detection systems needs a training phase. We are going to focus in this paper in anomaly detection systems.

Several approaches in terms of machine learning (ML) have been implemented. Typical ML algorithms used are KNN (K Nearest Neighbor), decision trees and logistic regression, but they are supervised methods, meaning that they need to learn by labels in order to understand what transactions are fraudulent or not. If the company lacks this information, these algorithms cannot be trained. 

Deep Learning (DL) is a subset of machine learning that is specialized in handling large volumes of data and can learn in a supervised or unsupervised way [1]. There is a set of DL algorithms that have been tested in the fraud detection realm with excellent results. Auto encoders (AE) and Restricted Boltzman Machines (RBM) are the most commonly used.

Deep learning algorithms

DL or ML algorithms are typically used to solve classification or regression problems (prediction). Fraud detection can be either a classification or regression problem, where we want to predict if a transaction is fraudulent or not. In the following lines, we are going to explain the most commonly used DL algorithms.

Auto encoders (AE)

Auto encoders [2] are a special kind of DL algorithm, where the output is the same as the input. But, AE has a middle or hidden layer that contains less neurons (i.e. perceptrons) and do compression of the data. AE uses many layers for encoding and for decoding in the hidden layer. When data is decoded, the output is compared with the input data and if they are not the same, a correction mechanism (i.e. Gradient descend) is activated until a minimum error is reached. In this process, some anomalies in the data (i.e. data that does not follow a pattern) is detected. Then, these anomalies could be flagged as fraudulent transaction for further study. 

AE override the need to define every single rule for detecting a fraudulent transaction. AE for fraud detection is suitable for large datasets, as described in [3].

Restricted Boltzman Machines (RBM)

RBM [4] is another DL algorithm specialized in fraud detection. They learn from the probability distribution of the input data in an unsupervised mode. Its main feature is that it does not have an output neural layer. Only a visible (i.e. input) and hidden layers. This is why it is called restricted. What is important about RBMs is that they try to identify with what probability a particular object would activate a particular feature. RBM are used for fraud detection where the objects are the transactions and the features are all the characteristics that make the transactions fraudulent or not. 

Commercial solutions

There are several commercial companies that has different approaches for solving the fraud issue. In [5] the company provides trusted identity as a service that comprises government id verification, identity verification with biometric facial recognition and document verification that looks at bank, credit card statements and other documents. In [6] the company focus on smart devices for detecting fraud. In [7] the focus is in human behavioral biometrics that help to build users profiles. In [8], the company offers services for bank account takeover, payment fraud, content abuse, promo abuse, among others.

There are other kind of companies such as Numerai [9], is a hedge fund that crowdsource data scientist for making predictions of the stock market. The data scientist submit algorithms that are encrypted. In [10], Zestfinance focus on explainable ML solutions for credit and risk modeling.

Conclusions

Due to the increasing number of scenarios where fraud is possible, it is not practical to implement rule based or supervised based algorithms. Modern fraud detection systems should be able to react in real time and with accuracy to new kinds of fraud scenarios. That is only possible with unsupervised algorithms that do not need any human classification and can ingest vast volumes of data, such as Deep Learning algorithms. In this paper we have focused in the most promising DL algorithms such as AutoEncoders and Restricted Boltzmann Machines that focus on anomaly detection rather than misuse detection. Further investigation is needed to see how much fit has other DL algorithms to the present problem.

Bibliography

[1] Complete chart of Neural networks: https://www.asimovinstitute.org/neural-network-zoo/

[2] Introduction Auto-encoder (2015, Dec. 21). Auto-encoder [Online]. Available: https://wikidocs.net/3413

[3] Credit Card Fraud Detection using Deep Learning based on Auto-Encoder and Restricted Boltzmann Machine. Apapan Pumsirirat, Liu Yan. School of Software Engineering, Tongji University, China. (IJACSA) International Journal of Advanced Computer Science and Applications. Vol. 9, No. 1, 2018

[4] A beginner’s tutorial for restricted Boltzmann machine [Online]. Available: https://deeplearning4j.org/restrictedboltzmannmachine#define

[5] Jumio: https://www.jumio.com/

[6] Iovation: https://www.iovation.com/

[7] Biocatch: https://www.biocatch.com/

[8] Siftscience: https://siftscience.com/

[9] Numerai: https://numer.ai/

[10] Zestfinance: https://www.zestfinance.com





要查看或添加评论,请登录

Servio Lima的更多文章

社区洞察

其他会员也浏览了