Enterprise Credit Risk Prediction

FinTech companies are taking advantage of statistics and machine learning to make sense of financial data which is noisy, complex and high-dimensional. It can be difficult to choose which variables to use, particularly when some of them are irrelevant or exhibit high multicollinearity. For example, certain financial ratios or technical indicators can be useful when used individually, but provide little additional information when combined, such as the Simple Moving Average (SMA) and Exponential Moving Average (EMA). With this type of data, simple ranking algorithms that select variables by how much they correlate with a predefined output are inadequate. Instead, suitable dimensionality reduction methods are required to extract the most informative and discriminative features.

?

In the paper ‘Intelligent FinTech Data Mining by Advanced Deep Learning Approaches’, Huang et al. propose improvements on Canonical Correlation Analysis (CCA) to extract low-dimensional features. Traditional CCA projects the data linearly onto a lower-dimensional subspace and optimises the projection operators to maximise the correlation between the transformed input and transformed output. In other words, it aims to extract as much information as possible from the transformed features. However, financial data are usually non-linear and non-stationary. Kernel CCA can effectively account for some of the non-linearity, but deep learning has a superior ability in performing non-linear feature extraction. Deep Canonical Correlation Analysis (DCCA) and Deep Canonically Correlated Autoencoders (DCCAE) are two such models that use two separate neural networks to extract features from inputs X and ouputs Y respectively before performing CCA to automatically learn the most relevant features. DCCAE differs slightly in that the extracted features are reconstructed at the decoder and the objective function trades off between maximising the feature correlations and minimising the reconstruction error. This autoencoder regularisation technique can further improve the extracted features.

?

The paper focuses on the application of assessing an enterprise’s credit rating. Accurately assigning a company’s credit score has been especially important after the financial crisis 13 years ago, and banks need a tool that accurately predicts the credit risk of their borrowers so that they have enough capital for lending. Huang et al. use data from technology companies listed on the Taiwan Stock Exchange, which consists of the cash flow, balance sheet, and profit & loss statements. Using DCCA or DCCAE, they extract low-dimensional features from the statements and feed them to simple classification models such as Bayesian Networks, Decision Trees, Logistic Regression classifiers, Support Vector Machines and k-Nearest Neighbours (kNN) classifiers in order to predict one of the three ratings: ‘low risk’, ‘medium risk’ or ‘high risk’. They conclude that the predictive accuracy of credit rating surpasses that of simpler feature learning methods due to the representational flexibility of neural networks, albeit at a higher computational cost. Huang et al. emphatically state the need for GPUs run on parallel computing platforms to run their deep feature learners

?

Here is the article:

https://link.springer.com/article/10.1007/s10614-021-10118-5

?

For breakdowns and discussions of the latest research in Machine Learning, follow BSI's LinkedIn page https://www.dhirubhai.net/company/business-systems-international/ and website at https://www.bsi.uk.com.


要查看或添加评论,请登录

社区洞察

其他会员也浏览了