Business Intelligence as a question of Supervised Learning for the Prediction of Company Dynamics.
Balancing the use of data and choice ML algorithms while integrating business intelligence in a firm

Business Intelligence as a question of Supervised Learning for the Prediction of Company Dynamics.

Due to the increasing availability of granular, yet high-dimensional, firm level data, machine learning (ML) algorithms have been successfully applied to address multiple research questions related to firm dynamics. Supervised learning (SL), the branch of ML dealing with the prediction of labelled outcomes, has been used to better predict firms’ performance. A series of SL approaches can be used for prediction tasks, relevant at different stages of the company life cycle. The stages can include

  1. startup and innovation,
  2. growth and performance of companies, and
  3. firms’ exit from the market.

First, consider SL implementations to predict successful startups and R&D projects, and further look at how SL tools can be used to analyze company growth and performance. Secondly, review SL applications to better forecast financial distress and company failure. Lastly, consider the employing of SL methods in the light of targeted policies, result interpretability, and causality.

In particular, SL methods improve over standard econometric tools in predicting firm success at an early stage, superior performance, and failure. High-dimensional, publicly available data sets have contributed in recent years to the applicability of SL methods in predicting early success on the firm level and, even more granular, success at the level of single products and projects. While the dimension and content of data sets varies across applications, Support Vector Machines - SVM and Random Forests - RF algorithms are oftentimes found to maximize predictive accuracy. Even though the application of SL to predict superior firm performance in terms of returns and sales growth is still in its infancy, there is preliminary evidence, sometimes from the empirical data and at times unstructured data, that RF can outperform traditional regression-based models while preserving interpretability. Moreover, shrinkage methods, such as Lasso or stability selection, can help in identifying the most important drivers of firm success. Coming to SL applications in the field of bankruptcy and distress prediction, decision-tree-based algorithms and deep learning methodologies dominate the landscape, with the former widely used in economics due to their higher interpretability, and the latter more frequent in computer science where usually interpretability is deemphasized in favor of higher predictive performance.

In general, the predictive ability of SL algorithms can play a fundamental role in boosting targeted policies at every stage of the lifespan of a firm—i.e.,

(1) identifying projects and companies with a high success propensity can aid the allocation of investment resources;

(2) potential high growth companies can be directly targeted with supportive measures;

(3) the higher ability to disentangle valuable and non-valuable firms can act as a screening device for potential lenders.

As granular data on the firm level becomes increasingly available, it will open many doors for future data employment directions focusing on SL applications for prediction tasks. To elucidate more on the SL algorithms employed in the literature of firm dynamics, namely, decision trees, random forests, support vector machines, and artificial neural networks, should further be inclined on in order to make sense out of the vast amounts of data to the firms' disposal.

Besides reaching a high-predictive power, it is important, especially for policy-makers, that SL methods deliver retractable and interpretable results. For instance, the US banking regulator has introduced the obligation for lenders to inform borrowers about the underlying factors that influenced their decision to not provide access to credit. Hence, different SL techniques should be evaluated, and firms should opt for the most interpretable method when the predictive performance of competing algorithms is not too different. This is central, as the understanding of which are the most important predictors, or which is the marginal effect of a predictor on the output (e.g., via partial dependency plots), can provide useful insights for scholars and policy-makers. Indeed, data scientists in the firm can enhance models’ interpretability using a set of ready-to-use models and tools that are designed to provide useful insights on the SL black box. These tools can be grouped into three different categories: tools and models for

(1) complexity and dimensionality reduction (i.e., variables selection and regularization via Lasso, ridge, or elastic net regressions, ;

(2) model-agnostic variables’ importance techniques (i.e., permutation feature importance based on how much the accuracy decreases when the variable is excluded, Shapley values, SHAP [SHapley Additive exPlanations], decrease in Gini impurity when a variable is chosen to split a node in tree-based methodologies); and

(3) model-agnostic marginal effects estimation methodologies (average marginal effects, partial dependency plots, individual conditional expectations, accumulated local effects).

Higher standards of replicability should be reached by releasing details about the choice of the model hyperparameters, the codes, and software used for the analyses as well as by releasing the training/testing data (to the extent that this is possible), anonymizing them in the case that the data are proprietary for instance data sources collected by banks, financial institutions, and business analytics firms. This not only applies to the proprietary data but also data in jurisdictions that are extremely restrictive regarding the privacy aspect as in the case of GDPR - Europe.

Here, I would want to stress once more that SL learning per se is not informative about the causal relationships between the predictors and the outcome; therefore data engineers who wish to draw causal inference should carefully check the standard identification assumptions and inspect whether or not they hold in the scenario at hand. Besides not directly providing causal estimates, most of the reviewed SL applications focus on pointwise predictions where inference is de-emphasized.

Providing a measure of uncertainty about the predictions, e.g., via confidence intervals, and assessing how sensitive predictions appear to unobserved points, are important directions to explore further.

Considering how SL algorithms can predict various firm dynamics on “intercompany data” that cover information across firms, so many aspects have to be put into play. Yet, nowadays companies themselves apply ML algorithms for various clustering and predictive tasks, which will presumably become more prominent for small and medium-sized companies (SMEs) in the upcoming years. This is due to the fact that

(1) SMEs start to construct proprietary data bases,

(2) develop the skills to perform in-house ML analysis on this data, and

(3) powerful methods are easily implemented using common statistical software.

Against this background, I would want to stress that applying SL algorithms and economic intuition regarding the business problem at hand should ideally complement each other. Economic intuition can aid the choice of the algorithm and the selection of relevant attributes, thus leading to better predictive performance. Furthermore, it requires a deep knowledge of the studied research question to properly interpret SL results and to direct their purpose so that intelligent machines are driven by expert human beings.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了