Get your analytics division on steroid : Financial Services Data Science Framework & Use cases!

Get your analytics division on steroid : Financial Services Data Science Framework & Use cases!

Let me start by saying something already outdated "Data is the new currency for Financial Services". Today globally the top industry sector with maximum investment in Data Science Technologies is Financial Services. But many instances of Data Science or Analytics functions in Financial services in India are lot like startups today, either they have one good idea, meaning one analytics use case or a problem statement or they have a dream to make it big without any disruptive idea or they exist simply because competition has invested in analytics. I have witnessed, read and noticed many failing analytics divisions as the data science boom unfolded in India in last decade or so. Everyone wants to join the buzz, the noise around MLAI is deafening, every one fearing to miss the bus but not clearly knowing where would the bus take them to!

I wish to cover first in this paper the key challenges, myths & mistakes, then move onto recommend a few best practices. I conclude with a Framework (my research paper) sustainable high impact analytics/data science function for all Financial services organizations.

Lets dive right in!

Data Science Skillset Paradox (industry agnostic)


A typical data science function in any industry would need four types of roles/skillset as depicted above. Unfortunately though the industry today is lacking this clarity. There is a role profile, of so called 'Data Scientist', the fanciest job description of 21st century, a god's gift to mankind, is an unreasonable combination of all the four skills above and whole lot more, a JD so long & deep, that one takes months to hire and then takes months to figure out what the scientist should do for them. I remember not many years back, a CEO (BPO) hired a M.Tech Stat/Quant person as head of analytics division at a premium cost, the next day the CEO assigned him sales target of millions of dollars. The head of analytics, after writing some small R codes and improving some excel BI reports, quit in 6 months time. I sincerely urge my fellow industry colleagues to stop rampant misuse of the JD called 'Data Scientist', the pain is often felt on either side of the hiring table, the industry as a whole is losing out.

Lets understand what these profiles really are and set expectations right. For all you know, you may not really need all skills on day zero, 99% of the time.

1. The Dreamers : The head of analytics typically, business analysts who wants or likes to ask just questions that data could solve for them, like no strings attached, no idea is a bad idea, typically. Their job is to dream, wishful thinking. “I wish I could solve this problem through data…”. A problem, an industry challenge or a customer delighter, any one could be the starting point. Or it could as well be an opportunity just made possible through available datasets that didn't exist few years back. These professionals are not only industry experts (SMEs) but also are well aware of the data economy, regulatory landscape around the company or industry, so they can locate the sources easily. These people are not supposed to be data science experts, but are well aware of the possibilities of MLAI, basic models etc.

2. The Algorithm Geeks (so called Data Scientists) : The dreamers are going to need help to convert or translate a business ask to a mathematical or statistical problem. All kinds of machine learning algorithms, data cleaning and management techniques, testing and training frameworks etc are the skill they bring to the table.

3. Programmers & software engineers: This army is for converting a mathematical solution into a software or customer friendly product. UX / UI skill also prove handy along with multi language coding expertise.

4. Infra / Data Engineers : This team is responsible for building the house for big data & the software. The cloud or physical, the sizing of it, store or stream, speed/response time, RAM size, number of cores, distributed or RDBMS, Spark to Sparkling water. Computer science or Electronics engineers or network engineers are best fit for these roles. The architects of the data house.

For financial services in particular, we should follow the sequence below while hiring the best talents.

  1. Dreamers
  2. Programmers
  3. Data Engineers
  4. Algo Geeks (MLAI)

The dreamers and programmers can start small with a clear goal and with a small set of data, try out rule based & simple ML algorithm based decision making. When data grows, forms and shape get complex, data engineers will be called for. For advanced MLAI only, we need to hire PHDs statistics/maths or Algo geeks.

Today, every day we are creating 2.5 Exabytes of data (1 Billion Gigabytes). Yes please read the number again! And what is more astonishing & exciting at the same time, is that we are only using about 0.5% (or less) of this data to make smarter decisions. The point I am trying to make here is, a data science is all about data driven smart decisions and MLAI is just a part of it but not all of it. Imagine a Bank recommending an existing savings account holder to invest in its Mutual Fund during one of his online banking session, well that's basic. But when the bank includes a simple comparison of return of the customer's average quarterly balance vis-a-vis the same amount invested in its Mutual Fund, its truly an example of personalized smart recommendation, that's an extremely rewarding example of Data Science with no MLAI by the way.

Keep the investment low early, if you are starting a function from scratch. There are enough support in the ecosystem once you get to explore it, I will provide a detail insight later in this paper. I know of a CTO of a sizeable financial services company in India, complained in an interview few years back, that they bought & installed Hadoop infrastructure but little got consumed by business leaders for years. That's the big correction we need to make in our approach! Use case first, always! We need to reverse the direction of investment as depicted on the exhibit. Every such failure taking confidence away from the masses from rapid adoptions in India as rest of world march ahead at a rapid pace with countless data science lead disruptions!

The disruptive Analytics strategy for Financial Services :

The Analytics (function) factory need to fire all cylinders throughout the year with high impact projects, but whats the secret recipe for sustainable disruptive data science ?

Below is a simple 10 pointer approach for a sustainable Data Science Innovation Factory at your own organization. If you already have a big one existing, it's not difficult to quickly take a pause, re-validate the approach and course correct if necessary. It's not about one or two big bang projects, it's about firing high ROI projects one after another for long run.

  1. Prioritize 1 product at a time
  2. Start with a customer problem or Industry challenge or a business opportunity
  3. Use case to Business case
  4. Build a team of dreamers & domain experts
  5. Gain knowledge of Data economy and customer's supply chain (Framework below)
  6. Make your shopping list & fill your cart (6 by 6 as on the grid below)
  7. Pilot build
  8. Train & Refine
  9. ROI
  10. Sponsorship to GO TO 1

This is still common sense really. The real secret lies in a framework called Data Economy and Customer's Supply Chain (DECSC)  for financial services.

What is Customer's Supply Chain (CSC) : Each player in the market place is a part of its own supply chain. The core of any business making is to be able to add value to a set of inputs and produce a set of outputs, this is industry agnostic. For Data science to hit the bulls eye everytime, we need to recognize the Customer's supply chain comprehensively and clearly, knowing your clients only is not going to be enough.

Our customer's supply chain has six parties :

1. Customers

2. Customer's Core operation/Value creation

3. Customer's suppliers / partners

4. Customer's buyers

5. Customer's competition

6. Professional , Social & Personal support system

Irrespective whether retail or corporate, your customer will always belong to its own supply chain and they buy your product or service only to feed into their supply chain. Whether a retail customer seeking home loan or a startup looking to go public (investment banking), does not matter, this framework always will help you unearth the true potential of data science to not only generate leads but to create completely non existent market place (some call them 'Blue Oceans'). Your customer's real identify and what they really want is hidden always inside their own supply chain. May it be a real estate developer or some one buying a new house on loan, if you come to think of it, a Bank or a HFC may be serving both of them as customers and they themselves may be tied to the supply chain of each other. The Bank/HFC wouldn't like to finance a housing development project unless they know for sure who and how many customers would buy those houses (flats/product inventory). I deal with this example again later in detail using my framework.

Repeating my first example again, say a fixed deposit or a mutual fund I would like to buy. The Financial institution needs to exploit the supply chain I am on, meaning, what I buy, how much & how I earn, what I save, my family, my values, life goals etc. With all these data, the bank now can with almost certainty recommend and sell its Fixed deposit or a mutual fund.


What is Data Economy (DE):

A financial services company needs to be aware of the data economy that exists around it, its extremely critical in order to exploit the unprecedented possibilities of Big data & Data science.

? Creators of data / regulators

? Aggregators / consolidators

? Analytics partners

? The last mile partners

? Productizers

? In-house data science team & infra

Recognizing useful data is the primary pillar of success, explore regulators, data custodians and data partners , fintech starts ups before giving up on a use case. Dreams should not be challenged by data availability. Be rest assured that data is out there a lot more than what you are aware of. Just sharing a recent incident, one of my close data / fintech partner, during a recent meeting at their office, looked at his mobile and pulled on his laptop screen how many companies I had worked with so far with exact duration of service, I thought that was my personal information!! Yes he was not a magician, he just used an API connector to EPFO data set using my unique mobile number and UAN mapping. Imagine just that API alone could eliminate a significant part of HR function busy performing background verification activities across industries. Thats the power of data and data science today without any stat, math, MLAI or algorithms. In this case, I didnt know about this API but my partner did, inside my data ecosystem.

Also in the early stage of data science journey, its advisable for companies to go for last mile partners and productizers to enjoy the disruptive benefits of data science without being bogged down with heavy capital investments in technology and people. So just go shopping.

Finally the disruptive data science framework is built combining the above two dimensions : Data Economy (DE) and Customer's Supply Chain (CSC) to form a two dimensional grid. To put this framework to test, let us fill it up with the same example of a Bank's or a HFC's Real Estate Financing business.

The numbers in the exhibit above follow the maturity of the analytics function and corresponding investment required. Higher the number, higher the value. As we can see, setting up comprehensive in-house infrastructure is really a long term objective but definitely not before few years of consistent success. There are so much data, data partners, analytics & product partners out there in the market, who are key to your early success.

On the similar lines, focusing on only your customer data is passe now, hence ranked 1 on the analytics value chain. In order to exploit the true potential of Data science, financial services companies must exploit the entire customer's supply chain (between 1 to 6).

Once the above grid is formed for each line of your business, you are ready to rock & roll. Every box in the grid, is a potential treasure for cutting edge data science. In this case, the HFC company wants to learn all data sets about Lodha Amara before making a decision to finance the project and predict the ROA. (Lodha is India's largest real estate development company, Amara is an under construction luxury property in Thane, near Mumbai. I am using the same just as an example, https://amarabylodha-thane.in/).

  • First, they want to know the developer's background and there are truckloads of digitally consumable data available today through APIs.
  • Second, the health of the core business or project, the cash flows, order book, land asset qualities etc.
  • Third, who are its top suppliers, the relationships, total order size etc. GST is a great supply chain data set today in India. Through a customer consent API, this is increasingly being used by corporate lenders to build decision models and disburse SME/MSME loans on the spot. Other than private sector lenders and banks, all Indian public sector banks came together recently to launch the world's largest govt backed micro credit platform for MSMEs (www.psbloansin59minutes.com). GST is at the core of this 59 minutes smart loan engine.
  • Fourth, they want to evaluate the buying powers of residents in the same locality of the property (Thane & Mumbai). Nielsen India owns and maintains an advanced demographics, earning and spend prediction engine for 17000+ cities/towns in India at pin code level.
  • Then, Lodha's competitions in the same locality and social networks etc.

Build the grid for business, prioritize them and get rolling.

The biggest tarnsformational impact of Data Science on our lives & businesses around the world today is not MLAI (this is a big myth), but its the availability, accessibility, affordibility & usability of unprecedented amount of data around us. As a second stage of maturity, MLAI comes in when we manage to feed historical performance data and aim to predict the future. We need first many smart Data Crusaders who unearths the data economy for the relevant industry in focus first and bring wealth of industry use cases of data (simple rules) driven impactful business decisions. Its critical to break the myth and scale up our data science functions/divisions. Across the world the data science education sector has exploded and majority of skill injection has been commoditized, covers tools, maths and algorithms. But the capability, availability, participants, construct & dynamics of Data Ecosystem in any economy/country is not taught anywhere, its only experienced through pilots and failures.

In the next part of this paper, I would like to share & discuss data science use cases in financial services industry that are relevant in India or globally, but , majority of my data ecosystem examples will be relevant for India alone, a I continue enriching professional journey at the commercial capital of India. The business lines will include Banking, Credit, Mutual Funds, AIFs, Investment banking, Equity trading, Alternate research, Insurance, Distressed assets, Private wealth, Prime broking & Custody etc etc.

Thanks for your readership. Share your comments, questions, thoughts, new ideas.

I am a passionate practitioner, learner and blogger in FinTech, disruptive technologies & sciences, KYC, Blockchain, AIML, Data science, Lean sixsigma, RPA, Lateral Thinking etc.

I am reachable at [email protected].

 

Very informative piece ! I would also like to hear your recommendation on the outsourcing side. Clearly all the roles except the dreamers (?) can be outsourced to have flexibility of scaling down or up. This also can open up opportunity for starts up to provide this specialised service.

回复
Mandar Mondkar

CA and PMP certified

6 年

Hi Pallab...very informative article.. I absolutely agree with you that Indian companies should not blindly jump on the bandwagon of AIML.. They should first unearth the past ( generate ample amount of use cases) before forging the future..

回复

要查看或添加评论,请登录

Pallab Bhattacharya的更多文章

社区洞察

其他会员也浏览了