Advances in the Fintech industry by efficient use of "Data"- Part 2
Geetanjali Prasad ????
Analytics & Strategy Leader | Transforming Businesses with Data-Driven Insights | Coached 200+Aspiring Data Scientists
I started my most challenging journey as Data science professional with PayNearby sometime in 2018 and this was when I got extensive exposure to the BFSI business.
PayNearby operates on a B2B2C model, where they partner with neighborhood retail stores and enable them with the tools to provide assisted financial and digital commerce services to their local communities across 17,600 plus PIN codes. These local communities can access PayNearby Partner stores to avail a range of services including cash withdrawal, cash deposits, money transfer, savings, insurance, travel, Digital Payments, access to government benefits, and many more.
Below are a few stats to give you all a feel of the volumes of data I got to manage as their Data Science head :)
In this newsletter, I will be covering the most crucial project which was setting up the entire data science practice for India’s leading branchless financial service provider to the people at the bottom of the pyramid
I will briefly cover the Data Platform and the different components
Architecture Diagram
At Pay nearby, we were centralizing data from multiple data sources into the Cloud data warehouse which was on AWS & GCP
AWS was mostly used where we needed real-time data usage or we were supposed to integrate some of our data models to be used in real-time since the development team was on AWS. Whereas GCP was used for all other Analytics & data-science use cases.
There were two reasons why we opted for this structure
PS- Redshift was way too expensive though had lower latency but I believe it's good to be frugal when you see you have long-term benefits. Athena on the other hand was not something that we could roll out to everyone across the org to use as we did with BigQuery.
Let's cover each block of the Data platform briefly :
Source:
There were more than 200 data sources! So, you can imagine the challenge to put all the data in one place and create a single source of records which is just the first step toward building a full-fledged data science stack.
Data source/types - MySQL, MS-SQL, Postgress, Operational file systems, HR system, Opensource data, Image files and etc
ETL Pipelines:
At Nearby, we were ingesting?~ 200 Million?transactional entries in a month with over?200 active pipelines!
Data security?was at the core of Nearby technologies and hence were had to create our own ETL platform?on GCP. This not only helped us to?secure data?but also helped us in?saving big time, which could have been a huge cost if we would have gone for any third-party ETL tool.
领英推荐
Data warehouse and Data Lake:
We use two types of Data Warehouses to solve different use cases.
Important pointers, you should take into consideration while using Bigquery
Data Visualisation, Dashboards & Reporting?:
We used?PowerBI?for dashboarding, visualization, and reporting. This is the one-stop source for all the product and business metrics and it was extensively used across the organization to track product performance and business health. This enabled the Senior management and CXOs to be aware of all the key metrics and also gives them a UI from where based on the filters they could extract metadata.
We also used Automailers to send across many reports direct to the mailbox of the stakeholders using R/python
Machine Learning?:
We had a dedicated team of Data scientists who worked together to make robust systems that help to achieve our business goals easily
Here are a few systems built by our data scientists:
Business Analytics?:
"One size fits all" is something that does not apply to any Data science team as in most cases a good machine learning guy might not be good at quick business analysis but would be a great fit to solve complicated business problems which can not be solved by traditional methods. Keeping this in mind we had a team of business analytics ninjas who are quick like ninjas and helped us to be on our toes always.
Here are a few items built by our BA ninjas:
Data blog post:https://www.dqindia.com/digging-new-oil-well-data
Follow me on LinkedIn: www.dhirubhai.net/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=geetanjali-prasad
Delivery leader heading enterprise analytics projects | Program management | Account Management | Data Architecture | Data Visualization | Data Governance
2 年Thanks for sharing, insightful read... Curious to know how was data org structure setup? Was it centralized or dedicated data team for each department?