登录查看更多内容

Investment Risk Design Approach

Rohan Rekhi

Servant Leadership | Data API Strategy | Data Engineering Delivery | Data Governance | Cloud Certified | Snowflake Architect | DataOps | Data Product | Asset Management | Technical Reviews | Biz Dev| Data Strategy

发布日期: 2024年3月11日

In one of the earlier articles, I talk about Investment Risk and different aspects we need to consider for this Data Design. Let's look at overall approach in more detail.

Our goal of the design should be to build "collection of integrated subjects oriented, time variant and non-volatile data that supports decision making." As such schema should be organized to simplify querying and reading of data. Thus, processing becomes of secondary importance in this approach. As such applicable paradigm should ought to be:

Do not normalize data.
Logical and Physical models converge.
Storage structure match access methods.

In principle we could create a very wide, position-based records that could hold Risk data coming from all our data sources. For Ex: we could have a single table containing columns of data from Risk Metrics, Barra and so on. In practice this can create problem for Snowflake like databases. Whenever we modify data in Snowflake, we don't modify its data, we create a new record. This in turn creates task for background processes that will ultimately delete the old record and move records on the disk surface to better support fast data retrieval. We could work around this costly update by having an end of day process to create aggregated table but how much value does this add? We should consider options for aggregation.

Populate a wide table at the end of the day when all the required data is available.
Provide a Materialized view to provide common access to all data in a single location.

Update rows as data is loaded, accepting the costs of organizing data to obtain a final state for the table.

Although such a wide table fits the model, the high number of columns might be its own impediment to finding data of interest. It is also worth noting that Data Warehouse might ultimately provide a wide, joined view of the data for analytic and data mining purposes. So, in a Data Mesh setup Risk Source Data Domain be treated more as a OLTP model and leave aggregating into true OLAP with Data Platform team responsible for it. That might be a good division of labor.

We should consider whether we want data tables distinguished by Data sources. Does this make sense? Although we need to know where data came from, I don't think it's a key dimension in our future state.

Another thing to point is we may need to have distinct approaches for Position Level, Portfolio Level and Security Level Risk data. The "rolled up" data can be stored in a narrow, generic table that stores one metric value per row. It has two powerful benefits:

领英推荐

My Data Quality Notes

Jose Almeida 1 年前

3 Horror Stories of Data Governance Implementation

Jose Almeida 5 个月前

Mastering Critical Data Elements (CDEs) for Effective…

Sunil Zarikar 7 个月前

The new metric can be added without changing the schema.
This can be a single place and common approach for retrieving all Portfolio and Security level metrics.

It also has secondary benefits, such as reducing potential issues with sparsely populated columns.

However, we need to remember it has costs. This narrow table, row-centric approach couldn't be further removed from OLAP paradigm that Snowflake aims to support. It is also likely that Business users might be less comfortable working with this kind of schema. The ease-of-use concern can be tackled by having views for the Business to use. This is another point worth debating and getting Business' input on.

My personal conclusion is to accept generic approach and later provide views and tables of commonly accessed data to support simpler and more efficient business processes.

I would conclude, any PoC will be successful if it is based on principles of starting simple and layering on complexity when a need to do so was identified and on iterative improvements. Rather than attempt to define a 100% correct and finalized model up-front, this approach should work.

In the next article will talk about very specific Schema Design points around this area.

If you find this article informative/insightful, please feel free to like it, subscribe it and share it further in your network to increase the reach.

#Risk #Snowflake #olap #oltp

Investment Risk Design Approach

Rohan Rekhi

Servant Leadership | Data API Strategy | Data Engineering Delivery | Data Governance | Cloud Certified | Snowflake Architect | DataOps | Data Product | Asset Management | Technical Reviews | Biz Dev| Data Strategy

领英推荐

更多精彩文章

社区洞察

其他会员也浏览了

Data Quality: A Shared Responsibility Across the Data Lifecycle

Data Governance Interview with Miles Reah

The Data Governance Spectrum: Perspectives from Key Organizational Roles

What are good key performance/risk indicators for data?

CRITICAL DATA: WHAT? WHY? HOW?

How long does it take to implement a Data Governance framework?

Mastering Critical Data Elements: Part 1 – Foundations of Modern Data Governance

Data Governance Interview with James Shaw

Why Do Businesses Need Data Analytics Consulting Services?

What problem are we solving?

领英推荐

Portfolio API

2024年9月19日

Solution Architecture - Evaluations & Selections

2024年9月7日

Separately Managed Account - Strategy

2024年8月25日

Solution Architecture - Cost Governance.

2024年8月14日

Solution Architecture - Technology Architecture

2024年7月31日

Separately Managed Account Lifecycle

2024年7月22日

Solution Architecture - Enterprise Application Architecture

2024年7月14日

Solution Architecture

2024年7月3日

Separately Managed Accounts

2024年6月26日

Private Credit

2024年6月6日

社区洞察

其他会员也浏览了

Data Quality: A Shared Responsibility Across the Data Lifecycle

Data Governance Interview with Miles Reah

The Data Governance Spectrum: Perspectives from Key Organizational Roles

What are good key performance/risk indicators for data?

CRITICAL DATA: WHAT? WHY? HOW?

How long does it take to implement a Data Governance framework?

Mastering Critical Data Elements: Part 1 – Foundations of Modern Data Governance

Data Governance Interview with James Shaw

Why Do Businesses Need Data Analytics Consulting Services?

What problem are we solving?