Data Warehouse for Asset Management

OIn this article I would like to illustrate what a data warehouse for an asset management could look like. I will explain the main fact and dimension tables, along with the potential issues.


1. Fact Tables

The main fact tables in an asset management warehouse are position, performance, risk and transaction. In the next 4 paragraphs I’ll explain what they are briefly. Later on in the article I will be explaining them in more detail.

The position fact table (aka valuation fact table) contains the daily holdings and benchmark constituents for every portfolio, both security/instrument positions and cash positions. The main issue with this fact table is the matching between the holding/portfolio positions and benchmark positions. We have the option of not attaching the benchmark, which makes it flexible particularly for portfolios having more than one benchmarks. The position fact table is mainly used to provide portfolio breakdown, and top N holdings. The other issue is lookthrough.

The performance fact table contains the performance returns of every portfolios compared with their benchmarks, for many different periods e.g. 3M, 6M, 1Y, 3Y. There is also another type of performance fact table, which contains the performance attribution to different factors such as yield curve, sector allocation and security selection. Performance can also be allocated to different countries, currencies and asset classes, each compared to the benchmark. Please see my article here.

Because of the non additive nature of risk numbers, there are many risk fact tables. They mainly hold the risk sensitivities such as interest rate sensitivities (PV01, DV01), inflation sensitivities (IE01), credit rate sensitivities (CR01) and exchange rate sensitivities. The risk fact table also stores the VAR data (value at risk) at portfolio level and at “carved-up” level (for example, industry sector, currency and asset class)

Transaction fact table stores every purchase, sale and corporate action in every portfolio, including the fees and charges. The transaction status, which changes at every stage on different dates, can be stored on an accumulated snapshot fact table. Whereas the trades data can be stored on a “transaction” fact table (as opposed to snapshot fact table).

Other business areas requiring fact tables are:

  • Fund administration e.g. redemptions, subscriptions, distribution payments
  • Liquidity management e.g. cash ladder (projected cash)
  • Collateral management and counterparty exposure
  • ESG data (Environment, Social, Governance)


2. Dimensions

In an asset/investment management DW, the following dimensions are commonly found:

Security dimension (aka Instrument dimension or Asset dimension) which contains 3 types of securities: held securities, benchmark securities and watchlist securities.

Note: cash is a financial instrument but not a security. Only equity, fixed income and money market are financial securities, including derivatives (refer to The Securities Exchange Act, section 3 part 10).

Issuer dimension (aka company dimension) containing a list of public and private companies who issue shares or bonds, had an OTC trade/deal with us, or has ESG data.

Client dimension (aka customer dimension) containing a list of clients from the CMS such as Salesforce, with their usual attributes such as client type, Mifid category, industry sector, and address.

Portfolio dimension (aka account dimension, product dimension or fund dimension) which contains a list of open funds, segregated account/mandate and trust funds that the company runs. Portfolio dimension should contain share classes too.

Benchmark dimension, containing a list of benchmarks which the portfolios are measured against, including blended benchmarks. The “pairing” between portfolios and benchmarks is not done in this dimension.

Country dimension covers two country code schemes (ISO 2 chars) and (ISO 3 chars) and 2 to 3 country name columns (one for each OMS/IMS, plus ISO).

Currency dimension is pretty static. A country column is not useful here, as the currency name already includes the country and some currencies are used by several countries. It is usually a good practice to update country and currency dimensions manually.

Rating dimension. Covers Fitch, Moody’s, and S&P. Each agency is put on different set of rows, with an “agency” column. This data structure is known as a “stacked” dimension. Also covers “major” rating and “notched” rating as two different attributes/columns (AA is a major and AA+ is a notched). Also covers cash ratings (Moody’s), which is quite different from credit rating, possibly “watch” rating and “outlook” too.

Industry Sector dimension. Covers at least ICB and GICS, along with major providers such as ML, Barclays, Bloomberg. Some are 4 levels deep, some 7 levels. Unlike rating, don’t try to do translation between providers here, so stack them up with a “provider” column, then level 1 to level 7 columns.

Asset class dimension (aka asset type, instrument type, asset group) which contains a list like: equity, fixed income, cash, property, commodity as level 1. Level 2 contains the breakdown for example for fixed income the level 2s are bond, floating rate note, CDS, bond future. For bond the level 3s are callable/puttable or government/corporate.

Performance return basis dimension: contains the fee basis (gross or net of fees), cumulative or annualised, arithmetic or geometric, pricing method (single or dual pricing), etc. We should model them as a junk dimension because most attributes only have 2-3 values. I don’t recommend the performance period to be put here (e.g. 3 month, 1 year, 5 years), best to put them as columns in the performance return fact table.

Broker dimension: stores all brokers we use in trading. Used by the transaction fact tables.

And finally, date dimension.


3. Issues

The usual issues/dilemmas with designing the dimensions are:

  • Do we include country of risk/incorporation, currency, industry sector, etc. in the security dimension? Absolutely, as type 2 attributes. Plus we also have them as separate dimensions in their own right.
  • How do you model derivative instruments which has very different attributes? By creating Option, Swaps, Futures and CDS as separate dimensions. Don’t separate the OTC Derivatives (Over The Counter, e.g. options, swaps, CDS) from ETDs (Exchange Traded Derivatives, e.g. futures, forwards). Don’t create too many dimensions for derivatives, but group them up. 4 or 5 dimensions sounds about right. Don’t worry too much if certain attributes are duplicated between the main security dimension and the derivative instrument dimension, it’s ok.
  • Do we put cash in the security dimension or not? Don’t. Only real financial instruments should be in the security dimension (including CD, Repo, OTC and ETD), but leave settled and unsettled cash out (including collateral cash for margin requirements) See my article here for more info: link. Cash in different currencies should be put directly in the position fact table, clearly labelled as settled or unsettled as many portfolio management decisions hang on this label.


The above is the summary. The full article is in my blog: https://dwbi1.wordpress.com/2018/09/25/data-warehouse-for-asset-management/

Happy reading. If you have any questions you can DM me.

要查看或添加评论,请登录

Vincent Rainardi的更多文章

  • DQ Engineering

    DQ Engineering

    DQ stands for Data Quality. If you don't have a background in data quality, read this first: https://www.

    6 条评论
  • Data Product

    Data Product

    For those of you who don't know what a data product and “data as a product” are, please read this first:…

    13 条评论
  • Snowflake vs SQL Server

    Snowflake vs SQL Server

    Sometimes we need to remind ourselves that Snowflake is not an OLTP database. I know today is the era of Hybrid tables…

    6 条评论
  • Data engineer becoming solution architect

    Data engineer becoming solution architect

    Are you a data engineer thinking about transitioning to a cloud solution architect? Data engineer are good with…

    2 条评论
  • Asset Mgt vs Fund Mgt vs Investment Mgt vs Wealth Mgt: What's the difference?

    Asset Mgt vs Fund Mgt vs Investment Mgt vs Wealth Mgt: What's the difference?

    If you work in banking or investment or any other sector in financial services, you might be wondering about the above.…

  • Data Warehousing Basics: Cost

    Data Warehousing Basics: Cost

    If you call yourself a data engineer you need to be aware of 2 additional things compared to a developer. The first one…

    2 条评论
  • My Linkedin post & articles

    My Linkedin post & articles

    The list below goes back to Nov 2024. For older than that see here.

    9 条评论
  • Data Warehousing Basics: Single Customer View

    Data Warehousing Basics: Single Customer View

    Imagine that you work for an insurance company who sell health insurance (HI), life insurance (LI), general insurance…

    2 条评论
  • Data Warehousing Basics: NFR

    Data Warehousing Basics: NFR

    What I’m about to tell you today failed a lot of data warehousing projects which is why it’s worth paying attention so…

    1 条评论
  • ML and AI - What's the difference?

    ML and AI - What's the difference?

    Machine Learning covers about 20-30 algorithms such as Logistic Regression, Decision Tree, Gradient Boosting, Random…

    5 条评论

社区洞察

其他会员也浏览了