登录查看更多内容

EMIR Refit Pairing and matching : A machine learning approach.

Jitender Malik

SVP | Data engineering & Science(AI/ML, Gen AI, Computer Vision) | AI Engineering Lead at NatWest Group

发布日期: 2024年6月22日

The EMIR mandates EU counterparties to report their transactions to trade repositories. EMIR focuses on the double-sided reporting, which means that details of a trade between two EU entities will be reported separately by each of the counterparties. According to the regulation, the two counterparties must agree on a unique trade identifier and on the characteristics of the trade itself (so-called common data) before submitting the report.

While both the counterparty reports their legs(trade events) to the TR's they need to ensure that both the legs of the trade is Paired and matched(Completeness and accuracy). The challenge at the financial institution end is that how the institution knows whether they or at fault or the counterparty in the trade to take any action for correction.

To solve above problem this paper proposes the use of One-class SVM to identify the party at fault between two sides of the same trade reported to regulator in case of a mismatch happens.

As a part of solution we will first understand how OCSVM works and then further w create and train a model on trade events which are paired and matched successfully and based on these matched events the model will create an optimal hyperplane of true states of paired and matched.

The hyperplane which the model built will check for the new trade events which are unmatched and will check the distance from the hyperplane. If the point is inside the hyperplane this can be marked as fault at counterparty as hyperplane which the model has built states such trade always match and the issue lies at the counterparty, in cases where the points lie outside of the hyperplane we can identify them as issue at the bank itself as model doesn't recognise these events have ever been matched successfully historically.

Method?: As stated the paper uses once class SVM(OCSVM) which is an extension of Support vector machine(SVM) a classification algorithm in machine learning.

Introduction to SVM

SVMs are used for binary or multi-class classification by finding the optimal hyperplane (maximized marginal distance) which separates different classes. Depending on the dataset if we can linearly separate it this is termed as hard margin however if the boundary is not separable, we opt for soft margins.?Hyperplane is created in between datasets to indicate which class it belongs to. SVM can be used for Classification, Regression problems.??

Additionally, SVM can efficiently perform a non-linear classification using the?kernel trick, which represent the data only through a set of pairwise similarity comparisons between the original data observations and representing the data by these transformed coordinates in the higher dimensional feature space. ?? ?

One-class SVM?(OCSVM)

One Class Support Vector Machines (OCSVM) is one type of outlier, anomaly, or novelty detection algorithm. Generic SVMs do the same by separating data into several classes creating a hyperplane. The hyperplane then decides which class any subsequent data belongs to. The key working principles of one-class SVM includes Outlier Boundary which operates by defining a boundary around the majority class (normal instances) in the feature space. This boundary is constructed to encapsulate the normal data points, creating a region of normalcy. Secondly, the algorithm strives to maximise the margin around the normal instances, allowing for a more robust separation between normal and anomalous data points. This margin is crucial for accurately identifying outliers during testing. Lastly, OCSVM has an in-build hyperparameter called “nu,” which represents an upper bound on the fraction of margin errors and support vectors. Fine-tuning this parameter influences the model’s sensitivity to outliers.??

In the above image we can see the trade events inside the blue region (hyperplane) are the matched trades, while the ones outside of this region can be termed as an unmatched event when analysed for party at fault. The hyperplane separates the anomalies here in our case we treat mismatches or unmatched event.?

领英推荐

The AI Revolution in Tech, Finance, and Retail: 2025…

Owebest Technologies Pvt. Ltd. 2 个月前

Strategic Foresight: Future Trends in Forex Markets…

CMS Prime 1 年前

How are AI & LLMs transforming the financial services…

CARDO AI 11 个月前

Implementation

Before we start training the One-Class SVM model, it’s essential to normalise the trade data to ensure that features are on similar scales. This helps prevent certain features from dominating the model’s learning process due to their large values. The data is then transformed using a kernel function to map it into a higher-dimensional space. This transformation allows the algorithm to find a hyperplane that separates normal data points from the origin, effectively capturing the distribution of normal data. OCSVM aims to find the optimal hyperplane that maximises the margin between normal data points just like SVM does and the origin in the transformed space. The hyperplane formed is learned during the model training process by adjusting the model parameters, including the kernel parameters and regularisation parameters.?

After the training process ends, OCSVM generates a decision function that assigns anomaly scores to new trade points. The anomaly score represents the distance of a data point from the learned hyperplane. Data points with higher anomaly scores are considered more likely to be anomalies in trade reporting which can help counterparties to detect the mismatches at an early stage (before they report to authorities and can correct it).?

?OCSVM sets a threshold on the anomaly scores. Trade data points with scores above the threshold are classified as anomalies, while those below the threshold are considered normal. Threshold can be adjusted based on the desired trade-off between false positives (normal points classified as anomalies) and false negatives (anomalies classified as normal).?

Classification of mismatched trade events using OCSVM?

First, we collect the matched trade events which includes TradeID, TradeTime, Buyer/Seller, price etc. After data is collected , we need to perform some pre-processing before we start training the model. This stage includes data cleaning, feature engineering, normalisation, scaling, etc. Moving forward can train the model with pre-processed data. Once we have the model trained, we can use this for detecting party at fault based on hyperplane as shown below.

Based on the findings we can identify which party is at fault in case of mismatch happens:?

Counterparty at fault?: The points which lie within the hyperplane?(All red)

Party at fault : The classification point which lies outside the hyperplane(All Blue)

Conclusions :?

The paper described the use of OCSVM to solve the problem of pairing and matching outlined by EMIR Refit. With the use of this algorithm the European financial institutions will be able to better understand the trade information and can correct them to know before hand rather than going into large operational process. ?

The follow-up work will include further refinement of this procedure and incorporating newer innovations to the steps. ?

Abhishek Dudeja

Software Development Manager @ Clearwater Analytics | AWS Expertise | Computer Vision

8 个月

Great going Jitender & team. It is an absolute delight to witness this team's journey - you folks have come a long way employing intelligent ML algos to T&TR domain.

1 次回应

Gerard Fernandes

8 个月

Interesting. But I don't really understand why a problem that can be solved deterministically, benefits from a statistical solution. Maybe I'm missing something. But to me, the first step would be a bi-party handshake to agree an identifier, and then it should be simple to confirm both pairing and matching.

2 次回应

Manjari Sharma

AI/ML Enthusiast | Python and C programmer

8 个月

Very excited to see the impacts such solutions will have in banking sector

4 次回应

Nitin Sharma

Associate Vice President at NatWest Markets(RBS)

8 个月

Working on this solution has been an enriching experience. Leveraging One-Class SVM (OCSVM) has enabled us to effectively identify mismatches in double-sided reporting for EMIR compliance, significantly reducing errors and improving efficiency. The precision of OCSVM in detecting anomalies before they escalate has been a game-changer. Excited to see the positive impact this will have on the industry! #OCSVM #MachineLearning #EMIRCompliance #Innovation #FinTech

7 次回应

Nikhil Deswal

Senior Software Engineer

8 个月

Thank you for including me. I'm excited to work with you more on these innovative projects!

5 次回应

查看更多评论

要查看或添加评论，请登录

Jitender Malik的更多文章

Key issues (Post-production) in an ML based solution

2025年3月2日

Key issues (Post-production) in an ML based solution

In my last article, I talked about the key challenges in AI adoption. Even after organizations successfully build and…

1 条评论
6 Core steps for choosing a ML Model

2025年2月24日

6 Core steps for choosing a ML Model

There are many possible solutions to any given problem. Given a task that can leverage ML in its solution, you might…

1 条评论
LLM Agents: Reasoning and acting (ReAct)

2025年1月5日

LLM Agents: Reasoning and acting (ReAct)

In this article I have covered three things to start the series of articles on Agents. first, what is LLM agents? And…
LLM's: Chain of thought prompting

2024年10月6日

LLM's: Chain of thought prompting

Chain rule(Backpropagation - Wikipedia) doesn’t get the appreciation it should. Without it back-propagation (Backbone…

1 条评论
Deep Learning 1: ANN (Artificial Neural Network) Architecture

2024年7月28日

Deep Learning 1: ANN (Artificial Neural Network) Architecture

Neuron and perceptron Deep learning is heavily inspired by our own nervous system. Just as our nervous system works…

4 条评论
Logistic regression: A deep learning approach.

2024年7月20日

Logistic regression: A deep learning approach.

Logistic regression is one of the most modern machine learning algorithms, and it is important because if you want to…

3 条评论
Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

2024年7月13日

Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

This article talks about the journey of transformer architecture where the 4 groundbreaking research paper brought the…

1 条评论
A data driven approach for scalable Integration testing.

2024年6月29日

A data driven approach for scalable Integration testing.

Note: This article talks about using statistics to scale integration testing using pact flow. for detailed…

3 条评论
Comparison of Multivariate Data Using Principal Component Analysis

2024年6月16日

Comparison of Multivariate Data Using Principal Component Analysis

Lately I was working on a project where we need to solve the problem of comparing population with its sample to ensure…

8 条评论

See all articles

EMIR Refit Pairing and matching : A machine learning approach.

Jitender Malik

SVP | Data engineering & Science(AI/ML, Gen AI, Computer Vision) | AI Engineering Lead at NatWest Group

领英推荐

Jitender Malik的更多文章

社区洞察

其他会员也浏览了

The Rise of Generative AI and Forex Markets

Revolutionizing Financial Reasoning: Harnessing Advanced AI Technologies (GPT o1/o3) for Informed Decision-Making and Strategic Innovation

The Role of AI in Enhancing CFD Trading Strategies This Year

Harnessing the Power of Machine Learning for Market Microstructure Analysis

The Rise of AI in Stock Trading: How Algorithms Are Shaping the Market

AI & ML ARE POWERING THE FUTURE OF FINANCIAL SERVICES

Integration of Machine learning and Fin-tech

How AI 'Sees' the Market: Visualizing AI Trading Data

Bhavv: Transforming Stock Trading with AI Simplicity and Risk Management Excellence

Beyond Candlesticks: Innovative Charting Techniques with AI

领英推荐

Jitender Malik的更多文章

Key issues (Post-production) in an ML based solution

6 Core steps for choosing a ML Model

LLM Agents: Reasoning and acting (ReAct)

LLM's: Chain of thought prompting

Deep Learning 1: ANN (Artificial Neural Network) Architecture

Logistic regression: A deep learning approach.

Encoder decoder to Transfer learning: An analysis of all research papers contributed towards journey of Transformers Architecture (LLM's)

A data driven approach for scalable Integration testing.

Comparison of Multivariate Data Using Principal Component Analysis

社区洞察

其他会员也浏览了

The Rise of Generative AI and Forex Markets

Revolutionizing Financial Reasoning: Harnessing Advanced AI Technologies (GPT o1/o3) for Informed Decision-Making and Strategic Innovation

The Role of AI in Enhancing CFD Trading Strategies This Year

Harnessing the Power of Machine Learning for Market Microstructure Analysis

The Rise of AI in Stock Trading: How Algorithms Are Shaping the Market

AI & ML ARE POWERING THE FUTURE OF FINANCIAL SERVICES

Integration of Machine learning and Fin-tech

How AI 'Sees' the Market: Visualizing AI Trading Data

Bhavv: Transforming Stock Trading with AI Simplicity and Risk Management Excellence

Beyond Candlesticks: Innovative Charting Techniques with AI