Supersonic Payment Protector with Redis Stack

Supersonic Payment Protector with Redis Stack

Account hijackings are rising threats and costly additions to the already sour online fraud cake. Added to the fact that quite a few customers transact online while overseas, we can see either increased frustration from legitimate users going through verification hoops or reputational losses or increased costs spent on reviews by the Bank risk and prevention teams. A few recents events to put into the context:

  • In the final weeks of 2021, around 790 Bank customers who lost a total of S$13.7 million to the scammers.Many of these victims lost their entire life savings.
  • Another event occurred where Hackers abroad have been able to pose as 75 bank customers here to make about $500,000 in fake credit card payments.

?Domino Effects on the banking Business?

While it’s harder for Banks to put a monetary value on losses than, say, online fraud, it doesn’t mean it’s a victimless crime. There are very real consequences for affected banks businesses:

  • Hacks and security issues put a strain on the Bank's RISK team.
  • Bank support team? is overwhelmed by customer requests while attempting to reclaim their account.
  • The regulators impose penalty or compensate?
  • Users turn to competitors due to a loss of reputation and brand trust.

In the worst-case scenario, stocks can even plummet after a publicized breach penalty imposed by the regulators.

Digital identity profiling as the defence shield of fraud prevention

Here the proposal is a combination of powerful monitoring such as geo analysis, IP analysis and device fingerprinting enables Banks to profile users based on their activities from day one of registration.

Device fingerprinting:

A device hash/ID can be created using data from a browser, operating system, device, and network and this can flag suspicious connections. This is something that doesn’t require excessive calculations, yet can be highly effective in preventing users from logging in with unknown devices or browsers. It can also detect the use of suspicious emulators or virtual machines, which fraudsters often use to make multiple requests from the same original computer.

On the flip side, Mobile apps specifically for smartphone OS can use a specific SDK (software development kit) to get extra information about devices, whether they are built by Apple, Samsung or other vendors. Such mobile device fingerprinting products detect: MAC address, serial number (Android only), device time zone, battery health, CPU details…

IP analysis: This classic fraud prevention method can be enriched to reveal suspicious VPN proxies or TOR usage.

Logging the data obtained can also be useful to create whitelists for your users in order to reduce false positives. For instance, if a user was able to let you know that they’re traveling in advance, it could be reflected in their IP address being whitelisted.

User Behavior Focused Payment Protection Rules

It is essential to have reinforced rules in place that lets us understand what is considered safe behavior and what should raise warning flags. We can build such complex user device fingerprinting, Rule based algo? for whitelisting, UB rules. However, from an implementation point of view, these look ups for the fingerprinting across the historical user activity and rules checks in real time becomes a challenge

Key challenges of detecting and handling suspicious activities?

Above all, Banks do not want to unnecessarily stop valid transactions from happening, as they will forgo the transaction fees that they otherwise could have earned and might loss the customer for another bank. The challenge of fraud detection is to have a high confidence level in the authenticity of the transaction. The way payment processors try to hone in on that confidence level is to run multiple fraud algorithms at the same time.?

CX - Customer Experience?

This calls for sustaining existing SLA for the response time, NRF demands sub secs response time across a humongous amount of the data lookups.? In fraud prevention, the faster you can make a decision with zero false hit, the best it would be for customer experience

One shot hit:

If an account hacking is already underway, there is? only one and only one chance to spot suspicious user behavior.

Need for heavy uplift of wide spread technology stack :

As an example, let us consider AWS Real-time Fraud Detection end-to-end solution for real-time fraud detection which leverages graph database Amazon Neptune, Amazon SageMaker and Deep Graph Library (DGL) to construct a heterogeneous graph from tabular data and train a Graph Neural Network(GNN) model to detect fraudulent transactions in the IEEE-CIS dataset.

No alt text provided for this image

Now let us see how we can leverage Redis to have simpler approach can help to achieve the goal of implementing a real time fraud detection without affecting customer experience

Redis Stack -? THE SUPERSONIC? DATA PROCESSING PLATFORM

Redis at its core is designed with 2 core principles -?

  • Real-time speed?
  • simplicity.?

Redis Stack is an extension of Redis that adds modern data models and processing engines to provide a complete developer experience.Redis Stack unifies and simplifies the developer experience of the leading Redis modules and the capabilities they provide. Redis Stack bundles five Redis modules: RedisJSON, RediSearch, RedisGraph, RedisTimeSeries, and RedisBloom

Speed?

All Redis data resides in memory, which enables low latency and high throughput data access. Unlike traditional databases, Redis In-memory data stores don’t require a trip to disk, reducing engine latency to microseconds. Because of this, in-memory data stores can support an order of magnitude more operations and faster response times. The result is blazing-fast performance with average read and write operations taking less than a millisecond and support for millions of operations per second.

Developer Friendly :? Ease-of-use

Redis enables you to write traditionally complex code with fewer, simpler lines. With Redis, you write fewer lines of code to store, access, and use data in your applications. The difference is that developers who use Redis can use a simple command structure as opposed to the query languages of traditional databases. For example, you can use the Redis hash data structure to move data to a data store with only one line of code. A similar task on a data store with no hash data structures would require many lines of code to convert from one format to another. Redis comes with native data structures and many options to manipulate and interact with your data. Over a hundred open source clients are available for Redis developers. Supported languages include Java, Python, PHP, C, C++, C#, JavaScript, Node.js, Ruby, R, Go, and many others.

High availability and scalability

Redis offers a primary-replica architecture in a single node primary or a clustered topology. This allows you to build highly available solutions providing consistent performance and reliability. When you need to adjust your cluster size, various options to scale up and scale in or out are also available. This allows your cluster to grow with your demands.

Simplified Platform

It allows to avoid implementing a mega build up of the vast disjoined technology stack to single and simple omni technology platform

No alt text provided for this image

Highly Resilient vendor Neutral Platform

Redis stack provides support for hybrid? multi-cloud capability. This is kind of unique. Which means you choose to develop your application using primarily Redis and Redis stack and as we've seen you can do a lot with those modules, you now have a system that is not locked into any one cloud provider. So since you're not using cloud proprietary systems like DynamoDB on AWS or Bigtable on Google or whatever it might be, you can run Redis on any of these platforms. Redis will run out of the box on Amazon, Microsoft Azure or Google Cloud. With? a button clock you're deploying, it will automatically spin that up on the cloud provider of your choice and it works the same way no matter where it is running.

No alt text provided for this image

And the really cool thing is that you could choose to even deploy your application across all of these cloud providers if you wanted to and you can also download Redis cloud software to your on-premises data center as well. So you can imagine the system where you are distributed amongst AWS, Google cloud, Microsoft Azure and even your on-premises network and think about the resiliency that that gives you.

Payment Fraudulent Transaction Detection

With that quick intro, let us get back to our use case - Super Sonic Fraud Prevention. By design, Redis delivers sub-millisecond response times, enabling millions of requests per second for real-time applications

Solution Overview

Digital Identity Profile Matching - RedisBloom

Powerful device fingerprinting: Instantly know when a user is connecting with a suspicious combination of software and hardware

User Behavioural Payment Protection rules: - RedisBloom

Collect and screen complete user activity on app/web app via custom API calls relating to any data point you wish to send. It’s the closest thing to conduct behavior analysis to help you understand precisely when someone is acting suspiciously.

Automated AML Payment Screening - Redis Graph

Financial institutions (FIs) that neglect to identify PEPs and breach sanctions put themselves at risk of fines, which can be quite significant. Between 2008 and 2018, regulators around the globe levied almost $27 billion in fines related to watchlist screening. Notable offenders include BNP Paribas (fined $9 billion in 2014), Societe Generale (settled for $1.3 billion in 2018) and Standard Chartered (fined $1.1 billion in 2019).

A thorough screening includes checks on sanction and PEP lists. These watch lists are continually updated with new names. So, both the sanctions and PEP screening should be done in real-time to adhere to AML requirements and to create a seamless customer payment process. PEP screening to identify and conduct customer due diligence on any politically exposed person as part of a robust Anti-Money Laundering and Know Your Customer (AML/KYC) program.To improve the effectiveness of sanctions and PEP screening processes, and to automate much of the associated workload, financial institutions can also implement AML/KYC solutions designed to help mitigate AML risks.?

A single API-led solution can pull information from various sources to help screen customers against sanctions and PEP databases. With cutting-edge technologies, FIs need to reduce false positives and thereby increase efficiencies in their screening process.

Automatic watch list screening and ongoing monitoring, coupled with a global identity verification platform, is a smart and economical way to make it more difficult for corrupt individuals to launder illicit funds and thus safeguard an FI’s reputation and integrity.

Solution Details

Fingerprinting for Identity Profiling

Browser fingerprinting techniques are helpful for identifying visitors with a pattern of fraudulent behavior and then targeting only these visitors for additional security. In addition, fraudsters often use identity concealing techniques like disabling cookies, surfing through a VPN, or using browsers in incognito mode. These are all areas where fingerprinting proves to be at its best since it identifies users quickly without relying on IP addresses and site cookies.

No alt text provided for this image

Browser fingerprinting analyzes any given user’s software and hardware configuration, which in turn creates unique IDs that can be used to highlight suspicious behavior. It can help spot a range of potentially fraudulent activities including synthetic IDs, identity theft, CNP fraud, phishing, spoofing, account takeover, and affiliate fraud.

No alt text provided for this image

Hashing All the data returned from online fingerprinting is processed through a hash function. This is a long string of letters and numbers which processes data of arbitrary sizes into fixed-sized values. This makes it easier to log the information, encrypt, analyze and compare it.

User device/browser activity Streams. Apart from fingerprinting info, user activity on a busy app or site creates a lot of messages. Using streams, we can create a series of real-time feeds, which include page views, clicks, searches, and so on, and allow a wide range of consumers to monitor, report on, and process this data.

Supersonic Data ingestion via Redis Streams

For the data ingestion, we leverage Redis Streams as part of the redis stack. No additional technology required here

No alt text provided for this image

Redis Streams?

Simply put, Redis streams is an append-only data structure of Redis. It has some similarities to other data structures like lists, but it is more useful and complex. When we append some data/message on a stream variable, it becomes available for consumers. It has a BLOCKING API which allows us to make consumers wait for new message arrival. It is robust in speed and easy to implement. It also offers Consumer Groups that allow sending a different subset of messages to different consumers.?

Adding Data

The fundamental write command is XADD, which appends a new entry in the specified stream.

XADD profile-information * name Jacky Chao age 27 source-country baku?

Here XADD is the command which tells Redis to append a new entry at a specified key. profile-information is the key name of the stream and * tells Redis to creates a monotonically increasing ID for every entry. We can also specify ID explicitly, but it is very rare. Auto-generation of IDs by the server is perfect for almost all cases. The rest of the part of the command is key-value pairs composing our stream-entry.

XGROUP CREATE profile-information identityprofile

You may have noticed that we are not pushing the same messages to multiple clients. We are sending different messages to different clients. In this scenario, Consumer Groups come into this picture.

Creating a consumer group

Assuming I have a key “profile-information” of type stream already existing, then the following command will create a consumer group.

> XGROUP CREATE profile-information profilegroup $

You may already know $ is a special ID that represents the last maximum ID available in the streams. Consumer Group needs to know from where it should track the messages for its consumer. Passing $ will track only the new messages after the creation time of this Group. If we pass 0 instead of $, then this Group will consume all the messages from the beginning of the stream. We can specify any valid ID indicating the starting point from where the Group consumes the message. profilegroup is the name of the group. profile-information is the existing stream key. XGROUP also supports creating the stream if it doesn’t exist by passing a subcommand at the last MKSTREAM

> XGROUP CREATE newstream profilegroup $ MKSTREAM

Consuming from a group

Redis provides a command name XREADGROUP which is very similar and provides the same BLOCK option; otherwise, it is synchronous. There is a mandatory option GROUP with two arguments — the name of Consumer Group and the name of the consumer attempting to read

Redis Gears and RedisInsight

Redis Gears is a subscriber to this stream which in our case is used for event transformations, map-reduce and micro batching use case. Redis Gears pushes the data to the Redis time series module, which powers the time series visualisation tool Grafana, on which trends can be analysed. And finally, the data can be persisted using Redis database . RedisInsight is a browser based tool to monitor Redis. Support for Redis Graph and Time Series makes it a handy visualization tool as well.

Supersonic Fraud probability lookup via RedisBloom

Identify unusual behaviors by comparing to previous activity without storing massive volumes of information. RedisBloom provides Redis with support for additional probabilistic data structures. These structures allow for constant memory space and extremely fast processing while still maintaining a low error rate. It supports scalable Bloom and Cuckoo filters to determine whether an item is present or absent from a collection with a given degree of certainty, Count-min sketch to count the frequency of the different items in sub-linear space, and Top-K to count top k events in a near deterministic manner.

We will use the Redisson as our library to connect with Redis as it has out of box support for bloom filters.

No alt text provided for this image

For this PoC, we shall use local redis stack

No alt text provided for this image

From the Redis insights, we can also see all loaded identity profiles

No alt text provided for this image

Here let us create a Bloom filter of type String and initiate it with expected data size and probability of false positives.

???RBloomFilter<String> stringRBloomFilter = redissonClient.getBloomFilter("usernames");

??????????? stringRBloomFilter.tryInit(99999, 0.001); //Set expectedInsertions and falseProbability

??????????? identityProfileBloomFilter = stringRBloomFilter;


No alt text provided for this image

Consumers can now call sync API over the probability of any matching based on the existing identity profile collected from during the registration.? To? keep it simple, we used GET method.?

No alt text provided for this image

If not found, warn the user and step up with MFA. Once it gets passed, new profile information will be loaded for the future.

No alt text provided for this image

Supersonic Payment Protection Rules Execution - RedisBloom

Here we can pre check with a set of governess rules that fraudulent transactions falls into and calls for a score. Bank can set the threshold accordingly whether to stop the transaction or not?

Let us run thru some of them to get an idea

Usage detection of Emulators, Virtual Machine

Criminals employ emulators to create fraudulent virtual devices. They then use those devices to act like legitimate users to carry out various fraudulent activities. With more than six billion people worldwide (including 85% of Americans) using smartphones, mobile e-commerce spending is at an all-time high. Accordingly, fraudulent activity is growing.

Systems detect human vs. non-human users, fraudulent devices can be identified by pairing device attributes and behavioral data. Identifying whether a device has been tampered with or spoofed is straightforward. Emulators leave a footprint on the OS and have incomplete critical behavioral data, like mobile attributes missing accelerator or gyroscope data, which are automatic red flags.

UnUsual Activity Spikes in the accounts

This behavior captures occurrence of? intensity of transactions for an inactive account suddenly increases without plausible reason.

No alt text provided for this image


Suspicious changes in ISP, VPN Usage

The VPN, or virtual private network, can be a useful tool for the casual internet user. These software plugins can help hide your location and make it easier to be anonymous online for a whole variety of reasons. And of course, masking your location also makes VPNs the perfect tool for committing click fraud. Tor on the other side, is software designed to provide users with anonymity and security in some online communications. It also allows them to access certain websites and services on the Deep Web.?

No alt text provided for this image

There are IP address checkers that can tell how likely fraud is on a suspect address.

Transfer? of a large number of account transfers,

These are situations where a payer whose account indicates large or frequent wire transfer and sums are immediately withdrawn or usage of multiple accounts are used to transfer funds between accounts by generating offsetting losses and profits in different accounts. Abnormal settlement instructions including payment to apparently unconnected parties. .

Multiple Customers using same IP or device

Many fraudsters employ residential proxy networks—a service that routs their traffic through an intermediary server—typically a home network serviced by an internet service provider (ISP) that is registered to serve consumers instead of businesses.

Very large unusual purchase?

This behavior look up for suspicious events like payment that cannot be matched with the investment and income levels of the customer.

Payment Protector Back-office

At the application side, we can create our micro front end and microservices to maintain the set of the? Payment Protection Rules

No alt text provided for this image

Serverless Scoring Engine using RedisGears

With all the data stream, we can leverage RedisGears to score calculation function based on these rules

No alt text provided for this image

RedisGears is a dynamic framework for data processing in Redis. RedisGears supports transaction, batch and event-driven processing of Redis data. To use RedisGears, you write functions that describe how your data should be processed. You then submit this code to your Redis deployment for remote execution.

RedisGears enables reactive programming at the database level. It's like using lambda functions, but with a dramatically lower latency, and with much less encoding/decoding overhead.

Here's a sample script that records all commands run on keys that have a payment- prefix:

  • GearsBuilder().filter(lambda x: x['key'].startswith(payment-')).foreach(lambda x: execute('xadd', payment-channel, '*', 'key', x['key'])).register()

This second script then reads the payment-channel stream and updates access counts in a sorted set called payment-counts:

  • GearsBuilder('StreamReader').foreach(lambda x: execute('zadd', payment-risk-score', '1', x['key'])).register('payment-channel')

If you register both queries, you will see that both the stream and payment-risk-score update in real time. This is a very simple example to show what can be done ( implementation a a full score engine is beyond scope of this article).

Payment Protection Dashboard?

With all the data stream, we can have global spread of all incoming request and of course how many of them success or declined

No alt text provided for this image


And then you can drill down to the details

No alt text provided for this image

Flexible Adjusting score weightage

Fraud scores are based on default, custom, and machine learning rules. You can review the different rules in the Scoring Engine. By default, a score over ten is considered risky. However, every Bank? has a different risk appetite and can adjust the different state thresholds using the slider. Threshold - Set what scores like to associate with the APPROVE, REVIEW, and DECLINE states.Identity Profile weight -By default, a Fraud Score is the sum of all score rules (IP, phone, and custom). You can change the weight of the rules category to emphasize its importance over other categories.

We can look at the incoming channels specific frauds cases

No alt text provided for this image

Similarly to the identity profile probability checking, we can look up for? the probability matching across other Payment Protection Rules.?

Supersonic Fraud linker - using Redis Graph

RedisGraph is based on a unique approach and architecture that translates Cypher queries to matrix operations executed over a GraphBLAS engine. This new design allows use cases like social graph operation, fraud detection, and real-time recommendation to be executed 10x – 600x faster than any other graph database.

We can stream them in to Redis from list provided from data sources - lists are a compilation of various regulatory and enhanced due diligence lists from major sanctioning bodies around the globe such as the Office of Foreign Assets Control (OFAC), UN sanctions, EU sanctions, Her Majesty’s Treasury and thousands of other regulatory and law enforcement organizations like Interpol.?

Fraud linker Recommendations

Rapidly find connections between payers and the connection with other fraud or PEP entities? they have by examining the relationships between them.

In this example we have persons and companies.. From the list above, we can see some person have PEP - true and some corporates Sanctioned - true..?

No alt text provided for this image

As you can see, there is no much direct connection between non PEP and PEP. For example Mark LEE is just stake holder of the SpaceZero entity where John LO who having profile with PEP true also works for a sanctioned entity MetaX

No alt text provided for this image

Now at payment time, we can leverage the Redis graph to check the linkage from payer or beni with any of the corporate or PEPs…

Here our Payer is Mark LEE.. Let us apply a match query? to find any link between person who is making the payment and any person PEP?

No alt text provided for this image

Above is a Cypher query. Cypher is a declarative graph query language that allows for expressive and efficient querying, updating and administering of the graph. It is designed to be suitable for both developers and operations professionals. Cypher is designed to be simple, yet powerful; highly complicated database queries can be easily expressed, enabling you to focus on your domain

Some of the main features of Cypher are

  • It is a declarative query language. It specify “what data to fetch” rather than “how to fetch the data”, unlike SQL
  • Queries follow the pattern - MATCH <PATTERN> COMMAND, where COMMAND can be RETURN, CREATE, MERGE, DELETE etc.
  • The main query logic for any CRUD operation, are in the pattern. A pattern represents what data we are interested in
  • It also uses bunch of common SQL-like constructs e.g. WHERE, OREDR BY, MAX, MIN, EXISTS, IN, UNION, STARTS WITH, RANGE, DISTINCT etc.

//find what role Tom Cruise played in movie Matrix

MATCH (:Person {name: 'Tom Cruise'}) -[r]-> (:Movie {title: 'The Matrix'})

RETURN type(r)

Relationship? length pattern matching

Variable length pattern matching is used when the level of relationship, or number of hops between the nodes can be variable. A variable length relationship is indicated with an asterisk (*) followed by a number pattern. In simplest form, (x)-[*2]->(y) denotes relationship between node x and y with exactly 2 relationships/hops in between, which is same as (x)->()->(y).

To have a variable length pattern, a range is used. For example, [*1..5] indicates relationship with 1-5 hops. [*3..] and [*..3] indicates minimum 3 and maximum 3 hops respectively. To indicate any/infinite number of hops [*] is used. Minimum hops by default is 1, but 0 can also be used to indicate no hops or the same node, e.g. in case of (a)-[*0..1]-(b), a and b can be the same node as well!

GRAPH.QUERY corporates "MATCH p = (p1:Person {name: 'Mark LEE' })-[:STAKE_HOLDER_IN*1..3]-(m:Corporate) -[:STAKE_HOLDER_IN*1..3]-(p2:Person {PEP:true }) RETURN p,m"

No alt text provided for this image

From our query result below, we can unearth the linkage .. even though Payer Mark LEE is not connected to John Lo directly. However Redis Graph can, at a sub seconds level, easily trace it out.

Relationship by degree of separation

By using the relationship length -[:LINKED*2]->, we tell Cypher that there should be exactly 2 consecutive :LINKED relationships on path between payer/payee and PEPs.

You can refer to the APIs provided https://redis.io/docs/stack/graph/commands/#redisgraph-api

Here is a highly simplified Redis based end-to-end fraud detection platform - Single stack Single platform

No alt text provided for this image

So we can build up prevention shield features into the core of Redis to compose an end-to-end fraud detection platform. We also take great care to put user experience front and center, reducing the processing time to a minimum while allowing us to remain on a single stack:

要查看或添加评论,请登录

Joseph George的更多文章

社区洞察

其他会员也浏览了