Building the next generation Relayer Infrastructure for web3 apps at Biconomy
Our article about meta transaction flow discussed some of the features of the relayer engine, which is the key element in the whole meta transaction architecture.
In this article, we would like to give you a walkthrough of our core relayer engine architecture, high-level design of its components, scalability, and stability of the engine.
Our relayer engine is feature-rich and highly scalable in relaying a large volume of transactions. This involves managing 100’s of relayers, tracking each relayer’s account nonce internally, auto-funding of relayer accounts, creating new relayers, optimising fee, monitoring each relayer’s pending transactions, real-time health metrics and logging etc.
Architecture Overview
The goal is to have high availability of relayers at any point in time in a system whether we need 10 relayers or 1000s of relayers, the system should be smart enough to manage its scalability and adjust its state to handle more load.
To handle concurrent transactions, the first thing we did was to use the eth_getTransactionCount json rpc method with the pending argument to relay multiple transactions from the same relayer account.
But, we can’t just send 1000’s of transactions from the same account, there are certain limits of your json rpc provider of keeping pending transactions in a transaction pool.
Also, when we have more load on the server it will be difficult for us to interact with the same node for every request and there will be certain nodes in the system which are not in sync with other nodes.
So, there is a big requirement of the internal nonce tracker. By keeping the nonce of the relayer account in the memory cache, we can send as many transactions to the blockchain but it should be within the limits of the rpc provider.
For the most part, our relayer engine is event-driven and has a different set of finite states. So, whenever any event or state change occurs we need to run different sets of operations in order to communicate between all its components.
In the next section, we will be discussing relayer engine components and their communication to ensure high availability.
Relayer Engine Deep Dive
There are 5 main components of our relayer engine. Let’s discuss each one in detail.
1. Relayer-Queue
- It maintains a queue of relayers addresses which are added to the relay hub smart contract and responsible for relaying transactions.
- There is a dedicated queue for each blockchain network supported by Biconomy.
- It works on First In First Out principle.
2. Transaction Listener
- It listens for all pending transactions of relayers.
- On each confirmation, it calculates the total number of pending transactions of a particular relayer and sends it to the pending count manager via “onConfirmation†event.
- It also checks for whether each active relayer has enough funds to relay transactions or not. In case of low funds, it calls for fund relayers method of relayer manager.
3. Pending Count Manager
- It maintains the active relayer with a minimum number of pending transactions on each “onConfirmation†event.
- It is used by a transaction handler to fetch the relayer with a minimum number of pending transactions.
4. Relayer Manager
- It creates new relayers and adds it to the relayer queue in case of more load.
- It also funds low balance relayers automatically by monitoring each relayer’s balances.
5. Transaction Handler
The main job of the transaction handler is to maintain the relayer with a minimum number of pending transactions out of all active relayers and communicate effectively with all the above components.
- On server startup, it sets the initial queue of relayer addresses for all blockchain networks supported by Biconomy.
- It sets the first relayer from the queue as an active relayer. The active relayer is the one which relays user’s transactions to the blockchain.
- While setting any relayer to be active, it also adds it to the transaction listener component so that we can monitor each relayer transaction confirmation and can decide the most preferred relayer for the next transaction.
- The length of the queue is auto-scalable. If there is a need for more relayers in case of more load, the relayer manager will notify this component using “newRelayers†event and it adds new relayers in the queue.
- It listens for “changeActive†event for changing the active relayer by fetching the new relayer from the relayer queue.
- It listens for the “setActive†event from the pending count manager for calculating the most optimised relayer for the next transaction.
- On “setActive†event, it checks whether the active relayer has reached its capacity of sending maximum number of pending transactions or not. If yes, it calls for step 5.
- It also monitors the usage of all active relayers and is responsible for calling the relayer manager to create new relayers in case of more load.
Scalability
With the help of the above architecture, we can ensure the high availability of relayers to handle a large volume of transactions in a single instance.
To scale it to more machines we can easily replicate the same architecture with a separate pool of relayers and internal nonce tracker.
Everything looks great?
In the next section, let’s discuss some of the challenges of building this infrastructure.
Challenges
- Internal nonce tracking needs great monitoring and syncing with a json rpc provider because it can easily put any relayer into a state which can be unpredictable.
- Meta-transactions make more sense with layer2 scaling solutions but these solutions are still not very mature.
- There should be some standard around developing contract wallets, it will really help the services like ours to become compatible with more wallets for relaying meta-transactions at scale.
- There is a great challenge to become compatible with existing wallet providers and contract wallets.
- Resubmitting failed and pending transactions also need great logging and monitoring. Solutions like new relic really provide great tools to manage these issues.
- Gas fee optimisation especially when there is a sudden increase in demand in a network is another issue.
Stability
As we are developing a highly scalable meta-transactions infrastructure for developers, it is super important to be stable. So, to mitigate any risks we follow the best engineering practices to roll out each update by deeply understanding the capabilities of our web servers, real-time monitoring and logging should be in place before the release, setting up fallback mechanisms etc.
Summary
Overall, we talked about how we designed our relayer engine infrastructure and discussed major technical challenges involved.
The choices we made while developing the architecture is based upon our core team strengths and how can we leverage different technologies and frameworks that will help to handle meta-transactions at scale.
We are consistently working on improving the architecture and take the responsibility for building a great developer-focused relayer engine infrastructure that ensures high availability & scalability.
Wait! We are not done yet, Stay tuned for our next article where we will talk about how we are solving the challenges.
Thanks to Sachin Tomar and Ahmed Al-Balaghi for their contribution to this article!
Github: https://github.com/bcnmy/mexa-sdk
Telegram Channel: https://t.me/biconomy
Website: https://biconomy.io
Twitter: https://twitter.com/biconomy
Email: support@biconomy.io
Global Head of Institutional Sales @ Abra
4 个月Tarun, thanks for sharing! Are you planning on going to the North American Block Chain Summit in Texas on November 21?
Full stack engineer having 9 years of experience in delivering end to end features.
4 å¹´Good read Tarun Gupta