Not-So-DIY Leaderboard (CSSE)
"We're champions of the world, we're champions of the world! Now you're gonna believe us, now you're gonna believe us." - Football supporters chant
We love playing games. We also love comparing ourselves with other players. Hence, that's why our competition spirit was born. We're jealous and feel challenged when other players played better than us, or at least have better scores. We want to surpass their scores.
But, how do we know that other players have higher scores than us? Well, we have a technology called leaderboard.
What Is Leaderboard?
So, what is leaderboard? The simple answer is, it's a board :-) Leaderboard is a board that shows players' name and score. Also, leaderboard is sorted based on highest score. So, for example, if my score is 707 and your score is 706, my name will appear first (a.k.a. I have higher rank).
Performance Challenges
There are so many challenges when designing a leaderboard, mainly because we always want to design a system that scales to millions of users. I identified at least 2 problems.
The Classic Big O Problem
Let's talk of a performance of a single instance first. Let's assume we want to put a global leaderboard in only 1 instance in the whole world. When building a leaderboard, we have to make sure that query and insertion performance is good. So, how to measure it? We use Big O Notation, of course. (Note: I will not go to the basics of Big O Notation. You can find the basics at MIT lecture: https://web.mit.edu/16.070/www/lecture/big_o.pdf, Dev Community: https://dev.to/sarah_chima/the-big-o-notation-an-introduction-34f7, Medium: https://medium.com/swlh/time-complexity-of-algorithms-big-o-notation-explained-in-plain-english-e12a11dc4a4f. For cheatsheet, you can refer to Big-O Cheat Sheet: https://www.bigocheatsheet.com/).
Let's design a simple leaderboard first, using only array as table. Let's say you have 5 players with scores like this:
P ID | Score
0 | 10
1 | 25
2 | 13
3 | 20
4 | 21
Now, I want to query the player with highest score. We have to loop from beginning to end of the array while storing the highest score and player ID that owns the highest score. It's easy, right? But what's the complexity of 1x query? It's O(n), because you have to look through all elements.
How about getting the player in 2nd place? You do a similar query, only this time you search for the highest score below the 1st position's score. It's another O(n) query. Imagine 5 players do query at the same time just to know their position on the leaderboard. It will be O(n * n), which is okay if only 5 players. But then the world isn't filled with only 5 players.
Can we do better? Sure! Just use binary search tree. When you use binary search tree, you know exactly on which node is the player with highest score, which is the right-most node. How about the 2nd position player? Just look at the 1st position's player.
Unfortunately, leaderboard query is not just about searching the top players, but also about knowing where you are in the leaderboard. The problem is more complex than we realize. But, there is one data structure that suits our need: Order-Statistic Tree (https://en.wikipedia.org/wiki/Order_statistic_tree).
Distributed Systems Design
In the distributed systems, the problem is more than just about the performance, but also about accuracy of the leaderboard. Imagine you have multiple instances of leaderboard spread across the globe. Then, a player in a particular region updates his / her score. Now you have to think how to synchronize the other instances, so the leaderboard on other regions are accurate as well.
There are several problems, such as network issue, time accuracy (if you sort the players with same score by timestamp), system fault, etc. Even if you think that your system will be accurate all the times, there are factors come into play when designing distributed systems. Not to mention that each region could have different loads, so it could affect performance.
Well, we won't talk about distributed systems here. But it's good to learn about distributed systems design. If you want to learn more, you can check Tech Dummies (https://www.youtube.com/channel/UCn1XnDWhsLS5URXTi5wtFTA) channel.
Redis
Redis is an in-memory database, primarily used as key-value system. It is written by antirez, and now Redis has a lot of contributors on Github. Redis is used by many companies (https://redis.io/topics/whos-using-redis).
However, Redis also has an interesting feature called sorted set. Sorted set is like a set (a pair of key and values), but Redis provides a command to rank the data in that set (ZRANK and ZREVRANK). With sorted set, there's a hope that we can build a high-performance leaderboard. Well, we have to develop the endpoint and benchmark it.
Let's Develop Our Leaderboard System!
We're going to use Express framework within Node.js for our leaderboard server. There will be 4 features that we want to develop:
Note that I will not discuss about setting up Express and Typescript. If you need help, I recommend this post from LogRocket: https://blog.logrocket.com/typescript-with-node-js-and-express/. This post will be about 4 features that we want to implement.
Now, it's time to develop. To develop our simple leaderboard, we are going add handy-redis and redis dependency to our project. These dependencies allow us to access Redis to either store / retrieve our leaderboard. Why handy-redis? Because it's a wrapper to redis dependency but with extra feature, such as async-await integration. The official redis dependency didn't provide async-await integration, so handy-redis will help us with async-await.
Also, we need to add body-parser and class-transformer dependency. body-parser dependency is a middleware that directly parses request body based on content type. class-transformer dependency is very useful if we care about strict parsing and we don't want to use JSON.parse() for parsing object. Although, when we're using Express, the body is automatically parsed to JSON through body-parser. This will be explained later when we try to implement 1st feature.
1) Insert Player's Score
To insert player's score, we are going to create a route that parse request body. Unfortunately, Express parse JSON in request body into "any" type, which is considered unsafe and not strict enough in Typescript (but "any" is quite popular in Javascript, tho). So, what we're going to do, is we are going to add body-parser middleware, and then we're going to use "plainToClass" function from class-transformer, so that we can get a proper object.
Here's how to use body-parser middleware:
// index.ts
import express from "express";
import bodyParser from "body-parser";
const app = express();
const PORT = 8888;
app.use(bodyParser.json());
Yes, as simple as that. Amazing, right? :-)
领英推荐
Now, we're going to implement insert player endpoint. Here's how to do it:
// index.ts
import savePlayer from "./routes";
...
app.post("/players", savePlayer);
Note in above code that we're importing function here, just not to make it messy.
// routes.ts
import {Request, Response} from "express";
import Player from "./Player";
import {plainToClass} from "class-transformer";
import {createNodeRedisClient} from "handy-redis";
export async function savePlayer(request: Request, response: Response): Promise<void> {
const player: Player = plainToClass(Player, request.body as Player);
const client = createNodeRedisClient();
try {
await client.zadd(KEY, player.toArray());
response.sendStatus(204);
} catch (e) {
console.log(e);
response.sendStatus(500);
return;
}
}
Note in above code that we also create our own class, just to ease ourselves.
// Player.ts
export default class Player {
? public name: string;
? public score: number;
? constructor(name: string, score: number) {
? ? this.name = name;
? ? this.score = score;
? }
? public toArray(): [score: number, name: string] {
? ? return [this.score, this.name];
? }
}
So, what happens here? Look at routes.ts file. First, we convert from plain object to class instance using "plainToClass" method. Then, we add the entry to a "key" (yes, Redis use term "key" since they're key-based database. Key in Redis can hold 1 / more data). In this scenario, we insert a member along with its score, so that Redis knows what needs to be sorted (which in this case, player's score). To store player data, we use "zadd" method, which is from Redis' ZADD command (https://redis.io/commands/zadd).
2) Update Player's Score
For this feature, the command is still ZADD, but with the extra option "XX", which means that Redis only updates existing member (in this case, player), not to add new member.
// index.ts
import {savePlayer, updatePlayer} from "./routes";
...
app.put("/players", updatePlayer);
// routes.ts
...
export async function updatePlayer(request: Request, response: Response): Promise<void> {
? const player: Player = plainToClass(Player, request.body as Player);
? const client = createNodeRedisClient();
? const scoreStr: string = await client.zscore(KEY, player.name);
? if(scoreStr === null) {
? ? response.status(404).send({
? ? ? "message": "Player is not found"
? ? });
? ? return;
? }
? try {
? ? await client.zadd(KEY, "XX", [player.score, player.name]);
? ? response.sendStatus(204);
? } catch (e) {
? ? console.log(e);
? ? response.sendStatus(500);
? }
}
3) Retrieve Individual Player's Rank
For this feature, we are going to use Redis' ZREVRANK command wrapped into "zrevrank" method. Basically, we tell redis to get a reversed rank (Redis sort scores ascending by default) for our player.
// index.ts
import {savePlayer, updatePlayer, getPlayerRank} from "./routes";
app.get("/players/:name/rank", getPlayerRank);
// routes.ts
...
export async function getPlayerRank(request: Request, response: Response): Promise<void> {
? const playerName: string = request.params.name;
? const client = createNodeRedisClient();
? const rank: number | null = await client.zrevrank(KEY, playerName);
? if(rank == null) {
? ? response.status(404).send({
? ? ? "message": "Player is not found"
? ? });
? ? return;
? }
? response.json({
? ? "rank": rank + 1
? });
}
4) Retrieve Top N Players
To retrieve top N players, we are going to use ZREVRANGE command from Redis. This command is used to get members based on a certain range. The "rev" means that the order is reversed (ascending -> descending), just like individual player's rank.
// index.ts
import {savePlayer, updatePlayer, getPlayerRank, getTopNPlayers} from "./routes";
...
app.get("/players", getTopNPlayers);
app.listen(PORT, () => {
console.log(`[server]: Server is running at https://localhost:${PORT}`);
});
Note: I add "app.listen()" code, so that the Express server can actually receive request.
// routes.ts
...
export async function getTopNPlayers(request: Request, response: Response): Promise<void> {
? const N: number = Number.parseInt(request.query.N as string);
? const client = createNodeRedisClient();
? const playerNames: string[] = await client.zrevrange(KEY, 0, N - 1);
? const players: Player[] = [];
? for (const playerName of playerNames) {
? ? const score: number = Number.parseInt(await client.zscore(KEY, playerName));
? ? players.push(new Player(playerName, score))
? }
? response.json(players);
}
Full code can be seen here.
Test
Prepare Dataset
In order to test our query mechanism, we decide to generate random data set. You can see the example script here.
Benchmark Result
So, I did a little benchmark to see the performance. It didn't disappoint me, to be honest (although I don't know if my standards are low / what). Here's the result:
NOTE: Benchmark is ran on Macbook Pro Early 2015 with Mac OS X Catalina
Insert 1000 players: 21 seconds
Get top 10 players : 43 ms
Get top 100 players: 349 ms
Get 1 player's rank: 27 ms
Of course if we're looking for 1 million users, we need to do more benchmark, but I think at least for a starting point, it's already a good number.