Building a Remote Cluster with Stockfish Chess Engine
Courtesy Wolff Morrow

Building a Remote Cluster with Stockfish Chess Engine

Part 1: The Basics and setting the context

  1. Problem statement and goal

The journey started ~1yr ago. I wanted a remote server connected to ChessBase Software when analyzing chess games, tactical analysis etc. You might think why not just run stockfish or other engines on your laptop. Well true, but the heavy CPU usage of such engine on a laptop -let’s assume 8 cores- drain the battery fast if not connected to power. Also, a remote chess engine running on a cluster has much better performance.

In this first part I will show how to build a raspberry Pi cluster with a cluster version of Stockfish. Stockfish is a free and open-source chess engine, available for various desktop and mobile platforms. It can be used in chess software through the Universal Chess Interface (UCI).

In the second part I will show how to connect to ChessBase and other UCI software. ChessBase is a personal, stand-alone chess database that has become the international standard for top players, from the World Champion to the amateur next door.?

The third part will show you how to build an Azure Compute Cluster to run a high-performance chess engine supercomputer.

But let's see what already exist on the market. There are several other options to use a remote engine:

  • ChessBase offering of "user cloud engines" which cost “ducats”, variable pricing set by the host such as:

No alt text provided for this image

  • ChessifyMe “renting” an engine running in the cloud

No alt text provided for this image

Both are costly and not sufficient (and IMHO less effective). As an example, you can easily beat the ChessifyMe Free and Amateur pricing and kN/s speed with your own cluster solution.

If you don't have some Raspberry Pi around, a Libre Computer Board AML-S905X-CC (Le Potato) 2GB 64-bit Mini Computer cost you ~$45 each. A cluster of 4 would be ~ $200 hardware invest. I will discuss Azure pricing in the next part, but a club player/amateur would spend ~$20 per month.

Therefore, building a cluster with raspberry pi I have anyway is an easy way to achieve the goal and to learn how it works. On this journey I figured out that a) I was not the first one exploring this options and b) it is sometimes difficult to configure ChessBase while it is flawless with other Software such as Arena, Nibbler or other UCI front ends.

Chessbase Challenge:

There is no easy way to connect ChessBase with a remote engine. Therefore, companies like ChessifyMe build a “plug in” to do so. ChessBase is designed to easy implement a local engine, that’s it. But to connect a remote engine one option is to just start from ChessBase SSH or Telnet, connect via private key and hand over a command such as:

mpirun -display-allocation -hostfile chessbase -npersocket 4 stockfish14

But this option doesn’t exist in the ChessBase GUI. That said, you need to find a workaround, and, in this article, I will show the "how to"!

Message Passing Interface

To load balance an engine cluster version I use OpenMPI. OpenMPI is an open-source implementation of the Message Passing Interface concept. An MPI is a software that connects processes running across multiple computers and allows them to communicate as they run. This is what allows a single script to run a job spread across multiple cluster nodes.

And yes, there is a Stockfish Chess Enginge branch developed with MPI cluster implementation of Stockfish, allowing Stockfish to run on clusters of compute nodes connected with a high-speed network.

The performance difference/increase is pretty stunning.

Feature ask and Chessbase improvement:

Let's look into the future and what would be the perfect solution. It's quite easy from a user/customer standpoint: modify the Chessbase GUI to simply use SSH or Powershell SSH commands.

But what to do until someone from ChessBase pick this up and modify the GUI/modernize the GUI?

Solution for now:

There are 2 options (beside of option 3, making it easier in the Chessbase GUI). It’s not possible to simply start the native SSH Windows client or Powershell from ChessBase and hand over additional commands because ChessBase needs an .exe file to act as an engine, so we need something that one could use to trick ChessBase into thinking there exist an engine locally, when it is really on a remote server.

But there are some downsides. You need PuTTY (an older SSH client). To use PuTTY, one need to convert standard private keys to a PuTTY-specific format. And then use PLINK to start ssh and hand over the command script to the cluster. That’s kind of weird and not efficient. However, it works.

  • SSH Engine: the source code is free and open and can be found here on GitLab. The chessplayer and developer Matt has done an awesome job and we collaborated and I think I was his first user testing SSHEngine. Matt Plays Chess / SSH Engine · GitLab

Even with the above, it’s not easy to get ChessBase configuration to work. E.g. ChessBase tries to “overrule” the remote engine with “smart cpu” functionality and sending commands to the remote engine. So, all of this can't be used and needs to be "unchecked".

But first things first!

2. Building a raspberry Pi cluster

Let's get started. Basically, every raspberry Pi you have can be used. However, a 3b model would be the slowest node in a cluster then. I recently replaced two 3b with AML-S905X-CC (Le Potato) and in combination with the then existing two Pi 4 they work great.

I. Getting ready

The Raspberry Pi 3 and higher is built with a 64-bit chip. Yet the Raspberry Pi Foundation has only released 32bit Linux distributions so far but starting March 2022 there is a 64-bullseye distribution. Replacing my existing Pi's with Le Potato I switched to Ubuntu server 64-bit images. Performance is way higher than with 32-bit images.

If you want to use Debian, start by downloading the latest version of Raspbian, the Debian distribution that runs on the Pis. Download the command-line only “lite” version to save space and create Image with raspberry PI Imager, Debian Lite. Before we finish with the SD card, we want to enable SSH remote access from our Pi. To do this, open the “boot” drive on the SD card and create an empty file named ssh with no extension. Create an empty via file with windows CMD with: copy nul “ssh” and voila, the file is created and named ssh (no extension) and copy to sd card.

Repeat with all SD cards for each node and don’t mix them up. Boot raspberry PI and login via SSH, user “pi”, password “raspberry” and run the updates

  • $ sudo apt-get update
  • $ apt-get dist-upgrade -y

Setting up the ClusterMaster:

$ hostnamectl set-hostname new-hostname

or

$ sudo nano /etc/hostname and reboot

Additionally, I had to change the file /etc/cloud/cloud.cfg in the Ubuntu OS. There is a setting preserve_hostname: false and I had to change that from false to true

Next step is to add a new user we want to use to run the Stockfish Cluster:

  • $ adduser newuser
  • $ and add to sudoer group
  • $ usermod -aG sudo newuser

It's easier to run sudo command without password. Therfore, login as newuser and get rid of password when entering sudo command with

  • $ sudo bash -c 'echo "$(logname) ALL=(ALL:ALL) NOPASSWD: ALL" | (EDITOR="tee -a" visudo)'

Repeat this for all your nodes, having this picture in mind.

No alt text provided for this image

a) ClusterMaster need to talk to each node

1. Create ssh key on ClusterMaster:

  • $ ssh-keygen -t rsa

2. copy the key to each node

From ClusterMaster to ClusterNode1-3

  • $ ssh-copy-id remote-user@server-ip

b) ClusterNode1-3 need to talk to ClusterMaster

1. Create ssh key on ClusterNode1-3

  • $ ssh-keygen -t rsa

2. copy the key to each node

From ClusterNode1-3 to ClusterMaster

  • $ ssh-copy-id remote-user@server-ip

c) you want to login from your windows machine via SSH and without password too.

From Windows Powershell to Linux Machine

1. create ssh key

2. $ cat ~/.ssh/id_rsa.pub

copy the key, login to ClusterMaster, ClusterNode1-3

3. $ cd .ssh

4. $ nano authorized_keys and paste the key from Windows in a new line

Done, the basic cluster is setup, the master can talk to the worker nodes and vice versa.

II. Install the Stockfish Cluster Engine

As said in the intro, we use OpenMPI to connect processes running across multiple computers and allows them to communicate as they run. This is what allows a single script to run a job spread across multiple cluster nodes.

Install the following:

  • $ sudo apt install openmpi-bin openmpi-common libopenmpi3 libopenmpi-dev -y

Now clone the Stockfish Cluster branch:

  • $ git clone --branch cluster --single-branch https://github.com/official-stockfish/Stockfish.git cluster
  • $ cd cluster/src

For 64bit OS run:

  • $ make ARCH=armv8 clean build COMPILER=mpicxx

For 32bit OS run:

  • $ make ARCH=armv7 clean build COMPILER=mpicxx

Then copy to the games directory

  • $ sudo cp /home/pi/cluster/src/stockfish /usr/games/stockfish14

You can delete the src directory with:

  • $ rm -r cluster

Repeat for ClusterNode1-3

Let's check if everything is ok with a simple script. MPI can be used directly in Fortran and C scripts only. But as the Raspberry Pi runs with Python, we’ll use Python capability to our cluster.

  • $ nano test.py

Paste this line inside (or whatever you want):

  • print("Hello")

Now we test it by running it on 4 cores with MPI on ClusterMaster. We actually have 1 node with 4 cores:

$ mpiexec python3 test.py

And MPI delivers the result:

No alt text provided for this image

We want to run this now on the entire cluster. For this and to start the compute cluster later, we need to build a hostfile. I named the file "chessbase"

$ nano chessbase

No alt text provided for this image

Create the test.py file on each node and then startMPI with

$ mpiexec --hostfile chessbase python3 test.py

The result is then 16x Hello :-)

We are ready for the great moment. Last step is to create a bash script which can be executed. The script is straightforward and simple, I named it cluster14:

  • $ mpirun --display-allocation --hostfile chessbase /usr/games/stockfish14

No alt text provided for this image

Our Stockfish Cluster is ready, up and running. You can start to test it with something like:

bench

No alt text provided for this image
No alt text provided for this image
simple cluster case

If you are looking for cluster case a simple one like this works fine.

You might also want to decide for PoE hat and a separate router to increase communication between the nodes/master.









In terms of tuning and optimizing the performance, you might want to use the extra processors on your master node. If so, be sure to include it in your hosts file as well. Below is an example with the number of slots specified.

ClusterMaster slots=2
Node1 slots=3
Node2 slots=3
Node3 slots=3        

mpiexec -n <number procs> -hostfile <host file name (step 5)> hostname

You might want to test with the above command, which runs MPI using the hosts/procs specified in the hosts file. The command to be run is at the very end of this command, hostname. Running this should print the hostname of every node in the cluster, as this is the purpose of the?hostname?command.

The number of processes (<number procs>) should be large enough to ‘hit’ each node on the cluster. That is if there are three hosts in your host file and the first two allow for 3 processes each, putting the value 5 after?-n?above would not allow the third host to receive any commands from the master.

III. Managing the cluster

Multiple options such as Ansible or Fabric. A very quick and easy to use one is parallel-ssh, an asynchronous parallel SSH library designed for large scale automation. It differentiates itself from alternatives, other libraries and higher-level frameworks like Ansible several ways:

  • Scalability - Scales to hundreds, thousands, tens of thousands hosts or more.
  • Ease of use - Running commands over any number of hosts can be achieved in as little as two lines of code.

sudo apt install parallel-ssh
nano .pssh_hosts        

here you add your host names such as

ClusterNode1

ClusterNode2

ClusterNode3

You can execute commands such as:

parallel-ssh -i -h .pssh_hosts sudo apt-get update
parallel-ssh -i -h .pssh_hosts sudo apt-get upgrade -y        

or even scripts.

Enjoy and see you soon (update: part 2 is out here)

Egbert

Sharad Cornejo Altuzar

Principal Engineering Lead at Bungie

2 年

This is great! I was curious how a DYI solution would compare to something like Chessify.

Hamdaan Khalid

Strictly interested in building databases for the foreseeable future. I also enjoy language internals of C# and Rust

2 年

要查看或添加评论,请登录

Egbert H. Schroeer的更多文章

社区洞察

其他会员也浏览了