Yet more, and more, chess AI thoughts

Yet more, and more, chess AI thoughts

It’s been a while since I’ve written about my chess AI work. Work continues on Haunted Chess, we’re trying to get some funding for it as well. I’m still a big fan of the Apple Vision Pro and I also believe Haunted Chess is a great board game by itself and so I want to make a physical version (along with a play along app that contains the narrative elements if you want to include those). If you want to fund me to go write chess stuff all day, I'm all ears!

Chess AI programming is hard. Very hard. You can write one that beats you and has about an 1,800 ELO (ELO is how good a chess player is, most of us are sub 1,000, the best humans are about 3,000 (or just below) and the best chess AIs are in the high 3,000s) if you add the basics, 1,000 is easy with just a good evaluator, and optimizing the search so you get a decent depth, but that area above 2,000. That’s hard. Very hard. I think it’s about the single hardest problem a lone programmer can undertake and it’s extremely easy to get sucked into it and learn from it.

I’ve made a couple of break throughs, well for me and by my standards. The first is a very fast material evaluation function. A simple implementation of a material score (adding up all the values on the chess board to gain a score overall, white pieces are worth 100 to 900, and black pieces -100 to -900. This is actually a unit of measurement called a centipawn which approximates that your queen is worth about 9 pawns, so if my overall material score is 250, that means the estimation is that white is up about 2.5 pawns).

You can add up the scores easily by just going through all 64 squares. That’s a loop through your board, identifying your pieces, and then adding to a score. That might seem fast, but we can do better.

The next fastest way is to use bit boards. I’ve discussed these before but basically it’s a 64 bit number that stores the occupancy of an 8x8 chess board. So with 12 bit boards, you have all the pieces for a chess game. Modern CPUs can scan VERY fast through bits and even tell you the number of set bits. There’s this great book by Henry Warren called “Hacker’s Delight” which has this method inside it:

unsigned int popcount64(unsigned long long x) {
 ? ? x = (x & 0x5555555555555555ULL) + ((x >> 1) & 0x5555555555555555ULL);
 ? ? x = (x & 0x3333333333333333ULL) + ((x >> 2) & 0x3333333333333333ULL);
 ? ? x = (x & 0x0F0F0F0F0F0F0F0FULL) + ((x >> 4) & 0x0F0F0F0F0F0F0F0FULL);
?    return (x * 0x0101010101010101ULL) >> 56;
 }         

I recommend this book to anyone who wants to know more about tricks with bits than anyone has a right to know.

Then there’s my method, which is just 16 additions from a lookup table, and only one lookup table. The trick I came up with is to store the board in nibbles, with 1-6 being the piece, +8 for black. This lets me store a whole chess board in 32 bytes, but more than that, it creates an opportunity for lookup tables for all sorts of things. By using 16 bit entries (2 bytes) I can make my material scoring just 16 additions. Cleverer, or maybe not, I can also arrange my chessboard as 2x2 nibbles, instead of 4x1 nibbles, giving some approximation for extra material scoring inside these 2x2 nibbles. You can’t do much because of edge aliasing, but you can do some.. and all a good chess engine needs is these small advantages. Moving to this way of representing a chess board, alongside bit boards, has been interesting, it’s not very human readable, but it’s fast.

The other area I’ve been looking at a lot is training a neural network on board positions. I’ve been using a standard three-layer neural network (input, hidden, ReLU activation, output). Did I know what that was two weeks ago? No. But it’s not that complicated. The main thing you need to know about the math for the evaluate function is that it uses a dot product and that you need a lot of training data. Training data for AIs is a hot subject so I wanted to find around 5,000,000 or so positions I could evaluate using a static evaluator from something like Stockfish (you can use it’s built in evaluator from the command line), but source them from games that were copyright free.

Luckily there have been many many many chess games over the years and getting games from players with an ELO of 2400+ (ELO is the quality of the player) was a search, but it turns out quite a few people want this kind of thing and for the cost of a cup of coffee I was in. Second is the training. You want to use middle game and end game positions (because start games in chess AIs are often covered by “opening books”, sometimes when you play another AI via the UCI interface the opening is done completely by the host via opening books to randomize the board for you, I’m not a fan of that, but there you go). Training on 5,000,000 positions takes a few hours on my 24 core M2 Max and the average error I encountered was about 100 - 200 centipawns on the positions (so, I was within 100-200 centipawns, or 1 or 2 pawns, of the Stockfish evaluation). Most evaluators disagree by between 50 and 100 centipawns anyway, so this was fine. The problem is that the evaluation is slow. That dot product.

Not to worry, I know SIMD and NEON. I’ll speed it up. And I did speed it up, by about 8x on NEON and 12x on SIMD.

The problem is that it was still slow.

Before the neural network addition to my chess AI I was running at about 16 million nodes per second. Which is pretty good. With the neural network? I was running about 1 million nodes per second. I cut it’s use down to my qSearch (search at the end of an AI search to wait for the board to calm down, it’s really the only place you should actually use it anyway) and got to about 2 million nodes per second.

So. I gave up on using it. Weeks of late nights wasted, But sometimes you get to appreciate the journey, not the destination, and that journey is one I do not regret.


Now, you might think all that learning was wasted (and certainly for a while me and a glass of whiskey agreed), but it wasn’t, because now I’m trying a new kind of neural network called a NNUE network which was designed specifically for chess AI (you would think I would google these things a bit beforehand to maybe go straight to NNUE, and I did, but it looked complicated).

By learning about neural networks I was learning how to graduate to NNUE networks.

NNUE updates as the nodes go through their negamax iterations. That is, as you look ahead in moves you don’t recalculate the entire evaluation, you evolve it. It also uses integers, which are faster, and is more receptive to my board packing than a normal neural network that loved sparse (64 bytes of input) more than non sparse data.

And this is where my next breakthrough happens. Normally for position we use something called a Zobrist hash, which is a really fast way of making a really unique hash code for a chess game (it was originally for go, but it works really well for chess). It returns a 64 bit code that you can pretty much trust (pinkie finger promise) will be unique to your game.

But what if I could pack a whole board into 192 bits? Using a 64 bit board for occupancy (an 8x8 array of 1s and 0s that shows where every piece is on the board) and then 128 bits for the pieces packed into the aforementioned nibbles laid out per the bit structure of the occupancy bit board).

This is unique because it lets me create a hash table effectively without computing a hash, I can use the occupancy bit board as the key, and then the piece map. So this is actually faster for transposition tables (big hash tables that tell you if you’ve been to a position or not before that help you with alpha beta pruning in an effort to not look at a quintillion locations every AI search).

This speeds up my nodes per second BY A LOT.

My NNUE network isn’t quite working yet. Something about math, but my copy of Hacker’s Delight is at my side and Bella has been great at these long debugging sessions.

I’m hoping this latest outing has a destination rather than just being a journey. I can see it there.



Steven Lu

Command Line Junkie

2 个月

It would definitely be a lot of fun to hack on chess solver internals with compute shaders and CUDA and poke around GPU architectures that way. We are fast approaching and will blow past petascale in a box under your desk. And soon after that you’ll have 1 petaflop/s sitting in a svelte box ON your desk. I used to dream about fluid simulations with millions of particles running in realtime. Now that’s hardly even a challenge for these silicon beefcakes. Wish I was actually any good at chess so that I would be able to appreciate galaxy brain moves from a smart solver. As it is, once iterative deepening gets dropped in there I’ve lost all hope of beating it :D

Michael Guerrero ?? GDC

Graphics Educator & Game Programmer

2 个月

Cool journey. I used something like bit boards for poker in desperate housewives. It was wicked fast and allowed me to calculate rough probabilities by simulating 1000 hands. I originally saw the concept from Mick West in the wonderful game developer magazine. SIMD also makes for a good time. What about compute shaders? Have you gone down that route?

Matz Johansson Bergstr?m

3D Technical Director (Automotive)

2 个月

"you would think I would google these things a bit beforehand to maybe go straight to NNUE, and I did, but it looked complicated" I like that you are being so frank there, I respect that. Thankfully you don't have Henry Stauf mocking you while trying so solve a puzzle ?? .

要查看或添加评论,请登录

Graeme Devine的更多文章

  • An update on where things are at in 2025

    An update on where things are at in 2025

    I’m about to start on my third year of teaching (with Tyler Coleman again which I’m jazzed about). The first class is…

    10 条评论
  • More Chess AI Updates

    More Chess AI Updates

    I’ve ported my chess engine to mobile. Which was actually at first very very depressing since it dropped from 3 million…

    6 条评论
  • More Chess AI Adventuring

    More Chess AI Adventuring

    I wrote about the development of my Chess AI, and a lot of you seemed interested so I thought I’d update you all. I’ve…

    10 条评论
  • Chess AI

    Chess AI

    I’ve been writing a chess AI. I’m sure many of you have written one.

    14 条评论
  • You are not the one

    You are not the one

    Recently I was working on a project I’m making in AR using some webAR tech I’ve been putting together and I was making…

    6 条评论
  • Manifesto for 2020

    Manifesto for 2020

    At the end of my Apple days I made a pitch to kill Springboard, the actual interface to the iPhone. As you can imagine,…

    4 条评论
  • Post Magic Leap Thoughts

    Post Magic Leap Thoughts

    Way back, four generations of phone ago, in 2016 I was still wondering what the next thing was. We hadn’t launched the…

    16 条评论

社区洞察

其他会员也浏览了