Machine Learning AI in a Hack 'n'? Slash Game - Final year University Project

Machine Learning AI in a Hack 'n' Slash Game - Final year University Project

Part 2 of this article detailing the finished project and results is available here -

I've been working on Machine Learning-based AI for a 3D Hack 'n' Slash game, for my 4th year Honours project at Abertay University. I'm posting a short update of my progress here, and I'd appreciate any feedback I can get about it. 


I am in the process of developing this game in Unity 3D for my project. The gameplay is modeled based on Mount and Blade: Warband. The idea for this project came out of an interest in RPGs and Hack and Slash games such as Mount and Blade and Skyrim. The melee combat system is central to such games.

My main goals for this project -

1) Apply Machine Learning techniques to create AI for the game.

2) Refine the Machine Learning AI to produce interesting and challenging behaviours

3) Improve the game design and gameplay mechanics as much as I can in an iterative process

I started out with a character model with animations that I bought from the Unity Asset Store. That turned out to be a bad idea - the attack animations weren't planned well, and they all had a variety of lengths and attack ranges. This made one attack superior to all the others. I animated a capsule collider for the sword using Mecanim on my own to try to compensate for that. Also, the legs don't move in the attack animations - so the character slides on the floor while attacking. Lesson learnt - Don't buy cheap assets from the Unity store!

The gameplay was also quite bland with just an attack combo to spam. It was at that point that I chose to model the game after Mount and Blade, with 4 directional attacks and blocks.

The first version of the AI was a simple State-Machine set up. The AI randomly picked one of 11 states at regular intervals - Idle, Walk, Jump, 4 Attacks, 4 directional blocks. I later added two more states, Seek and Evade, removed the Jump action, and changed the Walk to Wander. This could easily be interfaced with a Machine Learning based script that would handle the action selection in the future.

In the end, I had to spend a week or two replacing the character model, animations and animation State Machine. Thankfully, Mixamo had much better animations and models. I got the "Pro Sword and Shield" set and mixed in a few animations of a Viking with an Axe. They could all be applied to the same character model. Still stuck with the feet sliding on the floor during attacks though.

Q-Learning (Reinforcement Learning)

I programmed a Q-Learning based AI for the game a few weeks ago. Q-Learning uses the opponent's state to choose an Action for the AI. There is a 2D Array of State-to-Action mappings, where each value is a Q-value. Successful actions are rewarded with an increased Q-Value. After the training process, the algorithm chooses the action with the highest Q-Value for each State.

I've been refining the Q-Learning algorithm as well as the state machine/rule based system that it operates with to examine the different results produced. I added some rules in addition to the State Machine, to allow the AI to react immediately (In 0.2 seconds to simulate human-like behaviour) when the opponent comes within range or takes an offensive action. Also, to dissuade the AI from spamming attacks, there is a chance that the AI will wait for the opponent to make a move.

I train the Q-Learning AI by making it fight another AI, which has a simple state machine and some basic rules so that it's behaviours aren't completely random.

The Q-Learning reward values, training time, and training data (In this case, the actions the scripted AI takes against the Q-Learning AI) all lead to slight variations in how the AI behaves after the training process. I prefer to simulate the training process at 5-10 times normal game speed to save time. Increasing simulation speed beyond that seems to affect the accuracy of the physics calculations.

My latest experiment was running two instances of Q-Learning on the same AI - one controlling the AI's action based on the opponent's state, and the other controlling the AI's movement based on the chosen action. Both Q-Learning instances only get positive rewards when they work in a synergistic manner. This does seem to work and produces some smart behaviour, but I'm not yet sure if it works better than controlling the movement based on simple rules.

Future goals - 

- Further refinements to Q Learning AI with the aim of better and more challenging gameplay.

- Programming an Artificial Neural Network based AI.

- User testing with different Machine Learning AI examples with the aim of providing different levels of challenge.

Divij, I enjoyed your video and like what you're creating here. Some questions: What is the slide move that comes when the knight is somewhat lunged forward? It seems very unrealistic, which detracts from the typical person's view of the effort. You wrote "Still stuck with the feet sliding on the floor during attacks though." - I hope you may reconsider that. Have you competed an agent that has considerable learning against one that does not? What does that outcome look like? I'm curious if you've also tried and evolutionary learning for this (along the lines of Blondie24). Please keep me posted on your results. Regards, David

Filippo Ceffa

Senior Graphics Engineer at Sony Interactive Entertainment

8 年

Nice job! From the article I think I understood you let the AI build up it's knowledge of the opponent from scratch at every iteration. If that is the case, it might be worth keeping track of the global knowledge gathered in every match, and start from that instead of from no data. Basically keeping track of a "strategy against a generic human", and refine it at every match to be a "strategy against this specific opponent". The result should be a more precise AI that takes less time to get accurate results.

Raunak Sett

Software Engineer

8 年

Good read. Great to see the progress you're making. Good luck!

要查看或添加评论,请登录

Divij Sood的更多文章

社区洞察

其他会员也浏览了