BEM Gamification: Playtesting is weird
Javier Velasquez
Award winning Gamification and Engagement Expert | Change specialist | L&D Engagement Manager
Every project you embark upon will always have a certain amount of uncertainty. As a game designer I have learned to love that kind of uncertainty, as it means you are experimenting with new stuff in your project, which creates discovery, research and learning. But the corporate world hates uncertainty, as companies are efficiency-driven machines oiled by competition, profits and recognition. For a Gamification Consultant this can be great, as it means companies will spend a fair amount of money in hiring experts to reduce that level of uncertainty, but at the same time, that increases the responsibility in your work, and, if you have read your Flow theory with care, with increased stakes its harder to enter the Flow state. This tension is partly reduced by testing practices, which is why design thinking, agile and every methodology that deals with iterative loop design is always most welcome. So, in essence, prototyping and testing is a great way to fail in a controlled environment before spending all your money in failed ideas. But testing and playtesting don't work the same way and require a different expertise, because, let's face it, playtesting games is just weird.
The Paradox of Early Testing
Let's consider building a digital product (which is in many cases what we deal with in Gamification most of the time). Analogue games are easier to prototype and test, but digital products are a pain. Why? Because explaining a difficult algorithm to a human being so a game can be played is not as hard as coding that algorithm into a machine. Humans can read between the gaps and have short hands to understand difficult instructions, so you can explain Catan in 15 minutes, but coding the same rules in a machine can take weeks: each instruction needs to consider all possible variations or the game will behave just wrong.
In this sense, an early testing of a board games usually involves 80% of the game rules already placed into consideration, and the first prototype usually "feels" like a broken, but complete, playable, game. But in the digital world, early testing means chunking down some mechanics and rules into minimum expressions so you can test a simple idea fast (and cheap). If you are testing Catan you might just create a board, take some pieces, explain quickly the initial draft of the game and start playing, but if you are coding a game like Catan you might start by testing the rolling for resources mechanic. Of course, this means that digital testing will privilege simple ideas over complex ones, which is what you find in most digital products, except for games of course.
An beware! You need to test, which is not the same as present concepts and mechanics. You might be tempted to gather some potential players and explain your game without a playable prototype: that is always a bad idea. That is like explaining how Catan works (our example game for today) with a powerpoint presentation and hoping your players will, first, understand how to play (doubtful), and secondly, will be able to give an accurate sense of what they think the game will feel like (really doubtful!).
But the are some core problems in the nature of early testing that are unsolvable, but predictable, and, if you don't have experience doing so, or come from the wrong axioms, might damage your strategy in the long run.
First, early prototypes are stripped out of one of the things that drive more engagement and attraction: the aesthetics. I have seen first hand the difference between testing a product that has no sounds, art and animations, but are functional, and testing the same core mechanics and functionalities with all those elements put in together into a cohesive experience. Of course, early prototypes are there to validate other stuff, but sometimes humans will be driven towards functionality that they won't find appealing per se, if there is an amazing interface driving exploration: playtesting is weird because people are weird. And the leap is quantic! Some ideas will only work if they come with an amazing interface. Imagine playing Zelda, Breath of the Wild, with cubes and polygons, MIDI sounds, and no soundtrack... the mechanics can be great, but you will not respond well in playstesting. It will not be "delightful" to play the game, so only the players that get caught with the abstract mechanics will seem to like it (or professional playtesters, which are rare!).
Secondly, in complex systems, the experience is not the sum of its parts. Try to mash a racing mechanic, with a deck building mechanic, with a worker placement mechanic, with a roll and move mechanic, and what will you get? Nothing predictable. These are all tested mechanics, and they might work great in their own right, but the way they create a whole experience can only be seen when put together. Watch game reviews and you will see how critics can say that an RPG game was not fulfilling because the character building mechanic was not great. Final Fantasy XIII was critiqued for its linear world, however Final Fantasy X was linear as well, but the metagame was so interesting that people would not mind. A repeated formula did not worked the second time because its other elements pulled the game down.
But on the other hand, you can have mechanics that feel awful when tested separately, but that shine when included in a grander game. The challenge mechanics in the mobile version in Mario Kart would be awful if you played them without the core game, but making those boards progress while playing the racing game gives them another meaning. Or play a game with Mario Bros physics rules for moving the character in an unpolished design, without proper level thought, and it will blow! The thing is, these synergies are hard to read in early digital prototypes, so you must be sure what you are aiming to learn from your prototypes, because your players might give you awful feedback on stuff that is unrelated to the experience as a whole, and you might need to learn to pick from the noise.
The Learning Curve and the onboarding
There is another aspect of traditional testing principles that might drive your design in undesirable ways. The one I struggle with the most is the learning curve. Some amazing games have steeper learning curves, because they create engagement through complexity. If you are part of those who believe that simple things are usually best, you have not worked in the gaming industry then. Complexity is something that arises in game design precisely because of those iterative loops. Each time you playtest a game, and find something is broken, you need to add additional rules to patch the game mechanic. And in this process, players will probably start suggesting things to add to the game to make it better, for example "why don't you add random events" or "what if each resource would yield different advantages when delivered to the king". Game play is driven by complexity and simple elegance is something weird to comeby (that's why it's considered the holy grail of game design). In my experience, simple gamification systems (like PBLs), are less engaging that more complex systems, without having to create full fledge games, of course.
But each time you add a rule, you need to think how you will teach it. The best onboarding experiences are so hard, time consuming and costly to design, that they are often done in a rush. And the worst part is that these "tutorials" usually mean coding "temporary game rules", this is, rules that only apply in the first minutes of the game and that disappear forever thereafter! This is horrible, because when you make an estimate of how long will your development pipeline take, you will almost always underestimate the workload of creating the onboarding experience. So, what will, predictably, happen is that when you finish the main functionalities, which the whole team are experts on using by now, you will take it to your users and they will feel overwhelmed and lost by the whole thing, and you will take double the time you expected tuning that experience, or, in the worst case, ship the product as it is and hope your users will deal with the horrible first few minutes of gameplay.
Why didn't you catch this problem in early testings? Because you cannot spend resources building a tutorial for your prototypes, so you teach the mechanics the old fashion way, like teaching a board game. The person conducting the test acts as the tutorial, and humans are great explaining things to each other, because they can answer unexpected questions with ease, and explain again only the pieces that are needed if something is confusing. But again, automatized tutorials must follow strict rules and sequences, that, if not optimized properly, might leave your players lost. Imagine this, I have tried coding interactive tutorials for my board games, but for one tutorial I have estimated 2 to 4 weeks of work for a game that I could explain in a video in 10 minutes (of course learning a board game from a video is suboptimal). Coding onboarding experiences is hard! But not impossible. I actually use prototyping tools like Adobe XD to create interactive storyboards to simulate the tutorials (actually, this is how I pitch game ideas to my clients). So regular navigation map of 5 screens might become a 16 screen project to show how the player would learn to play. This helps, but it's not enough.
Why? Because things get more complicated when you get feedback from your users. Remember, playtesting is weird! During playtesting, I have found, early players will ask you to be more precise in every explanation, so you will add pages to your virtual rulebook, so to speak. They will want everything explained in detail, even if that means hand-holding them through the first 30 minutes of the gamification experience. Why? Because they are just testing the tutorial! That is the whole thing they are evaluating, and they are unable to imagine that players actually learn to play the game over time, and that not everything must be said in the tutorial.
This is awful, because you end up with a long tutorial explaining everything, and when you start playtesting the game as a whole, players will hate it! They will start asking you to make the tutorial more simple and enjoyable, because they don't want to evaluate how you teach your game, they just want to play! And hand-holding is a horrible experience when playing games. In playtesting your players wanted to make sure they got all the rules of the game during the testing time and each gap in their understanding model is an actual pain, but that principle doesn't apply to actual players. In general, players will start asking for a more "intuitive" learning experience, which is a weird concept in itself, so let's talk about that for a bit, and why it's impossible to perfectly give that to them.
What is intuition?
I hate the UX experts that came with the idea of "intuitive" interfaces and design, even though I'm not really sure who they might be. I have to struggle with every client, killing in their heads the idea of "intuition" as a key indicator of the design. Why? Because human intuition is horribly inaccurate and unreliable! In psychology, the only type of reliable intuition is called "expert intuition", which is the kind of fast thinking experts develop after many hours of working on the same field. But even this "expert intuition" has a big chance of failing. Part of the reason that many people distrust experts in fields like politics is because their predictions tend to be inaccurate and poor. The best experts are the ones that will never trust their intuition and will draw a plan of research to validate those gut feelings, because they know the effects of heuristics and biases. The thing is, our minds love to make predictions, but we actually suck quite a bit at it.
So, what are UX designers really talking about when they use that horrible word? They are talking about affordances and familiarity. Affordances are semiotic elements in a design that queue your brain into a way of acting upon that interface, just because it fits your mental and physiological models. It is like the knob of a door, that is designed so you can only grab it with your hands: there is just no other sensible way of using it! So, yes, you can make a button "look" like a button in your interface, so people will "tap" it and not "drag" it, for example. However affordances work fine in simple contexts: there are no affordances when explaining game rules... but there can be familiarity.
Familiarity is the idea that you have seen a pattern repeated in other places that can be applied to a new interface or rule set. Think on how we name game mechanics: a "market mechanic" is a system that "works like" a real world market, where you trade goods for currency. Creating a market-like system and calling it a "market" creates familiarity and you will not need a tutorial on how the market works. Think also on how shooter games use the right trigger in the controller for firing game actions: it is familiar and kind of an affordance! I hated that in Zelda Breath of the Wild the button for jumping was not the traditional button, it was not familiar. There's not an affordance rule that says that you must jump with X (in a PlayStation sense), that is just a "convention". However, familiarity can actually backfire! If a game feels to familiar, it will seem like a copy of an original. If all we design are familiar PBL systems, they lose their "interest-factor" really fast. Thus, UX design is full of affordances and familiarities, as it helps to take a lot of complexity away, but sometimes testing becomes a futile quest of searching for intuition, where non will ever or shouldn't exist.
If you are planning on using games for learning or engagement, remember that games are abstractions with "weird" rules that imply "leaps of logic". Games are not simulations, are designed artifacts to improve experience. Why can't you place two settlements besides each other in Catan? This is game logic, and that kind of game logic cannot not be explained through intuition, familiarity or affordances: it's actually counterintuitive! I have played a lot of board games and one of the things I love about them is that it's impossible to learn how to play a new game by just looking at the board, no matter how many I have played. Yes, glancing at the board gives me some ideas on how it can be played, because of "expert intuition", but I'm nowhere near being able to play a game without reading the rules.
And in this sense, playtesting is weird again, because it defies the logic of testing for simplicity. Finding the balance between rule explanation and finding what can be left to familiarity or affordances means another kind of mindset, another kind of questions.
Short term vs Long term enjoyment, and saving the player from herself
And I finish last with the one I care about the most, but that required the other paradoxes to be laid down first to appreciate its beautiful complexity. I have worked in some projects where I'm not in charge of the testing procedures, and I always suffer with those. Not because the guys doing the testing are not experts in their field, but because they are proficient in testing, not in playtesting. And in these scenarios I have seen great mechanics stripped down and awful ones included against my council.
When you test, specially when you early test, you will find yourself evaluating "short term enjoyment". Imagine a focus group where you will have an audience captive for 40 minutes. 40 minutes in gamification time is almost nothing, as your design is probably meant to work for hours or days or weeks or... anyhow, in this kind of testing you will be dealing with short term pleasure mechanisms, which work really different from long term enjoyment ones. Let me give you an example: if you ask in a focus group if gathering points for making different activities and giving some badges is enjoyable, your testers will say yes! Now, if you are a responsible gamification designer with a sense on how humans react to extrinsic rewards, you must know that that kind of enjoyment will last for a couple of month maximum. But testers will find this signal as a reason to include this kind of design in your project: you will almost never test a product for three months anyway! Many enjoyable mechanics that work in the short term are unable to capture long term engagement, but you will never see that on playtesting. And your client or UX testing company will revolve around quick wins, as they remove the horrible sense of uncertainty, unknowingly digging a long-term grave, so it is your duty to try and stop them!
Furthermore, your players might actually start giving suggestions that might damage the experience overall. My girlfriend, who actually works designing our gamification and board game interfaces, for example, is the worst tester in this sense, because she will always push towards changing the balance of the game so it will reduce its difficulty, or so it will give her more power over the outcomes. And I know she really enjoys difficult games, but the tester brain is in a different mode than the player brain. The tester brains is thinking how others might react to difficulty and how it could kill the experience, while the player brain is embracing the difficulty, and welcomes frustration. So, playtesting requires a particular skill set that allows the game designer to actually know when a suggestion might improve the game, and when it's just the tester brain making incorrect hypothesis or overreacting to the empowerment of the evaluation process. Natalia now knows when she is inclined to do that kind of suggestions, but still can't help herself and gives them anyway. Again, this is weird!
But the hardest thing to measure is long term enjoyment associated with mastery acquisition. Remember I told you that sometimes too much hand-holding in your tutorial can be bad and that some stuff will be learned from repetition and pattern recognition? Well, this is the kind of stuff that is hard to measure in controlled testing scenarios. Right now I'm in the middle of a beta release of a gamification app that teaches service models to front end employees of a company, and it has been quite a challenge. Why? Because the company decided to collect experience data from day one, and the first comments were that the game was too complex, but the tutorial was to long. This has meant that I have been having to calm the team down, because it seems like an impossible scenario: if its too complex, should we explain things more? But if the tutorial is too long should we strip things away? My take, it was still to early to know. After a few more days the comments have been more something of this kind: "Learning to play the game was hard and complex, but after playing some more it becomes clear and easy". This whole learning problem revolves around the feeling that the game is hard during the first 15 minutes of game play, but after a couple of sessions it seems to be losing that sense of complexity, and for a game meant to be played for four months it doesn't seem that bad.
Still, this is a beta version and there is room for improvement. We have been working on that tutorial for 2 months now, so, of course, the frustration of not being well received feels high. But as I have been working with this kind of scenarios for a long time, I already have my take on possible improvements: it is neither to shorten or adding pages to the tutorial, it is about spreading more the learning curve. Right now, each time they play a level, a new mechanic is explained, so we are teaching too much to fast. We should probably let the players "play" a little bit more with some mechanics before increasing the complexity.
How do I know this? By learning how to read the players in the playtesting environment, and having the right theoretical framework: I am not listening to the suggestions of the players, but I'm understanding why does suggestions appear, so I can use the right design framework to correct the problem. To save a player from herself means to listen to the players suggestions, not to apply them, but to understand their roots. An for that, keep in mind that you are playing the long game, and that in playtesting the word of the player is not Gospel, which can be mistaken with you being stubborn. You must understand their pains and desires, but frame them in the right psychological framework and be creative. How can we increase short term engagement without damaging long term results? How can you make sure that a mechanic that might move your players through mastery in the mid game is not removed in an early stage because players find it confusing at first? This requires embracing a whole new level of uncertainty, but having the right knowledge, theoretical background and experience working on games will never hurt.
So be wary of the paradigms you use to test your gamification project. It is way more similar to playtesting games than to testing traditional digital products. Each rule you add to your tutorial might kill a bit of uncertainty on your player, but might create pain in other ways ("I still don't understand the rule", or "the learning process is too long", or "I have to read too much information"). And taking away a key feature because in early testing players thought it was confusing might be killing hours of future enjoyment! So give a lot of thought to each feedback and a lot more on how you create the testing environment, but this is a theme for a whole new article.
Happy testing!