How Big is Big Data in Gaming?
How Big is Big Data in the Gaming Industry?
Hariom Nayani
Masters in Computer Science
Stevens Institute of Technology
Hoboken, New Jersey, United States
1. Abstract
Next generation microprocessors and graphic cards have made gaming a very accessible source of entertainment for a vast variety of age groups. This shift of paradigm has opened new gates for gaming companies to harness this large chunk of data that is made readily available to them. Game developers now have at their disposal a plethora of information on their players, and therefore can take advantage of reliable models that can accurately predict user behavior. This paper shall go over the several ways in which this data is leveraged by gaming giants to gain insights on user behavior and trends that developers use to optimize marketing strategies and make the data do the work instead of rubbing their heads thinking of ideas. It will also cover a small case study of how Big Data was leveraged by one of the leaders in the gaming industry, Electronic Arts (EA).
2. Introduction
a. History of Gaming
Gaming has been an ever rising and one of the most profitable entertainment industries. Gaming was born in a small science fair, where Computer Scientists played around with mainframes to create mini games [1]. From there on, there was no stopping. Arcade gaming boxes came out, Atari launched consoles and its own games. In the 90s, the era of multiplayer games introduced first by Sega arrived where multiple users could play on different consoles using LAN. Fastforward to 2022, there are users playing games on their way home from work, in their spare time, making a living by streaming and competing in competitions.
There would be a predicted 3.4 billion people who engage in gaming, through various devices like their computers, PlayStations, mobile phones and Nintendos. It is predicted that just the mobile gaming industry would have an estimated value of $138 billion. And the oncoming of the new VR or Virtual Reality devices, the hype surrounding metaverse, there is only one way this industry is going to go.
All these statistics only point towards one thing, how much data there is abundantly available to leverage and make use of.
b. Motivation
Big data analytics helps organizations harness their data and use it to identify new opportunities. That, in turn, leads to smarter business moves, more efficient operations, higher profits and happier customers. This estimated $300 billion dollar industry is leveraging Big Data like no other industry, and this is genuinely helping the companies serve gamers better, and make innovative decisions to maximize profit without affecting the gameplay experience in a negative way.
?
3. Why is it necessary?
a. Enhancing gaming experience
User experience is the forefront of any technological product that is being made. Making sure the user not only plays the game but also comes back to play it again is important. Which is why customer retention is a crucial factor when making decisions. Big data should give insights, from the get-go. From the very first time the player plays the game, the churn model should know the type of player it is and put it in the correct level of difficulty.
b. Enhancing Technological Development
Gaming has enhanced the faster initiation of cyberspace and advanced the development of the next generation of processors. This has led to an extremely high demand for speed, performance, and low response times. [4]
c. Monitoring Gaming Information
A large compilation of gaming data, starting from what time the player plays the game, how much time is spent playing the game, counting the number of times they have visited the store, number of friends they have, how often they play with them and so on. The possibilities of ways in which this data can be used are boundless. This also helps them with bugs and many other issues that might creep up.
3. How are they leveraging it?
a. To fix glitches
Every gaming move, every button push can be recorded, if necessary, by the companies, these strokes of buttons give them an idea on how to fix issues that come up. Gameplay glitches recorded in log files or reported by users are taken by reproducing these scenarios, developers can work on and fix these issues.
b. To create a safer environment to game
With everything good gaming brings to the world, making it a place where millions of gamers find a sense of belonging, it also has a few pitfalls. Amongst them comes one where people abuse and discriminate against one another. Systems to detect such behavior are put in place which analyze text and speech to ban such players temporarily or permanently from the platform. Sentimental analysis is used to keep track of complaints that are registered through various forms of communication that are used in the game.[5]
c. For technical tests and beta versions
Big data is necessary to help gaming companies with updates that are required, be it the latest content updates that need to be pushed or bugs and errors that need to be patched. Each update relies on the past data that is gathered. Beta versions of the update carried out require a lot of analysis. Usually there are cycles of these games which come about. Beta version is released before the launch of the game, usually lasts for a couple of months or more depending on the vastness of features that are available. User behavior is captured using this which gives the developers an insight on how a large group of players are using the gaming UI provided by them.
d. For monetization of their store
In-game purchase data, or in-game options selected by users, or any data generated by users in-game, that is not while the user is playing the game itself, can be utilized to understand specific user behavior and hence promote targeted advertisements.?Data such as the number of times the store was visited, how much money they spent in the past to adjust discounts, how often to give an attractive offer to a new user so they come back for more items to buy in store, which items they have browsed the most, which area of the world they reside in, many such usage patterns could help in development of a much more successful and profitable monetization strategy for video game developers.
e. Matchmaking and AI difficulty
Player retention is crucial for the gaming companies, they want users to not only play the games but also come back for more. For this very reason, making sure the player finds the right spot between the game, not being too difficult for them to give up but also not being too easy for them to lose interest is paramount. A lot of intelligent systems and algorithms are used to decide which player should match up with whom or what the difficulty of AI Bots of the game should be. Taking the example of a currently popular FPS or First-Person Shooter game, Valorant developed by Riot games, has one of the best matchmaking algorithms. It follows a Match Making Rating or MMR system which ranks players based on their in-game skills like ability usage, aim of the player and many such other skills required to be ranked in a higher MMR. These skills are a high description of course, the point was that the process behind giving a rating to a player is highly data driven and lots of data was collected in the Beta phase to create this model, which is also constantly being optimized.
f. Making their game secure
领英推荐
Every day, hackers find a new way to hack and gain competitive advantage in a game or just do it to disrupt someone else’s gaming experience. Gaming companies are always on top of these hacks, constantly trying to write secure code and creating impenetrable systems. However, some hacks are always found out and it is important that there is an enormous size of data available to create a detection system to detect such hackers. Especially in Online Multiplayer games, most games have this feature to detect players whose actions deviate highly from the average player's behavior and resemble that of a computer or an automated system.
4. Case Study – EA [6]
EA was founded by Trip Hawins, who was a director of Product Marketing at Apple, who left his job and joined the boom of the gaming industry in the 80s.
EA was facing issues in the early 2010’s and was not making their projected revenue on their most popular games that were from different genre like Sports, Shooting and fantasy. To go with that, new games were launching free of cost, that included game buying options which was giving EA a lot of things to think about. CTO (Chief Technology Officer) Rajat Taneja had progressive plans to bring the company to where it belonged, and the way he saw to accomplish that was using the terabytes of data that EA had gathered in the past.
Taneja in a speech at the Strata 2013 conference said that gaming is just another form of social media. And it can be considered as always being connected to your friends through multiple devices. There is so much data that his team at EA was working on that they had to reject their own assumptions of the past. Data and data mining algorithms won against the experience of these people. Data and the numbers never lie.
Taneja knew that the only way to catch up with other big companies like Activision, Microsoft, Tencent, Rockstar and so on, was to leverage this data that they had. Geographical data, devices used to sign in, how often they would share their achievements on social media, their actions in game, all of this was being collected constantly in the back end while the players enjoyed the game. They built a recommender system for analyzing the game data and found complex relations which they were able to make use of, in their Origin platform (their online game store). This data was used to match players based on their profiles to games, in-game discounts and promotions, and friend recommendations based on social media applications connected to it.
What were the results?
In 2013, the company lost 7% of its revenue, but it shot up by 23% the following year. This was their highest ever earned revenue. Most of this goes back to Data being used in the right way by developers to bring back their customer base that was being lost.
In 2013 itself, after making data centric decisions, a game of EA, battlefield alone generated about 1 terabyte of data per day. That is a lot of data for one game. The framework built by EA uses technologies like Hadoop and Apache Spark which accelerate and ease the process of managing and modeling this data.
This growing amount of data was a challenge for everyone working with data at EA. There was so much data pouring in that it was difficult to keep track of it with existing systems. Taneja said that the best way to do this analysis was to take a small amount of data and smartly store it so that actions can be taken in the game, while the player is playing or after the game is done and promotional elements are to be displayed to generate higher revenue.
To do this the company rebuilt their data pipeline from scratch, implementing Hadoop to run the machine-learning and adaptive, predictive algorithms they had created to analyze the data.
A study shows that most used tools to study player behavior are the Cloudera Distribution for Hadoop, with a cluster of four virtual machines (VMs) in a public Cloud environment. Some other components like: HBase, HDFS (Hadoop Distributed File System), Hive, Impala, HUE (Hadoop User Experience), Pig, Spark, YARN (MR2) and ZooKeeper are also used. Using the Big Data platform CDH, it was possible to map their gameplay patterns by game usage time and relate it to game modifications, like expansion releases. Using virtual geographic coordinates, a heatmap was created to identify frequented zones according to the characteristics of each avatar or groups of avatars. In addition, illegal activities of the players were detected. Different analysis combining the use of these two methods can be made depending on the modeling, leading to inputs and insights that empower and support the developers’ decision-making. [7]
5. Key Challenges
a. High volume and velocity of data
It is never an easy task to manage an ecosystem where in terabytes of data is flowing in. Advanced level of machinery or the next generation cloud solutions are required that can efficiently and quickly produce results without having to wait around for hours before a job finishes. Amidst all this, people from higher levels that take care of the business side of things are demanding insights which might take a while to come.
b. Problems with merging data from multiple sources
A cauldron of data is required to carry out the analytics that goes behind gaming. Now imagine a cross platform game, in which a player can play the same game across a PlayStation, on a PC, on their mobile phones or iPads. All the data that comes from different devices does not necessarily come in the same format. This data needs to be converted into a generic format before models can be run on it. This is one such example in which merging data can be an issue. Different countries have different laws that govern data, so data must be cleaned to remove entries that would not make it to the final model due to restrictions put on use of data by these countries.
c. Time Consuming Cluster Management and High-performance auto scaling
As data grows, managing clusters gets increasingly challenging. Therefore, an automated approach needs to be followed to ease its configuring, deploying, and management of its Hadoop clusters. Structured and unstructured data mashup challenges are faced, wherein data of different types is to be analyzed under the same model.
6. Conclusion
Despite the challenges faced by the gaming industry, with ginormous amounts of data coming in from multiple sources, in multiple formats, leveraging such data is not an easy task. However, modern day technologies like the Hadoop ecosystem, data mining tools and the visual informatics that follows have made it possible for gaming companies to make reformed decisions to become profitable, for its stakeholders, that is the investors, and the most important of all the gamers/users playing the game. The gaming industry is ever growing, and more data is being pumped into their log files every passing second, which only means that users are in for an even more enhanced gaming experience.
?
References:
[1]
Riad Chikhani, “The History Of Gaming: An Evolving Community,” TechCrunch, Oct. 31, 2015. https://techcrunch.com/2015/10/31/the-history-of-gaming-an-evolving-community/
[2]
Ivan, “Gaming Statistics - TrueList 2022,” TrueList, May 24, 2022. https://truelist.co/blog/gaming-statistics/
[3]
“U.S. video game device usage reach 2021,” Statista. https://www.statista.com/statistics/219679/household-penetration-of-gaming-devices-in-the-united-states/ (accessed Nov. 06, 2022).
[4]
J. Hampton, “The Use of Big Data in the Gaming Industry - IT Supply Chain,” itsupplychain.com, Jun. 21, 2021. https://itsupplychain.com/the-use-of-big-data-in-the-gaming-industry/ (accessed Nov. 06, 2022).
[5]
M. Bonenfant, F. Richert, and P. Deslauriers, “Using Big Data Tools and Techniques to Study a Gamer Community: Technical, Epistemological, and Ethical Problems,” Loading..., vol. 10, no. 16, Feb. 2017, Accessed: Nov. 08, 2022. [Online]. Available: https://journals.sfu.ca/loading/index.php/loading/article/view/174/0
[6]
“Electronic Arts: Big Data in Video Gaming,” Big Data in Practice, pp. 273–279, Apr. 2016, doi: 10.1002/9781119278825.ch43.
[7]
V. P. Barros and P. Notargiacomo, “Big data analytics in cloud gaming: Players’ patterns recognition using artificial neural networks,” IEEE Xplore, Dec. 01, 2016. https://ieeexplore.ieee.org/abstract/document/7840782/ (accessed Nov. 08, 2022).