Game of Thrones Book Analysis
Top 10 Characters based on the number of times the Character was mentioned in the book.

Game of Thrones Book Analysis

My friend who read the Game of Thrones books often talks about how different they are from the TV series. I never read books myself and don’t plan to, but I thought it would be fun to analyze them.

So, welcome to this project. It should be fun!

In this analysis, we’ll discover the top characters mentioned in the books, their sentiment, and their longevity. We’ll use various data processing and analysis techniques to uncover interesting patterns and trends within the text.

Winter is coming soon; let’s dive in.

Data Acquisition and Preparation

We downloaded the “Game of Thrones Boxed” book series in EPUB format from pdfdrive.com and converted the EPUB file to a plain text file (TXT) to make it easier to analyze. We used an external converter, but we also found a Python library (ebookLib) that can do the same job.

Our main goals for this project are:

1. Identify the most frequently mentioned characters in the book series and compare them with the top series characters.

2. Analyze the sentiment of the text to understand its emotional tone.

3. Perform a survival analysis to estimate the “lifespan” of characters based on their mentions.

Word Frequency Analysis

To identify the most frequently mentioned characters, we performed a word frequency analysis. We counted capitalized words, which are likely to be character names. Initially, we compiled a list of frequently occurring capitalized words and then refined this list by excluding common sentence-starting words such as “The,” “He,” “She,” etc.

1. Top 10 Most Mentioned Characters

We then found the top 10 characters based on their mentions, which highlights the central figures of the narrative in the book.


Here's a poster of the series adaptation of the book. Do you see faces from our list?

From this, we see the similarity between the top characters from the book and the movie. For context, Dany, the 8th name on our list, is Daenerys Targaryen, the Mother of Dragons and Breaker of Chains. I enjoyed this character :)


2. Sentiment Analysis

Now, we go further to analyze the sentiment of the text using the VADER sentiment analyzer from the NLTK library. This helped us categorize the emotional tone of the text into positive, neutral, and negative sentiments, and the compound score categorized the results.

The text was preprocessed to remove stopwords and non-alphabetic tokens and then lemmatized. Sentiment polarity scores were calculated for each line, and the results were categorized based on the compound score.

Here’s a plot showing the sentiment distribution throughout the book:


Characters Sentiment Analysis

A step lower, we conducted sentiment analysis on the dialogues of the top characters and calculated the average sentiment score for each character. This analysis revealed differences in how characters are perceived emotionally based on their roles and experiences in the story.

The average sentiment scores for characters’ dialogues are visualized using a bar plot below. Characters who are associated with positive sentiments are scored upward, and negative sentiments are scored negatively. Characters closest to the middle are scored neutral.


3. Character Lifespan and Survival Analysis

To understand how often characters appear in the text and their prominence, we used regular expressions to extract the number of mentions (lifespan) and dialogues for each character. Then, we performed a survival analysis using the Kaplan-Meier estimator to gain insights into the prominence and longevity of characters within the narrative.

The Kaplan-Meier plot shows the survival probability of characters based on the number of mentions. Characters with higher survival probabilities are mentioned more frequently throughout the series, indicating their importance and continued presence in the narrative.


Overall Insights

  • Top Characters: The most mentioned characters closely match the top characters from the movie. Also, the frequency of mentions is a highly relevant metric in text analytics.
  • Sentiment Distribution: The movie was heavy on dialogue, so it is no surprise that the sentiment analysis revealed a balanced distribution of positive, neutral, and negative sentiments, reflecting the complex emotional tone of the series.
  • In a twist, the characters with the highest negative sentiments, Jon Snow and Arya Stark, were actually the protagonists and some of the most loved by fans of the movie.
  • Character Longevity: We lost some of our favorite characters in this movie. So, this survival analysis highlights the prominence of key characters based on their mentions, with some characters maintaining a high presence throughout the series.
  • Character Sentiment: The sentiment analysis of dialogues shows the emotional perception of characters with varying sentiment scores reflecting their roles and experiences. Again, no surprise, Arya was bent on avenging her family, and Jon Snow kept warning us, “Winter is coming!”

Conclusion

This analysis of the “Game of Thrones” book series proves that by leveraging text processing and statistical analysis, we can gain an understanding of the narrative structure and emotional tone of qualitative data and uncover hidden trends and patterns.

?? Obinna Nweke, 2024.

Olugbemisola Aruwayo-Obe, FIIM , FIMC ,CMC

Executive Technology Leader transforming business,growing revenue and improving brand value/Coach & Mentor/Africa

7 个月

Arya

要查看或添加评论,请登录

Obinna Nweke的更多文章

  • Digital Advertising: Expand Your Reach

    Digital Advertising: Expand Your Reach

    You need to understand your prospects before determining which channel to focus your digital advertising tactics and…

  • Social Media

    Social Media

    Social Media: Sales, Marketing & Reputation Management Social media which has positioned itself at the core of digital…

社区洞察

其他会员也浏览了