Word Analysis: The Bible

Word Analysis: The Bible

During the election, I spent a lot of time analyzing the word usage of speeches and debates and I thought it was interesting how much information you could glean by just analyzing the most commonly used words. As I was doing those, I began to wonder what I could learn by performing a similar analysis on the scriptures of various religious traditions. As I’ve noted previously on my posts, I’ve always been very interested in the world’s different religions—their histories, traditions, belief systems, differences between them, and also their similarities. So much of this come from each faith’s scriptures, so I thought that analyzing the word usage at a macro level could be particularly insightful. So, I’m going to be starting a series of posts (and visualizations) analyzing the scriptures of the world’s religions, including both western and eastern traditions.

The Bible

I’m going to start the series with the scriptures of Christianity, the Bible. The Bible, as most of us know, is broken into two major Testaments, which Christians refer to as “Old” and “New”. The Old Testament is also, more or less, the same set of books as the Hebrew Bible (though some interpretations are a bit different and the books are ordered differently), so this analysis will largely encompass the Jewish scriptures as well.

The Bible is also generally broken up into eight subdivisions, the first five of which fall within the Old Testament.

  1. The Law (Pentateuch) – Genesis, Exodus, Leviticus, Numbers, Deuteronomy
  2. Historical Books – Joshua, Judges, Ruth, 1st Samuel, 2nd Samuel, 1st Kings, 2nd Kings, 1st Chronicles, 2nd Chronicles, Ezra, Nehemiah, Esther
  3. Poetical Books – Job, Psalms, Proverbs, Ecclesiastes, Song of Solomon
  4. Major Prophets – Isaiah, Jeremiah, Lamentations, Ezekiel, Daniel
  5. Minor Prophets – Hosea, Joel, Amos, Obadiah, Jonah, Micah, Nahum, Habakkuk, Zephaniah, Haggai, Zechariah, Malachi
  6. Gospels and Acts – Matthew, Mark, Luke, John, Acts
  7. Epistles of Paul – Romans, 1st Corinthians, 2nd Corinthians, Galatians, Ephesians, Philippians, Colossians, 1st Thessalonians, 2nd Thessalonians, 1st Timothy, 2nd Timothy, Titus, Philemon
  8. General Epistles and Revelation – Hebrews, James, 1st Peter, 2nd Peter, 1st John, 2nd John, 3rd John, Jude, Revelation

I decided it would be best to work with the New International Version (NIV) of the Bible, a modern English translation of the Protestant Bible (the Catholic Bible includes some additional Old Testament texts) which dispenses with the thee’s and thou’s of the more standard King James version. I was fortunate enough to find a copy of the NIV text on www.godoor.net. From there, I broke the text into each of the sixty-six books. I then used a tool from www.writewords.org to count the occurrences of each word in each book, which I compiled together into Excel. From here, I was able to begin my analysis.

Unfortunately, as was the case with my previous analyses of political debates and speeches, the most common words were ones like “the” and “and”, which give virtually no insight into the meaning of the text. (I’d be willing to bet that “the” is the most common word in pretty much every single text ever written in or translated to English.) But, of course, these are not the only common words which would rob the analysis of its insight. I, therefore, needed some way to remove them. As a starting point, I obtained a list of the 5000 most common words from www.wordfrequency.info. I did some of my own cleansing of the list to remove certain words which are common, but still very meaningful. In the end, I was able to create a fairly comprehensive list of word to be removed, which mostly consisted of pronouns (he, she, it), prepositions (at, from, after), conjunctions (and, but, so), and determiners (this, the, those). Of course, this process is a bit subjective and I may have made choices others would not have, but the result, I think, will be much better than simply leaving all of these commonly used words in the analysis. (If interested, I’d be happy to share the list of words I’ve excluded.)

Analysis & Visualization

Once I had excluded these common words, I began to visualize the word usage in Tableau (you can find the full visualization here). I started by analyzing the most commonly used words in the Bible as a whole. The following visualization shows the Top 10 words.

Top 10 Words in the Bible (List starts on the left and goes down)

The number 1 word is “Lord” (this includes both the upper case “Lord”, which refers to God and the lower case “lord”, which would refer to a someone’s superior), with 7,759 occurrences, followed by God, at number 2, with 3,977 occurrences. Interestingly, “Jesus” makes the list (ranked # 10 with 1,273 occurrences), despite the fact that the New Testament accounts for less than 23% of the 274,000+ words (Remember, this is a filtered list of words; without filters, the total word count is over 725,000, but the percentages of the Old and New Testament remain relatively the same).

I also created a Word Cloud, which allows you to select a specific Testament, Division, or Book. For example, here’s the cloud for the book of Job.

Working with this view of the data and experimenting with different filters can be quite insightful.

Finally, I created a single visualization showing bubble charts of each book’s word usage.

This is an interesting view of the data because it shows you the most commonly used words as well as giving you some perspective on the amount of words used in each book compared to one another. For example, you can see that the Epistles of Paul are, more or less, organized by size, with the largest book, Romans, appearing first, and the smallest, Philemon, appearing last.

But other insights can be gained here as well. For example, the term “Christ” is incredibly common in the Epistles of Paul and Acts (typically ranked first or second), but not nearly as often in the other books of the New Testament. In Hebrews, for example, “Christ” is only the 12th most commonly used word. In James, which is traditionally attributed to Jesus’s brother, “Christ” is the 84th ranked word, only appearing twice.

There is likely a lot more insight to be gained, but I’ll leave it there for now. Please take a look at the visualization and let me know if you discover anything interesting. Again, you can find the Tableau visualization, which includes separate tabs for each of the screenshots above, here.

Stay tuned. I’ll be back soon with another analysis of the scriptures of one of the world’s religions.

Ken Flerlage, December 12, 2016

Website: www.kenflerlage.com

Tableau Public: https://public.tableau.com/profile/ken.flerlage#!/


Amy Fields

EHSCN Sr. Specialist at Toyota Boshoku America

7 年

I'd really love to talk with you about the software you used for visualization.

回复
Osaghae N. Irianan

Visual Designer | Napstone Multimedia

7 年

Amazing! Well done, I look forward to future posts. I just started using Tableau recently and found it quite interesting and productive. Just a heads-up, you are going to find something common in all your further analysis of pretty much every sacred text/book out there. They might use different phraseology, but the underlying principle is the same. Ironically, there's no place in any of these sacred texts, including the Bible, where it actually tells you plainly or writes out what this something actually is lol, but rather the reader is left to decipher it for themselves, based on their own level of understanding or shall I say awakened consciousness. Once they do, everything becomes clear, all the parables, symbols, allegories and all the seeming contradictions, of which there are very many, instantly disappear and they 'll see this something "written" everywhere, from the first verse in Genesis to very last in Revelation. Not only will they see this everywhere in the Bible, but also in every sacred text ever written and also basically in everything. It's the simplest thing and you'll laugh once u "figure it out", but at the same time the most complicated and most elusive thing to grasp to a mind that isn't ready for it.

回复
Adam Gabardy

Training Specialist

7 年

Very interesting! I think it would be very enlightening to separate the unique usages of Lord to distinguish between a reference to God and a reference to a man.

Krishna Kumar Vishwakarma

Tech Enthusiast | Senior Software Engineer at Oracle | Ex-Freshworks| Ex-Quadratyx

7 年

very nice

要查看或添加评论,请登录

社区洞察

其他会员也浏览了