Sherlock Holmes and his Statistical Inference

Sherlock Holmes and his Statistical Inference

The story begins when Hilton Cubitt of Ridling Thorpe Manor in Norfolk visits Sherlock Holmes and gives him a piece of paper with the following mysterious sequence of stick figures.

No alt text provided for this image

Cubitt explains to Holmes and Dr Watson that he has recently married an American woman named Elsie Patrick. Before the wedding, she had asked her husband-to-be never to ask about her past, as she had had some "very disagreeable associations" in her life, although she said that there was nothing that she was personally ashamed of. Their marriage had been a happy one until the messages began to arrive, first mailed from the United States and then appearing in the garden.

The messages had made Elsie very afraid but she did not explain the reasons for her fear, and Cubitt insisted on honouring his promise not to ask about Elsie's life in the United States. Holmes examines all of the occurrences of the dancing figures, and they provide him with an important clue.

In fact, Holmes used an important Statistical Method to get to the cause of the matter.

Holmes' Solution

Holmes observed that there are exactly 26 such alphabets. That gave him a guess that the set of characters has a one-to-one correspondence with the set of English alphabets. He was struggling to find out which character corresponds to which letter. After going into the depth of English literature, he realized that "e" is the most frequent word in the literature. Thus, the most frequent character in the messages will correspond to "e". It turned out it did, and in fact, with the help of the name of the lady "Elsie", the other letters can be easily discovered using correspondence matching of which letter comes after which letter. Thus, the final correspondence was reached, with the number of messages, Holmes had received. Brilliant, right?

The Statistics inside Holmes' Mind Palace

No alt text provided for this image

Further Exploration

I have shared my personal thoughts on the relationship of this Markov Chain too.

No alt text provided for this image

The method used here was first done by Andrey Markov. Also, this concept is used by Claude Shanon to develop his theory for Communication Systems and Information Theory. In fact, this is a foundational stone in Natural Language Processing of how languages operate. You can know more about this here.

Acknowledgement

I want to thank Professor Bimal Kumar Roy for sharing this in his talk. The talk was in Bengali so I have translated the idea and added my own thoughts to this article with a special perspective on Probability and Statistics. I also want to thank Sarupyo Chatterjee for sharing the talk with me. This story is from Sherlock Holmes: The Adventure of the Dancing Men. I have copied the initial story from the Wikipedia page. You can watch the episode here.

The link to the full article is given in the comments.

Shubham Samant

Machine Learning Engineer

1 年

very interesting ??

Sai Kumar Popuri, PhD

Data, Analytics and Engineering leader | Retail | Finance | Experienced C-suite Partner | PhD (Stats / AI) | FRM

1 年

Jeremy Brett's depiction of Holmes ?? . Scandal in Bohemia is probably the very first episode in that tv series.

Srijit Mukherjee

Medical AI Researcher at Penn State

1 年

Read the full article and the mathematics behind it in https://www.srijitmukherjee.com/sherlock-holmes-and-his-statistical-inference/ Let me know how you like it.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了