登录查看更多内容

How to Estimate Chance with Dice Rolls Using Convolutions and Recursion

Matt Rosinski

Senior Data Scientist | Business Insights | Causal AI

发布日期: 2023年5月23日

In today's edition of Data Science Code in Python + R we're going to tackle a fascinating concept. We're diving into the world of convolutions and recursion, two powerful tools you can utilize in your data science toolkit. And we're going to do it with something fun: a dice-rolling exercise!?

To start off, let's break down these two terms. Convolution is a mathematical operation on two functions that produces a third function. It’s a way of combining two sets of information. For data scientists, this comes up often in the context of image and signal processing, but it has other interesting applications as well.

On the other hand, recursion is a method of solving problems by having functions call themselves. In other words, a recursive function solves a problem by solving smaller instances of the same problem.

Let's now talk about how we can apply these concepts to our dice rolling scenario.

Step-by-Step Convolution and Recursion:

Consider a single die roll of a six-sided die. The outcomes can range from 1 to 6, each with equal probability of 1/6. We can represent this as a probability vector (single_die_prob) where each element corresponds to the probability of the respective outcome. The probability of getting any particular sum when rolling a single die is straightforward - it's simply 1/6 for each of the possible outcomes (1 through 6).

# Generate the probability vector for a single die in R
single_die_prob = rep(1/num_sides, num_sides)

But what happens when you roll more than one die and want to know the probability of getting a particular sum? This is where convolutions come into play!

The convolution of the distributions of two independent random variables gives us the distribution of their sum. So, if we roll two dice and sum the result, we can get the probability distribution of the sum by convolving the distributions of the individual rolls.?

# Generate the probability vector for the sum of two dice roll in R

two_dice_sum_prob = convolve(single_die_prob, single_die_prob, type = "open"))

This gives us the probability distribution for the sum of two dice. The process can be repeated for more dice. For three dice, we take the convolution of the distribution for two dice with the distribution for one die:

# Generate the probability vector for the sum of three dice rolls in R

three_dice_sum_prob = convolve(two_dice_sum_prob, single_die_prob, type = "open"))

If we have many dice, we don't want to manually write out each step like this. This is where recursion comes in!

Recursion allows us to automate this process by convolving the distribution for n dice with the distribution for one die to get the distribution for n + 1 dice. This is done using the `Reduce` function in R, which applies a function recursively over a list.?

# Generate the probability vector for the sum of multiple dice roll in R

dice_sum_prob = Reduce(function(x, y) convolve(x, y, type = "open"), rep(list(single_die_prob), num_dice))

And that's it! We've calculated the probability distribution for the sum of dice rolls using convolutions and recursion.

Lastly, we can sum up the probabilities for sums greater than a target value:

Towards Data Science 6 个月前

7 Data Science Trends for 2023, Top ODSC Recordings…

Open Data Science Conference (ODSC) 1 年前

Data Science #4

Andriy Burkov 1 年前

# Calculate the probability of the sum being greater than target_sum in R

favorable_prob = sum(dice_sum_prob[(target_sum - num_dice + 2):length(dice_sum_prob)])

Putting it Together with Python and R

Performing convolutions in R to establish the probability distribution for 5 dice

The output for the R code is shown in the cover image for this article. We can use a similar approach in Python with a few modifications.

Performing convolutions in Python to establish the probability distribution for 5 dice

# Calling our Python function
estimate_dice_roll_chances(num_dice = 5, sum_greater_than = 20)}

We see a similar result rendered with Seaborn and Matplotlib using our Python function for the case where we have 5 standard six-sided dice and want to know the chances a roll will sum to greater than 20.

Probability distribution for dice roll sums using convolutions and recursion

Other Applications

The beauty of this approach is that it isn't limited to dice rolls. Anytime you're dealing with independent random variables and you're interested in the probability distribution of their sum, convolution can come into play. This can be applied in a wide variety of scenarios:

Signal Processing: In the field of digital signal processing, convolutions are used to apply filters to signals.
Image Processing: Convolution is at the heart of convolutional neural networks (CNNs), a class of deep learning models most commonly applied to analyzing visual imagery. They're used in feature extraction which is vital for image recognition tasks.
Natural Language Processing (NLP): Convolutional neural networks can also be used in text analysis to capture the sequence of words in a sentence or phrases.
Time series analysis: The moving average of a time series can be obtained through the convolution of the time series data and a sequence of weights.

Wrapping Up

So, there you have it, a step-by-step walkthrough of convolution and recursion, set in the context of a dice rolling problem. I hope this helped you grasp these fundamental concepts. With these tools in your data science toolkit, you can now approach many problems in a new and more efficient way.

Remember, as with anything in data science, practice is key. Try experimenting with these concepts in different scenarios and exploring further applications.

Happy coding and remember to like and leave a comment if you found this article interesting or helpful!

~ Matt

Better Decisions with Data

28,612 位关注者

Hussain Mohammed Dipu Kabir

Programming, Research and Development

5 个月

Hi Matt,? Thank you for sharing. I have shared functions in the Python programming language for computing probabilities of finding different numbers as the sum of outcomes for rolling an N-sided dice K times. Link: https://github.com/dipuk0506/dice The readme file describes the reasons. In functions, I did convolutions over loops. There are two functions, one is for fair dice and another is for unfair dice.

要查看或添加评论，请登录

查看全部

How to Estimate Chance with Dice Rolls Using Convolutions and Recursion

Matt Rosinski

Senior Data Scientist | Business Insights | Causal AI

Step-by-Step Convolution and Recursion:

领英推荐

Putting it Together with Python and R

Other Applications

Wrapping Up

Better Decisions with Data

28,612 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

How to Become a Data Scientist with Free Learning Materials

Python vs R – Who Is Really Ahead in Data Science, Machine Learning?

DATA Pill #092 - MLFlow iceberg, Meta ?? Python

Machine Learning fast-track: Telco Customer Churn Prediction

AI_Part_5_K-NN

A Data Science Framework: To Achieve 99% Accuracy using Python

Pre-processing data in Python for Machine Learning

6th Story – If You can Visualize It. You can Explain It

RStudio Became Posit PBC Yesterday - Here's Why I Think That's Good News

Seaborn

Step-by-Step Convolution and Recursion:

领英推荐

Putting it Together with Python and R

Other Applications

Wrapping Up

Better Decisions with Data

28,612 位关注者

How to Build a Hierarchical Bayesian Model with PyMC (and Make a Comeback)

2024年10月9日

How to build a hierarchical Bayesian model (and include team-specific effects on win probability)

2024年9月25日

How to estimate the chance your NFL team will win (even if the game has already started)

2024年9月11日

How to Make Better Decisions with Data (and Leverage Your Subject Matter Expertise)

2024年9月10日

The Eye Test: How to Find Conditional Probabilities Using Multi-Dimensional Arrays

2023年7月18日

Monte Carlo Simulation: How to Model Labour Requirements for a Call Centre (and the Data Generating Process)

2023年6月27日

How to Build a Faster Bayesian Linear Regression Model with Bambi + BRMS (Even With NUTS)

2023年6月20日

The Chance Framework: How to Explain A/B Test Results to Managers Using Probability (Without p-values)

2023年6月13日

Revise Your Priors: Updating Marketing Metrics with Bayesian Analysis in Python + R

2023年6月6日

Bayesian Methods: A Powerful Tool for Estimating Conversion Rate Uplift

2023年5月30日

社区洞察

其他会员也浏览了

How to Become a Data Scientist with Free Learning Materials

Python vs R – Who Is Really Ahead in Data Science, Machine Learning?

DATA Pill #092 - MLFlow iceberg, Meta ?? Python

Machine Learning fast-track: Telco Customer Churn Prediction

AI_Part_5_K-NN

A Data Science Framework: To Achieve 99% Accuracy using Python

Pre-processing data in Python for Machine Learning

6th Story – If You can Visualize It. You can Explain It

RStudio Became Posit PBC Yesterday - Here's Why I Think That's Good News

Seaborn