Probability and Likelihood in R
Probability
What is Probability vs Probability distribution.
Probability is a measure of the likelihood of an event occurring.
Probability distribution is a mathematical function or a model that measures the probability of occurrence of different possible values in an experiment.
Flip a coin
Consider a coin toss example (experiment). If we toss a fair coin and the probability of tossing a head or tail is 50%. p= 0.5.
The probability mass function (PMF) of flipping a coin, the random variable (x) can either be 1 (head) or 0 (tail). The number of coin toss is (n=1) and the theoretical probability theta = 0.5
Using the binomial probability distribution (another type of probability function) we can find out the probability of different possible occurrence e.g tossing 2 heads out of the 8 tosses.
PMF
#dbinom generate probability mass function
#x = targeted random variable
#N = Number of attempts (no of coin toss) 1
#θ = Theoretical probability of success (getting a head)
#0 = tail,
#1= head
dbinom(1,size = 1, prob =0.5) #P(x=1|N=1)
[1] 0.5
dbinom(0,size=1, prob = 0.5) #P(x=0|N=1)
[1] 0.5
Using the binomial probability distribution (another type of probability function) we can find out the probability of different possible occurrence e.g tossing 2 heads out of the 8 tosses.
The random variable x can take in multiple values e.g 0-8
set.seed(123)
# Number of coin tosses
n <- 8
# Probability of getting a head on a single toss
p <- 0.5
# Generate a sequence of possible outcomes (number of heads) or no of random variables
x <- 0:8
# Calculate the probability of getting exactly x heads using the binomial distribution
probability <- dbinom(x, size = n, prob = p)
# Plot the probability distribution
barplot(probability, names.arg = x, xlab = "Number of Heads (k)", ylab = "Probability", main = "Probability of Getting Heads in 8 Coin Tosses")
Probability is 0.27343750, when x = 4. Using the likelihood function, we can also calculate the probability of observing a given outcome (4 heads) given a certain probability parameter (theta= 0.5).
probability
[1] 0.00390625 0.03125000 0.10937500 0.21875000 0.27343750 0.21875000 0.10937500
[8] 0.03125000 0.00390625
Likelihood
For the same events, we can calculate binomial likelihood function
The probability of an event is determined by p and N.
probability mass function : P(x|θ)
likelihood function: L(θ|x)
Likelihood determines how likely theta us given x and N
First Method:
Using original bdinom, we fix the random variable but theta varies.
In this example we fix r.v as 4 and theta 0 - 1 (0.01 steps)
# Number of coin tosses
n <- 8
# Number of heads observed (you can vary the number of heads to see for yourself)
x <- 4
theta <- seq(0,1,0.01)
f <- dbinom(x,n,theta)
plot(theta,f,type="l", xlab="θ",ylab="LF",main="LF,x=4,N=8")
领英推荐
At theta 0.5, the likelihood is the highest (check y axis 0.27 ish)
Second Method:
you can create a function.
Here we fix r.v to 6
# Create a LF function
# inputs: theta, x and N
# output: Likelihood
likelihood <- function(theta,x,N){
dbinom(x,N,theta) }
# Set up the inputs for LF you created
# The most important thing is "varying probability of success on each trial"
x <- 6
N <- 8
theta <- seq(0, 1, 0.0001) # varied probability of success
f <- likelihood(theta, x, N)
# plot the likelihood as function of probability of success
plot(theta, f, type="l", xlab="θ", ylab="LF", main="LF, x=6, N=8")
We can see in the code above theta is not fixed and our distribution shifted to the right, with theta being highest 0.7 (maximum). The likelihood is 0.30 (y-axis)
Maximum Likelihood
The ml is obtained at theta = 0.5 if number of coin toss is 8 and number of observing the head is 4.
this likelihood is known as maximum likelihood.
Remember likelihood L(θ|x) = P(x|θ)
For discrete events the likelihood of event occurring (e.g 4 heads ) = 0.27.
also the probability of having 4 heads from a coin toss is also = 0.27
x <- 4
N <- 8
theta <- seq(0, 1, 0.0001) # varied probability of success
f <- likelihood(theta, x, N)
# plot the likelihood as function of probability of success
plot(theta, f, type="l", xlab="θ", ylab="LF", main="LF, x=4, N=8")
Maximizing Likelihood
There are several ways to maximize the likelihood function in R.
Our coin toss examples of simple, but in real world scenarios we will need to find methods to maximize theta.
for example we can find the negative likelihood by creating a new negative-likelihood function.
nlm is non linear minimization, but we can use the nlm function to maximize the likelihood function.
Maximizing a function is the same as minimizing the function multiplied by minus one.
#x = 4
#n=8
negative_likelihood <- function(p){
dbinom(4, 8, p)*-1
}
# Test that our function is behaving as expected
negative_likelihood(0.5)
[1] -0.2734375
nlm(negative_likelihood,0.5,stepmax=0.5)
$minimum
[1] -0.2734375
$estimate
[1] 0.5
$gradient
[1] 0
$code
[1] 1
$iterations
[1] 1
For a theoretical probability (theta) of 0.5, the nlm outputs the maximum likelihood = 0.2734375.
References: