Normal Distribution
What is it ?
It's one of the most popular forms of distribution in statistics also known as Gaussian, Gauss or Laplace-Gauss distribution), curved like a bell, hence the name of bell curve.
The Gaussian distribution is defined by 2 parameters σ (the standard deviation of the distribution) and μ (the mean of the distribution).
Characteristics of normal distribution :
Laplace-Gauss distribution PDF :
This formula allows us to draw our normal distribution using the mean and the standard deviation. For example X~N(10,2) ; 10 is the mean and 2 is the standard deviation
Our normal distribution would be equal to :
领英推荐
Let's take an example to see where do we use it
Example :
Photo by Philip Myrtorp on Unsplash
You have a business appointment to schedule, however you don't know the time when you should set the appointment, taking into account that you must take a plane and then a taxi to attend the appointment.
You know that flying from point A to B provides the following distribution:
From the distribution, you know that it is most likely that the time needed to reach your destination would be between 3.5h to 4.5h with most likely the flight would be at 4 hours, basing on the flights flown. ,and that you have a low chance of arriving in 2.5 or 5 hours and more, so you decide to schedule your appointment at 4.5 hours after your shift so as not to arrive late!
You can see in the graph that the average is 4 hours with an approximate standard deviation (σ) of 0.5 hours.
In the next article we will explore the standard normal distribution as well as how to determine the probabilities, what would be the chances that the trip would only take 3 hours? ??
Let's build it on Python
from scipy.stats import norm
import numpy as np
import matplotlib.pyplot as plt
# PDF
# We need x , mu and sigma for normal distribution
def normal_dis_pdf(x, mu, sigma):
? ? return (1 / (sigma * np.sqrt(2 * np.pi))) * np.exp((-1 / 2)* ((x- mu / sigma) ** 2))
# Let's test it
# x will be a np.array, you can also do it with a single value
# μ would be 0 and σ is equal to 1
x = np.array([1,2,3])
# You can also just do , no need to hard code
norm.pdf(x,loc=0,scale=1)
norm.pdf(x)
# By default loc is equal to 0 and scale is equal to 1
# Let's do some visualization
# We will have values that start from -7 to 7 with 10000 values
x = np.linspace(-7,7,10000)
fig, ax = plt.subplots(figsize=(20,8))
ax.plot(x,normal_dis_pdf(x,0,1))
plt.show()
# CDF
norm.cdf(x,loc=0,scale=1)
fig, ax = plt.subplots(figsize=(20,8))
ax.plot(x,norm.cdf(x,0,1))
plt.show()