Introduction to seaborn! (Part -2)
Hi Guys! Welcome to the next part of the seaborn series.
If you missed Part-1, please make sure you read the previous part because it is the continuous part of seaborn. You can't understand this. click to hit the article ??
Here's the small reminder: Seaborn helps to visualize the data effectively, and it has more cool features in it. Every feature (independent variables) needs to visualize properly to get beautiful insights from the data. Seaborn helps to visualize the data statistically, and it has a more flexible function within it, which helps to visualize properly!
In the last article, we had a deeper look at the reg and cat plot, today we will look at hist and violin plot.
Before going into detail about seaborn, let me introduce the useful library. It will help to see the visualization clearly! We have a tool in Ipython that helps to visualize the data clearly because it has the format of SVG (Scalable?Vector?Graphics) format. The SVG format helps to make the plot clear, and it helps to interpret the image easily. Let me throw the picture before and after applying of SVG format in a notebook.
If you see carefully the image before, the plot looks blurred and is not visible clearly. After applying the SVG module, the plot looks clear and bright! It helps to understand the data more clearly.
Code to activate SVG in a notebook:
# Install the module
pip install IPython
# Import the module
from IPython import display
display.set_matplotlib_formats('svg')
Histogram:
It is the statistical interpretation of the features. It's also called by frequency diagram.
Data types in hist plots?
It is like Bar Plot:
Ex:
Consider this is our dataset:
class Interval (Price ranges of pens): In-dependent features
Frequency (Number of Pens sold) during the ranges: Dependent features
We are going to make a histogram by using this data.
Let's plot by using these value:
So simple, right? But in case of continuous numbers, selecting the bins is too arduous task.,
For selecting the bins, we use Freeman-Diaconis rule. In simple words, it helps to calculate the bins based on the data.
By using this simple formula, we can calculate the bins for our histogram.
Let hit in code: (Matplotlib Code)
#?import?libraries
import?matplotlib.pyplot?as?plt
import?numpy?as?np?
import?scipy.stats?as?stats
##?create?some?data
#?number?of?data?points
n?=?1000
#?number?of?histogram?bins
k?=?40
#?generate?log-normal?distribution
data?=?np.exp(?np.random.randn(n)/2?)
#?one?way?to?show?a?histogram
plt.hist(data,k)
plt.xlabel('Value')
plt.ylabel('Count')
plt.show()
output:
Using Freedman-Diaconis rule:
##?try?the?Freedman-Diaconis?rule
r?=?2*stats.iqr(data)*n**
(-1/3)???#?This?is?one?of?the?way?to?find?how?many?bins?needed?for?histogram.?
b?=?np.ceil(?(max(data)-min(data)?)/r?)
plt.hist(data,int(b))
#?or?directly?from?the?hist?function
#plt.hist(data,bins='fd')
plt.xlabel('Value')
plt.ylabel('Count')
plt.title('F-D?"rule"?using?%g?bins'%b)
plt.grid()
plt.tight_layout()
plt.show()
output:
Seaborn code:
# Seaborn internally using the Freedman rule, so you don't need to worry about anything.
import seaborn as sns
sns.distplot(data)?#?uses?FD?rule?by?default
# that's all
These are all the basics of histogram, let's see about violin plot.
Violin Plot:
It is the beauty of histogram. It has 2 steps:
Visual interpretation:
Real-World Data Interpretation:
Let hit in code: (Matplotlib Code)
#?import?libraries
import?matplotlib.pyplot?as?plt
import?numpy?as?np
import?scipy.stats?as?stats
#?Install?the?module?
#pip?install?IPython?
#?Import?the?module?
from?IPython?import?display
display.set_matplotlib_formats('svg')
##?create?the?data
n?=?1000
thresh?=?5?#?threshold?for?cropping?data
data?=?np.exp(?np.random.randn(n)?)
data[data>thresh]?=?thresh?+?np.random.randn(sum(data>thresh))*.1
#?show?histogram
plt.hist(data,30)
plt.title('Histogram')
plt.grid()
plt.tight_layout()
plt.show()
#?show?violin?plot
plt.violinplot(data)
plt.title('Violin')
plt.grid()
plt.tight_layout()
plt.show()
Ouput:
Seaborn Code:
import?seaborn?as?sn
sns.violinplot(data,orient='v')
plt.tight_layout()
Ouput:
This is all about Hist and Violin plot.
Did you like this article? Don't forget to share:
Look at our latest articles:
Click the photo to redirect:
Activation Functions
Gentle Introduction to Inferential Statistics!
+
Name: R.Aravindan
Company: Artificial Neurons.AI
Position: Content Writer