课程: Complete Guide to AI and Data Science for SQL: From Beginner to Advanced

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Checking the distribution of the variables

Checking the distribution of the variables

- [Instructor] Now that you've explored the summary statistics of your dataset in the last step, in this step, you are going to visualize your data to gain a deeper understanding of its distribution. Data visualization is like putting on special glasses that allows you to see patterns and insights in your data. To do this, you'll be using Python Library's Matplotlib and Seaborn. Let's dive right in. You see this code? When you run it, you're creating histograms for each of your columns, and each histogram represents the distribution of a specific attribute. Here's an example. Take a look at the crime rate column. The distribution of crime rate appears to be highly skewed to the right with a mean of 3.61 and a maximum value of 88.98. When you look at the histogram, you'll see that most houses have a crime rate below 20 and there are fewer houses with higher crime rates. This means that the majority of houses in the…

内容