Is This Normal?
A loose rendering of the normal distribution curve, done in Adobe Illustrator

Is This Normal?

Now that I have your attention, I'll point out that "this" refers to data points, and not people or behavior. "Normal" in this scenario refers to the normal distribution of values for a single field like we see in the chart for this article's cover image. Normal distribution of data follows the following characteristics:

  • Bell-shaped
  • Symmetrical
  • Unimodal (bell has one peak)
  • Mean, median, and mode are equal and at the center of the distribution

Describing the characteristics of a normal distribution in words still only describe a fraction of what we can see when we look at an actual normal distribution curve. In the graphic below, we can see the impact of visualizing standard deviations on a normal distribution curve as well.

No alt text provided for this image
Normal Distribution with Standard Deviations

Examples of normal distribution that we see every day include things like height, weight, and shoe size.

Power BI Weekly

Expecting a normal distribution is one thing, but how can we calculate it and compare it to existing data values? In one of the latest videos of the Power BI Weekly series, I cover how to do all of this using DAX measures in tandem with a standard combination line/stacked column chart.

No alt text provided for this image
Power BI Weekly: Normal Distribution with DAX

Analyzing normal distribution is a key part of the statistical analysis that goes into AI and machine learning algorithms for example.

100 Days, 100 AI Courses

I'm so excited to be part of the latest LinkedIn initiative for learning AI skills! You can read about it more in an insightful recent blog post that came out on March 15th: 100 (Free) AI Courses to Help You Navigate the Future of Work (linkedin.com)

No alt text provided for this image
LinkedIn 100 AI courses in 100 days

Learning that I had one course available in these 100 free courses was exciting, but I was even more excited to learn that I had three courses in the list! If you're wondering what these courses cover, keep on reading below.

Power BI

I love writing code, but I also love tools and applications that make my life easier. Power BI is an interesting tool because it combines both of those facets together in an interesting way. While there are many ways we can incorporate ETL frameworks, algorithms, or data visualizations from writing R or Python code of our own within the application, this course doesn't do any of that directly.

Instead, it breaks down using AI (and machine learning, which incorporates feedback loops into these algorithms) into three different AI model categories:

  1. Power BI automatically runs the model for us.
  2. We connect to a pre-built algorithm or function within Power BI that runs an AI model.
  3. We build the model ourselves (in this case using DAX measure for linear regression and outlier detection).

No alt text provided for this image
Power BI: Integrating AI and Machine Learning

Logistic Regression

Linear and logistic are both types of regression models. While we see linear regression as a straight line on a two-dimensional plane, logistic regression is an S-shaped curve in the same space. The course explores why we would choose one over the other. It also explores how to understand logistic regression models at a detailed level in Excel, then move the code into R and then Power BI to make it a scalable process.

No alt text provided for this image
Machine Learning with Logistic Regression in Excel, R, and Power BI

Data Reduction

This course focuses on reducing dimensionality in data, which includes clustering algorithms (like KMeans and hierarchical clustering), as well as algorithms for anomaly detection like Principal Components Analysis (PCA for short). Like the logistic regression course above, this course also explores how to understand these models first at a detailed level in Excel, then move the code into R and then Power BI to make it a scalable process.

No alt text provided for this image
Machine Learning with Data Reduction in Excel, R, and Power BI

What's New?

Three months into the year and a lot's happened so far, including the release of a Power BI Weekly serial course video every week on Thursdays.

New Time Series Course!

In addition to my weekly video releases, I had another full course come out in the LinkedIn Learning library this past week focusing on time series models.

No alt text provided for this image
Time Series Modeling in Excel, R, and Power BI

Houston Power BI User Group

If you live in the Houston area (or you're stopping through), check out our user group meeting on the third Thursday of each month. You can check out the meeting details and sign up for the actual meeting on Meetup.

Greater Houston Power BI Users Group (Houston, TX) | Meetup

No alt text provided for this image

Thanks for subscribing to my newsletter! More to come in the next newsletter edition on the new time series modeling course and other topics to come after that!

-HW

Monika Wahi

Epidemiology & Biostatistics Consultant a/k/a Data Scientist | Exclusive and innovative solutions for data science challenges in public health, research and education

1 年

I had a great statistics teacher tell me that when I ask if a distribution is normal, a better way of asking is, "Is it definitely NOT normal?" If a distribution is so skewed or bimodal it is definitely not normal, then it's an easier call. But if you don't rule out normalness, then what are you going to do? My solution is to get big data so the distribution doesn't matter as much - and since you are using PowerBI, that's probably your solution, too! I'll check out these courses when I have a moment! ??

Is this normal? When talking about the distribution of residuals in a linear regression model, the question isn't even asked!

Taiwo Adegbayi

Analytics Engineer @ FirstBank Nigeria

1 年

Nice intro for the 3 courses. I will surely look them up.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了