课程: Data Analytics for Business Professionals

Predictive analytics

- Predictive analytics answers the question what might happen in the future? Amazon has offered a unique spin on that concept with its recommended for you feature. This personalized recommendation system uses something called a collaborative filtering engine. It looks at what's in your cart, your wishlist and what you've bought recently and recommends other products based on what other consumers have bought while purchasing those items. The key here is that predictive analytics does not predict the future. There are too many variables to ever safely know what is going to happen. Rather predictive analytics attempts to determine which future events are the most likely. Amazon does not know exactly what you buy when you put peanut butter into your cart. However, past behavior from other customers suggests it'll probably be jelly but this is by no means a certainty, which is why Amazon gives you more than one recommended product. They want you to buy at least one and increase their revenue. So let's add to our list of descriptive statistics. Another potentially useful one is a standard deviation, a measure of the relative spread of the data. The standard deviation measures how close to the mean is the overall sample or are there many observations that are far from the average? If we have a large standard deviation, that means that a great deal of data lies further away from the mean, a small standard deviation means that the data are close together around the mean. Without getting overly technical, we often assume large data sets exhibit a normal distribution. What you might have heard of as the bell curve. With this assumption, the 68 95 99 rule is often used as a way to put standard deviations into perspective. 68% of the data are within one standard deviation above or below the mean. 95% of the observations are within two standard deviations above or below the mean and 99.7% of the observations are within three standard deviations above or below the mean. Keenly observing standard deviations for a data set can give us insight on the spread of the data and an idea of how we should interpret the statistics.

内容