The measure of Central Tendency
There are three main measures of central tendency: mean, median, and mode.
The mean is the arithmetic average of a set of values and is computed by summing all the values and dividing by the number of values.
The median is the middle value in a set of values. To find the median, you must first arrange the values in numerical order and then determine which value falls in the middle. If there is an odd number of values, the median is the middle value. If there is an even number of values, the median is the mean of the two middle values.
The mode is the value that occurs most frequently in a set of values. A set of values can have more than one mode or no mode.
Which measure of central tendency is most appropriate depends on the data set's characteristics and the question you are trying to answer.
What is Mean(Arithmetic)?
The mean, also known as the arithmetic mean or average, is a measure of central tendency that represents the average value of a set of numbers. It is computed by adding all the values in a set and then dividing by the number of values. For example, the mean of the set {1, 3, 4, 7} is (1+3+4+7)/4 = 15/4 = 3.75.
The mean is often used to describe the central tendency of a set of continuous or numerical data, and it is sensitive to every value in the set. This means that the mean can be influenced by outliers or extreme values in the data. For example, if we add the value 100 to the set {1, 3, 4, 7}, the mean becomes (1+3+4+7+100)/5 = 115/5 = 23, which is significantly higher than the mean of the original set.
In summary, the mean is a measure of central tendency that is calculated by summing all the values in a set and dividing by the number of values. It is often used to describe the central tendency of continuous or numerical data and can be influenced by outliers or extreme values in the data.
When not to use the mean?
?There are several situations when it may not be appropriate to use the mean as a measure of central tendency:
The data is skewed: If the data is skewed, the mean may not accurately represent the central tendency of the data. For example, if the data has a long tail on one side, the mean will be influenced by the outliers in the tail and may not be a good representation of the majority of the data.
The data is ordinal: If the data is ordinal (i.e., it has a natural order but the intervals between values are not equal), the mean may not be meaningful. For example, if you are measuring satisfaction on a scale of 1 to 5, the mean may not be a good representation of the central tendency because the intervals between the values (1, 2, 3, 4, and 5) are not equal.
The data is categorical: If the data is categorical (i.e., it consists of categories rather than numerical values), the mean is not defined. For example, if you are collecting data on the favorite colors of a group of people, the mean is not a meaningful statistic because the data consists of categories (e.g., red, blue, green, etc.) rather than numerical values.
The data has missing values: If the data has missing values, the mean may not be a reliable statistic because it is based on all the values in the data set. If some of the values are missing, the mean may be biased or misleading.
In summary, the mean may not be a suitable measure of central tendency in situations where the data is skewed, ordinal, categorical, or missing values. In these cases, other measures of central tendency such as the median or mode may be more appropriate.
What is the Median?
The median is a measure of central tendency that represents the middle value in a set of numbers. It is the value that separates the higher half of the data from the lower half. To find the median of a set of values, you must first arrange the values in numerical order and then determine which value falls in the middle.
For example, consider the following set of values: {3, 7, 5, 2, 1, 6, 4}. To find the median, we first arrange the values in numerical order: {1, 2, 3, 4, 5, 6, 7}. The median is the middle value, which in this case is 4.
If there is an odd number of values in the set, the median is the middle value. If there is an even number of values, the median is the mean of the two middle values. For example, consider the following set of values: {3, 7, 5, 2, 1, 6}. To find the median, we again arrange the values in numerical order: {1, 2, 3, 5, 6, 7}. There are two middle values (3 and 5), so the median is the mean of these two values, which is (3+5)/2 = 4.
The median is a useful measure of central tendency in situations where the data is skewed or has outliers because it is not influenced by extreme values. It is also a good choice for ordinal data because it preserves the order of the values.
When not to use the median?
There are a few situations when it may not be appropriate to use the median as a measure of central tendency:
?
领英推荐
The data is continuous: If the data is continuous (i.e., it can take on any value within a range), the median may not be a meaningful statistic because it only represents a single value. In this case, the mean may be a more appropriate measure of central tendency.
The data is normally distributed: If the data is normally distributed (i.e., it follows a bell-shaped curve), the median and mean will be similar, and either one could be used to represent the central tendency of the data. However, if the data is not normally distributed, the median may be a better choice because it is not influenced by extreme values.
The data is categorical: If the data is categorical (i.e., it consists of categories rather than numerical values), the median is not defined. For example, if you are collecting data on the favorite colors of a group of people, the median is not a meaningful statistic because the data consists of categories (e.g., red, blue, green, etc.) rather than numerical values.
In summary, the median may not be a suitable measure of central tendency in situations where the data is continuous, normally distributed, or categorical. In these cases, other measures of central tendency such as the mean or mode may be more appropriate.
What is the Mode?
The mode is a measure of central tendency that represents the most common value in a set of numbers. It is the value that occurs most frequently in the data.
For example, consider the following set of values: {1, 2, 2, 3, 3, 3, 4, 4, 5}. The mode of this set is 3 because it occurs more frequently than any other value.
A set of values can have more than one mode or no mode at all. If a set has two or more modes, it is called multimodal. If a set has no mode, it is called uniform.
The mode is a useful measure of central tendency for categorical data or data that is not continuous (i.e., it takes on only a limited set of values). It is also a good choice for data with a large number of unique values because it is not influenced by the magnitude of the values.
However, the mode is not a good choice for continuous data or data with a small number of unique values because it may not accurately represent the central tendency of the data. In these cases, other measures of central tendency such as the mean or median may be more appropriate.
when not to use the Mode?
There are a few situations when it may be appropriate to use the mode as a measure of central tendency:
The data is categorical: If the data is categorical (i.e., it consists of categories rather than numerical values), the mode is a useful measure of central tendency because it represents the most common category. For example, if you are collecting data on the favorite colors of a group of people, the mode would be the most common favorite color.
The data is not continuous: If the data is not continuous (i.e., it takes on only a limited set of values), the mode is a good choice because it represents the most common value. For example, if you are collecting data on the number of children in a household, the mode would be the most common number of children.
The data has a large number of unique values: If the data has a large number of unique values, the mode may be a good choice because it is not influenced by the magnitude of the values. For example, if you are collecting data on the ages of a group of people, the mode would be the most common age.
You are interested in the most common value: If you are interested in the value that occurs most frequently in the data, the mode is a good choice. For example, if you are collecting data on the most common type of car in a city, the mode would be the most common type of car.
In summary, the mode is a useful measure of central tendency for categorical data, data that is not continuous, data with a large number of unique values, or when you are interested in the most common value in the data.
Here is an example of a graph that illustrates the difference between the mean, median, and mode of a set of values:
In this graph, the mean is represented by the horizontal line, the median is represented by the vertical line, and the mode is represented by the highest peak.
The mean is the arithmetic average of the values and is computed by summing all the values and dividing by the number of values. It is represented by a horizontal line on the graph.
The median is the middle value in a set of values and is represented by a vertical line on the graph.
The mode is the value that occurs most frequently in a set of values and is represented by the highest peak on the graph.
The mean, median, and mode are all measures of central tendency that describe the center or typical value of a set of numbers. They are used to summarize and analyze data, and each one is appropriate in different situations. The mean is an average value and is sensitive to every value in the set, the median is the middle value and is not influenced by extreme values, and the mode is the most common value and is a good choice for categorical data or data with a large number of unique values.