For Statistics Beginners - Types of data

For Statistics Beginners - Types of data

Question:One supermarket collects cardholder's monthly purchase data. Which of the following quantitative variables is it?

  1. Number of items purchased
  2. Purchasing store name
  3. 3day of purchase

Data Types are an important concept of statistics, which needs to be understood, to correctly apply statistical measurements on your data.There are various kinds of data. As the viewpoint of data, the graph to use and the method of analysis are different on each type, it is very important to understand what kind of features it has.

The quantitative data (quantitative variables)

Quantitative data is data that can be counted or expressed numerically. It is commonly used to ask “how much” or “how many” and can be used to study events or levels of occurrence. Because it is numerical in nature, quantitative data is both definitive and objective. It also lends itself to statistical analysis and mathematical computations and therefore, is typically illustrated in charts or graphs.

There are two main types of quantitative data: discrete and continuous. Discrete data is described as having a finite number of possible values. For example, if a teacher gives an exam that has 100 questions, the exam scores reflect the number of answers that were correct out of the 100 possible questions. Discrete data may also be defined as data where there is space between values on a number line, thus values must be a whole number.

For example, if a study examined the number of vehicles owned by households in America, the data collected would be whole numbers. Continuous data is defined as data where the values fall on a continuum and it is possible to have fractions or decimals. Continuous data is usually a physical measurement. Examples may include measurements of height, age, or distance.

  • Continuous data:Continuously continuing without interruption, such as height, time, temperature, etc. Data that can be inferred finely. Example) Next to 175.0 cm is 175.000 ...... 001 cm,
  • Discrete data (discontinuous data): Data that can not be inferred in general, such as the number of people, the number of times, etc. Example) When counting the number of people, the next one is generally two people, not 1.00 ...... 001 people.

The qualitative data (qualitative variables)

Qualitative variables are those whose data is indicated by category. As the name implies, the "quality" is different between the data. As an example, Favorite color,Room layout,sex,name,And so on. Since these are not numerical data, they can not be used for calculation as it is. In order to use it for calculations, special measures are required.My favorite sports, blood type, car number, etc. are just for distinguishing categories and types.

The flow data (flow)

It is data that shows the amount of change that flowed over a certain period of time. Example) From the amount of water flowing in the tub, minus the amount of water exiting the tub (liter per minute)

The stock data (stock)

It is data that shows the amount accumulated at a certain point of time. Example) Amount of water accumulated in a tub (liter at 1:00 pm) This qualitative data as well as quantitative data are further divided into two. First, qualitative data is classified as "nominal scale" and "ordinal scale".

The nominal scale

It is a scale that has significance only in distinction, merely showing men as examples, merely expressing certain classifications as 2, like females, and of course having no meaning in the order of choices.

The order scale

It has meanings in the order of choices such as (1 very good 2 somewhat good 3 somewhat bad 4 very bad). Median is a measure of meaning.

The Scale level

Based on the nature of the information, the scale level can be roughly divided into two types, quantitative and qualitative, from which it can be further classified into four levels.

The proportional scale

Proportional scale has zero as "nothing", that is, it has a special meaning as a base point (origin). For example, elapsed time and speed, height, weight, blood pressure etc. Because the ratio has meaning, all four arithmetic operations of data are possible. Most statistics have meaning, and there are many analysis methods that can be used. Example) elapsed time,Speed,Height,Weight etc.This data is a quantitative variable. It allows addition, subtraction, multiplication and division, so it can be used for various analysis methods.

Nominal scale

Nominal scales are used for labeling variables, without any quantitative value. “Nominal” scales could simply be called “labels.” Here are some examples, below. Notice that all of these scales are mutually exclusive (no overlap) and none of them have any numerical significance. A good way to remember all of this is that “nominal” sounds a lot like “name” and nominal scales are kind of like “names” or labels.Examples of Nominal Scales

Note: a sub-type of nominal scale with only two categories (e.g. male/female) is called “dichotomous.” If you are a student, you can use that to impress your teacher.

Continue reading about types of data and measurement scales: nominal, ordinal, interval, and ratio…

Ordinal scale

With ordinal scales, it is the order of the values is what’s important and significant, but the differences between each one is not really known. Take a look at the example below. In each case, we know that a #4 is better than a #3 or #2, but we don’t know–and cannot quantify–how much better it is. For example, is the difference between “OK” and “Unhappy” the same as the difference between “Very Happy” and “Happy?” We can’t say.Ordinal scales are typically measures of non-numeric concepts like satisfaction, happiness, discomfort, etc.

“Ordinal” is easy to remember because is sounds like “order” and that’s the key to remember with “ordinal scales”–it is the order that matters, but that’s all you really get from these.

Advanced note: The best way to determine central tendency on a set of ordinal data is to use the mode or median; the mean cannot be defined from an ordinal set.

Interval scales

Interval scales are numeric scales in which we know not only the order, but also the exact differences between the values. The classic example of an interval scale is Celsius temperature because the difference between each value is the same. For example, the difference between 60 and 50 degrees is a measurable 10 degrees, as is the difference between 80 and 70 degrees. Time is another good example of an interval scale in which the increments are known, consistent, and measurable.Interval scales are nice because the realm of statistical analysis on these data sets opens up. For example, central tendency can be measured by mode, median, or mean; standard deviation can also be calculated.

Like the others, you can remember the key points of an “interval scale” pretty easily. “Interval” itself means “space in between,” which is the important thing to remember–interval scales not only tell us about order, but also about the value between each item.

Here’s the problem with interval scales: they don’t have a “true zero.” For example, there is no such thing as “no temperature.” Without a true zero, it is impossible to compute ratios. With interval data, we can add and subtract, but cannot multiply or divide. Confused? Ok, consider this: 10 degrees + 10 degrees = 20 degrees. No problem there. 20 degrees is not twice as hot as 10 degrees, however, because there is no such thing as “no temperature” when it comes to the Celsius scale. I hope that makes sense. Bottom line, interval scales are great, but we cannot calculate ratios, which brings us to our last measurement scale…

Ratio

Ratio scales are the ultimate nirvana when it comes to measurement scales because they tell us about the order, they tell us the exact value between units, AND they also have an absolute zero–which allows for a wide range of both descriptive and inferential statistics to be applied. At the risk of repeating myself, everything above about interval data applies to ratio scales + ratio scales have a clear definition of zero. Good examples of ratio variables include height and weight.

Ratio scales provide a wealth of possibilities when it comes to statistical analysis. These variables can be meaningfully added, subtracted, multiplied, divided (ratios). Central tendency can be measured by mode, median, or mean; measures of dispersion, such as standard deviation and coefficient of variation can also be calculated from ratio scales.

?




要查看或添加评论,请登录

Shintaro Nakabayashi的更多文章

社区洞察