Statistics Basic-Data Classification |Statistical Data Types-(Ultralearning_ML_2)

Statistics Basic-Data Classification |Statistical Data Types-(Ultralearning_ML_2)

There only 4 types of data in statistics => Nominal, Ordinal, Interval, Ratio

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image



What is the difference between?Quantitative and Qualitative Data??

Ultimately there are two classes of data in statistics that can be further sub-divided into 4 statistical data types. You may have heard phrases such as 'Ordinal Data', 'Nominal Data', 'Discrete Data' and so on.

What they are and What they've got to do with data?

Maybe they are all just fancy words made up by mathematicians and statisticians to make them sound important?

Well actually, they are pretty important, because?if you know what types of data you have, then you know which Maths and Stats operations you're allowed to use on your data.

Quantitative and Qualitative Data?Points to Remember:

  • Quantitative Data is Measured, Categorised,

Quantitative Data:

  • Interval Data => Properties: Measured, Ordered, Equidistance, Meaningful Zero(no)
  • Ratio Data => Properties: Measured, Ordered, Equidistance, Meaningful Zero

Qualitative Data:

  • Nominal Data => Properties: Measured(no), Ordered(no), Equidistance(no), Meaningful Zero(no)
  • Ordinal Data => Properties: Measured(no), Ordered, Equidistance(no), Meaningful Zero(no)

Types of Data in Statistics - Lot of Confusion:

In Data Analysis and Statistics you'll come across lots of different ways to refer to different types of data and it can get really confusing at time see:

Quantitative Data, Qualitative Data, Numerical Data, Categorical Data, Discrete Data, Continuous Data, Ratio Data, Interval Data, Ordinal Data, Nominal Data, Dichotomous Data

I will try to clear up all confusion - by explaining each of these types of data.

Starting at beginning with Highest Level of Data Classification - Quantitative Data and Qualitative Data

What Are Quantitative and Qualitative Data Types in Statistics?

Quantitative or Numerical Data => It can be measured like age, dates, distance, iq, weight

Qualitative or Categorical Data => observed data that are placed into categories, not measured like Socioeconomic status (Lower class, Middle class, Upper class), Opinion (Agree, Neutral, Disagree)

What are the type of Quantitative Data??

  1. Discrete Data: => information that can only take certain values and can't be made more precise. Like numbers on a die, it cant be 1.2, either it will be 1 or 2 or upto 6 any number
  2. Continuous Data: => can take any value usually within certain limits, and could be divided into finer and finer parts. Your height(6.7fit) is continuous data as it can be measured in metres and fractions of metres (centimetres, millimetres, nanometers)

How one can Tell if Data is Quantitative or Qualitative?

Simply putting:

  • Quantitative Data is measured
  • Qualitative Data is observed and placed into categories

In practical terms:

  • When data is text-based such as Types of Animals [ Sheep, Cow,Ox] then consider it as Qualitative Data
  • When data is number-based like Length (2.13 metres or 5.72 miles) then consider it as quantitative data

One Exception to this:

  • When categories have been numbered for practical purposes, such as Types of Animals [1, 2, 3] instead of [Sheep, Cow,OX]

In above case numbers must be treated as names of Categories - keep in head, you're not allowed to do any calculations with them

Still not clear How one can Tell if Data is Quantitative or Qualitative? then as yourself these questions:

  • Have data been measured if yes then(Quantitative Data)
  • Was data observed if yes then (Qualitative Data)

(if quantitative):?Meaningful Zero?- Does scale of measurement include a unique, non-arbitrary zero value? (answer: Yes = Ratio Data, No = Interval Data), else...

(if qualitative):?Order?- Can some sort of progress be detected between adjacent data points or categories? Can data be ordered meaningfully? (answer: Yes = Ordinal data, No = Nominal Data)

If you have identified that your variable is Discrete Data, then remind yourself of the questions above and ask yourself whether your data have a true zero. If they do, they are Ratio Data, and if not, they are Interval Data

What Type of Data is Nominal Data?

'nominal' (from the Latin?nomen, meaning 'name')

  • Nominal Data => Properties: Measured(no), Ordered(no), Equidistance(no), Meaningful Zero(no)

Examples:

Nationality (Indian, Indian, Indian), Genre/Style (Classical,Jazz,Rock, Hip-Hop), Favourite Colour (red, black, blue), these categories have no order

What Calculate you can do With Nominal Data?

You can't order Nominal data, so you can't sort them. Neither can you do any Mathematical Operations like Sorting, Difference, Magnitude(mult, div) because they are reserved for numerical data.

Only Mathematical or Logical Operations you can perform on Nominal Data is: Grouping based on (equality | inequality), to see if it is Similar | Different.

What Descriptive Statistics You Can Do With Nominal Data?

One can calculate following:

  • Frequencies?- count how many you have in each category
  • Proportions?- determine how often something happens by dividing frequency by total number of events
  • Percentages?- transform proportions to percentages by multiplying by 100
  • Central Point?- you can?determine most common item by finding mode

Other ways of finding the middle of the class such as Median or Mean make no sense because ranking is meaningless for Nominal Data.

If you want to know more about What you can do with Nominal Data including Descriptive Statistics, Data Visualisations, Creating Dummy Variables and which statistical tests you can use with them, I will write one more Article or Post in future on this.        
What Type of Data is Ordinal Data?

It is Statistical Data Type with under given characteristics:

Ordered, Cannot be Measured, Not Equidistance, No meaningful zero

Example: Opinion (agree, mostly agree, neutral, mostly disagree, disagree), Time of day (morning, noon, night)

What Calculate you can do With Ordinal Data?

Can be Grouped(same|difference) and Sorting(greater|smaller then) is possible.

Not Difference(add|subtract) and Magnitude(multiply|divide)

What Descriptive Statistics You Can Do With Ordinal Data?

Can calculate precisely same things as with Nominal data, with a couple of extra things:

  • Frequencies?- count how many you have in each category
  • Proportions?- determine how often something happens by dividing frequency by total number of events
  • Percentages?- transform proportions to percentages by multiplying by 100
  • Central Point?- since there is an order to data?you can rank them and compute?median?(or mode, but not mean) to find central value
  • Summary Statistics?- as data are ordered, you can use percentiles and inter-quartile range to summarise your data

What Else Can You Do With Ordinal Data?


If you need to know more about Ordinal data and what you can do with them, including descriptive statistics, data visualisations, creating dummy variables and which statistical tests you can use with them, I will write one more article or post in future        
What is Difference Between Ordinal and Nominal Data?

  • Both are Qualitative Data
  • Nominal Data can only be Classified - arranged into classes or categories
  • Ordinal Data can be Classified and Ordered

One assumptions of Ordinal Data is that although categories are ordered they do not have equal intervals.

What Type of Data is Interval Data?

Following characteristics:

Ordered, can be?Continuous?(have an infinite number of steps) or?Discrete?(organised into categories) and?degree of difference between items is meaningful?(their intervals are equal) but not their ratio.

The key points of an Interval scale is that word 'interval' means 'space in between', which is important thing to remember - interval scales not only tell us about order but also about value between each item

  • Examples: Temperature (°C or F, but not Kelvin), Dates (20021, 20022, 1776, etc), Time interval on a 12 hour clock (7am, 7pm)

Crucially, Interval data can be negative, whereas Ratio data cannot

Interval Data can appear very similar to Ratio Data, difference is in their defined zero-points.

  • If zero-point of scale has been chosen arbitrarily (such as melting point of water or from an arbitrary epoch such as AD) then data cannot be Ratio Data and must be Interval Data

What Can You Calculate With Interval Data?

Can?compare degrees of data?(equality | inequality, more | less) and you can also?add | subtract?values

Example: with Interval Data you can say things such as '30°C is 10 degrees hotter than 20°C' (30 - 20 = 10) or '6pm is 4 hours later than 2pm' (2 + 4 = 6)

However you cannot multiply or divide numbers because of arbitrary zero, so you can't say 'a person with an IQ of 200 is 2 times as smart as a person with an IQ of 100' or '6pm is twice as late as 3pm'

What Descriptive Statistics Can You Do With Interval Data?

Interval Data are Quantitative Data (continuous data), descriptive statistics for these are very different to those for Qualitative Data.

Central value of Interval data is typically mean?(but could be median or mode). You can also express spread or variability of data using measures such as range,?standard deviation,?variance?and/or?confidence interval.

Descriptive Statistics you can calculate for Interval Data are:

  • Central Point?- Mean, Median or Mode
  • Range?- Minimum and maximum
  • Spread?- Percentiles, Inter-Quartile Range and Standard Deviation

What Else Can You Do With Interval Data?
If you need to know more about Interval Data and what you can do with them, including descriptive statistics, data visualisations, creating dummy variables and which statistical tests you can use with them, I will write one more article or post in future        
What Type of Data is Ratio Data?

Following Characteristics:

can be Measured, Ordered, is Equidistance, have Meaningful Zero

Interval Data can take negative values, whereas Ratio data cannot be negative.

Examples:

Age (from 0 years to 100+), Temperature (in Kelvin, but not °C or F), Time interval (measured with a stop-watch or similar)

For each of these examples of Ratio Data there is a real meaningful zero-point. Age of a person, absolute zero, distance measured from a pre-determined point or time all have real zeros.

What Can You Calculate With Ratio Data?

You can compare the data (equal or not), you can sort the values (greater | less than), add | subtract them and you can also (multiply | divide) them.

What Descriptive Statistics Can You Do With Ratio Data?

With Ratio data you can calculate precisely the same things as you can with Interval data. That is:

  • Central Point?- Mean, Median or Mode
  • Range?- Minimum and maximum
  • Spread?- percentiles, inter-quartile range and standard deviation

What Else Can You Do With Ratio Data?
If you need to know more about Ratio Data and what you can do with them, including descriptive statistics, data visualisations, creating dummy variables and which statistical tests you can use with them, I will write one more article or post in future        


What is Difference Between Ratio Data and Interval Data?

Both Ratio Data and Interval Data are Quantitative Data (numerical data)

Only difference between them is that while both Ratio Data and Interval Data have equal spacing between adjacent values (so you can add and subtract their values) only Ratio Data have a true zero

Interval Data can take negative values, whereas Ratio Data cannot - and this means that Ratio Data can be multiplied and divided, but Interval Data cannot.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了