登录查看更多内容

Population vs sample

Jesper Martinsson

From Oceans to Dashboards: Marine Ecologist | Data Wrangler | BI Leader

发布日期: 2023年6月6日

Population and sample are two fundamental concepts of statistical theory. In every statistical test, you deal with at least one population and an associated sample.

Before even thinking of collecting data you need to define the population(s) involved in the test. A population is the group you want to make generalizations about. You want to make some sort of statement about this group, such as: “Oak trees are on average x meters in length”. When performing a statistical test you often want to see if there is a difference between two potential populations, such as: “Oak trees in area x are on average taller compared to oak trees in area y”. But, if no difference is detected by the test, there is a high probability that heights of all oak trees belong to the same population. You need to be sure about which group you actually make generalizations about.

It is in practice impossible to gather information about the heights of all oak trees in the world or even in Sweden. Therefore you need to collect a subset of all the heights in the population. This is called a sample. The sample is a random subset that represents the population. The sample needs to be random, otherwise it is not really representing the population. Let’s say you are interested in describing the height of all oak trees in the world. But, since you don't like to fly, you only collect random subsets of nearby countries where you can go by train. The problem about this study is that the samples are not randomly drawn from the population of all oak trees in the world. The samples in fact represent the heights of oak trees in a small part of Scandinavia.

领英推荐

Monday Morning Quarterback

Los Angeles County Real Estate Investors Association, LLC 2 年前

Extending Geodemographics Using Data Primitives

Consumer Data Research Centre 6 个月前

Rural Center Staff Spotlight: Dalton Bailey, Data…

NC Rural Center 1 年前

When you are working with a dataset, it is important that you know if you are dealing with a population or a sample. In most cases the data is from a sample, but sometimes it is actually possible to collect data from the entire population. The equations used to describe a population differ depending on if you have observations from all the units in the population or from a random sample.

To wrap up:

Be sure to define the population that you want to make generalizations about.
The sample of a population needs to be representative, which means it has to be randomly drawn from the population.
Be sure that you know whether your data is from the entire population or a sample.

Ilaf Hashim

Business Intelligence Developer at Voyado

1 年

Love it. S? pedagogiskt f?rklarat! ??

1 次回应

要查看或添加评论，请登录

Jesper Martinsson的更多文章

Standard error

2024年5月17日

Standard error

I believe the standard error is one of the most confusing concepts for those that are new in statistics. That is my…
The normal distribution

2023年11月25日

The normal distribution

The normal distribution has distinct characteristics that form the foundation for parametric statistical tests…
How to describe a statistical population using R - Part 2: Distribution

2023年8月25日

How to describe a statistical population using R - Part 2: Distribution

Besides Location and variability you can also use the distribution as a way to describe your data. Frequencies and…
How to describe a statistical population using R - Part 1: Location and variability

2023年7月21日

How to describe a statistical population using R - Part 1: Location and variability

Measures of location and variability play a fundamental role in describing a statistical population. They are equally…
Hypothesis testing

2023年6月13日

Hypothesis testing

Hypothesis testing is, according to my opinion, analogous to the scientific method. It follows a logical structure that…
Variables and scale

2023年6月9日

Variables and scale

Data used in research and statistical tests can be obtained by measuring stuff directly (such as height), collecting…

See all articles

Population vs sample

Jesper Martinsson

From Oceans to Dashboards: Marine Ecologist | Data Wrangler | BI Leader

领英推荐

Jesper Martinsson的更多文章

社区洞察

其他会员也浏览了

Correlation Dynamics: Charleston's Regional Price Parity and Population Growth

Law of Large Numbers

Monday Morning Quarterback

LyPSOS-Round 2

Population dynamics and digitalization: Implications for COVID-19 data sources in South Africa—A scoping review.

Making Data-Driven Decisions Without a Math Degree

Basics of Statistics

A Statistician counts well

Suicidal Explosion of World Population

Discussion on Population Education & National Development

领英推荐

Jesper Martinsson的更多文章

Standard error

The normal distribution

How to describe a statistical population using R - Part 2: Distribution

How to describe a statistical population using R - Part 1: Location and variability

Hypothesis testing

Variables and scale

社区洞察

其他会员也浏览了

Correlation Dynamics: Charleston's Regional Price Parity and Population Growth

Law of Large Numbers

Monday Morning Quarterback

LyPSOS-Round 2

Population dynamics and digitalization: Implications for COVID-19 data sources in South Africa—A scoping review.

Making Data-Driven Decisions Without a Math Degree

Basics of Statistics

A Statistician counts well

Suicidal Explosion of World Population

Discussion on Population Education & National Development