登录查看更多内容

Hypothesis testing

Jesper Martinsson

From Oceans to Dashboards: Marine Ecologist | Data Wrangler | BI Leader

发布日期: 2023年6月13日

Hypothesis testing is, according to my opinion, analogous to the scientific method. It follows a logical structure that enables an objective procedure that allows science to progress. Hypothesis testing is an essential aspect when it comes to the planning, execution, analysis and interpretation of results of a research project.?

The first step in this procedure involves the construction of a null-hypothesis (H0). This is the opposite of the researcher’s hypothesis (H1), which actually should represent a theory that may explain a specific observation. H1, the alternative hypothesis, can be viewed as a prediction of what will occur if the theory is correct. It is the null-hypothesis that is to be tested with a statistical test.?

The reason for this relies on philosophical grounds: Poppers falsificationism. In short, this concept says that it is not possible to say that anything is true unless you have gathered all possible observations, which in practice is impossible. But, it only requires one observation to falsify a hypothesis. If a hypothesis once has been falsified, it remains false. Then it cannot be true.

For example, you have a belief that all flowers on the planet are red. Every day you only see red flowers, so this is your “truth”. But one day you go out for a stroll beyond the limits of your own garden. Then suddenly, you see a blue flower. Your belief is false. It only required one blue flower. Your hypothesis is only true until the day it is falsified. This means you can never say that something is true. Once the null hypothesis is falsified you receive support for the opposite; your alternative hypothesis, H1. This is only support, not the truth. It is “true” to the day an experiment is unable to falsify the null-hypothesis.?

The null-hypothesis is often set to zero as in no effect. But you can also see it as a baseline from which your alternative hypothesis deviates. In this case, the null-hypothesis does not have to be zero. The null- and alternative hypothesis can be expressed in words but also in more mathematical terms. For instance, say that your alternative hypothesis is that “Pine trees are taller compared to oak trees”. The null-hypothesis is then: “Oak trees are as tall or taller than pine trees”. In mathematical terms the alternative and null-hypothesis can be expressed as:

?H1: μPINE > μOAK

?H0: μPINE ≤ μOAK????

Where μ represents the population mean of tree heights.

领英推荐

Understanding Common Statistical Pitfalls

CITI Program 6 个月前

Nine Criteria to Evaluate the Impact of Human…

Dave Ulrich 1 年前

Can We Have Conscious Artificial Intelligence and…

Bernard Marr 4 年前

The second step is to determine which kind of experiment you have to do or which kind of data you need to test the null-hypothesis. In this phase, the concept of sampling is important in order to get a representative subset from the population in question. Before the sampling is started, you should invest time to determine which kind of statistical test you need to perform, which level of significance you should use and which sample size you need in order to detect a difference if it is there. The latter has to do with the power of the test.?

The third step involves the execution of the statistical test of the null-hypothesis.?

In the fourth step, you interpret the results of the test, which should not be very difficult if you have specified an alternative and null-hypothesis in advance. For example, you have sampled the Sweden and Japan population of heights, performed a statistical test (in this case a t test) where the mean height of pine trees was statistically significantly larger compared to the mean height of oak trees (μPINE > μOAK). This means that the null-hypothesis can be falsified and we get support for the alternative hypothesis and can conclude for the moment that pine trees are taller than oak trees. If the hypothesis is a prediction of a theory the interpretation can go further, claiming support for the theory.

The fifth step is in scientific and philosophical terms very important but cannot always be done because of practical issues. It is the continuous work of trying to falsify the alternative hypothesis. We only receive support for it when the null-hypothesis has been falsified. It is not the truth.

My experience is that you have to be somewhat pragmatic about the procedure of hypothesis testing. It is not always possible to specify hypotheses based on a theory. This requires hard work conducting descriptive studies from which observations are made that can be explained by some theory. The descriptive studies, however, involve statistical tests in order to produce patterns. Then by routine the null-hypothesis is set to zero or no effect/no difference. The key word in the process of using hypotheses as predictions of a theory is a priori work. That means, you need to know in advance what you are after and plan your project accordingly. Otherwise you are wasting your time.?

Key take aways

Construct an alternative (H1) and null-hypothesis (H0)
Plan the experiment/sampling and decide which test to use
Perform the statistical test
Interpret the results of the test
Continue your research trying to falsify your alternative hypotheses that support your theory
Be pragmatic?

Statistical Tests with R

88 位关注者

要查看或添加评论，请登录

Jesper Martinsson的更多文章

Standard error

2024年5月17日

Standard error

I believe the standard error is one of the most confusing concepts for those that are new in statistics. That is my…
The normal distribution

2023年11月25日

The normal distribution

The normal distribution has distinct characteristics that form the foundation for parametric statistical tests…
How to describe a statistical population using R - Part 2: Distribution

2023年8月25日

How to describe a statistical population using R - Part 2: Distribution

Besides Location and variability you can also use the distribution as a way to describe your data. Frequencies and…
How to describe a statistical population using R - Part 1: Location and variability

2023年7月21日

How to describe a statistical population using R - Part 1: Location and variability

Measures of location and variability play a fundamental role in describing a statistical population. They are equally…
Variables and scale

2023年6月9日

Variables and scale

Data used in research and statistical tests can be obtained by measuring stuff directly (such as height), collecting…
Population vs sample

2023年6月6日

Population vs sample

Population and sample are two fundamental concepts of statistical theory. In every statistical test, you deal with at…

1 条评论

See all articles

Hypothesis testing

Jesper Martinsson

From Oceans to Dashboards: Marine Ecologist | Data Wrangler | BI Leader

领英推荐

Statistical Tests with R

88 位关注者

Jesper Martinsson的更多文章

社区洞察

其他会员也浏览了

2: The Move, toward developing a new system

Game Theory in 5G

The evolution of description

The Impact of AI on Global Economics

How to Distinguish Good Science from Bad Science

How to learn from data: Empirics First

The Power of Symbolic Logic

Who Is The Smartest Person Alive?

Future-ready = braided (not bridled)

Philosophical Musings about Grinding Part 1/3

领英推荐

Statistical Tests with R

88 位关注者

Jesper Martinsson的更多文章

Standard error

The normal distribution

How to describe a statistical population using R - Part 2: Distribution

How to describe a statistical population using R - Part 1: Location and variability

Variables and scale

Population vs sample

社区洞察

其他会员也浏览了

2: The Move, toward developing a new system

Game Theory in 5G

The evolution of description

The Impact of AI on Global Economics

How to Distinguish Good Science from Bad Science

How to learn from data: Empirics First

The Power of Symbolic Logic

Who Is The Smartest Person Alive?

Future-ready = braided (not bridled)

Philosophical Musings about Grinding Part 1/3