Hypothesis testing
Jesper Martinsson
From Oceans to Dashboards: Marine Ecologist | Data Wrangler | BI Leader
Hypothesis testing is, according to my opinion, analogous to the scientific method. It follows a logical structure that enables an objective procedure that allows science to progress. Hypothesis testing is an essential aspect when it comes to the planning, execution, analysis and interpretation of results of a research project.?
The first step in this procedure involves the construction of a null-hypothesis (H0). This is the opposite of the researcher’s hypothesis (H1), which actually should represent a theory that may explain a specific observation. H1, the alternative hypothesis, can be viewed as a prediction of what will occur if the theory is correct. It is the null-hypothesis that is to be tested with a statistical test.?
The reason for this relies on philosophical grounds: Poppers falsificationism. In short, this concept says that it is not possible to say that anything is true unless you have gathered all possible observations, which in practice is impossible. But, it only requires one observation to falsify a hypothesis. If a hypothesis once has been falsified, it remains false. Then it cannot be true.
For example, you have a belief that all flowers on the planet are red. Every day you only see red flowers, so this is your “truth”. But one day you go out for a stroll beyond the limits of your own garden. Then suddenly, you see a blue flower. Your belief is false. It only required one blue flower. Your hypothesis is only true until the day it is falsified. This means you can never say that something is true. Once the null hypothesis is falsified you receive support for the opposite; your alternative hypothesis, H1. This is only support, not the truth. It is “true” to the day an experiment is unable to falsify the null-hypothesis.?
The null-hypothesis is often set to zero as in no effect. But you can also see it as a baseline from which your alternative hypothesis deviates. In this case, the null-hypothesis does not have to be zero. The null- and alternative hypothesis can be expressed in words but also in more mathematical terms. For instance, say that your alternative hypothesis is that “Pine trees are taller compared to oak trees”. The null-hypothesis is then: “Oak trees are as tall or taller than pine trees”. In mathematical terms the alternative and null-hypothesis can be expressed as:
?H1: μPINE > μOAK
?H0: μPINE ≤ μOAK????
Where μ represents the population mean of tree heights.
领英推荐
The second step is to determine which kind of experiment you have to do or which kind of data you need to test the null-hypothesis. In this phase, the concept of sampling is important in order to get a representative subset from the population in question. Before the sampling is started, you should invest time to determine which kind of statistical test you need to perform, which level of significance you should use and which sample size you need in order to detect a difference if it is there. The latter has to do with the power of the test.?
The third step involves the execution of the statistical test of the null-hypothesis.?
In the fourth step, you interpret the results of the test, which should not be very difficult if you have specified an alternative and null-hypothesis in advance. For example, you have sampled the Sweden and Japan population of heights, performed a statistical test (in this case a t test) where the mean height of pine trees was statistically significantly larger compared to the mean height of oak trees (μPINE > μOAK). This means that the null-hypothesis can be falsified and we get support for the alternative hypothesis and can conclude for the moment that pine trees are taller than oak trees. If the hypothesis is a prediction of a theory the interpretation can go further, claiming support for the theory.
The fifth step is in scientific and philosophical terms very important but cannot always be done because of practical issues. It is the continuous work of trying to falsify the alternative hypothesis. We only receive support for it when the null-hypothesis has been falsified. It is not the truth.
My experience is that you have to be somewhat pragmatic about the procedure of hypothesis testing. It is not always possible to specify hypotheses based on a theory. This requires hard work conducting descriptive studies from which observations are made that can be explained by some theory. The descriptive studies, however, involve statistical tests in order to produce patterns. Then by routine the null-hypothesis is set to zero or no effect/no difference. The key word in the process of using hypotheses as predictions of a theory is a priori work. That means, you need to know in advance what you are after and plan your project accordingly. Otherwise you are wasting your time.?
Key take aways