登录查看更多内容

Randomization needs numbers

Dr. Marc Jacobs

?? Data scientist | Machine Learning Engineer ?? Statistical Consultant ?? Author ?? Owner of MSJ Advies

发布日期: 2022年1月2日

Truth be told, the original version of this post looked different. But then, I messed-up changing text and code snippets, LinkedIN autosave did its magic and I lost 80% of my original article. To some, this is Murphy's law. To me, this is just me messing up. No randomness, no chance. Just good old me.

This post is not about what randomness is or is not. It is just a show and tell on the limitations of randomization, and why randomization is actually a big numbers game. Over the course of my PhD I (co-)authored about seven publications on the quality of Randomized Controlled Trials in the field of cancer. After that, I taught experimental design for 5+ years to 75+ PhDs in the animal science industry. Whereas randomization is the panacea in the medical field, the animal sciences have long past augmented randomization through the art of blocking (which is called stratified randomization in the medical world). The reason being that trials are often small in size due to farm limitations. Randomization, after all, is a big numbers game.

Today, I want to show you via simulation in R what happens if you cannot get the numbers needed to make randomization work, and why it is perhaps best to expect that randomization will not solve all your problems. Because, after all, randomization is put in place to equally spread known and unknown confounders across the control and treatment group, and nobody knows what he/she does not know.

Or, to quote William Briggs in his book "Uncertainty":

"Randomness is not a cause. Neither is chance. It is?always?a mistake to say things like “explainable by chance”, “random change”, “the differences are random”, “unlikely to be due to chance”, “due to chance”, “sampling error”, and so forth. Mutations in biology are said to be “random”; quantum events are called “random”; variables are “random”. An entire theory in statistics is built around the erroneous idea that chance is a cause. This theory has resulted in much grief, as we shall see.

First of, I will show that a simple is an inherent deviation from the 'true mean' or the 'population mean'. Hence, all the work conducted in those expensive RCTs have limited external validity, and repetition is needed. This is why meta-analyses are so important.

require(tidyverse
set.seed(11223)
x<-rnorm(1000,100,10)
density(x)
plot(density(x),lwd=3)
head(x)
df=data.frame(x)
x2<-df%>%sample_frac(0.05)
par(new=TRUE)
lines(density(x2$x), add=T, col="red", lty=2)
x3<-df%>%sample_frac(0.10)
par(new=TRUE)
lines(density(x3$x), add=T, col="blue", lty=2)
x4<-df%>%sample_frac(0.10)
par(new=TRUE)
lines(density(x4$x), add=T, col="green", lty=2)
x5<-df%>%sample_frac(0.10)
par(new=TRUE)
lines(density(x5$x), add=T, col="orange", lty=2)

It is not hard to see that each sample, which I extracted as a 10% fraction from the original observations made, only mimics the population at best and can seriously deviate from it. This not bad. In fact, it should be expected.

Most of you know the sample-from-distribution trick, but I am not sure how many know the population-from-sample trick. Also called bootstrapping. What bootstrapping does is resample the sample, via replacement, to approximate the population from the sample should hypothetically come. Let me show you.

my.bootstrap <- numeric(10
for(i in 1:length(my.bootstrap)) {
? my.bootstrap[i] <- mean(sample(x2$x, replace=TRUE))
}
quantile(my.bootstrap,0.025)
quantile(my.bootstrap,0.975)
hist(x, freq=FALSE, ylim=c(0,0.06), breaks=48, col="white")
abline(v=mean(x), lty=1, lwd=2, col="blue")
abline(v=mean(my.bootstrap), lty=1, lwd=2, col="red")
abline(v=quantile(my.bootstrap,0.975), lty=2, lwd=1, col="red")
abline(v=quantile(my.bootstrap,0.025), lty=2, lwd=1, col="red")

It is not perfect, but the bootstrapped confidence limits (red dotted line) contains the theoretical population mean (1000), even though the simulated population mean is not exactly 1000. Hence, one could say that a small sample, say 100 observations large, is enough to approximate the unknown population via bootstrapping.

However, I already mentioned that drawing sample (and thus randomization) is a big numbers game, and the example above drew a 100-observations sample and repeated that process 5000 times. What will happen if I make all this sampling a lot smaller? Say, a 5% fraction from the population (50 observations) and 10 resamples? This means, I will redo the trial ten times! Not many get to even redo a trial once.

my.bootstrap <- numeric(10
for(i in 1:length(my.bootstrap)) {
? my.bootstrap[i] <- mean(sample(x2$x, replace=TRUE))
}
quantile(my.bootstrap,0.025)
quantile(my.bootstrap,0.975)
hist(x, freq=FALSE, ylim=c(0,0.06), breaks=48, col="white")
abline(v=mean(x), lty=1, lwd=2, col="blue")
abline(v=mean(my.bootstrap), lty=1, lwd=2, col="red")
abline(v=quantile(my.bootstrap,0.975), lty=2, lwd=1, col="red")
abline(v=quantile(my.bootstrap,0.025), lty=2, lwd=1, col="red")

This time, the bootstrapped confidence intervals do not even contain the population mean, and the bootstrapped mean is quite off. This was to be expected though, as sampling is a big numbers game. For instance, to get the blue line to hit 1000, we will probably need to create a 10k or 100k big database.

Now, lets move from a one-sample example to a two-sample example. A difference between groups.

set.seed(11223)
sample1 <- data.frame(sample = 'first sample', values = rnorm(1000,56,5))
sample2 <- data.frame(sample = 'second sample', values = rnorm(1000,50,5))
samples <- rbind(sample1,sample2)

g1<-ggplot(samples,
? ? ? ?aes(x = values, col = sample, fill = sample, group = sample)) +?
? geom_density(alpha=0.2) + theme_bw() + theme(legend.position = "none")

samples2<-as.data.frame(cbind(sample1$values,sample2$values, sample1$values-sample2$values))

g2<-ggplot(samples2,
? ? ? ? ? ?aes(x=V3)) +?
? geom_density() +
? geom_vline(xintercept=mean(samples2$V3), col="red", lty=2)+
? theme_bw()

require(viridis)
g3<-ggplot(samples2,
? ? ? ?aes(x = V1, y=V2, col=V3)) +?
? geom_point(size=5, alpha=0.8) +
? scale_color_gradient2(low = "red", mid = "darkblue", high = "red",?
? ? ? ? ? ? ? ? ? ? ? ? midpoint = mean(samples2$V3), guide = FALSE)+
? theme(legend.position = "none")+
? theme_bw()

require(grid)
require(gridExtra)
grid.arrange(g1,g2,g3,nrow=3))

What I am asking R to do is create two equally sizes groups, both coming from a Normal distribution, but having no connection whatsoever. So, comparing the groups will show a mean difference of around six, but this is not because of anything whatsoever. The graphs below show the density of both groups, the density of the difference between the groups, and a scatterplot showing the differences. The cloud-shape of the differences reveals that that there is no underlying mechanism, except 'randomness'.

I am sure William Briggs would argue that randomness only exists in simulations, and I dare not disagree.

I will now do the same for this two-sample example as what I did for the one-sample example - bootstrapping. Let me start small.

require(simpleboot
require(boot)
bootstrapped <- two.boot(samples2$V1,?
? ? ? ? ? ? ? ? ? ? ? ? ?samples2$V2,?
? ? ? ? ? ? ? ? ? ? ? ? ?mean,?
? ? ? ? ? ? ? ? ? ? ? ? ?R=10,
? ? ? ? ? ? ? ? ? ? ? ? ?student = TRUE,
? ? ? ? ? ? ? ? ? ? ? ? ?M=50)
hist(bootstrapped)
abline(v=bootstrapped$t0[1], lty=1, lwd=2, col="blue")
abline(v=mean(bootstrapped$t[,1]), lty=1, lwd=2, col="red")
abline(v=boot::boot.ci(bootstrapped, type="stud")$student[4:5], lty=2, lwd=1, col="red"))

Let me finish bigger.

bootstrapped <- two.boot(samples2$V1,
? ? ? ? ? ? ? ? ? ? ? ? ?samples2$V2,?
? ? ? ? ? ? ? ? ? ? ? ? ?mean,?
? ? ? ? ? ? ? ? ? ? ? ? ?R=1000,
? ? ? ? ? ? ? ? ? ? ? ? ?student = TRUE,
? ? ? ? ? ? ? ? ? ? ? ? ?M=1000)
hist(bootstrapped)
abline(v=bootstrapped$t0[1], lty=1, lwd=2, col="blue")
abline(v=mean(bootstrapped$t[,1]), lty=1, lwd=2, col="red")
abline(v=boot::boot.ci(bootstrapped, type="stud")$student[4:5], lty=2, lwd=1, col="red")?

Don't get too optimistic because the bootstrapped mean is very close to the population mean. The bootstrapped confidence intervals are quite wide, which means that any trial you run could deviate that far from the population mean. Nobody will be able to replicate a trial 1000 times, or find 1000 publications of trials that are extremely similar. The more data you get, the more variance you often attain.

Now, lets make it even more interesting and move towards regression and build a linear model with no actual relationship. Then, I will add a simple main effect, and extend by adding multiple interaction effects. The goal here is to see if bootstrapping picks up the population effects from the sample when the model does, or does not, contain all necessary effects.

require(car)
set.seed(11223)

n <- 1000
x <- rnorm(n)
y <- x + rnorm(n)

population.data <- as.data.frame(cbind(x, y))

m1 <- lm(y ~ x, population.data)
betahat.boot <- Boot(m1, R=199) 
summary(betahat.boot)? # default summary
confint(betahat.boot)
hist(betahat.boot)

n <- 100
x <- rnorm(n)
y <- 2.5*x + rnorm(n)
population.data <- as.data.frame(cbind(x, y))
m1 <- lm(y ~ x, population.data)
betahat.boot <- Boot(m1, R=199) # 199 bootstrap samples--too small to be useful
hist(betahat.boot)0

require(jtools)
require(interactions)
n <- 1000
a <- rnorm(n)
b <- rnorm(n)
c <- rnorm(n)
d <- rnorm(n)
e <- rnorm(n)
y <- 2.5*a +?
? 1.3*b+
? 0.7*c+
? 2.1*d+
? 1.9*e+
? -0.5*a*b+
? -0.9*a*c+
? 1.1*c*d+
? 0.8*a*d*c+
? rnorm(n)
population.data <- as.data.frame(cbind(a,b,c,d,e,y))
m0 <- lm(y ~ a+b+c+d+e+
? ? ? ? ? ?a:b+
? ? ? ? ? ?a:c+
? ? ? ? ? ?a:d+
? ? ? ? ? ?a:e+
? ? ? ? ? ?b:c+
? ? ? ? ? ?b:d+
? ? ? ? ? ?b:e+
? ? ? ? ? ?c:d+
? ? ? ? ? ?c:e+
? ? ? ? ? ?d:e+
? ? ? ? ? ?a:c:b, population.data)
m1 <- lm(y ~ a+b+c, population.data)
i0<-interact_plot(m0, pred = a, modx = c)
i1<-interact_plot(m1, pred = a, modx = c)
grid.arrange(i0,i1, ncol=2)
plot_summs(m0, m1, scale = TRUE,plot.distributions = TRUE)

The above graph shows the interaction plot of a model that does contain main effects and first-order interactions, to a model that only contains some of the main effects. This resembles a situation in which the analyst does not know that additional effects play a role. The example may be a bit extreme, but people do not know what they do not know, and this is where simulations becomes handy. The graph above clearly shows the second model (m1) not capturing the existing interaction.

Comparing the coefficients from both the m0 and m1 model shows what happens to the estimates of some of the main effects as well.

Lets see if bootstrapping can help us out. First, lets take again a look at the coefficients from the sparse model. The real issue with these results is that you have NO IDEA that you are missing at least two main effects (d and e), and a lot of interactions.

领英推荐

Biotecnika Times Newsletter BIRAC High Paying Jobs…

BioTecNika 1 年前

Biotecnika Times Newsletter IISc Non-NET Job Opening…

BioTecNika 2 年前

ABUNDANCE INSIDER: OCTOBER 26TH, 2018

Peter H. Diamandis 6 年前

plot_summs(m1,omit.coefs =NULL,
? ? ? ? ? ?scale = TRUE,
? ? ? ? ? ?plot.distributions = TRUE,?
? ? ? ? ? ?model.names = c("m1"))
require(sjPlot)
require(sjmisc)
plot_model(m1, show.values = TRUE, value.offset = .3)+theme_bw()?

So, if we bootstrap the data, will we then obtain estimates much closer to the population effects? Based on what I see below, I am not worried as the bootstrapped confidence intervals seem to contain the coefficients of the population.

betahat.m1.boot <- car::Boot(m1, R=1000)
hist(betahat.m1.boot)
summary(betahat.m1.boot)?

Number of bootstrap replications R = 1000
? ? ? ? ? ? original? ? bootBias? bootSE bootMed
(Intercept)? 0.10076 -0.00258513 0.11154 0.10175
a? ? ? ? ? ? 2.52591? 0.00011154 0.12425 2.52161  # 2.52 - 2.5 = 0.02
b? ? ? ? ? ? 1.23634 -0.00140071 0.10736 1.23401  # 1.24 - 1.3 = 0.07
c? ? ? ? ? ? 0.86689? 0.00414945 0.11381 0.87044? # 0.87 - 0.7 = 0.17 --> due to missing interaction

But, what we forgot to really do, is to take a tiny single sample, from the original big database, and see how it behaves. Of course, the bootstrapped results will mimic the population, because I built 1000 datasets from those original observations. But what if I can only sample 50? Once?

x2<-population.data%>%sample_frac(0.05
m2 <- lm(y ~ a+b+c+d+e+
? ? ? ? ? ?a:b+
? ? ? ? ? ?a:c+
? ? ? ? ? ?a:d+
? ? ? ? ? ?a:e+
? ? ? ? ? ?b:c+
? ? ? ? ? ?b:d+
? ? ? ? ? ?b:e+
? ? ? ? ? ?c:d+
? ? ? ? ? ?c:e+
? ? ? ? ? ?d:e+
? ? ? ? ? ?a:c:b, x2)
plot_summs(m0, m2, scale = TRUE,model.names = c("m0", "m2"))

And what if I did not include all the important variables I should have included?

m3 <- lm(y ~ a+b+c, x2
plot_summs(m0,m2,m3, scale = TRUE, model.names = c("m0", "m2","m3")))

Can bootstrapping help me now? As you can see it cannot, and just like randomization, also bootstrapping needs a sufficient base to get started.

betahat.m2.boot <- car::Boot(m2, R=1000)
hist(betahat.m2.boot)
summary(betahat.m2.boot)

> summary(betahat.m2.boot)


Number of bootstrap replications R = 1000?
? ? ? ? ? ? ? original? ?bootBias? bootSE? ? bootMed
(Intercept)? 0.2194352 -0.0140810 0.26017? 0.2227485
a? ? ? ? ? ? 2.4550501 -0.0191784 0.27917? 2.4178270
b? ? ? ? ? ? 1.3595962? 0.0400222 0.27590? 1.3863846
c? ? ? ? ? ? 1.0580434? 0.0542109 0.33912? 1.1051985


betahat.m3.boot <- car::Boot(m3, R=1000)?
hist(betahat.m3.boot)
summary(betahat.m3.boot)

> summary(betahat.m3.boot)


Number of bootstrap replications R = 1000?
? ? ? ? ? ? original? ?bootBias? bootSE bootMed
(Intercept)? 0.90742 -0.0097986 0.52050 0.88945
a? ? ? ? ? ? 2.72682? 0.0229260 0.44768 2.74583
b? ? ? ? ? ? 1.45078? 0.0166067 0.45787 1.47188
c? ? ? ? ? ? 1.30921 -0.0068593 0.51522 1.31287
?

The histograms coming from m3 show a nice enough curve to yield normally distributed bootstrapped confidence intervals, but the bias is too much. However, you will never know, because you did not include all relevant co-variates in your model.

Or to finish with Briggs:

Pick up a pencil and let it go mid air. What happened? It fell, because why? Because of gravity, we say, a cause with which we are all familiar. But the earth’s gravity isn’t the only force operating on the pencil; just the predominant one. We don’t consider the pencil falling to be “random” because we know the nature or essence of the cause and?deduce?the consequences. We need to speak more of what makes a causal versus probabilistic model, but a man standing in the middle of a field flipping a coin is thinking more probabilistically than the man dropping a pencil. Probabilities become substitutes for?knowledge of?causes, they do not become causes themselves."

UPDATE

Below the article, comments started to show that what I was showing resembled sampling deviation, and that sampling a covariance matrix does not change its inherent structure. That is true and perhaps I should have changed the title of this article to: "Randomization does not safeguard from incorrect model specification".

The following codes come from Pieter Koppelaar which I augmented to make my case again, even though his comments are extremely useful.

What I will show via the following code snippets and graphs is that a covariance matrix does maintain its structure via sampling, but that not including all the influencers in the model does limit coefficient estimation, regardless of sampling used. However, the less data you have, the worse it will be.

library(tidyverse
library(corrplot)
require(jtools)
require(interactions)
require(grid)
require(gridExtra)
set.seed(1)
variables <- 3
population <- 1e7
samplesize <- 1000
samplingsize <- 1000
z_std <- function(observed) {result <- (observed - mean(observed)) / sd(observed)}

# Unobservable (latent) variables U are matrix-multiplied with B twice to obtain Observable variables O. We consider this our population.
U <- matrix(data = rnorm(variables * population), ncol=variables)
B <- matrix(data = rnorm(variables^2), ncol=variables)
O <- U %*% B %*% B

Udf<-as.data.frame(U)
Udf%>%sample_frac(0.05)%>%
? ggplot(.)+
? geom_density(aes(x=V1))+
? geom_density(aes(x=V2))+
? geom_density(aes(x=V3))+
? theme_bw()
Udf%>%sample_frac(0.05)%>%pairs(.) # No relationship whatsoever)

Odf<-as.data.frame(O
Odf%>%sample_frac(0.05)%>%
? ggplot(.)+
? geom_density(aes(x=V1))+
? geom_density(aes(x=V2))+
? geom_density(aes(x=V3))+
? theme_bw()
Odf%>%sample_frac(0.05)%>%pairs(.)))

U <- as.tibble(U) %>% mutate_all(z_std)
O <- as.tibble(O) %>% mutate_all(z_std)
samplingU <- U %>% filter(NA)
samplingO <- O %>% filter(NA)
for (i in 1:samplingsize){
? sample <- sample(1:population, samplesize)
? samplingU[i,] <- U %>% slice(sample) %>% summarize_all(mean)
? samplingO[i,] <- O %>% slice(sample) %>% summarize_all(mean)
}
ggplot(samplingO)+
? geom_density(aes(x=V1))+
? geom_density(aes(x=V2))+
? geom_density(aes(x=V3))+
? theme_bw()
cor(samplingO)
> cor(samplingO)
? ? ? ? ? ?V1? ? ? ? ?V2? ? ? ? ?V3
V1? 1.0000000 -0.5526532 -0.5681778
V2 -0.5526532? 1.0000000? 0.5949642
V3 -0.5681778? 0.5949642? 1.0000000

cor(O)
> cor(O)
? ? ? ? ? ?V1? ? ? ? ?V2? ? ? ? ?V3
V1? 1.0000000 -0.5625197 -0.5802865
V2 -0.5625197? 1.0000000? 0.6126717
V3 -0.5802865? 0.6126717? 1.0000000

cor(Odf)
> cor(Odf)
? ? ? ? ? ?V1? ? ? ? ?V2? ? ? ? ?V3
V1? 1.0000000 -0.5625197 -0.5802865
V2 -0.5625197? 1.0000000? 0.6126717
V3 -0.5802865? 0.6126717? 1.0000000

Odf%>%
? sample_frac(0.00005)%>%
? cor(.)
? ? ? ? ? ?V1? ? ? ? ?V2? ? ? ? ?V3
V1? 1.0000000 -0.5280985 -0.5912621
V2 -0.5280985? 1.0000000? 0.5840342
V3 -0.5912621? 0.5840342? 1.0000000

attach(Odf
Odf$Y<-2*V1-3*V2+0.8*V3+
? 1.9*V1*V2 +
? -7*V2*V3+
? 3.4*V1*V3)
Odf%>
? sample_frac(0.0005)%>%
ggplot(.,?
? ? ? ?aes(x=V1, y=Y, col=V3, size=V2))+
? geom_point(alpha=0.5)+theme_bw()

m0 <- lm(Y ~
? ? ? ? ? ?V1 +?
? ? ? ? ? ?V2 +?
? ? ? ? ? ?V3 +?
? ? ? ? ? ?V1:V2 +?
? ? ? ? ? ?V2:V3 +?
? ? ? ? ? ?V1:V3,?
? ? ? ? ?data=Odf)
m1 <- lm(Y ~?
? ? ? ? ? ?V1 +?
? ? ? ? ? ?V3 +??
? ? ? ? ? ?V1:V3,?
? ? ? ? ?data=Odf)
i0<-interactions::interact_plot(m0, pred = V1, modx = V3)
i1<-interactions::interact_plot(m1, pred = V1, modx = V3)
grid.arrange(i0,i1, ncol=2)
plot_summs(m0, m1, scale = TRUE,model.names = c("m0", "m1"))?

m0_s <-Odf%>
? sample_frac(0.0005)%>%
? lm(Y ~?
? ? ? ? ? ?V1 +?
? ? ? ? ? ?V2 +?
? ? ? ? ? ?V3 +?
? ? ? ? ? ?V1:V2 +?
? ? ? ? ? ?V2:V3 +?
? ? ? ? ? ?V1:V3,?
? ? ? ? ?data=.)
m1_s <- Odf%>%
? sample_frac(0.0005)%>%
? lm(Y ~?
? ? ? ? ? ?V1 +?
? ? ? ? ? ?V3 +??
? ? ? ? ? ?V1:V3,?
? ? ? ? ?data=.)
i0_s<-interactions::interact_plot(m0_s, pred = V1, modx = V3)
i1_s<-interactions::interact_plot(m1_s, pred = V1, modx = V3)
grid.arrange(i0,i0_s,
? ? ? ? ? ? ?i1,i1_s,
? ? ? ? ? ? ?ncol=2,nrow=2)%

Pieter Koppelaar

Architect Data & Analytics

3 年

Ha Marc, Kan je misschien een uitleg toevoegen waarom bootstrapping een relevant onderwerp is? De relevantie wordt op dit moment nergens duidelijk. Hetzelfde geldt voor regressie. Wat is de relevantie hiervan bij clinical trials? Meestal gaat het hier toch om een test en controlegroep design waar men ge?nteresseerd is in het vergelijken van proporties of gemiddelden. Wat gaat hier mis volgens jou? Als we bij het voorbeeld van de vaccin onderzoeken blijven: Hoe kan het zijn dat redelijk consistent grote verschillen in proporties gevonden (symptomatisch ernstig ziek) tussen twee dubbelblinde groepen. En als de randomisation niet het effect heeft zoals men verwacht vanwege eventuele niet te observeren verschillen tussen groepen, waarom vinden we dan deze effecten nooit omgekeerd? Misschien zou je een uitleg kunnen maken van problemen die jij ziet op basis van een dergelijk eenvoudig design zonder resampling en zonder lineaire regressie op niet-lineaire effecten.

查看更多评论

要查看或添加评论，请登录

Dr. Marc Jacobs的更多文章

Waarom achteraf voorspellen ook voor het klimaat niet goed werkt - keuzes, keuzes keuzes.

2023年10月23日

Waarom achteraf voorspellen ook voor het klimaat niet goed werkt - keuzes, keuzes keuzes.

Het klimaat is een boeiend onderwerp. Niet dat ik er bijster veel vanaf weet, want ik ben geen klimaatwetenschapper.

14 条评论
The importance of applying mathematical models in decision making: a plea for dialogue and common sense.

2023年9月26日

The importance of applying mathematical models in decision making: a plea for dialogue and common sense.

By Marc Jacobs & Ronald Meester Introduction In this note we share our view about the use, and often misuse, of…

6 条评论
Het ondersteunen van oncologen bij het gebruiken van evidence-based Patient Reported Outcome informatie in de klinische praktijk

2022年3月20日

Het ondersteunen van oncologen bij het gebruiken van evidence-based Patient Reported Outcome informatie in de klinische praktijk

Dit was de titel van mijn KWF-Fellowship Sociol-Oncology research proposal uit 2015. Mijn beoogde traject na vier jaar…

4 条评论
Hoe maak je pati?nten “informatie-wijs”?

2022年3月20日

Hoe maak je pati?nten “informatie-wijs”?

Voor de pati?nt is steeds meer informatie op een muisklik afstand beschikbaar. De medisch wetenschappelijke wereld…

7 条评论
De lange staart - ik zie ik zie wat jij niet ziet.

2022年3月18日

De lange staart - ik zie ik zie wat jij niet ziet.

Gister schreef ik een artikel over hoe meten niet gelijk staat aan echt weten en dat weten afhankelijk is van meten…

4 条评论
Wat weet ik nou als ik meet?

2022年3月17日

Wat weet ik nou als ik meet?

Ik heb net het boek "De Meetmaatschappij" van Berend van der Kolk gelezen waarin terecht wordt gewezen op de negatieve…

11 条评论
De wetenschap is van ons allemaal.

2022年3月12日

De wetenschap is van ons allemaal.

Elke titel die ik bedenk en die de woorden "wetenschap" en "wetenschapper" draagt, is een titel die ook automatisch op…

17 条评论
Fact-checking, causality and cognitive dissonance. Why humans will always be human.

2022年2月22日

Fact-checking, causality and cognitive dissonance. Why humans will always be human.

Let me start with a disclaimer. I really really do not like fact-checking.

2 条评论
Een goed model reflecteert altijd en overal. Ook in crisistijd.

2022年2月19日

Een goed model reflecteert altijd en overal. Ook in crisistijd.

Elke zaterdag spendeert het NOS tijd aan de heren Van Dissel en Wallinga. Ik weet niet of zij er zelf blij mee zijn…

3 条评论
Modellen leven op aannames, en op mensen.

2022年2月4日

Modellen leven op aannames, en op mensen.

Iedereen heeft wel een idee wat een model is. Mocht je nou geen gedachte hebben bij een model dan is die gedachte…

118 条评论

See all articles

Randomization needs numbers

Dr. Marc Jacobs

?? Data scientist | Machine Learning Engineer ?? Statistical Consultant ?? Author ?? Owner of MSJ Advies

领英推荐

Dr. Marc Jacobs的更多文章

社区洞察

其他会员也浏览了

Year in Review: Advancements in the Fields of Life Science and Biomedicine

i3S Inside Stories

Weekly Research News Digest

Firestarters

Diagnostic exome identifies a novel PRKG2 mutation in a proband with skeletal dysplasia

Health Issues Weekly: Edition 92

IMPloyable talent: Krista Gert

How Metadata and AI are Paving the Way for Ethical Innovation in laboratory animal research.

The Case for Genomic Representation in Latin America

Don’t be a people pleaser…. what kind of people you are dealing with,… Serial Passaging Allows Virus to Jump Species ….part 224

领英推荐

Dr. Marc Jacobs的更多文章

Waarom achteraf voorspellen ook voor het klimaat niet goed werkt - keuzes, keuzes keuzes.

The importance of applying mathematical models in decision making: a plea for dialogue and common sense.

Het ondersteunen van oncologen bij het gebruiken van evidence-based Patient Reported Outcome informatie in de klinische praktijk

Hoe maak je pati?nten “informatie-wijs”?

De lange staart - ik zie ik zie wat jij niet ziet.

Wat weet ik nou als ik meet?

De wetenschap is van ons allemaal.

Fact-checking, causality and cognitive dissonance. Why humans will always be human.

Een goed model reflecteert altijd en overal. Ook in crisistijd.

Modellen leven op aannames, en op mensen.

社区洞察

其他会员也浏览了

Year in Review: Advancements in the Fields of Life Science and Biomedicine

i3S Inside Stories

Weekly Research News Digest

Firestarters

Diagnostic exome identifies a novel PRKG2 mutation in a proband with skeletal dysplasia

Health Issues Weekly: Edition 92

IMPloyable talent: Krista Gert

How Metadata and AI are Paving the Way for Ethical Innovation in laboratory animal research.

The Case for Genomic Representation in Latin America

Don’t be a people pleaser…. what kind of people you are dealing with,… Serial Passaging Allows Virus to Jump Species ….part 224