Pagina's

Friday, 19 October 2018

Custom contrasts for the one-way repeated measures design using Lmer

Here is some code for doing one-way repeated measures analysis with lme4 and custom contrasts. We will use a repeated measures design with three conditions of the factor Treat and 20 participants. The contrasts are Helmert contrasts, but they differ from the built-in Helmert contrasts in that the sum of the absolute values of the contrasts weights equals 2 for each contrast.
The standard error of each contrast equals the square root of the product of the sum of the squared contrast weight w and the residual variance divided by the number of participants n.

$$\sigma_{\hat{\psi}} =  \sqrt{\sum{w_i}\sigma^2_e/n}$$

The residual variance equals the within treatment variance times 1 minus the correlation between conditions. (Which equals the within treatment variance minus 1 times the covariance $\rho\sigma^2_{within}$) .

$$\sigma^2_e = \sigma^2_{within}(1 - \rho)$$

In the example below, the within treatment variance equals 1 and the covariance 0.5 (so the value of the correlation is .50 as well). The residual variance is therefore equal to .50.

For the first contrasts, the weights are equal to {-1, 1, 0}, so the value of the standard error of the contrasts should be equal to the square root of 2*.50/20 = 0.2236. 
 

library(MASS)
library(lme4)

# setting up treatment and participants factors
nTreat = 3
nPP = 20
Treat <- factor(rep(1:nTreat, each=nPP))
PP <- factor(rep(1:nPP, nTreat))

# generate some random 
# specify means

means = c(0, .20, .50)

# create variance-covariance matrix
# var = 1, cov = .5
Sigma = matrix(rep(.5, 9), 3, 3)
diag(Sigma) <- 1

# generate the data; using empirical = TRUE
# so that variance and covariance are known
# set to FALSE for "real" random data

sco = as.vector(mvrnorm(nPP, means, Sigma, empirical = TRUE))

#setting up custom contrasts for Treatment factor 
myContrasts <- rbind(c(-1, 1, 0), c(-.5, -.5, 1))
contrasts(Treat) <- ginv(myContrasts)

#fit linear mixed effects model: 
myModel <- lmer(sco ~ Treat + (1|PP))

summary(myModel)
## Linear mixed model fit by REML ['lmerMod']
## Formula: sco ~ Treat + (1 | PP)
## 
## REML criterion at convergence: 157.6
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -2.6676 -0.4869  0.1056  0.6053  1.9529 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev.
##  PP       (Intercept) 0.5      0.7071  
##  Residual             0.5      0.7071  
## Number of obs: 60, groups:  PP, 20
## 
## Fixed effects:
##             Estimate Std. Error t value
## (Intercept)   0.2333     0.1826   1.278
## Treat1        0.2000     0.2236   0.894
## Treat2        0.4000     0.1936   2.066
## 
## Correlation of Fixed Effects:
##        (Intr) Treat1
## Treat1 0.000        
## Treat2 0.000  0.000

Thursday, 4 October 2018

Distribution of the difference between two binomial variables

For a project I'm working on, I needed to work with the probabilities associated with the difference between two binomial variables. I thought I'd share the code for four functions for calculating the probability mass and the cumulative probabilites.

The functions d.diff.binom (probability mass) and p.diff.binom (cumulative distribution) are functions for calculating the distribution of the differences for equal sample sizes N and equal "succes" probability.  The arguments of the functions are the quantiles k of the distribution of the differences, the number of trials N and the probability of succes p. 


The functions d.diff.binom.un (probability mass) and p.diff.binom.un (cumulative distribution) are functions for calculating the distribution of the differences for sample sizes and succes probabilities that may (or may not) differ between the two populations. The arguments of the functions are the quantiles k of the distribution of the differences, the number of trials N.1 and N.2 for the two groups and the probability of succes p.1 and p.2 for each group. 



#equal N and p:
#probability mass:
d.diff.bin = function(k, N, p) {
diff = outer(0:N, 0:N, "-")
prob = outer(dbinom(0:N, N, p, log=TRUE),
dbinom(0:N, N, p, log=TRUE), "+")
p = sum(exp(prob[diff == k]))
return(p)
}
#cumulative probability:
p.diff.bin = function(k, N, p) {
diff = outer(0:N, 0:N, "-")
prob = outer(dbinom(0:N, N, p, log=TRUE),
dbinom(0:N, N, p, log=TRUE), "+")
p = sum(exp(prob[diff <= k]))
return(p)
}
#examples
d.diff.bin(0, 30, .50)
## [1] 0.1025782
p.diff.bin(0, 30, .50)
## [1] 0.5512891
#mean of distribution:
N = 30
m = sum(sapply(-N:N, d.diff.bin, N = 30, p= .50)*(-N:N))
m
## [1] -3.084642e-20
all.equal(0, m)
## [1] TRUE
#(un)equal N and p:
#probability mass:
d.diff.bin.un = function(k, N.1, N.2, p.1, p.2) {
diff = outer(0:N.1, 0:N.2, "-")
prob = outer(dbinom(0:N.1, N.1, p.1, log=TRUE),
dbinom(0:N.2, N.2, p.2, log=TRUE), "+")
p = sum(exp(prob[diff == k]))
return(p)
}

#cumulative distribion: 
p.diff.bin.un = function(k, N.1, N.2, p.1, p.2) {
diff = outer(0:N.1, 0:N.2, "-")
prob = outer(dbinom(0:N.1, N.1, p.1, log=TRUE),
        dbinom(0:N.2, N.2, p.2, log=TRUE), "+")
p = sum(exp(prob[diff <= k]))
return(p)
}

#examples
d.diff.bin.un(0, 30, 20, .60, .70)
## [1] 0.05896135
p.diff.bin.un(0, 30, 20, .60, .70)
## [1] 0.1497043