多変量での検定 - 驚異のアニヲタ社会復帰への道

面白そうな解析をやっていた。

アウトカムが多変量の場合の解析である。

＞O群とC群で画像のズレがx軸とy軸でとれるらしく

ならば、原点が一番理想的なわけで、原点からの距離を指標にすれば臨床的に意義があるのではないかと考えた。これなら通常の単アウトカムで検定できる。

とりあえず原点からの距離で検定してみる。

データとplotは元記事から引用させていただいた。

## Create

dat <- read.table(header = TRUE, text = "
group   x       y
o      -2       1
o       0       4
o       1       1
o       2       2
o       3      -5
o       3       2
o       3       4
o       4       2
o       5       3
o       6       2
c      -5      -2
c      -4       0
c      -4       0
c      -1      -2
c      -1       0
c      -1      -2
c      -1       0
c       0      -1
c       0       0
c       0       2
")

plot(dat$x, dat$y, col = dat$group, pch = as.numeric(dat$group)) abline(h = 0, v = 0, lty = 2) legend("topleft", legend = paste("Group", c("O", "C")), bty = "n", col = 2:1, pch = 2:1, cex = 1.5)



# 原点からの距離で検定
dat$dist <- sqrt(dat$x^2 + dat$y^2)
t.test(dist ~ group, data = dat)

## 
##  Welch Two Sample t-test
## 
## data:  dist by group
## t = -2.503, df = 17.99, p-value = 0.0222
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -3.4373 -0.2998
## sample estimates:
## mean in group c mean in group o 
##           2.286           4.154

plot(dist ~ group, data = dat)



# Wilks検定など
library(car)
multi.fit <- lm(cbind(x, y) ~ group, data = dat)
summary(multi.fit)

## Response x :
## 
## Call:
## lm(formula = x ~ group, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -4.50  -1.70   0.60   1.55   3.50 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   -1.700      0.677   -2.51  0.02187 *  
## groupo         4.200      0.958    4.38  0.00036 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.14 on 18 degrees of freedom
## Multiple R-squared:  0.516,  Adjusted R-squared:  0.49 
## F-statistic: 19.2 on 1 and 18 DF,  p-value: 0.000358
## 
## 
## Response y :
## 
## Call:
## lm(formula = y ~ group, data = dat)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##   -6.6   -0.6    0.4    0.5    2.5 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   -0.500      0.636   -0.79    0.442  
## groupo         2.100      0.900    2.33    0.031 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.01 on 18 degrees of freedom
## Multiple R-squared:  0.232,  Adjusted R-squared:  0.19 
## F-statistic: 5.44 on 1 and 18 DF,  p-value: 0.0314

multi.fit

## 
## Call:
## lm(formula = cbind(x, y) ~ group, data = dat)
## 
## Coefficients:
##              x     y   
## (Intercept)  -1.7  -0.5
## groupo        4.2   2.1

res <- Anova(multi.fit, type = 3)
res

## 
## Type III MANOVA Tests: Pillai test statistic
##             Df test stat approx F num Df den Df  Pr(>F)    
## (Intercept)  1     0.269     3.13      2     17 0.06953 .  
## group        1     0.562    10.91      2     17 0.00089 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

summary(res)

## 
## Type III MANOVA Tests:
## 
## Sum of squares and products for error:
##      x    y
## x 82.6  6.5
## y  6.5 72.9
## 
## ------------------------------------------
##  
## Term: (Intercept) 
## 
## Sum of squares and products for the hypothesis:
##      x   y
## x 28.9 8.5
## y  8.5 2.5
## 
## Multivariate Tests: (Intercept)
##                  Df test stat approx F num Df den Df Pr(>F)  
## Pillai            1    0.2692    3.131      2     17 0.0695 .
## Wilks             1    0.7308    3.131      2     17 0.0695 .
## Hotelling-Lawley  1    0.3684    3.131      2     17 0.0695 .
## Roy               1    0.3684    3.131      2     17 0.0695 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## ------------------------------------------
##  
## Term: group 
## 
## Sum of squares and products for the hypothesis:
##      x     y
## x 88.2 44.10
## y 44.1 22.05
## 
## Multivariate Tests: group
##                  Df test stat approx F num Df den Df   Pr(>F)    
## Pillai            1    0.5622    10.91      2     17 0.000893 ***
## Wilks             1    0.4378    10.91      2     17 0.000893 ***
## Hotelling-Lawley  1    1.2841    10.91      2     17 0.000893 ***
## Roy               1    1.2841    10.91      2     17 0.000893 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

いずれにせよgroup-oの方がズレが大きいようだ

r-statistics-fanの日記

統計好き人間の覚書のようなもの

アウトカムが多変量の場合の解析

多変量での検定 - 驚異のアニヲタ社会復帰への道