Example 2 for Wright, Strubler & Vallano
(2011) Legal and Criminological Psychology
This page goes through the R code and the R output in more detail than the paper. As time goes on, if people want more done we can add more here. Following the practice in the paper, dark will be R code and light will be output. Comments are written in purple. Here is a brief outline which allows you to go where you want in this page.
NOTE: This was run in R2.81 to match with the output in the paper. A couple of the models are having problems in R2.90, but I assume that will be sorted when the add-on packages are updated.
accessing data and rounding data
Precision of theta comparing Poisson and negative binomial hurdle models
Model selection of hurdle models
Finding predicted values of hurdle model
Estimating theta for zero inflated negative binomial models
Predicted values for zero inflated models
> # Example 2 for Wright et al., 2011, Legal and Criminological Psychology
I like this graph. It shows the three approaches and what each assumes the underlying processes are. As far as creating it in R it took time to get the sizes about right and everything to about where I wanted it. There is a lot of turning axes on and off, printing things in white, and stuff like that, but nothing very tricky.
par(mfrow=c(1,3))
dollars <- seq(0,30000)
plot(c(-10000,30000),c(0,.3),xlab=" Dollars awarded",cex.lab=1.6,cex.axis=1.6,
col="white",ylab="",yaxt="n",bty="n",yaxs="i")
#lines(dollars,model1)
lines(spline(seq(0,30000,1000),dnbinom(seq(0,30),3,.3)))
# yes, the polygon does not look like a diamond. Changing the graphics
# window will adjust this.
polygon(c(3900,600,3900,7200),c(.15,.18,.21,.18),lwd=1.5)
text(3800,.18,"?",cex=1.6)
text(4000,.23,"Is liable?",cex=1.6)
arrows(-2800,.18,200,.18,length=.1)
arrows(7750,.18,10750,.18,length=.1)
arrows(3900,.145,3900,.11,length=.1)
text(900,.135,"Yes",cex=1.6)
text(20000,.18,"No, 0$ award",cex=1.6)
lines(c(0,0),c(0,dnbinom(0,3,.3)))
#title("Hurdle model")
# requires erasing the -10000 in some graphics tool.
dollars <- seq(0,30000)
plot(c(-3000,35000),c(0,.3),xlab=" Dollars awarded",cex.lab=1.6,cex.axis=1.6,
col="white",ylab="",yaxt="n",bty="n",yaxs="i")
#lines(dollars,model1)
lines(spline(seq(200,30200,1000),dnbinom(seq(0,30),3,.3)))
lines(c(-200,-200),c(0,.1))
lines(c(-200,200),c(.1,.1))
lines(c(200,200),c(.1,.027))
rect(-4000,.15,4000,.178)
text(100,.163,"Pop 1",cex=1.6)
arrows(0,.145,0,.11,length=.1)
rect(8900,.15,16900,.178)
text(13000,.163,"Pop 2",cex=1.6)
arrows(11000,.145,8000,.09,length=.1)
#title("Zero-inflated model")
dollars2 <- seq(-10000,30000)
dollars2b <- seq(-10000,0)
model2 <- dnorm(dollars2,10000,7000)
model2b <- dnorm(dollars2b,10000,7000)
plot(c(-10000,30000),c(0,.0001),xlab=" Dollars awarded",cex.lab=1.6,cex.axis=1.6,
col="white",ylab="",yaxt="n",bty="n",yaxs="i")
lines(dollars2,model2)
polygon(c(dollars2b,0,-10000),c(model2b,0,0),density=20)
abline(v=0,lty=3)
text(-12000,.00004,"Not liable\n0$ award",pos=4,cex=1.6)
text(0,.00007,"Amount awarded",pos=4,cex=1.6)
rect(-500,.00006,500,.00014,col="white",border=NA)
#title("Tobit model")

> library(pscl);library(AER)
> library(e1071);library(boot)
> library(Hmisc)
You will need to install.packages if you have not already done so. pscl is for hurdle and zero-inflated models, AER is for Tobit models (and it just sends stuff to the slots in the surv function). e1071 is for skewness, boot for rugged shoes, and Hmisc is for errbar functions.
> zeroes <-
+ read.table("http://www.fiu.edu/~dwright/jurstat/ex2zeroes.DAT",
+ header=TRUE)
> attach(zeroes)
> names(zeroes)
[1] "Participant" "Expert" "Injury" "Econ" "NonEcon"
[6] "TotComp" "Punative"
This rounds values up to the next thousand. Why you ask? Well, it means all 0s are really 0s.
> TotComp <- round((TotComp+499)/1000)
[1] 1.001835
So, the value is fairly skewed. Part of this is because of the zeroes, but below skewness is found for the non-zero data and it has skewness around .8.
> skboot <- boot(TotComp,function(x,i) skewness(x[i]),R=2000)
> boot.ci(skboot)
BOOTSTRAP CONFIDENCE INTERVAL
CALCULATIONS
Based on 2000 bootstrap replicates
CALL :
boot.ci(boot.out = skboot)
Intervals :
Level Normal Basic
95% ( 0.734, 1.276 ) ( 0.718, 1.254 )
Level Percentile BCa
95% ( 0.750, 1.286 ) ( 0.755, 1.298 )
Calculations and Intervals on Original Scale
If you have been wondering about the warning below, estimates of the variance are required for studentized intervals. This is applicable for certain types of bootstraps (e.g., of stratified data), but not those done here. Chapter 5 of Davison and Hinkley (the book on which boot is based) discusses these, as do Chapters 12-14 of Efron and Tibshirani.
Warning message:
In boot.ci(skboot) : bootstrap variances needed for studentized intervals
>
> par(mfrow=c(1,2))
> hist(TotComp,xlab="Total Compensation in $1000s",cex.axis=1.3,cex.lab=1.3,
+ main="",breaks=0:max(TotComp))
> mx <- tapply(TotComp,list(Expert,Injury),mean)
> ebarvals <- unlist(tapply(TotComp,list(Expert,Injury),smean.cl.boot))
> dim(ebarvals) <- c(3,2,3)
> errbar(c("Mild-Extreme","Severe-Extreme","Mild-Minimum","Severe-Minimum",
+ "Mild-Moderate","Severe-Moderate"),
+ ebarvals[1,,],ebarvals[3,,],ebarvals[2,,],cex.axis=1.3,cex.lab=1.3,
+ ylab="Total Compensation in $1000s")
> mtext(expression(underline("Expert -
Injury")),side=2,las=2,padj=-6,adj=.5,cex=1.3)
> par(mfrow=c(1,1))
>
If you are interested in using R and making within subject confidence intervals, see here. There was some trial and error used making this graph to figure out the unlist was necessary. There is not much detail on the bootstrap method used in smean.cl.boot, but this could be replaced by those found with boot. If you are interested in a very introductory introduction to bootstrapping, see here.

>
> summary(aov(TotComp ~ Expert*Injury))
Df
Sum Sq Mean Sq F value Pr(>F)
Expert 1
86952 86952 11.0728 0.001046 **
Injury 2
110004 55002 7.0042
0.001153 **
Expert:Injury 2 29855
14928 1.9009 0.152172
Residuals 196 1539133
7853
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
So, you could say their are statistically significant main effects for Expert, F(1,196) = 11.07, p = .001, and for Injury, F(2,196) = 7.00, p = .001, but a non-significant interaction, F(2,196) = 1.90, p = .15. Summary does not automatically print partial eta squared, but you can divide the Sum of Squares by it plus the residual sum of squares. So, for Expert this is 86952/(86952+1539133) = .05.
>
>
> htot1p <- hurdle(TotComp ~ Expert*Injury|Expert*Injury,dist="poisson")
> htot1nb <- hurdle(TotComp ~ Expert*Injury|Expert*Injury,dist="negbin")
> summary(htot1p)
So this shows when everything is included in the model lots of things predict the count variable but nothing predicts the hurdle variable.
Call:
hurdle(formula = TotComp ~ Expert * Injury | Expert * Injury, dist = "poisson")
Count model coefficients (truncated poisson with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.47896 0.02013 222.510 < 2e-16 ***
ExpertSevere
0.54763 0.02446 22.391 < 2e-16 ***
InjuryMinimal
-0.29324 0.03112 -9.422 < 2e-16 ***
InjuryModerate
-0.32588 0.03142 -10.372 < 2e-16 ***
ExpertSevere:InjuryMinimal -0.30695 0.04006 -7.663
1.81e-14 ***
ExpertSevere:InjuryModerate 0.02165 0.03843
0.564 0.573
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
2.2336
0.6075
3.677
0.000236 ***
ExpertSevere
1.2928
1.1826 1.093
0.274310
InjuryMinimal
-0.5472
0.7785 -0.703
0.482138
InjuryModerate
-0.8837
0.7409 -1.193
0.232994
ExpertSevere:InjuryMinimal
-1.7628
1.3407 -1.315
0.188579
ExpertSevere:InjuryModerate
-0.8509
1.3460
-0.632
0.527256
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 12
Log-likelihood: -7716 on 12 Df
> summary(htot1nb)
With the negative binomial the significance goes down because the standard errors are much higher.
Call:
hurdle(formula = TotComp ~ Expert * Injury | Expert * Injury, dist = "negbin")
Count model coefficients (truncated negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.44886 0.22021 20.203 <2e-16 ***
ExpertSevere
0.55785 0.29694 1.879 0.0603 .
InjuryMinimal
-0.30079 0.31402 -0.958 0.3381
InjuryModerate
-0.33439 0.31403 -1.065 0.2870
ExpertSevere:InjuryMinimal -0.31085 0.43425 -0.716
0.4741
ExpertSevere:InjuryModerate 0.02502 0.42840
0.058 0.9534
Log(theta)
-0.30229 0.12325 -2.453 0.0142 *
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
2.2336 0.6075 3.677 0.000236 ***
ExpertSevere
1.2928 1.1826 1.093 0.274310
InjuryMinimal
-0.5472 0.7785 -0.703 0.482138
InjuryModerate
-0.8837 0.7409 -1.193 0.232994
ExpertSevere:InjuryMinimal -1.7628 1.3407
-1.315 0.188579
ExpertSevere:InjuryModerate -0.8509 1.3460
-0.632 0.527256
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta: count = 0.7391
Number of iterations in BFGS optimization: 15
Log-likelihood: -1030 on 13 Df
>
>
> lowerlogtheta <- log((htot1nb)$theta) - 2* htot1nb$SE.logtheta
> upperlogtheta <- log((htot1nb)$theta) + 2* htot1nb$SE.logtheta
> ltheta <- exp(lowerlogtheta)
> utheta <- exp(upperlogtheta)
> print(c(ltheta,htot1nb$theta,utheta))
This is the theta (θ) from the text and its asymptotic approx. 95% confidence interval.
count count
count
0.5776477 0.7391271 0.9457475
> thevars <- as.data.frame(cbind(TotComp,Expert,Injury))
> theta <- function(x,i)
+ hurdle(x$TotComp[i]~ x$Expert[i]*x$Injury[i]
+ |x$Expert[i]*x$Injury[i],dist="negbin")$theta
> thetaboot <- boot(thevars,theta, R=2000)
> boot.ci(thetaboot,type="bca")
The bootstrap confidence interval is slightly smaller.
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 2000 bootstrap replicates
CALL :
boot.ci(boot.out = thetaboot, type = "bca")
Intervals :
Level BCa
95% ( 0.5533, 0.8807 )
Calculations and Intervals on Original Scale
>
>
>
> htot2nb <- hurdle(TotComp ~ Expert*Injury|Expert+Injury,dist="negbin")
> htot3nb <- hurdle(TotComp ~ Expert+Injury|Expert+Injury,dist="negbin")
> htot4nb <- hurdle(TotComp ~ Expert+Injury|Injury,dist="negbin")
> htot5nb <- hurdle(TotComp ~ Expert|Injury,dist="negbin")
> htot6nb <- hurdle(TotComp ~ Expert|1,dist="negbin")
> htot7nb <- hurdle(TotComp ~ 1|Injury,dis="negbin")
We'll only print for the model reported in the paper. The others can be printed if you so desire.
> summary(htot5nb); logLik(htot5nb);
This is an interesting result (in our humble opinions). If somebody has experienced injury, they should get an award, but if they haven't then they should not. The amount of the award should be based on what is typical. Thus, the defendant should pay for the typical reaction.
Call:
hurdle(formula = TotComp ~ Expert | Injury, dist = "negbin")
Count model coefficients (truncated negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.2478
0.1314 32.32 < 2e-16 ***
ExpertSevere 0.5031 0.1803
2.79 0.00527 **
Log(theta) -0.3385
0.1244 -2.72 0.00653 **
Zero hurdle model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.7408
0.5159 5.313 1.08e-07 ***
InjuryMinimal -1.3168 0.6013
-2.190 0.0285 *
InjuryModerate -1.1827 0.6058
-1.952 0.0509 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta: count = 0.7129
Number of iterations in BFGS optimization: 9
Log-likelihood: -1034 on 6 Df
'log Lik.' -1033.790 (df=6)
> chis <- 2*c(logLik(htot1p)-logLik(htot1nb),
+ logLik(htot1nb)-logLik(htot2nb),
+ logLik(htot2nb)-logLik(htot3nb),
+ logLik(htot3nb)-logLik(htot4nb),
+ logLik(htot4nb)-logLik(htot5nb),
+ logLik(htot5nb)-logLik(htot6nb),
+ logLik(htot5nb)-logLik(htot7nb))
> chis
[1] -1.337263e+04 2.262532e+00 7.132639e-01 1.655285e-01
4.730927e+00 6.327748e+00
[7] 7.481845e+00
> pchisq(10.5907839,1,lower.tail=F)
[1] 0.001136528
> pchisq(6.3277484,2,lower.tail=F)
This shows that the injury level effect is significant.
[1] 0.04226169
The following are just the predicted cell and marginal values.
> tapply(predict(htot5nb,type="zero"),Injury,mean)
Extreme
Minimal Moderate
0.9702428 0.8325145 0.8534447
> tapply(predict(htot5nb,type="count"),Expert,mean)
Mild Severe
69.95031 115.68160
> tapply(predict(htot5nb,type="response"),list(Expert,Injury),mean)
Extreme Minimal Moderate
Mild 68.28922 58.58998 60.05237
Severe 111.62340 95.76933 98.15971
> zs2 <- tapply(predict(htot5nb,type="zero"),list(Expert,Injury),mean)
> zs2
Extreme Minimal Moderate
Mild 0.9762533 0.8375943 0.8585005
Severe 0.9649192 0.8278700 0.8485334
> cs2 <- tapply(predict(htot5nb,type="count"),list(Expert,Injury),mean)
> cs2
Extreme Minimal Moderate
Mild 69.95031 69.95031 69.95031
Severe 115.68160 115.68160 115.68160
>
> ztot1p <- zeroinfl(TotComp ~ Expert*Injury|Expert*Injury,dist="poisson")
> ztot1nb <- zeroinfl(TotComp ~ Expert*Injury|Expert*Injury,dist="negbin")
> summary(ztot1p)
The zero inflated models. As discussed in the paper the results are similar to the hurdle model.
Call:
zeroinfl(formula = TotComp ~ Expert * Injury | Expert * Injury, dist = "poisson")
Count model coefficients (poisson with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.47896 0.02013 222.510 < 2e-16 ***
ExpertSevere
0.54763 0.02446 22.391 < 2e-16 ***
InjuryMinimal
-0.29324 0.03112 -9.422 < 2e-16 ***
InjuryModerate
-0.32588 0.03142 -10.372 < 2e-16 ***
ExpertSevere:InjuryMinimal -0.30695
0.04006 -7.663 1.81e-14 ***
ExpertSevere:InjuryModerate 0.02166
0.03843 0.564 0.573
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
-2.2336 0.6075 -3.677 0.000236 ***
ExpertSevere
-1.2928 1.1826 -1.093 0.274311
InjuryMinimal
0.5472 0.7785 0.703 0.482142
InjuryModerate
0.8837 0.7409 1.193 0.232997
ExpertSevere:InjuryMinimal 1.7628
1.3407 1.315 0.188582
ExpertSevere:InjuryModerate 0.8509
1.3460 0.632 0.527258
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Number of iterations in BFGS optimization: 11
Log-likelihood: -7716 on 12 Df
> summary(ztot1nb)
Call:
zeroinfl(formula = TotComp ~ Expert * Injury | Expert * Injury, dist = "negbin")
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
4.44885 0.22021 20.203 <2e-16 ***
ExpertSevere
0.55784 0.29694 1.879 0.0603 .
InjuryMinimal
-0.30080 0.31402 -0.958 0.3381
InjuryModerate
-0.33439 0.31403 -1.065 0.2870
ExpertSevere:InjuryMinimal -0.31084 0.43425
-0.716 0.4741
ExpertSevere:InjuryModerate 0.02502 0.42840
0.058 0.9534
Log(theta)
-0.30228 0.12325 -2.453 0.0142 *
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept)
-2.5995 0.8669 -2.999 0.00271 **
ExpertSevere
-2.0953 3.4067 -0.615 0.53852
InjuryMinimal
0.6432 1.0536 0.610 0.54153
InjuryModerate
1.0463 0.9928 1.054 0.29195
ExpertSevere:InjuryMinimal 2.6902
3.5014 0.768 0.44229
ExpertSevere:InjuryModerate 1.6668
3.4892 0.478 0.63287
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta = 0.7391
Number of iterations in BFGS optimization: 22
Log-likelihood: -1030 on 13 Df
'log Lik.' -1029.854 (df=13)
> chisq <- -2*(logLik(ztot1p)[1]-logLik(ztot1nb)[1])
> pchisq(chisq,1,lower.tail=FALSE)
[1] 0
>
> lowerlogtheta <- log((ztot1nb)$theta) - 2* ztot1nb$SE.logtheta
> upperlogtheta <- log((ztot1nb)$theta) + 2* ztot1nb$SE.logtheta
> ltheta <- exp(lowerlogtheta)
> utheta <- exp(upperlogtheta)
> thevars <- as.data.frame(cbind(TotComp,Expert,Injury))
> theta <- function(x,i)
+ zeroinfl(x$TotComp[i]~ x$Expert[i]*x$Injury[i]
+ |x$Expert[i]*x$Injury[i],dist="negbin")$theta
> thetaboot <- boot(thevars,theta, R=2000)
> boot.ci(thetaboot,type="bca")
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 2000 bootstrap replicates
CALL :
boot.ci(boot.out = thetaboot, type = "bca")
Intervals :
Level BCa
95% ( 0.5642, 0.8885 )
Calculations and Intervals on Original Scale
>
>
> ztot2nb <- zeroinfl(TotComp ~ Expert*Injury|Expert+Injury,dist="negbin")
> ztot3nb <- zeroinfl(TotComp ~ Expert+Injury|Expert+Injury,dist="negbin")
> ztot4nb <- zeroinfl(TotComp ~ Expert+Injury|Injury,dist="negbin")
> ztot5nb <- zeroinfl(TotComp ~ Expert|Injury,dist="negbin")
> ztot6nb <- zeroinfl(TotComp ~ Expert|1,dist="negbin")
> ztot7nb <- zeroinfl(TotComp ~ 1|Injury,dis="negbin")
Again, we'll just print the information for the best fitting model.
> summary(ztot5nb)
Call:
zeroinfl(formula = TotComp ~ Expert | Injury, dist = "negbin")
Count model coefficients (negbin with log link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) 4.2404 0.1315
32.249 < 2e-16 ***
ExpertSevere 0.5143 0.1801
2.855 0.00431 **
Log(theta) -0.3437 0.1251
-2.746 0.00603 **
Zero-inflation model coefficients (binomial with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -3.606
1.269 -2.842 0.00448 **
InjuryMinimal 2.007
1.295 1.550 0.12118
InjuryModerate 1.834
1.297 1.414 0.15725
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Theta = 0.7092
Number of iterations in BFGS optimization: 13
Log-likelihood: -1034 on 6 Df
> summary(ztot6nb); logLik(ztot6nb); summary(ztot7nb); logLik(ztot7nb)
> chisz <- 2*c(logLik(ztot1p)-logLik(ztot1nb),
+ logLik(ztot1nb)-logLik(ztot2nb),
+ logLik(ztot2nb)-logLik(ztot3nb),
+ logLik(ztot3nb)-logLik(ztot4nb),
+ logLik(ztot4nb)-logLik(ztot5nb),
+ logLik(ztot5nb)-logLik(ztot6nb),
+ logLik(ztot5nb)-logLik(ztot7nb))
> chisz
[1] -1.337263e+04 1.970750e+00 8.865592e-01 9.063249e-03
4.615409e+00 6.584113e+00
[7] 7.872315e+00
>
> chis <- 2*c(logLik(ztot5nb)-logLik(ztot6nb),
+ logLik(ztot5nb)-logLik(ztot7nb))
> pchisq(6.628847,2,lower.tail=F)
[1] 0.036355
> pchisq(11.041633,1,lower.tail=F)
[1] 0.0008908835
>
>
> tapply(predict(ztot5nb,type="zero"),Injury,mean)
Extreme Minimal
Moderate
0.02644098 0.16812630 0.14534052
> zs <- c(exp(-3.606)/(1+exp(-3.606)),
+ exp(1.834-3.606)/(1+exp(1.834-3.606)),
+ exp(2.007-3.606)/(1+exp(2.007-3.606)))
> zs
[1] 0.02644210 0.14529379 0.16812143
> tapply(predict(ztot5nb,type="count"),Expert,mean)
Mild
Severe
69.43599 116.12768
> cs <- exp(c(4.2404,4.2404+.5143))
> cs
[1] 69.43562 116.12881
> tapply(predict(ztot5nb,type="response"),list(Expert,Injury),mean)
Extreme Minimal Moderate
Mild 67.60003 57.76197
59.34412
Severe 113.05715 96.60356 99.24962
>
>
>
> ttot1 <- tobit(TotComp ~ Expert*Injury)
> ttot2 <- tobit(TotComp ~ Expert+Injury)
> ttot3 <- tobit(TotComp ~ Expert)
> ttot4 <- tobit(TotComp ~ Injury)
Just printing main effects model.
> summary(ttot2); logLik(ttot2)
Call:
tobit(formula = TotComp ~ Expert + Injury)
Observations:
Total Left-censored Uncensored Right-censored
202 29
173 0
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 89.7357 14.4056
6.229 4.69e-10 ***
ExpertSevere 43.4374 14.1915
3.061 0.002207 **
InjuryMinimal -64.4723 17.4431 -3.696
0.000219 ***
InjuryModerate -48.6291 17.2794 -2.814
0.004889 **
Log(scale) 4.5939
0.0551 83.378 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Scale: 98.88
Gaussian distribution
Number of Newton-Raphson Iterations: 3
Log-likelihood: -1067 on 5 Df
Wald-statistic: 24.62 on 3 Df, p-value: 1.8573e-05
'log Lik.' -1067.388 (df=5)
> chist <- 2*c(logLik(ttot4)-logLik(ttot2),
+ logLik(ttot3)-logLik(ttot2),
+ logLik(ttot2)-logLik(ttot1))
> chist
[1] -9.135286 -14.497530 -4.014352
> pchisq(4.014352,2,lower.tail=FALSE)
[1] 0.1343676
> skewness(TotComp[TotComp != 0])
[1] 0.816546
>
> sqTotComp <- round(sqrt(TotComp))
> hist(sqTotComp,breaks=0:max(sqTotComp))

>
>
> TotComp3 <- round(TotComp^(1/3))
And other transformations were also tried. Splines can be handy for the predictor variables, but it was decided that introducing splines would have been too much for an already long paper.