Skip to content
WonderLand
Go back

Linear Regression Assignment 8

The eigth assignment of Linear Regression. The assignment is written in Rmarkdown, a smart syntax supported by RStudio helping with formula, plot visualization and plugin codes running.

most recommend: click here for html version of assignment, you can see codes as well as plots.

You may also find the PDF Version of this assignment from github. Or if you can cross the fire wall, just see below:

1

a

# read the data
setwd('~/Desktop/三春/5线性回归分析/作业/HW8/')
dat<-read.csv("hw8.csv")
X1<-dat$x1
X2<-dat$x2
X3<-dat$x3
X4<-dat$x4
Y<-dat$y
# plot stem and leaf plots
stem(X1)
stem(X2)
stem(X3)
stem(X4)

It seems that X3 has a denser concentration and the following boxplot supports it. X1 has two outliers. X2 is asymmetric

library(vioplot)
vioplot(X1,X2,X3,X4,col="gold")
boxplot(X1,X2,X3,X4)

b

pairs(~X1+X2+X3+X4+Y,data=dat, 
   main="Scatterplot Matrix")
cor(dat)

obviously X3 and X4 has high correlation.

c

Fit = lm(Y~X1+X2+X3+X4, data=dat)
anova(Fit)
summary(Fit)

Y^=124.38+0.30x1+0.05x2+1.31x3+0.52x4\hat Y = -124.38 + 0.30 x_1 + 0.05 x_2 + 1.31 x_3 + 0.52 x_4 It seems X2 should be excluded from the model since the p-value=0.4038.

2

a

library(leaps)
best <- function(model, ...) 
{
  subsets <- regsubsets(formula(model), model.frame(model), ...)
  subsets <- with(summary(subsets),
                  cbind(p = as.numeric(rownames(which)), which, adjr2))

  return(subsets)
}  
round(best(Fit, nbest = 6), 4)

The four best subset regression models are

subsetRa,p2R^2_{a,p}
x1, x3, x40.956
x1,x2,x3,x40.955
x1,x30.927
x1,x2,x30.925

b

There are CpC_p Criterion, #AIC_p# and #SBC_p# which can be used as criterion to select the best model. They all place penalties for adding predictors.

3

library(MASS)
Null = lm(Y ~ 1, dat)
addterm(Null, scope = Fit, test="F")
NewMod = update( Null, .~. + X3)
addterm( NewMod, scope = Fit, test="F" )
NewMod = update( NewMod, .~. + X1)
dropterm(NewMod , test = "F")
addterm( NewMod, scope = Fit, test="F" )
NewMod = update( NewMod, .~. + X4)
dropterm( NewMod, test = "F" )
addterm( NewMod, scope = Fit, test="F" )

b

The model evaluated using the forward stepwise regression shows the same result as earlier chosen variables under the criteria of adjusted R square.


Share this post on:

Previous Post
On Wittgenstein's Picture Theory of Meaning
Next Post
Reading Notes of Wittgenstein's Bio