The fourth assignment of Linear Regression. The assignment is written in Rmarkdown, a smart syntax supported by RStudio helping with formula, plot visualization and plugin codes running.
most recommend: click here for html version of assignment, you can see codes as well as plots.
You may also find the PDF Version of this assignment from github. Or if you can cross the fire wall, just see below:
1 |
|
a
1 | boxplot(resid(hardness.fit),main="Box Plot of Hardness Data", ylab="Residuals") |
The mean of residuals is zero
b
1 | yhat <- fitted(hardness.fit); resid <- resid(hardness.fit) |
There is one residual equals to 5.575 a little higher than others.
c
1 | hardness.stdres = rstandard(hardness.fit) |
1 | StdErr = summary(hardness.fit)$sigma |
With n=16, from Table B.6, the critical value for the coefficient of correlation between the ordered residuals and the expected values under normality when the distribution of error terms is normal using a 0.05 significance level is 0.941. Since 0.9916733 > 0.941, the assumption of normality appeared reasonable.
4.5
a
1 | interval <- function(dat){ |
From R result, $b_0 = 168.6, s(b_0)=2.65702, b_1 = 2.03438, s(b_1)=0.09039.$ Since
t(0.975, 14) = 2.145, Bonferroni joint confidence intervals for $β_0$ and $β_1$, using a 90% percent family confidence coefficient, are 168.6±2.145(2.65702) = [162.901, 174.299] for $β_0$ and 2.03438±2.145(0.09039) = [1.840, 2.228] for $β_1$. At least 90% of the time, both coefficients will be within the limits stated.
c
The 90% joint confidence interval means that both will be in the interval at least 90% of the time.
Restated, at least one of them will be out of the interval no more than 10% of the time. We cannot get more specific than this.
4.9
a
For Bonferroni, use $b_0+b_1X_j±t(1−0.1/6, 14)s{\hat Y_h}$, with t(1−.10/6, 14) = 2.35982.
1 | meaninterval <-function(X_h){ |
Using this function we have CI of 20,30,40 are [215.1106,228.3704], [245.9163,250.7052], [265.1384,284.6236] respectively.
The 90% joint confidence interval means that all three mean hardness will be in their respective interval at least 90% of the time.Restated, at least one of them will be out of the interval no more than 10% of the time. We cannot get more specific than this.
4.12
1 | galleys.x <- c(7, 12, 10, 10, 14, 25, 30, 25, 18, 10, 4, 6) |
a
1 | typos.lm <- lm(cost.y~galleys.x-1) |
b
1 | plot(galleys.x,cost.y,xlab= "Galleys", ylab="Cost") |
It appears that the model fits good
c
1 | historical.norm <- data.frame(galleys.x=1) |
Alternatives: $H0:E[Y]=β{10}=17.50$
$H0:E[Y]≠β{10}=17.50$
CI: $17.81226≤E[Y_h]≤18.24435$
Decision rule:
If $β_{10}$ falls within the confidence interval for $E[Y_h]$, conclude $H_0$;
If $β_{10}$ does not fall within the confidence interval for $E[Y_h]$, conclude $H_A$
Conclusion:
Since 17.50<17.81226; therefore, accept $H_A$.
d
1 | newdata.galleys <- data.frame(galleys.x=10) |
$\hat Y_h=180.283$
s[pred]=4.506806
180.283±2.738769(4.506806)
$167.8441≤Y_{h(new)}≤192.722$