-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathHW05emailFall2018Key.Rmd
355 lines (216 loc) · 14.2 KB
/
HW05emailFall2018Key.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
---
title: "Homework 05 Fall 2018"
author: "YourNameHere"
output: html_document
editor_options:
chunk_output_type: console
---
## Homework : ANOVA in CRD and RCBD
Use this homework as an opportunity to make sure you understand the calculations in ANOVA. In this homework you can use R functions or simply do calculations using R as a calculator. For example if you need to calculate the average for th eset of numbers 1, 2, 3, 4, you can use `(1 + 2 + 3 + 4) / 4` or `mean(c(1, 2, 3, 4))`. You will have to store the results in the objects whose names you are given. For example, you maybe given code similar to: `my.avg1 <- # enter the average here`, in which case you would complete this in any of the following ways:
`my.avg1 <- 2.5 # enter the average here`
`my.avg1 <- (1 + 2 + 3 + 4) / 4 # enter the average here`
`my.avg1 <- mean(c(1, 2, 3, 4)) # enter the average here`
`my.avg1 <- sum(c(1, 2, 3, 4)) / lenght(c(1, 2, 3, 4)))) # enter the average here`
We will grade the value in `my.avg1`
If you type numbers as input, keep inmind that your results will be graded as correct if they are within 2% of the true correct result obtained without rounding. If you want to use R to do all calculations without rounding, look for the code in the textbook. Keep in mind that the computer shows you a properly truncated version of any quantity, because frequently the actual values have many many figures to the right of the decimal point.
Run the following chunk before you proceed to do the rest. If you do not have the packages installed, go to the Packages tab, click Install and follow the instructions therein.
```{r echo=FALSE, message=FALSE}
library(emmeans)
library(knitr)
library(dplyr)
library(kableExtra)
library(agricolae)
```
### Completely Randomized Design [50]
A CRD experiment was conducted to study the effect of nitrogen fertilization on sugarbeet yield. There were 6 treatments of increasing fertilization rate (lb/ac): A (0), B (50), C (100), D (150), E (200), F (250) and yields were measured in lb/ac. The data are in the file "BeetRootYield.csv."
#### Read in the data and calculate treatment averages, treatment effects and residuals for all observations. [15]
```{r}
# Download the data from the website and read it into R. Instead of
# "BeetRootYield.csv" you should enter the complete path to wherever you
# downloaded the file. To find the complete path you can enter file.choose()
# and hit Enter in the console. Then follow the file navigation system of your
# OS and double click on the data file. The complete path will appear in the
# console. Do the same for any files you need to read into R to avoid problems
# in the future.
(yield.d <- read.csv("BeetRootYield.csv", header = TRUE)) # read in data
yield.d$rep <- factor(yield.d$rep) # run this line as it is.
## add any code you want to get the averages etc. here. For example:
(averages <- aggregate(yield ~ treat, data = yield.d, FUN = mean))
rownames(averages) <- averages$treat
#### Treatment averages ######## no need to modify this
(YbarA. <- averages["A", "yield"]) # this puts average of treatment A in YbarA.
(YbarB. <- averages["B", "yield"]) # etc.
(YbarC. <- averages["C", "yield"]) # etc.
(YbarD. <- averages["D", "yield"]) # etc.
(YbarE. <- averages["E", "yield"]) # etc.
(YbarF. <- averages["F", "yield"]) # etc.
(Ybar.. <- mean(yield.d$yield))
#### Treatment effects ########
# We use the letter T to represent the greek letter Tau
ThatA. <- # estimated effect of treat. A goes here; ThatA. = YbarA. - Ybar..
ThatB. <- YbarB. - Ybar.. # put estimated effect of treatment B in here
ThatC. <- YbarC. - Ybar.. # put estimated effect of treatment C in here
ThatD. <- YbarD. - Ybar.. # put estimated effect of treatment D in here
ThatE. <- YbarE. - Ybar.. # put estimated effect of treatment E in here
ThatF. <- YbarF. - Ybar.. # put estimated effect of treatment F in here
(Thats <- c(ThatA., ThatB., ThatC., ThatD., ThatE., ThatF.)) # leave as it is
(SST <- length(unique(yield.d$rep)) * sum(Thats^2))
m1 <- lm(yield ~ treat, yield.d)
(my.res <- residuals(m1))
resdls <- my.res # c() # enter the residuals separated by commas and in the order in
# which observations appear in the original data. You can use any method to
# fill in the vector; use the easiest way for you to create the vector resdls.
# There is one residual for each observation. eij = Yij - Ybari., for example,
# eA1 = YA1 - YbarA.
```
#### Create the complete ANOVA table and interpret the F test for treatments. [15]
```{r}
(a1 <- anova(m1))
(SourceOfSS <- c("Treatments", "Residual", "Total")) # leave as it is
(SumOfSquares <- c(a1$'Sum Sq'[1], a1$'Sum Sq'[2], sum(a1$'Sum Sq'))) # enter the SS for treatment, residual and total separated
# by commas and in the order listed.
(degreesOfFreedom <- c(a1$Df[1], a1$Df[2], sum(a1$Df))) # enter the df for treatment, residual and total separated
# by commas and in the order listed.
(MST <- a1$'Mean Sq'[1]) # enter the mean square for treatments here.
(MSE <- a1$'Mean Sq'[2]) # enter the mean square for error or residual here.
(Fcalc <- MST / MSE) # enter the calculated F here.
(Fcrit <- qf(0.05, df1 = a1$Df[1], df2 = a1$Df[2], lower.tail = FALSE)) # enter the value of the critical F here.
## Leave the following lines as they are and run them ########
anovaTable <- data.frame(source = SourceOfSS,
SS = SumOfSquares,
df = degreesOfFreedom,
MS = c(MST, MSE, NA),
Fcalc = c(Fcalc, NA, NA))
kable(anovaTable, digits = c(0, 2, 0, 2, 3)) %>%
kable_styling(full_width = FALSE)
# Check your values against the following table.
# If your values are different, you made a mistake.
anova(m1)
```
Interpret the results and state your null hypothesis and the conclusion.
The calculated F is smaller than the critical F, therefore, we cannot reject the null hypothesis that all means are equal.
#### Calculate 95% confidence intervals for treatments A and C [10]
```{r}
(tcrit <- qt(p = 0.975, df = a1$Df[2])) # enter the t critical here
(S.Ybari. <- sqrt(MSE / length(unique(yield.d$rep)))) # enter the standard error of the averages here
(CIloA <- YbarA. - tcrit * S.Ybari.) # enter the lower boundary of the CI for A
(CIupA <- YbarA. + tcrit * S.Ybari.) # enter the upper boundary of the CI for A
(CIloC <- YbarC. - tcrit * S.Ybari.) # enter the lower boundary of the CI for C
(CIupC <- YbarC. + tcrit * S.Ybari.) # enter the upper boundary of the CI for C
```
#### Use the LSD to determine if treatments A and C differ in yield. [10]
```{r}
(tcrit.AC <- tcrit) # enter the t critical here
(Sdbar.AC <- sqrt(2 * MSE / length(unique(yield.d$rep)))) # enter the standard deviation of the difference in averages.
(LSD.AC <- tcrit.AC * Sdbar.AC) # enter the LSD here.
(dbar.AC <- YbarA. - YbarC.) # enter the difference between treatments here
```
Interpret the results and state your null hypothesis and the conclusion.
The difference between treatment averages is smaller than the LSD, therefore, we cannot reject the null hypothesis that the means of treatments A and C are equal
### Randomized Complete Block Design [50]
The following experiment is presented in [@OstleMensing1975]. Ten treatments consisting of different rations were tested in a feeding trial with steers. The 40 steers available were separated into 4 blocks with 10 steers in each block. The blocks were made on the basis of steer initial weight, prior to the start of the feeding trial. The first block had the heaviest steers, the second block had the next heaviest group and so on. Within blocks, steers received the treatments randomly.
#### Read in the data and calculate treatment averages, treatment effects and residuals for all observations. [15]
```{r}
(gain.d <- read.csv("/Users/eal1/Google Drive/Documents/TeachServ/CLASSES/PLS120/PLS120Book2018a/Datasets/SteerGain.csv", header = TRUE))
# read the file with the data, named SteerGain.csv. See the first exercise to
# find out the path for your file.
# The following line make the block variable into a categorical variable.
(gain.d$block <- factor(gain.d$block)) # run this line as it is.
## add any code you want to get the averages etc. here. For example:
(trt.avgs <- aggregate(gain ~ diet, data = gain.d, FUN = mean))
rownames(trt.avgs) <- tolower(trt.avgs$diet)
#### Treatment averages ########
(Ybara. <- trt.avgs["a", "gain"]) # put average of treatment a in here
(Ybarb. <- trt.avgs["b", "gain"]) # put average of treatment b in here
(Ybarc. <- trt.avgs["c", "gain"]) # put average of treatment c in here
(Ybard. <- trt.avgs["d", "gain"]) # put average of treatment d in here
(Ybare. <- trt.avgs["e", "gain"]) # put average of treatment e in here
(Ybarf. <- trt.avgs["f", "gain"]) # put average of treatment f in here
(Ybarg. <- trt.avgs["g", "gain"]) # put average of treatment f in here
(Ybarh. <- trt.avgs["h", "gain"]) # put average of treatment f in here
(Ybari. <- trt.avgs["i", "gain"]) # put average of treatment f in here
(Ybarj. <- trt.avgs["j", "gain"]) # put average of treatment f in here
(Ybar..rcbd <- mean(gain.d$gain))
#### Treatment effects ########
# We use the letter T to represent the greek letter Tau
Thata. <- Ybara. - Ybar..rcbd # put estimated effect of treatment a in here
Thatb. <- Ybarb. - Ybar..rcbd # put estimated effect of treatment b in here
Thatc. <- Ybarc. - Ybar..rcbd # put estimated effect of treatment c in here
Thatd. <- Ybard. - Ybar..rcbd # put estimated effect of treatment d in here
Thate. <- Ybare. - Ybar..rcbd # put estimated effect of treatment e in here
Thatf. <- Ybarf. - Ybar..rcbd # put estimated effect of treatment f in here
Thatg. <- Ybarg. - Ybar..rcbd # put estimated effect of treatment g in here
Thath. <- Ybarh. - Ybar..rcbd # put estimated effect of treatment h in here
Thati. <- Ybari. - Ybar..rcbd # put estimated effect of treatment i in here
Thatj. <- Ybarj. - Ybar..rcbd # put estimated effect of treatment j in here
# Leave this line as it is
(Thats2 <- c(Thata., Thatb., Thatc., Thatd., Thate., Thatf., Thatg., Thath., Thati., Thatj.))
#### Block averages ##########
blk.avgs <- aggregate(gain ~ block, data = gain.d, FUN = mean)
rownames(blk.avgs) <- blk.avgs$diet
(Ybar.1 <- blk.avgs["1", "gain"]) # put average of block 1 in here
(Ybar.2 <- blk.avgs["2", "gain"]) # put average of block 1 in here
(Ybar.3 <- blk.avgs["3", "gain"]) # put average of block 1 in here
(Ybar.4 <- blk.avgs["4", "gain"]) # put average of block 1 in here
### Block effects ##########
Bhat.1 <- Ybar.1 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.2 <- Ybar.2 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.3 <- Ybar.3 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.4 <- Ybar.4 - Ybar..rcbd # put estimated effect of block 1 in here
# Leave this line as it is
(Bhats <- c(Bhat.1, Bhat.2, Bhat.3, Bhat.4))
m2 <- lm(gain ~ diet + block, gain.d)
my.res2 <- residuals(m2)
(resdls2 <- my.res2) # c() # enter the residuals separated by commas and in the order in
# which observations appear in the original data. You can use any method to
# fill in the vector; use the easiest way _for you_ to create the vector resdls.
```
#### Create the complete ANOVA table and interpret the F test for treatments. [15]
```{r}
a2 <- anova(m2)
SourceOfSS2 <- c("Diets", "Block", "Residual", "Total")
SumOfSquares2 <- c(a2$'Sum Sq'[1], a2$'Sum Sq'[2], a2$'Sum Sq'[3], sum(a2$'Sum Sq')) # enter the SS for diets, block, residual and total separated
# by commas and in the order listed.
degreesOfFreedom2 <- c(a2$Df[1], a2$Df[2], a2$Df[3], sum(a2$Df)) # enter the df for diets, block, residual and total separated
# by commas and in the order listed.
(MST2 <- a2$'Mean Sq'[1]) # enter the mean square for diets here.
(MSB <- a2$'Mean Sq'[2]) # enter the mean square for blocks here.
(MSE2 <- a2$'Mean Sq'[3]) # enter the mean square for error or residual here.
(Fcalc.diet <- MST2 / MSE2) # enter the calculated F for diet here.
(Fcrit.diet <- qf(0.05, df1 = a2$Df[1], df2 = a2$Df[3], lower.tail = FALSE)) # enter the value of the critical F here.
(Fcalc.blk <- MSB / MSE2) # enter the value of the calculated F for blocks here.
(Fcrit.blk <- qf(0.05, df1 = a2$Df[2], df2 = a2$Df[3], lower.tail = FALSE)) # enter the value of the critical F here.
## Leave the following lines as they are and run them ########
anovaTable2 <- data.frame(source = SourceOfSS2,
SS = SumOfSquares2,
df = degreesOfFreedom2,
MS = c(MST2, MSB, MSE2, NA),
Fcalc = c(Fcalc.diet, Fcalc.blk, NA, NA))
kable(anovaTable, digits = c(0, 2, 0, 2, 3)) %>%
kable_styling(full_width = FALSE)
# Check your values against the following table.
# If your values are different, you made a mistake.
anova(m2)
```
Interpret the results and state your null hypothesis and the conclusion.
Calculated F is greater than critical F, therefore, the null hypothesis is rejected. We conclude that the gain in at least one diet differs from that in another diet.
#### Calculate 95% confidence intervals for diets a and h [10]
```{r}
(tcrit.a <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here
(S.Ybara. <- sqrt(MSE2 / length(unique(gain.d$block)))) # enter the standard error of the averages here
(CIloa <- Ybara. - tcrit.a * S.Ybara.) # enter the lower boundary of the CI for a
(CIupa <- Ybara. + tcrit.a * S.Ybara.) # enter the upper boundary of the CI for a
(tcrit.h <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here
(S.Ybarh. <- sqrt(MSE2 / length(unique(gain.d$block)))) # enter the standard error of the averages here
(CIloh <- Ybarh. - tcrit.h * S.Ybarh.) # enter the lower boundary of the CI for h
(CIuph <- Ybarh. + tcrit.h * S.Ybarh.) # enter the upper boundary of the CI for h
```
#### Use the LSD to determine if diets a and h differ in gain [10]
```{r}
(tcrit.ah <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here
(Sdbar.ah <- sqrt(2 * MSE2 / length(unique(gain.d$block)))) # enter the standard deviation of the difference in averages.
(LSD.ah <- tcrit.ah * Sdbar.ah) # enter the LSD here.
(dbar.ah <- Ybara. - Ybarh.) # enter the difference between treatments here
```
Interpret the results and state your null hypothesis and the conclusion.
The difference between treatments is greater than the LSD so we reject the null hypothesis that they are equal.