HW05emailFall2018Key.Rmd

---
title: "Homework 05 Fall 2018"
author: "YourNameHere"
output: html_document
editor_options:
  chunk_output_type: console
---
## Homework : ANOVA in CRD and RCBD

Use this homework as an opportunity to make sure you understand the calculations in ANOVA. In this homework you can use R functions or simply do calculations using R as a calculator. For example if you need to calculate the average for th eset of numbers 1, 2, 3, 4, you can use `(1 + 2 + 3 + 4) / 4` or `mean(c(1, 2, 3, 4))`. You will have to store the results in the objects whose names you are given. For example, you maybe given code similar to: `my.avg1 <-  # enter the average here`, in which case you would complete this in any of the following ways:

`my.avg1 <- 2.5 # enter the average here`

`my.avg1 <- (1 + 2 + 3 + 4) / 4 # enter the average here`

`my.avg1 <- mean(c(1, 2, 3, 4)) # enter the average here`

`my.avg1 <- sum(c(1, 2, 3, 4)) / lenght(c(1, 2, 3, 4)))) # enter the average here`

We will grade the value in `my.avg1`

If you type numbers as input, keep inmind that your results will be graded as correct if they are within 2% of the true correct result obtained without rounding. If you want to use R to do all calculations without rounding, look for the code in the textbook. Keep in mind that the computer shows you a properly truncated version of any quantity, because frequently the actual values have many many figures to the right of the decimal point.

Run the following chunk before you proceed to do the rest. If you do not have the packages installed, go to the Packages tab, click Install and follow the instructions therein.

```{r echo=FALSE, message=FALSE}
library(emmeans)
library(knitr)
library(dplyr)
library(kableExtra)
library(agricolae)
```


### Completely Randomized Design [50]

A CRD experiment was conducted to study the effect of nitrogen fertilization on sugarbeet yield. There were 6 treatments of increasing fertilization rate (lb/ac): A (0), B (50), C (100), D (150), E (200), F (250) and yields were measured in lb/ac. The data are in the file "BeetRootYield.csv."

#### Read in the data and calculate treatment averages, treatment effects and residuals for all observations. [15]

```{r}

# Download the data from the website and read it into R. Instead of
# "BeetRootYield.csv" you should enter the complete path to wherever you
# downloaded the file. To find the complete path you can enter file.choose()
# and hit Enter in the console. Then follow the file navigation system of your
# OS and double click on the data file. The complete path will appear in the
# console. Do the same for any files you need to read into R to avoid problems
# in the future.

(yield.d <- read.csv("BeetRootYield.csv", header = TRUE)) # read in data

yield.d$rep <- factor(yield.d$rep) # run this line as it is.

## add any code you want to get the averages etc. here. For example:

(averages <- aggregate(yield ~ treat, data = yield.d, FUN = mean))

rownames(averages) <- averages$treat

#### Treatment averages ######## no need to modify this

(YbarA. <- averages["A", "yield"]) # this puts average of treatment A in YbarA.
(YbarB. <- averages["B", "yield"]) # etc.
(YbarC. <- averages["C", "yield"]) # etc.
(YbarD. <- averages["D", "yield"]) # etc.
(YbarE. <- averages["E", "yield"]) # etc.
(YbarF. <- averages["F", "yield"]) # etc.

(Ybar.. <- mean(yield.d$yield))

#### Treatment effects ########

# We use the letter T to represent the greek letter Tau

ThatA. <-   # estimated effect of treat. A goes here; ThatA. = YbarA. - Ybar..
ThatB. <- YbarB. - Ybar.. # put estimated effect of treatment B in here
ThatC. <- YbarC. - Ybar.. # put estimated effect of treatment C in here
ThatD. <- YbarD. - Ybar.. # put estimated effect of treatment D in here
ThatE. <- YbarE. - Ybar.. # put estimated effect of treatment E in here
ThatF. <- YbarF. - Ybar.. # put estimated effect of treatment F in here

(Thats <- c(ThatA., ThatB., ThatC., ThatD., ThatE., ThatF.)) # leave as it is

(SST <- length(unique(yield.d$rep)) * sum(Thats^2))

m1 <- lm(yield ~ treat, yield.d)
(my.res <- residuals(m1))

resdls <- my.res # c() # enter the residuals separated by commas and in the order in
# which observations appear in the original data. You can use any method to
# fill in the vector; use the easiest way for you to create the vector resdls.
# There is one residual for each observation. eij = Yij - Ybari., for example,
# eA1 = YA1 - YbarA.


```

#### Create the complete ANOVA table and interpret the F test for treatments. [15]

```{r}
(a1 <- anova(m1))

(SourceOfSS <- c("Treatments", "Residual", "Total"))  # leave as it is

(SumOfSquares <- c(a1$'Sum Sq'[1], a1$'Sum Sq'[2], sum(a1$'Sum Sq'))) # enter the SS for treatment, residual and total separated
# by commas and in the order listed.

(degreesOfFreedom <- c(a1$Df[1], a1$Df[2], sum(a1$Df))) # enter the df for treatment, residual and total separated
# by commas and in the order listed.

(MST <- a1$'Mean Sq'[1]) # enter the mean square for treatments here.

(MSE <- a1$'Mean Sq'[2]) # enter the mean square for error or residual here.

(Fcalc <- MST / MSE) # enter the calculated F here.

(Fcrit <- qf(0.05, df1 = a1$Df[1], df2 = a1$Df[2], lower.tail = FALSE)) # enter the value of the critical F here.

## Leave the following lines as they are and run them ########
   anovaTable <- data.frame(source = SourceOfSS,
                            SS = SumOfSquares,
                            df = degreesOfFreedom,
                            MS = c(MST, MSE, NA),
                            Fcalc = c(Fcalc, NA, NA))

kable(anovaTable, digits = c(0, 2, 0, 2, 3)) %>%
   kable_styling(full_width = FALSE)

# Check your values against the following table.
# If your values are different, you made a mistake.

anova(m1)

```

Interpret the results and state your null hypothesis and the conclusion.

The calculated F is smaller than the critical F, therefore, we cannot reject the null hypothesis that all means are equal.

#### Calculate 95% confidence intervals for treatments A and C [10]

```{r}

(tcrit <- qt(p = 0.975, df = a1$Df[2])) # enter the t critical here

(S.Ybari. <- sqrt(MSE / length(unique(yield.d$rep)))) # enter the standard error of the averages here

(CIloA <- YbarA. - tcrit * S.Ybari.) # enter the lower boundary of the CI for A

(CIupA <- YbarA. + tcrit * S.Ybari.) # enter the upper boundary of the CI for A

(CIloC <- YbarC. - tcrit * S.Ybari.) # enter the lower boundary of the CI for C

(CIupC <- YbarC. + tcrit * S.Ybari.) # enter the upper boundary of the CI for C

```


#### Use the LSD to determine if treatments A and C differ in yield. [10]

```{r}

(tcrit.AC <- tcrit) # enter the t critical here

(Sdbar.AC <- sqrt(2 * MSE / length(unique(yield.d$rep)))) # enter the standard deviation of the difference in averages.

(LSD.AC <- tcrit.AC * Sdbar.AC) # enter the LSD here.

(dbar.AC <- YbarA. - YbarC.) # enter the difference between treatments here

```

Interpret the results and state your null hypothesis and the conclusion.

The difference between treatment averages is smaller than the LSD, therefore, we cannot reject the null hypothesis that the means of treatments A and C are equal


### Randomized Complete Block Design [50]

The following experiment is presented in [@OstleMensing1975]. Ten treatments consisting of different rations were tested in a feeding trial with steers. The 40 steers available were separated into 4 blocks with 10 steers in each block. The blocks were made on the basis of steer initial weight, prior to the start of the feeding trial. The first block had the heaviest steers, the second block had the next heaviest group and so on. Within blocks, steers received the treatments randomly.

#### Read in the data and calculate treatment averages, treatment effects and residuals for all observations. [15]

```{r}

(gain.d <- read.csv("/Users/eal1/Google Drive/Documents/TeachServ/CLASSES/PLS120/PLS120Book2018a/Datasets/SteerGain.csv", header = TRUE))
# read the file with the data, named SteerGain.csv. See the first exercise to
# find out the path for your file.

# The following line make the block variable into a categorical variable.
(gain.d$block <- factor(gain.d$block)) # run this line as it is.

## add any code you want to get the averages etc. here. For example:

(trt.avgs <- aggregate(gain ~ diet, data = gain.d, FUN = mean))

rownames(trt.avgs) <- tolower(trt.avgs$diet)


#### Treatment averages ########

(Ybara. <- trt.avgs["a", "gain"]) # put average of treatment a in here
(Ybarb. <- trt.avgs["b", "gain"]) # put average of treatment b in here
(Ybarc. <- trt.avgs["c", "gain"]) # put average of treatment c in here
(Ybard. <- trt.avgs["d", "gain"]) # put average of treatment d in here
(Ybare. <- trt.avgs["e", "gain"]) # put average of treatment e in here
(Ybarf. <- trt.avgs["f", "gain"]) # put average of treatment f in here
(Ybarg. <- trt.avgs["g", "gain"]) # put average of treatment f in here
(Ybarh. <- trt.avgs["h", "gain"]) # put average of treatment f in here
(Ybari. <- trt.avgs["i", "gain"]) # put average of treatment f in here
(Ybarj. <- trt.avgs["j", "gain"]) # put average of treatment f in here

(Ybar..rcbd <- mean(gain.d$gain))

#### Treatment effects ########

# We use the letter T to represent the greek letter Tau

Thata. <- Ybara. - Ybar..rcbd # put estimated effect of treatment a in here
Thatb. <- Ybarb. - Ybar..rcbd # put estimated effect of treatment b in here
Thatc. <- Ybarc. - Ybar..rcbd # put estimated effect of treatment c in here
Thatd. <- Ybard. - Ybar..rcbd # put estimated effect of treatment d in here
Thate. <- Ybare. - Ybar..rcbd # put estimated effect of treatment e in here
Thatf. <- Ybarf. - Ybar..rcbd # put estimated effect of treatment f in here
Thatg. <- Ybarg. - Ybar..rcbd # put estimated effect of treatment g in here
Thath. <- Ybarh. - Ybar..rcbd # put estimated effect of treatment h in here
Thati. <- Ybari. - Ybar..rcbd # put estimated effect of treatment i in here
Thatj. <- Ybarj. - Ybar..rcbd # put estimated effect of treatment j in here

# Leave this line as it is
(Thats2 <- c(Thata., Thatb., Thatc., Thatd., Thate., Thatf., Thatg., Thath., Thati., Thatj.))

#### Block averages ##########

blk.avgs <- aggregate(gain ~ block, data = gain.d, FUN = mean)

rownames(blk.avgs) <- blk.avgs$diet

(Ybar.1 <- blk.avgs["1", "gain"]) # put average of block 1 in here
(Ybar.2 <- blk.avgs["2", "gain"]) # put average of block 1 in here
(Ybar.3 <- blk.avgs["3", "gain"]) # put average of block 1 in here
(Ybar.4 <- blk.avgs["4", "gain"]) # put average of block 1 in here

### Block effects ##########

Bhat.1 <- Ybar.1 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.2 <- Ybar.2 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.3 <- Ybar.3 - Ybar..rcbd # put estimated effect of block 1 in here
Bhat.4 <- Ybar.4 - Ybar..rcbd # put estimated effect of block 1 in here

# Leave this line as it is
(Bhats <- c(Bhat.1, Bhat.2, Bhat.3, Bhat.4))

m2 <- lm(gain ~ diet + block, gain.d)
my.res2 <- residuals(m2)

(resdls2 <- my.res2) # c() # enter the residuals separated by commas and in the order in
# which observations appear in the original data. You can use any method to
# fill in the vector; use the easiest way _for you_ to create the vector resdls.


```

#### Create the complete ANOVA table and interpret the F test for treatments. [15]

```{r}

a2 <- anova(m2)

SourceOfSS2 <- c("Diets", "Block", "Residual", "Total")

SumOfSquares2 <- c(a2$'Sum Sq'[1], a2$'Sum Sq'[2], a2$'Sum Sq'[3], sum(a2$'Sum Sq')) # enter the SS for diets, block, residual and total separated
# by commas and in the order listed.

degreesOfFreedom2 <- c(a2$Df[1], a2$Df[2], a2$Df[3], sum(a2$Df)) # enter the df for diets, block, residual and total separated
# by commas and in the order listed.

(MST2 <- a2$'Mean Sq'[1]) # enter the mean square for diets here.

(MSB <- a2$'Mean Sq'[2]) # enter the mean square for blocks here.

(MSE2 <- a2$'Mean Sq'[3]) # enter the mean square for error or residual here.

(Fcalc.diet <- MST2 / MSE2) # enter the calculated F for diet here.

(Fcrit.diet <- qf(0.05, df1 = a2$Df[1], df2 = a2$Df[3], lower.tail = FALSE)) # enter the value of the critical F here.

(Fcalc.blk <- MSB / MSE2) # enter the value of the calculated F for blocks here.

(Fcrit.blk <- qf(0.05, df1 = a2$Df[2], df2 = a2$Df[3], lower.tail = FALSE)) # enter the value of the critical F here.


## Leave the following lines as they are and run them ########
   anovaTable2 <- data.frame(source = SourceOfSS2,
                            SS = SumOfSquares2,
                            df = degreesOfFreedom2,
                            MS = c(MST2, MSB, MSE2, NA),
                            Fcalc = c(Fcalc.diet, Fcalc.blk, NA, NA))

kable(anovaTable, digits = c(0, 2, 0, 2, 3)) %>%
   kable_styling(full_width = FALSE)

# Check your values against the following table.
# If your values are different, you made a mistake.

anova(m2)

```

Interpret the results and state your null hypothesis and the conclusion.

Calculated F is greater than critical F, therefore, the null hypothesis is rejected. We conclude that the gain in at least one diet differs from that in another diet.

#### Calculate 95% confidence intervals for diets a and h [10]

```{r}

(tcrit.a <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here

(S.Ybara. <- sqrt(MSE2 / length(unique(gain.d$block)))) # enter the standard error of the averages here

(CIloa <- Ybara. - tcrit.a * S.Ybara.) # enter the lower boundary of the CI for a

(CIupa <- Ybara. + tcrit.a * S.Ybara.) # enter the upper boundary of the CI for a

(tcrit.h <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here

(S.Ybarh. <- sqrt(MSE2 / length(unique(gain.d$block)))) # enter the standard error of the averages here

(CIloh <- Ybarh. - tcrit.h * S.Ybarh.) # enter the lower boundary of the CI for h

(CIuph <- Ybarh. + tcrit.h * S.Ybarh.) # enter the upper boundary of the CI for h
```


#### Use the LSD to determine if diets a and h differ in gain [10]

```{r}

(tcrit.ah <- qt(p = 0.975, df = a2$Df[3])) # enter the t critical here

(Sdbar.ah <- sqrt(2 * MSE2 / length(unique(gain.d$block)))) # enter the standard deviation of the difference in averages.

(LSD.ah <- tcrit.ah * Sdbar.ah) # enter the LSD here.

(dbar.ah <- Ybara. - Ybarh.) # enter the difference between treatments here

```

Interpret the results and state your null hypothesis and the conclusion.

The difference between treatments is greater than the LSD so we reject the null hypothesis that they are equal.