ug2-practical-pm/01-s01-lab01c.Rmd at master · philmcaleer/ug2-practical-pm · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
## InClass Activiy

```{r inclass-setup1, echo=FALSE, warning=FALSE, message=FALSE}
library(tidyverse)
ponzo_data <- read_csv("data/01-s01/inclass/PonzoAgeData.csv")
```

<span style="font-size: 16px; font-weight: bold; color: var(--green);">For all the inclass activities, you will need to have completed the PreClass Activities. You may also need some cheatsheets such as <a href="https://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf" target = "_blank">The R Markdown Cheatsheet</a>. Everything you need to complete this inclass activity can be found in those documents. We will tell you which additional material might help where we think it is appropriate.</span>

### R Markdown and The Experimental Design Portfolio

In the preclass activity, we asked you to start playing about with R Markdown. Now, in the lab, we are going to continue learning about R Markdown as creating your own files from scratch is a great start to creating reproducible science! We will also start you off in creating your own **Experimental Design and Analysis Portfolio** through R Markdown. The aim of this portfolio is to consolidate your learning in experimental design and analysis, allowing you to reflect back on how your learning has progressed. You should add to it whenever you think "Oh that is a good tip!" or "That is something I want to remember!". Do this after a lab or a lecture, when you are collating all your notes together. Your portfolio is for you, unless you choose to share it of course, and will not be assessed or marked in anyway. It is your learning aid to help you develop your understanding of research methods and analysis in Psychology.

By the end of these labs you should be able to use your portfolio for exam revision, particularly if you incorporate lecture material, so it's worth maintaining a clear structure to your portfolio! In this first lab, **across 9 tasks**, we will help you to understand how to structure and format R Markdown files; you can then apply what you learn here to your portfolio in your own time. We suggest having your portfolio open in every lab so you can add to it as you go along. Let's begin!

`r hide("Portfolio Point - Things you could include")`
```{block, type = "info"}
Throughout the semester you will see these Portfolio Points. They aren't always necessary to complete the lab but sometimes they will be and sometimes they are just things you might consider putting in your Portfolio to remember. What you keep in your portfolio is up to you but here are some examples of the kind of things we would recommend you include:

- Key points about classic experiments
    - their main goal, outcome, authors, year
    - a top tip is to write a short summary after every paper you read, including the authors' names to help you consolidate that information
- Aspects of your Reports' designs and analyses
    - what decisions you made and why; how they compare to other studies.
- Glossary points for R code functions
    - For codes you find more challenging to understand the function of
    - For codes you might use more frequently in future activities
    - We are developing a glossary which you can send us items to include or get involved with. It is still in development but you can see it here <a href="https://psyteachr.github.io/glossary/" target = "_blank">https://psyteachr.github.io/glossary/</a>.
- Reflection Points on what you have learned in your labs each week.
```
`r unhide()`

### The Ponzo Illusion and Age

The activities in this lab will make use of an **open dataset**.

`r hide("Explain This - What is an open dataset?")`
```{block, type = "info"}
An open dataset is made available for everyone to see and is stored on the internet for other researchers to use. In the PreClass activity, you saw an example of this at the very start in the PLOS One article. Many journals now ask researchers to make their data available or to post it somewhere accessible like the <a href="https://www.osf.io" target = "_blank">Open Science Framework</a>.

Interestingly, the art of making your data available was standard in classic older articles. The data we are using today comes from 1967. Sometime between then and more recent times, data started being made unavailable - closed. We believe all data should be made available and will encourage you to do that over the coming years. Transparent science is Open Science!
```
`r unhide()`
<br>
The data we will use today is from a paper looking at the **Ponzo illusion** and Age:

**Leibowitz, H. W. & Judisch, J. M. (1967). The Relation between Age and the Magnitude of the Ponzo Illusion. The American Journal of Psychology, 80(1), 105-109.** It can be accessed on campus (University of Glasgow) through <a href="https://www.jstor.org/stable/1420548?seq=1#page_scan_tab_contents" target = "_blank">this link</a>. Off campus you can sign in to read it through the University of Glasgow library if you are a student at Glasgow.

The basics of the Ponzo illusion (<a href="https://en.wikipedia.org/wiki/Ponzo_illusion" target = "_blank">Wikipedia page</a>) is that two lines of the same size are viewed as being of different length based on surrounding information - like sleepers on a traintrack. See Figure 1 of Leibowitz and Judisch (1967) for an example (P106). The authors showed people two vertical lines surrounded by differing horizontal lines running at angles behind the main vertical lines. The authors varied the size of one of the vertical lines (left line) and asked the participants to judge which of the two vertical lines was bigger or longer; the left line (variable) or the right one (standard). The paper also tested how this illusion was influenced by age. For more info, see the paper.  Operationalising the dependent variable, Leibowitz & Judisch measured what size the left line had to be to be considered the same size as the standard line on the right. The data we will be using can be seen on page 107, and includes:

* Which Group participants were assigned to according to age, with each group being made of 10 participants of the same sex
* The Sex of the Group
* The Mean Age of the Group
* The Mean Length of the left vertical line

### Task 1: Setting up Your R Markdown Portfolio {#Ch1InClassQueT1}

As above our overall goal is to make a reproducible "report" summarising the data in the Leibowitz and Judisch (1967) paper. As we go along, remember to refer back to the PreClass activity and cheatsheets to help you. Let's begin!

1. Create a new R Markdown document.
2. Give it a title, e.g. **My Psychology Research Methods Portfolio**
3. Enter your **GUID** or name as the author
4. Set the output as **HTML**.

`r hide("Helpful Hint")`
```{block, type = "info"}
Throughout the labs you will see these Helpful Hints. Usually the solutions are nearby or at the end of the chapter to prevent temptation.

In setting up this Rmd file, if you have followed these steps correctly, you will probably see a new R Markdown file with a header containing the title, author, date and output information as shown in the PreClass activity.

If you don't see the document header, then you've probably created an R Script instead. Refer back to the PreClass activity and try again. Look further down the list of File options on the top menu. If stuck, speak to a tutor or ask on an appropriate forum.
```
`r unhide()`
<br>
You can now remove the parts of the generic R Markdown code that we do not need; anything **after** the `setup` code chunk can be removed (see Figure \@ref(fig:img-setup-chunk)). So anything after line 11 can be removed. Leave the first code chunk however - lines 8 to 10 - as these lines make R Markdown show code chunks unless otherwise specified - note the `echo =  TRUE`.

`r hide("Portfolio Point - Code Chunk Reminders")`
```{block, type = "info"}
After the lab it might be handy to write a reminder somewhere in your portfolio about what a code chunk is. Writing it in your own notes somewhere accessible to you will mean you can find it more easily than searching through all the labs for the right information. This is something to keep in mind for any information you come across that you might need to recall later on!
```
`r unhide()`

### Task 2: Give your Report a Heading {#Ch1InClassQueT2}

We are going to start off your portfolio with creating a brief report on Leibowitz and Judisch (1967), so we should give it a heading.

1. After the `setup` code chunk, give your report a heading, e.g. **Lab 1 - The Magnitude of the Ponzo Illusion varies as a function of Age**.
2. Using hashtags, give this heading a **Header 1** size.

`r hide("Helpful Hint - hashtags are not just for social media")`
```{block, type = "info"}
Remember that the fewer the number of hashtags the larger the heading size.
```
`r unhide()`

### Task 3: Creating a Code Chunk {#Ch1InClassQueT3}

We are going to need the data soon so best to bring it in at the start of our code.

1. Set your working directory: **`Session >> Set Working Directory >> Choose Directory`**

`r hide("Portfolio Point - Set Working Directory")`
```{block, type = "info"}
One of the most common issues we see with people using Rstudio is that they forget to set their working directory to the folder containing the data file they are working on. This means that when you try to knit or run a code line it won't work because Rstudio doesn't know where the data is. Remember to set your working directory at the start of each session, using **`Session >> Set Working Directory >> Choose Directory`**

Avoid using code to set your working directory as often this will only work on your machine and not others and is therefore not fully reproducible without editing the script.
```
`r unhide()`

2. Download the data for this lab in a zip file by clicking [this link](./data/01-s01/inclass/PonzoAgeData.zip). Unzip it and save it to the folder you are working in.
3. Create a new code chunk in your R Markdown script, give this code chunk the name `load_data`.
4. Copy and paste the code below into your code chunk.  Spend a couple of minutes with a partner reminding yourself what the code does. The answer is in the hint below.
5. Now, add or change the `echo` rule in your code chunk so that when you knit the file, the code **will not be included** in the final document.

```{r Ch1InClassT3, echo = TRUE, warning = TRUE, message = FALSE, eval = FALSE}
library("tidyverse")
ponzo_data <- read_csv("PonzoAgeData.csv")
```

**Knit the document now and see what the output looks like. It will ask you to save the file somewhere. Remember that on the Boyd Orr Lab PCs this is best done on your M: drive, given available space.**

**Important**: There is a good chance that, on the webpage that you have knitted, you will see either some warnings or messages. You can suppress these using the `message` and `warning` rules within the code chunks as well. Try this now - the PreClass Activities and the R-Markdown cheatsheet will help.

`r hide("Helpful Hint - what was all that?")`
```{block, type = "info"}

**Hints:**

* Step 4 - echo can equal TRUE or FALSE.
* Remember to separate rules in the code chunk with commas. E.g. {r, rule1 = FALSE, rule2 = TRUE}

**What does the code do?**

1. Line 1 loads the `tidyverse` packages and all associated packages e.g. `dplyr`, `readr` and `ggplot2`. You have used these in Level 1 Grassroots book - we will recap a lot of that in the coming labs.
2. Line 2 loads in the data using the `read_csv()` function and stores it in `ponzo_data`.

**Important points to note:**

* `ponzo_data` could have been called anything but best to call it something that makes it clear what it is. Only rule is no spaces in the name. `ponzo_data` and `ponzo.data` are acceptable, and different from each other. `ponzo data` is not acceptable and will crash the code.
* `read_csv()` is actually in the `readr` package and is available to you only after you have loaded in `tidyverse` through `library(tidyverse)`. We will **always** tell you to use `read_csv()` to read in data from a csv file. There are other codes that load in data - one very similar one is `read.csv()`. They work differently. Only ever use `read_csv()` in your Psychology labs unless otherwise instructed.
* remember `<-` essentially means `assign this to that`. Assigning the ponzo data to the table `ponzo_data` can actually can be written the other way around -  `read_csv("PonzoAgeData.csv") -> ponzo_data` - but convention usually puts it the way we have in the code.
```
`r unhide()`


### Task 4: Writing your Report {#Ch1InClassQueT4}

Let's start giving this brief report some information and structure as we would a full report.

1. Underneath the code chunk you entered, put a new heading called **Introduction** and give it a **Header 2** size.
2. Next, do a little research with your group on the **Ponzo Illusion** and write a sentence or two describing how it works and what it tells us; include a citation to support your research. There is a link to the wikipedia page on the illusion at the top of this lab which might help.
3. Finally, copy the text in the box below into your report and finish the text by putting the names of two hypotheses behind the illusion below the sentence in an **ordered list** style; i.e. 1... 2..., etc. The two hypotheses are **The Framing hypothesis** and **The Perspective hypothesis**.

```
"There are two underlying hypotheses that may explain the Ponzo Illusion. These are: ..."
```

`r hide("Helpful Hint - Lists")`
```{block, type = "info"}
Lists can be tricky to begin with but are very straightforward once you know the key points.

* The list begins after a blank line after any text. If you start the list without leaving a blank line at the top it won't work.
* Each point starts with an asterisk (\*) or by an integer and full-stop (e.g. `1.`)
* You must have a space after the \* or 1. before writing your point.
* Each point is a new line.
* To stagger points on a list (i.e. indent), leave 4 blank spaces (two tabs) and then put your \* etc.
```
`r unhide()`
<br>

<span style="font-size: 22px; font-weight: bold; color: var(--green);">Quickfire Question</span>

Here are a couple of questions to try out in your group to remind you about using citations:

When writing a report, how would you cite:

- Papers with **five** authors on the **first** mention? `r mcq(c("Author 1, Author 2, Author 3, Author 4, & Author 5", "Author 1, Author 2, Author 3, Author 4, & Author 5, Year", answer = "Author 1 et al., Year"))`
- Papers with **five** authors on the **second** mention? `r mcq(c("Author 1, Author 2, Author 3, Author 4, & Author 5", "Author 1, Author 2, Author 3, Author 4, & Author 5, Year", answer = "Author 1 et al., Year"))`
- Papers with **seven** authors on the **first** mention? `r mcq(c("Author 1, Author 2, Author 3, Author 4, Author 5, Author 6, & Author 7", "Author 1, Author 2, Author 3, Author 4, Author 5, Author 6, & Author 7, Year", answer = "Author 1 et al., Year"))`
- Papers with **two** authors in a citation? `r mcq(c(answer = "(Author 1 & Author 2)", "(Author 1 et al., Year)"))`
- **Two** papers in **one** paretheses? `r mcq(c("Order chronologically according to year, separated by a semi-colon", answer = "Order alphabetically according to first author surname, separated by a semi-colon"))`
- **Two** papers of the **same** author? `r mcq(c("Order chronologically according to year, separated by a semi-colon", answer = "Order chronologically according to year, separated by a comma", "Order alphabetically by adding a letter to each year"))`

Those questions about citations may throw you a little. In late 2019, the APA Style Guide 7th edition was released. One of the biggest changes here was that for papers of three or more authors, the citation would only show the first author and the year, as opposed to listing all authors on first citation. Two author papers and single author papers still always show all authors. For more information on how to format citations, you can look at the highly recommended <a href = "https://apastyle.apa.org/style-grammar-guidelines/citations/basic-principles/author-date" target = "_blank">APA Style blog</a>.

### Task 5: Making Text Bold or Italicized {#Ch1InClassQueT5}

Sometimes we want to add some emphasis to text.

1. In your report, format the line `There are two underlying hypotheses...` in bold. Answering the below question might help you remember how.

<span style="font-size: 22px; font-weight: bold; color: var(--green);">Quickfire Question</span>

- Bold text and italicized text are created similarly, how do you create italicized text? `r mcq(c("* (before text)", "** (before and after text)", answer = "* (before and after text)", "** (before text)"))`

**It's a good idea to `knit` the file at this point to make sure the codes are all working correctly.**

### Task 6: Adding Links to the Data in your Methods {#Ch1InClassQueT6}

Good practice in a Report is to include information about where we got the data from.

1. Create a new heading below your list of the two hypotheses and call it **Methods**. Set it as **Header 2** size.
2. Below **Methods** write a new heading called **Data** and set it as **Header 3** size.
3. Underneath the Methods heading, copy and paste in the below sentence and turn the citation into an internet link to the paper.

> "The data in this report was obtained from within the original paper, (Lebowitz and Judisch, 2016). "

**Now knit your document again to make sure your formatting is working. Titles should be bigger than normal text and the list should be indented and have numbers at the start of each line.**

`r hide("Helpful Hint - adding links?")`
```{block, type = "info"}
You can get the web address by following the link to the paper shown towards the beginning of this lab activity. Include the https part.

Use the R Markdown cheatsheet to see how to insert links. It has something to do with square brackets [] and circular brackets () next to each other.
```
`r unhide()`

### Task 7: Adding an Image to your Methods {#Ch1InClassQueT7}

For certain studies, you may want to add an image to the Methods section, either of the stimuli, of the materials, or of the procedure. If you look at the R Markdown cheatsheet you'll see that adding an image is very similar to adding a link, the only difference is the exclamation mark, `!`, beforehand. Surprising, I know!

For now we will just add an image of the illusion taken from the internet to illustrate how to add images to our documents.

1. Below the sentence you added for Task 6, add a new heading called **Stimuli** and set it as **Header 3** size.
2. Below the **Stimuli** heading, insert the image at the following web address:

```
https://upload.wikimedia.org/wikipedia/en/8/89/Ponzo_Illusion.jpg
```

`r hide("Portfolio Point - A good methods section")`
```{block, type = "info"}
Remember that a good methods section will contain all the necessary information that would be required for another researcher to replicate your experiment exactly! It would normally be split into three or four sections including Ethics, Participants, Stimuli, and Procedure.

This may sound very obvious but you would be surprised at how many Methods sections don't give enough information for replicating the study. Articles tend to have word counts - just like your assignments. Authors have tended to cut words where they can to fit in further discussion or more results. Methods sections have suffered as a result. But no more!
```
`r unhide()`

### Task 8: Adding a Table to your Results {#Ch1InClassQueT8}

Another benefit of R Markdown is that you can insert tables of results directly into your report without having to format them - though for aesthetics you will want to learn how to format tables eventually. But for now...

1. Create a new heading below your methods sentence, called **Results** and format it as **Header 2** size.
2. Add a new code chunk and give it the name `table`, and include the code shown below. The first part of the code `my_table <- group_by %>% summarise` creates the table and stores it in `my_table`.  The second part of the code `my_table` calls the table. Calls means `to display or to show me` in this sense.
3. Add an `echo` rule so that the code **IS NOT** included in the final document but the ouput table **is** included.

```{r Ch1InClassT8, results='hide', warning = FALSE}
my_table <- group_by(ponzo_data, Sex) %>%
  summarise(NofGroups=n(), mean_length = mean(ComparisonLength))

my_table
```

**Now, knit your document to see what you have produced. You should not see the above code, just the output table.**

### Task 9: Adding a Figure to your Results {#Ch1InClassQueT9}

Nearly all research reports have a figure so we will want to add one as well.

1. Underneath your `table` code chunk, add a new code chunk and give it the name `plot`.
2. Add the below code to the chunk and set the `include` rule so that both the code and the plot are included in the final report.

```{r Ch1InClassT9, eval = FALSE}
ggplot(ponzo_data,
       aes(x = Mean_Age, y = ComparisonLength, color = Sex)) +
  geom_point()
```


`r hide("Portfolio Point - to assign or to not assign")`
```{block, type = "info"}
You may notice above that we assigned our data in the table to `my_table` and then called `my_table` to show it.  However, we didn't do that for the figure. We just put the code for the figure but did not assign it. Why?

There is no great answer and you could assign both or not assign either, and we will chop and change throughout the labs to show you the difference but the tendency is to assign tables but not assign figures. Simply because we often are creating the figures to show them and therefor assigning them and then calling them requires more code. Tables on the other hand are often stored to work on later, so it makes sense to assign them.

Again this is not a hard and fast rule and often we will assign figures but it just makes it quicker not to. If you ever do assign a figure remember to call it, or your figure won't be displayed!

```
`r unhide()`
<br>

**Again, knit your document to make sure it is working correctly. Below your table you should now have the ggplot code followed by the nice scatterplot.**

<span style="font-size: 22px; font-weight: bold; color: var(--purple);"> **Group Discussion Point** </span>

In your group, have a brief discussion about the figure to answer the following question.

"Based on the distibrution of the data, shown in the above Figure, ..." `r mcq(c(answer = "as age increases, people perceive a shorter vertical line to be of same length as the standard vertical line", "as age increases, people perceive a longer vertical line to be of same length as the standard vertical linge", "There is no relationship between Age and the Ponzo illusion", "This figure tells me nothing about the relationship between Age and the Ponzo illusion"))`

`r hide("Helpful Hint")`
```{block, type = "info"}
What does each dot represent in the Figure, and what is the pattern of the dots?
```
`r unhide()`
<br>

We will learn more about how to improve the visualisations as we progress but for now you have completed the bones of your first report! Compare your report to the one we have created to see if they match, which can be found at the end of this chapter using menu on the left (i.e. InClass Comparison) or [click here to download the .Rmd file in a zip folder](data/01-s01/inclass/ProducedPonzoTemplate.zip). **Fix anything that is not formatted as in our template.**

`r hide("Portfolio Point - The Power of R Markdown and the ggplot Package")`
```{block, type = "info"}
Here is a real-world scenario of why plotting in R Markdown can save a lot of effort. Say you carried out an experiment, made a figure of the results using an R Script, and wrote up the report using Microsoft Word. Then you realised you forgot to include 2 participants. To fix this, you would have to re-run the R script, make a new plot, save the plot, and **then** transfer that to your Word document. However, had you used **R Markdown** to begin with and both analysis and report were in the same place, then you can simply update the code within the document and a new figure will be created in the exact same place as the old one. **Magic!**

The code above uses the `ggplot2` package you used in Level 1. This is the main package we use for plots, figures, visualisations, or however you like to call them. It can be called into the library by itself, or is automatically called in when you call in the `tidyverse` package. Later this semester we will revist `ggplot2` in more detail. For now, we are using it to make a scatterplot (`geom_point`) of Age (`Mean_Age`) and Comparison Length (`ComparisonLength`), and splitting the data for males and females.

```
`r unhide()`
<br>

<span style="font-size: 22px; font-weight: bold; color: var(--blue);">Job Done - Activity Complete!</span>

**Great work!** We have now created a rough layout of a report. The only section we are missing is the Discussion where you relate the information from previous research to what your study showed. Feel free to add one in your own time; read the short summary at the end of the actual paper to help get your thoughts together. We will talk more about the structuring of reports all throughout the year so you will have a great idea of how to write one by the end of Level 2. Well done on succesfully creating your own R Markdown file!

You should now be ready to complete the Homework Assignment for this lab. **The assignment for this Lab is FORMATIVE and is NOT to be submitted and will NOT count towards the overall grade for this module**. However you are strongly encouraged to do the assignment as it will continue to boost the skills which you will need in future assignments. If you have any questions, please post them on the forums. Finally, don't forget to add any useful information to your Portfolio before you leave it too long and forget.