forked from jtr13/EDAV
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathridgeline.Rmd
142 lines (107 loc) · 5.78 KB
/
ridgeline.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
# Chart: Ridgeline Plots {#ridgeline}

*This chapter originated as a community contribution created by [nehasaraf1994](https://github.com/nehasaraf1994){target="_blank"}*
*This page is a work in progress. We appreciate any input you may have. If you would like to help improve this page, consider [contributing to our repo](contribute.html).*
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library("ggridges")
library("tidyverse")
library("cluster")
```
## Overview
This section covers how to make ridgeline plots.
## tl;dr
I want a nice example and I want it NOW!
Here's a look at the dose of theophylline administered orally to the subject on which the concentration of theophylline is observed:
```{r tldr-show-plot, echo=FALSE, message=FALSE}
library("ggridges")
library("tidyverse")
Theoph_data <- Theoph
ggplot(Theoph_data, aes(x=Dose,y=Subject,fill=Subject))+
geom_density_ridges_gradient(scale = 4, show.legend = FALSE) + theme_ridges() +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
labs(x = "Dose of theophylline(mg/kg)",y = "Subject #") +
ggtitle("Density estimation of dosage given to various subjects") +
theme(plot.title = element_text(hjust = 0.5))
```
Here is the code:
```{r tldr-code, eval=FALSE}
library("ggridges")
library("tidyverse")
Theoph_data <- Theoph
ggplot(Theoph_data, aes(x=Dose,y=Subject,fill=Subject))+
geom_density_ridges_gradient(scale = 4, show.legend = FALSE) + theme_ridges() +
scale_y_discrete(expand = c(0.01, 0)) +
scale_x_continuous(expand = c(0.01, 0)) +
labs(x = "Dose of theophylline(mg/kg)",y = "Subject #") +
ggtitle("Density estimation of dosage given to various subjects") +
theme(plot.title = element_text(hjust = 0.5))
```
For more info on this dataset, type `?datasets::Theoph` into the console.
## Simple examples
Okay...much simpler please.
Let's use the `Orange` dataset from the `datasets` package:
```{r}
library("datasets")
head(Orange, n=5)
```
## Ridgeline Plots using ggridge
```{r message=FALSE}
library("ggridges")
library("tidyverse")
ggplot(Orange, aes(x=circumference,y=Tree,fill = Tree))+
geom_density_ridges(scale = 2, alpha=0.5) + theme_ridges()+
scale_fill_brewer(palette = 4)+
scale_y_discrete(expand = c(0.8, 0)) +
scale_x_continuous(expand = c(0.01, 0))+
labs(x="Circumference at Breast Height", y="Tree with ordering of max diameter")+
ggtitle("Density estimation of circumference of different types of Trees")+
theme(plot.title = element_text(hjust = 0.5))
```
`ggridge` uses two main geoms to plot the ridgeline density plots: "geom_density_ridges" and "geom_ridgeline". They are used to plot the densities of categorical variable factors and see their distribution over a continuous scale.
## When to Use
Ridgeline plots can be used when a number of data segments have to be plotted on the same horizontal scale. It is presented with slight overlap. Ridgeline plots are very useful to visualize the distribution of a categorical variable over time or space.
A good example using ridgeline plots will be a great example is visualizing the distribution of salary over different departments in a company.
## Considerations
The overlapping of the density plot can be controlled by adjusting the value of scale. Scale defines how much the peak of the lower curve touches the curve above.
```{r, fig.width=10, message=FALSE}
library("ggridges")
library("tidyverse")
OrchardSprays_data <- OrchardSprays
ggplot(OrchardSprays_data, aes(x=decrease,y=treatment,fill=treatment))+
geom_density_ridges_gradient(scale=3) + theme_ridges()+
scale_y_discrete(expand = c(0.3, 0)) +
scale_x_continuous(expand = c(0.01, 0))+
labs(x="Response in repelling honeybees",y="Treatment")+
ggtitle("Density estimation of response by honeybees to a treatment for scale=3")+
theme(plot.title = element_text(hjust = 0.5))
ggplot(OrchardSprays_data, aes(x=decrease,y=treatment,fill=treatment))+
geom_density_ridges_gradient(scale=5) + theme_ridges()+
scale_y_discrete(expand = c(0.3, 0)) +
scale_x_continuous(expand = c(0.01, 0))+
labs(x="Response in repelling honeybees",y="Treatment")+
ggtitle("Density estimation of response by honeybees to a treatment for scale=5")+
theme(plot.title = element_text(hjust = 0.5))
```
Ridgeline plots can also be used to plot histograms on the common horizontal axis rather than density plots. But doing that may not give us any valuable results.
```{r}
library("ggridges")
library("tidyverse")
ggplot(InsectSprays, aes(x = count, y = spray, height = ..density.., fill = spray)) +
geom_density_ridges(stat = "binline", bins = 20, scale = 0.7, draw_baseline = FALSE)
```
If the same thing is done in ridgeline plots, it gives better results.
```{r message=FALSE}
library("ggridges")
library("tidyverse")
ggplot(InsectSprays, aes(x=count,y=spray,fill=spray))+
geom_density_ridges_gradient() + theme_ridges()+
labs(x="Count of Insects",y="Types of Spray")+
ggtitle("The counts of insects treated with different insecticides.")+
theme(plot.title = element_text(hjust = 0.5))
```
## External Resources
- [Introduction to ggridges](https://cran.r-project.org/web/packages/ggridges/vignettes/introduction.html){target="_blank"}: An excellent collection of code examples on how to make ridgeline plots with `ggplot2`. Covers every parameter of ggridges and how to modify them for better visualization. If you want a ridgeline plot to look a certain way, this article will help.
- [Article on ridgeline plots with ggplot2](https://rdrr.io/cran/ggridges/man/geom_density_ridges.html){target="_blank"}: Few examples using different examples. Great for starting with ridgeline plots.
- [History of Ridgeline plots](https://blog.revolutionanalytics.com/2017/07/joyplots.html){target="_blank"}: To refer to the theory of ridgeline plots.