Skip to content

Commit a24e6e6

Browse files
committed
Edits
1 parent 275be3b commit a24e6e6

File tree

3 files changed

+33
-27
lines changed

3 files changed

+33
-27
lines changed

DESCRIPTION

+7-7
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,22 @@ Package: bioc2022tidytranscriptomics
22
Title: Tidy Transcriptomics For Single-Cell RNA Sequencing Analyses
33
Version: 0.13.1
44
Authors@R: c(
5-
person("Maria", "Doyle", email="[email protected]",
6-
role = c("aut"),
7-
comment = c(ORCID = "0000-0003-4847-8436")),
85
person("Stefano", "Mangiola", email="[email protected]",
96
role = c("aut","cre"),
10-
comment = c(ORCID = "0000-0001-7474-836X")))
11-
Maintainer: Maria Doyle <[email protected]>, Stefano Mangiola <[email protected]>
12-
Description: This workshop will present how to perform analysis of RNA sequencing data following the tidy data paradigm, using the tidySingleCellExperiment and tidyverse packages.
7+
comment = c(ORCID = "0000-0001-7474-836X")),
8+
person("Maria", "Doyle", email="[email protected]",
9+
role = c("aut"),
10+
comment = c(ORCID = "0000-0003-4847-8436")))
11+
Maintainer: Stefano Mangiola <[email protected]>, Maria Doyle <[email protected]>
12+
Description: This workshop will showcase analysis of single-cell RNA sequencing data following the tidy data paradigm, using the tidySingleCellExperiment, tidySummarizedExperiment, tidybulk and tidyverse packages.
1313
License: CC BY-SA 4.0 + file LICENSE
1414
Encoding: UTF-8
1515
LazyData: true
1616
LazyDataCompression: xz
1717
Roxygen: list(markdown = TRUE)
1818
RoxygenNote: 7.2.0
1919
Depends:
20-
R (>= 4.0.0)
20+
R (>= 4.1.0)
2121
Imports:
2222
tidySingleCellExperiment,
2323
tidySummarizedExperiment,

README.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
[![.github/workflows/basic_checks.yaml](https://github.com/tidytranscriptomics-workshops/bioc2022_tidytranscriptomics/workflows/.github/workflows/basic_checks.yaml/badge.svg)](https://github.com/tidytranscriptomics-workshops/bioc2022_tidytranscriptomics/actions)
44
<!-- badges: end -->
55

6-
# Introduction to Tidy Transcriptomics
6+
# Tidy Transcriptomics For Single-Cell RNA Sequencing Analyses
77
<p float="left">
88
<img style="height:100px;" alt="BioC2022" src="https://bioc2022.bioconductor.org/img/carousel/BioC2022.png"/>
99
<img style="height:100px;" alt="tidybulk" src="https://github.com/Bioconductor/BiocStickers/blob/master/tidybulk/tidybulk.png?raw=true"/>

vignettes/tidytranscriptomics_case_study.Rmd

+25-19
Original file line numberDiff line numberDiff line change
@@ -30,9 +30,9 @@ knitr::opts_chunk$set(echo = TRUE)
3030

3131
## Description
3232

33-
This tutorial will present how to perform analysis of single-cell RNA sequencing data following the tidy data paradigm. The tidy data paradigm provides a standard way to organise data values within a dataset, where each variable is a column, each observation is a row, and data is manipulated using an easy-to-understand vocabulary. Most importantly, the data structure remains consistent across manipulation and analysis functions.
33+
This tutorial will showcase analysis of single-cell RNA sequencing data following the tidy data paradigm. The tidy data paradigm provides a standard way to organise data values within a dataset, where each variable is a column, each observation is a row, and data is manipulated using an easy-to-understand vocabulary. Most importantly, the data structure remains consistent across manipulation and analysis functions.
3434

35-
This can be achieved with the integration of packages present in the R CRAN and Bioconductor ecosystem, including [tidySingleCellExperiment](https://stemangiola.github.io/tidySingleCellExperiment/) and [tidyverse](https://www.tidyverse.org/). These packages are part of the tidytranscriptomics suite that introduces a tidy approach to RNA sequencing data representation and analysis. For more information see the [tidy transcriptomics blog](https://stemangiola.github.io/tidytranscriptomics/).
35+
This can be achieved with the integration of packages present in the R CRAN and Bioconductor ecosystem, including [tidySingleCellExperiment](https://stemangiola.github.io/tidySingleCellExperiment/), [tidySummarizedExperiment](https://stemangiola.github.io/tidySummarizedExperiment/), [tidybulk](https://stemangiola.github.io/tidybulk/) and [tidyverse](https://www.tidyverse.org/). These packages are part of the tidytranscriptomics suite that introduces a tidy approach to RNA sequencing data representation and analysis. For more information see the [tidy transcriptomics blog](https://stemangiola.github.io/tidytranscriptomics/).
3636

3737
### Pre-requisites
3838

@@ -59,7 +59,7 @@ This can be achieved with the integration of packages present in the R CRAN and
5959
- The fundamentals of single-cell data analysis
6060
- The fundamentals of tidy data analysis
6161

62-
This workshop will demonstrate a real-world example of using tidy transcriptomics packages, such as tidySingleCellExperiment and tidybulk, to perform a single cell analysis. This workshop is not a step-by-step introduction in how to perform single-cell analysis. For an overview of single-cell analysis steps performed in a tidy way please see the [ISMB2021 workshop](https://tidytranscriptomics-workshops.github.io/ismb2021_tidytranscriptomics/articles/tidytranscriptomics.html).
62+
This workshop will demonstrate a real-world example of using tidy transcriptomics packages to analyse single cell data. This workshop is not a step-by-step introduction in how to perform single-cell analysis. For an overview of single-cell analysis steps performed in a tidy way please see the [ISMB2021 workshop](https://tidytranscriptomics-workshops.github.io/ismb2021_tidytranscriptomics/articles/tidytranscriptomics.html).
6363

6464
## Getting started
6565

@@ -82,7 +82,7 @@ Alternatively, you can view the material at the workshop webpage [here](https://
8282

8383
## Slides
8484

85-
*The embedded slides below may take a minute to appear. You can also view or download [here](https://github.com/tidytranscriptomics-workshops/bioc2022_tidytranscriptomics/blob/master/inst/bioc2022_tidytranscriptomics.pdf)*
85+
*The embedded slides below may take a minute to appear. You can also view or download [here](https://github.com/tidytranscriptomics-workshops/bioc2022_tidytranscriptomics/blob/master/inst/bioc2022_tidytranscriptomics.pdf).*
8686

8787
<iframe
8888
src="https://docs.google.com/gview?url=https://raw.githubusercontent.com/tidytranscriptomics-workshops/bioc2022_tidytranscriptomics/master/inst/bioc2022_tidytranscriptomics.pdf&embedded=true"
@@ -145,12 +145,20 @@ We can use `filter` to choose rows, for example, to see just the rows for the ce
145145
sce_obj |> filter(Phase == "G1")
146146
```
147147

148-
We can use `select` to choose columns, for example, to see the sample, cell, total cellular RNA
148+
We can use `select` to view columns, for example, to see the filename, total cellular RNA abundance and cell phase.
149149

150150
```{r}
151-
sce_obj |> select(.cell, nCount_RNA, Phase)
151+
sce_obj |> select(file, nCount_RNA, Phase)
152152
```
153153

154+
> As we did not output the .cell column we get a tibble instead of a SingleCellExperiment object and a message to let us know: "tidySingleCellExperiment says: Key columns are missing. A data frame is returned for independent data analysis." This is ok as it's what we want here when exploring the data.
155+
156+
> If we use select to output the .cell (key) column we will also get any view-only columns returned, such as the UMAP columns generated during the preprocessing.
157+
158+
>```{r}
159+
> sce_obj |> select(.cell, nCount_RNA, Phase)
160+
>```
161+
154162
We can use `mutate` to create a column. For example, we could create a new `Phase_l` column that contains a lower-case version of `Phase`.
155163
156164
```{r}
@@ -211,20 +219,18 @@ The object `sce_obj` we've been using was created as part of a study on breast c
211219

212220
## Analyse custom signature
213221

214-
The researcher analysing this dataset wanted to to identify gamma delta T cells using a gene signature from a published paper [@Pizzolato2019].
222+
The researcher analysing this dataset wanted to identify gamma delta T cells using a gene signature from a published paper [@Pizzolato2019]. We'll show how that can be done here.
215223

216-
With tidySingleCellExperiment's `join_features` the counts for the genes could be viewed as columns.
224+
With tidySingleCellExperiment's `join_features` we can view the counts for genes in the signature as columns joined to our single cell tibble.
217225

218226
```{r}
219-
220227
sce_obj |>
221228
join_features(c("CD3D", "TRDC", "TRGC1", "TRGC2", "CD8A", "CD8B"), shape = "wide")
222229
```
223230

224-
They were able to use tidySingleCellExperiment's `join_features` to select the counts for the genes in the signature, followed by tidyverse `mutate` to easily create a column containing the signature score.
231+
We can use tidyverse `mutate` to create a column containing the signature score. To generate the score, we scale the sum of the 4 genes, CD3D, TRDC, TRGC1, TRGC2, and subtract the scaled sum of the 2 genes, CD8A and CD8B. `mutate` is powerful in enabling us to perform complex arithmetic operations easily.
225232

226233
```{r}
227-
228234
sce_obj |>
229235
join_features(c("CD3D", "TRDC", "TRGC1", "TRGC2", "CD8A", "CD8B"), shape = "wide") |>
230236
@@ -360,14 +366,14 @@ sce_obj_gamma_delta |> select(batch, cluster, everything())
360366
```
361367

362368
It is also possible to visualise the cells as a 3D plot using plotly.
363-
The example data used here only contains a few genes, for the sake of time and size in this demonstration, but below is how you could generate the 3 dimensions needed for 3D plot with a full dataset.
369+
The example data used here only contains a few genes, for the sake of time and size in this demonstration, but below is how you could generate the 3 dimensions needed for 3D plot with a full dataset.
364370

365371
```{r eval = FALSE}
366372
single_cell_object |>
367373
RunUMAP(dims = 1:30, n.components = 3L, spread = 0.5, min.dist = 0.01, n.neighbors = 10L)
368374
```
369375

370-
We'll demonstrate creating a 3D plot using some data that has 3 UMAP dimensions.
376+
We'll demonstrate creating a 3D plot using some data that has 3 UMAP dimensions. This is a fantastic way to visualise both reduced dimensions and metadata in the same representation.
371377

372378
```{r umap plot 2, message = FALSE, warning = FALSE}
373379
pbmc <- bioc2022tidytranscriptomics::sce_obj_UMAP3
@@ -385,22 +391,22 @@ pbmc |>
385391

386392
# Exercises
387393

388-
Using the `sce_obj`
394+
Using the `sce_obj`:
389395

390-
1. What proportion of all cells are gamma-delta T cells? Use signature_score > 0.7 to identify gamma-delta T cells.
396+
1. What proportion of all cells are gamma-delta T cells? Use signature_score > 0.7 to identify gamma-delta T cells.
391397

392-
2. There is a cluster of cells characterised by a low RNA output (nCount_RNA < 100). Identify the cell composition (cell_type) of that cluster.
398+
2. There is a cluster of cells characterised by a low RNA output (nCount_RNA < 100). Identify the cell composition (cell_type) of that cluster.
393399

394400
# Pseudobulk analyses
395401

396-
Next we want to identify genes whose transcription is affected by treatment in this dataset, comparing treated and untreated patients. We can do this with pseudobulk analysis. We aggregate cell-wise transcript abundance into pseudobulk samples and can then perform hypothesis testing using very well established bulk RNA sequencing tools. For example, we can use edgeR in tidybulk to perform differential expression testing. For more details on pseudobulk analysis see [here](https://hbctraining.github.io/scRNA-seq/lessons/pseudobulk_DESeq2_scrnaseq.html).
402+
Next we want to identify genes whose transcription is affected by treatment in this dataset, comparing treated and untreated patients. We can do this with pseudobulk analysis. We aggregate cell-wise transcript abundance into pseudobulk samples and can then perform hypothesis testing using the very well established bulk RNA sequencing tools. For example, we can use edgeR in tidybulk to perform differential expression testing. For more details on pseudobulk analysis see [here](https://hbctraining.github.io/scRNA-seq/lessons/pseudobulk_DESeq2_scrnaseq.html).
397403

398404
We want to do it for each cell type and the tidy transcriptomics ecosystem makes this very easy.
399405

400406

401407
## Create pseudobulk samples
402408

403-
To create pseudobulk samples from the single cell data, we will use a helper function called `aggregate_cells`, available in this workshop package. This function will combine the single cells into groups for each cell type for each sample.
409+
To create pseudobulk samples from the single cell data, we will use a helper function called `aggregate_cells`, available in this workshop package. This function will combine the single cells into a group for each cell type for each sample.
404410

405411
```{r warning=FALSE, message=FALSE, echo=FALSE}
406412
library(glue)
@@ -489,7 +495,7 @@ pseudo_bulk <-
489495
mutate(data = map(data, ~ filter(.x, FDR < 0.5))) |>
490496
491497
# Filter cell types with no differential abundant gene-transcripts
492-
# map_int is map that returns integer
498+
# map_int is map that returns integer instead of list
493499
filter(map_int(data, ~ nrow(.x)) > 0) |>
494500
495501
# Plot

0 commit comments

Comments
 (0)