@@ -75,20 +75,16 @@ To install them you use either
...
@@ -75,20 +75,16 @@ To install them you use either
* **BiocManager::install** (if it comes from [Bioconductor](http://www.bioconductor.org/)):
* **BiocManager::install** (if it comes from [Bioconductor](http://www.bioconductor.org/)):
```{r}
```{r}
#| label: install packages
install.packages(c("BiocManager", "quarto"))
install.packages(c("BiocManager", "quarto"))
BiocManager::install("pheatmap")
BiocManager::install("pheatmap")
```
```
Once a package is installed, you need to load it into your session with the command **library**:
Once a package is installed, you need to load it into your session with the command **library**:
```{r}
```{r}
#| label: load packages
BiocManager
BiocManager
library(BiocManager)
library(BiocManager)
```
```
{{< pagebreak >}}
## Exercise 1
## Exercise 1
The purpose of this exercise is to observe the effect of some common operations in R,
The purpose of this exercise is to observe the effect of some common operations in R,
...
@@ -97,30 +93,29 @@ and familiarize yourself with the language and the interface.
...
@@ -97,30 +93,29 @@ and familiarize yourself with the language and the interface.
Try to change some of the commands and see the effect.
Try to change some of the commands and see the effect.
1. Open RStudio.
1. Open RStudio.
2. Alternatively you can clone the [same gitlab repository](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025.git) into your working directory and open the directory from RStudio.
2. Create a "New project" (from the File menu), chose "Version Control" and "Git", paste the URL [https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025.git](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025.git) and chose the location on your computer to save it.
2. Alternatively you can clone the [same gitlab repository](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025.git) into your working directory and open the directory from RStudio.
3. Alternatively you can clone the [same gitlab repository](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025.git) into your working directory and open the directory from RStudio.
3. Open the file [ExercisesWeek1.qmd](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025/-/blob/main/week1/ExercisesWeek1.qmd) in RStudio (this is the file used to generate the document you are currently reading...)
4. Open the file [ExercisesWeek1.qmd](https://gitlab.epfl.ch/genomics-and-bioinformatics/course-data-2025/-/blob/main/week1/ExercisesWeek1.qmd) in RStudio (this is the file used to generate the document you are currently reading...)
4. Run the following code blocks and understand what they are doing.
5. Run the following code blocks and understand what they are doing.
Read the data from the tab-delimited file *GeneExpressionData.txt* (open the file as well to have a look at its content):
Read the data from the tab-delimited file *GeneExpressionData.txt* (open the file as well to have a look at its content):
```{r}
```{r}
#| label: load data
data = read.delim("GeneExpressionData.txt", row.names=1)
data = read.delim("GeneExpressionData.txt", row.names=1)
```
```
If the file is not found, check your path and use **setwd()** to change to your working directory:
If the file is not found, check your path and use **setwd()** to change to your working directory:
```{r}
```{r}
#| label: path functions
getwd()
getwd()
## setwd("/YOUR/PATH/TO/GITLAB/REPO")
## setwd("/YOUR/PATH/TO/GITLAB/REPO")
dir()
dir()
```
```
First look at the data (notice that rows and columns have names!):
First look at the data (notice that rows and columns have names!):
```{r}
```{r}
#| label: data check
dim(data)
dim(data)
head(data)
head(data)
data[1:4, ]
data[1:4, ]
data$id
data$id
```
```{r}
data$C1[1]
data$C1[1]
data$C2[3:10]
data$C2[3:10]
data["ATP2A3",]
data["ATP2A3",]
...
@@ -129,7 +124,6 @@ vector[4]
...
@@ -129,7 +124,6 @@ vector[4]
```
```
Compute some basic statistics:
Compute some basic statistics:
```{r}
```{r}
#| label: summary stats
summary(data)
summary(data)
summary(data$C1)
summary(data$C1)
mean(data$C2)
mean(data$C2)
...
@@ -143,7 +137,6 @@ apply(data, 2, sd)
...
@@ -143,7 +137,6 @@ apply(data, 2, sd)
```
```
Elementary data transformation (are all ratios well-defined?):
Elementary data transformation (are all ratios well-defined?):
@@ -162,22 +154,18 @@ If you would like to learn more about R, we suggest two online courses that are
...
@@ -162,22 +154,18 @@ If you would like to learn more about R, we suggest two online courses that are
* [UCDavis Introduction to R](https://ucdavis-bioinformatics-training.github.io/2021-March-Introduction-to-R-for-Bioinformatics/R/Intro2R_main)
* [UCDavis Introduction to R](https://ucdavis-bioinformatics-training.github.io/2021-March-Introduction-to-R-for-Bioinformatics/R/Intro2R_main)
* [SIB first steps with R](https://github.com/sib-swiss/first-steps-with-R-training)
* [SIB first steps with R](https://github.com/sib-swiss/first-steps-with-R-training)
{{< pagebreak >}}
## Exercise 2
## Exercise 2
In this exercise we will perform a typical gene expression analysis based on a dataset from Leukemia cells:
In this exercise we will perform a typical gene expression analysis based on a dataset from Leukemia cells:
1. Load the dataset *leukemiaExpressionSubset.rds* (it is in compressed [RDS format](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readRDS)):
1. Load the dataset *leukemiaExpressionSubset.rds* (it is in compressed [RDS format](https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readRDS)):
```{r}
```{r}
#| label: load leukemia
library(pheatmap)
library(pheatmap)
data = readRDS("leukemiaExpressionSubset.rds")
data = readRDS("leukemiaExpressionSubset.rds")
```
```
2. In the file, samples (table columns) are named according to cell type and experiment number.
2. In the file, samples (table columns) are named according to cell type and experiment number.
Let us create an annotation table by splitting the sample type and the sample number in different columns:
Let us create an annotation table by splitting the sample type and the sample number in different columns: