Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • lamp/cs206
  • bwermeil/cs206-2020
  • zabifade/cs206-2020
  • cauderan/cs206-2020
  • malonga/cs206-2020
  • dumoncel/cs206
  • bounekhe/cs206
  • bergerault/cs206
  • flealsan/cs206
  • hsu/cs206
  • mouchel/cs206
  • vebraun/cs206
  • vcanard/cs206
  • ybelghmi/cs206
  • belghmi/cs206
  • bousbina/cs206
  • waked/cs206
  • gtagemou/cs206
  • arahmoun/cs206
  • elhachem/cs206
  • benrahha/cs206
  • benslima/cs206
22 results
Show changes
Commits on Source (111)
Showing
with 11 additions and 1517 deletions
This repository will be used as the website for Parallelism and Concurrency CS-206. It will be updated weekly throughout the semester. This README contains general information about the class.
- [previous-exams](previous-exams) contains PDFs for the previous exams.
- [exercises](exercises) contains markdown documents for exercises and solutions.
- [slides](slides) contains the slides presented in class.
- [labs](labs) contains markdown documents for the labs.
We will use GitLab's issue tracker as a discussion forum. Feel free to [open an issue](https://gitlab.epfl.ch/lamp/cs206/issues) if you have any comments or questions.
# First-week tasks
1. Join [the Discord](https://discord.gg/hJsreTeZjc)
1. Log into gitlab: https://gitlab.epfl.ch/users/sign_in
1. Please fill in [this table](https://docs.google.com/spreadsheets/d/1rcq_UMgR6bAH-iK1L2I6WoClZtCqUoIOLqQO3NJKdgg) with your GASPAR and SCIPER number
* Choose the group for the exercises
* This will initialize your GitLab repository for the course
1. Follow the [Tools Setup](labs/tools-setup.md) page.
1. Do the [example lab](labs/example-lab.md).
1. Watch all videos under *Parallelism 1: Introduction to Parallel Programming* below
1. Do the [first graded lab](labs/lab1-parallel-box-blur-filter/).
# Grading
The grading of the course is divided between exercies (5%), labs (25%), midterm exam (30%) and final exam (40%).
# Staff
| Role | People |
| :--- | :--- |
| Professors | [Martin Odersky](https://people.epfl.ch/martin.odersky), [Viktor Kunčak](https://people.epfl.ch/viktor.kuncak) |
| TAs | [Aleksander Boruch-Gruszecki](https://people.epfl.ch/aleksander.boruch-gruszecki), [Dragana Milovancevic](https://people.epfl.ch/dragana.milovancevic), [Guillaume Martres](https://people.epfl.ch/guillaume.martres), [Nicolas Stucki](https://people.epfl.ch/nicolas.stucki), [Olivier Blanvillain](https://people.epfl.ch/olivier.blanvillain) |
| Student TAs | [Antoine Masanet](https://people.epfl.ch/antoine.masanet), [Lucas Giordano](https://people.epfl.ch/lucas.giordano), [Kajetan Pyszkowski](https://people.epfl.ch/kajetan.pyszkowski), [Marco Vögeli](https://people.epfl.ch/marco.vogeli), [Quentin Guignard](https://people.epfl.ch/quentin.guignard), [Sara Djambazovska](https://people.epfl.ch/sara.djambazovska) |
# Course Schedule
Lectures are partially live (on Zoom) and partially prerecorded (on YouTube).
Live sessions will be held on Wednesdays from 14:15 to 16:00.
Weekly Discord sessions will be held on Wednesdays from 14:15 to 16:00 for exercises (if it is a week with exercises) and 16:15 to 18:00 for labs.
You should watch the prerecorded lectures before doing the exercies.
In the first week of the semester, there will be a live Zoom session on Wednesday at 14:15 to welcome you to the class and answer questions you might have, followed by a Discord session.
<!-- seq 0 7 100 | xargs -i date -d "02/24/2021 {} days" +"%d.%m.%Y" -->
| Week | Date | Topic | Lectures (14:15-16:00) | Exercises (14:15-16:00) | Labs (16:15-18:00) |
| :-- | :-- | :-- | :-- | :-- | :-- |
| 1 | 24.02.21 | Parallelism 1 | Prerecorded | Welcome Zoom session | Lab 1 |
| 2 | 03.03.21 | Parallelism 2 | Prerecorded | Exercise 1 | Lab 1 & 2 |
| 3 | 10.03.21 | Parallelism 3 | Prerecorded | Exercise 2 | Lab 2 & 3 |
| 4 | 17.03.21 | Parallelism 4 | Prerecorded | Exercise 3 | Lab 3 & 4 |
| 5 | 24.03.21 | Concurrency 1 | Live | | Lab 4 & 5 |
| 6 | 31.03.21 | Concurrency 2 | Live | | Lab 5 & 6 |
| 7 | 07.04.21 | _Easter_ | | | |
| 8 | 14.04.21 | **Midterm Exam** | | | |
| 9 | 21.04.21 | Concurrency 3 | Live | | Lab 6 |
| 10 | 28.04.21 | Actors 1 | Prerecorded | Exercise 4 | Lab 7 |
| 11 | 05.05.21 | Actors 2 | Prerecorded | Exercise 5 | Lab 7 |
| 12 | 12.05.21 | Spark 1 | Prerecorded | Exercise 6 | Lab 8 |
| 13 | 19.05.21 | Spark 2 | Prerecorded | Exercise 7 | Lab 8 & 9 |
| 14 | 26.05.21 | Spark 3 | Prerecorded | Exercise 8 | Lab 9 |
| 15 | 02.06.21 | **Review for Final Exam** | | | |
Solutions to the exercises are released after each deadline. We do not provide solutions for the labs.
Before each Discord session, students should watch videos corresponding to that week's topic:
### Intro
- [Welcome Zoom session][Zoom1]
### Parallelism 1: Introduction to Parallel Programming
- [Introduction to Parallel Computing](https://www.youtube.com/watch?v=94O72nyNFY0)
- [Parallelism on the JVM I](https://www.youtube.com/watch?v=I8w-q1TPtjA)
- [Parallelism on the JVM II](https://www.youtube.com/watch?v=BbVWGWTNAXw)
- [Running Computations in Parallel](https://www.youtube.com/watch?v=KkMZGJ3M2-o)
- [Monte Carlo Method to Estimate Pi](https://www.youtube.com/watch?v=VBCf-aTgpPU)
- [First-Class Tasks](https://www.youtube.com/watch?v=mrVVaXCuhBc)
- [How Fast are Parallel Programs?](https://www.youtube.com/watch?v=Lpnexp_Qxgo)
- [Benchmarking Parallel Programs](https://www.youtube.com/watch?v=LvS_kjCssfg)
### Parallelism 2: Basic Task Parallel Algorithms
- [Parallel Sorting](https://www.youtube.com/watch?v=AcuvVgQbphg)
- [Data Operations and Parallel Mapping](https://www.youtube.com/watch?v=ghYtMLrphZw)
- [Parallel Fold (Reduce) Operation](https://www.youtube.com/watch?v=hEBgyhIoWww)
- [Associativity I](https://www.youtube.com/watch?v=q-Cl3whISCY)
- [Associativity II](https://www.youtube.com/watch?v=XBjqYavDUB8)
- [Parallel Scan (Prefix Sum) Operation](https://www.youtube.com/watch?v=CYr3YaQiMwo)
### Parallelism 3: Data-Parallelism
- [Data-Parallel Programming](https://www.youtube.com/watch?v=WW7TabCiOV8)
- [Data-Parallel Operations I](https://www.youtube.com/watch?v=Vd35YQ8DEO4)
- [Data-Parallel Operations II](https://www.youtube.com/watch?v=dcMgKtuAh3s)
- [Scala Parallel Operations](https://www.youtube.com/watch?v=NjkxjAT7ohE)
- [Splitters and Combiners](https://www.youtube.com/watch?v=Redz85Nlle4)
### Parallelism 4: Data-Structures for Parallel Computing
- [Implementing Combiners](https://www.youtube.com/watch?v=dTP0ntniB2I)
- [Parallel Two-phase Construction](https://www.youtube.com/watch?v=XcMtq3OdjQ0)
- [Conc-Tree Data Structure](https://www.youtube.com/watch?v=cUXHXKL8Xvs)
- [Amortized, Constant-Time Append Operation](https://www.youtube.com/watch?v=Ic5DUZLITVI)
- [Conc-Tree Combiners](https://www.youtube.com/watch?v=aLfFlCC1vjc)
### Concurrency 1, 2 & 3
- Live lectures
- Concurrency 1 ([Zoom Recording][ZoomConcurrency1]) ([YouTube version][YouTubeConcurrency1])
- Concurrency 2 ([Zoom Recording][ZoomConcurrency2]) ([YouTube version][YouTubeConcurrency2])
- Concurrency 3 ([Zoom Recording][ZoomConcurrency3]) ([YouTube version][YouTubeConcurrency3])
### Actors 1
- [Introduction: why actors?](https://www.youtube.com/watch?v=ZQAe9AItH8o)
- [The Actor Model](https://www.youtube.com/watch?v=c49tDZuFtPA)
### Actors 2
- [Message Processing Semantics](https://www.youtube.com/watch?v=Uxn1eg6R0Fc)
- [Designing Actor Systems](https://www.youtube.com/watch?v=uxeMJLo3h9k)
- [Testing Actor Systems](https://www.youtube.com/watch?v=T_2nwLr-H2s)
### Spark 1: Spark Basics
- [From Parallel to Distributed](https://www.youtube.com/watch?v=bfMbJ8NzTZI)
- [Latency](https://www.youtube.com/watch?v=igNIz2Ent5E)
- [RDDs, Spark's Distributed Collection](https://www.youtube.com/watch?v=EuVmW62aIXI)
- [RDDs: Transformations and Actions](https://www.youtube.com/watch?v=qJlfATheS38)
- [Evaluation in Spark: Unlike Scala Collections!](https://www.youtube.com/watch?v=0pVYuuUrN74)
- [Cluster Topology Matters](https://www.youtube.com/watch?v=lS4vRzwrmtU)
### Spark 2: Reduction Operations & Distributed Key-Value Pairs
- [Reduction Operations](https://www.youtube.com/watch?v=JhF0_Ka_iqU)
- [Pair RDDs](https://www.youtube.com/watch?v=kIUzgweDMUs)
- [Transformations and Actions on Pair RDDs](https://www.youtube.com/watch?v=ovf0GFbnp5g)
- [Joins](https://www.youtube.com/watch?v=kYpaZpj4qTM)
### Spark 3: Partitioning and Shuffling
- [Shuffling: What it is and Why it's important](https://www.youtube.com/watch?v=LrgA4PrKrks)
- [Partitioning](https://www.youtube.com/watch?v=sTcki6mxjcA)
- [Optimizing with Partitioners](https://www.youtube.com/watch?v=4Vfp5kp2jnE)
- [Wide vs Narrow Dependencies](https://www.youtube.com/watch?v=L9BnaYp10c8)
# Labs
Labs are individual assignments where you get to write Scala programs using the concepts learned during lectures.
Labs are submitted by pushing your code on GitLab, see details in the [grading and submission](labs/grading-and-submission.md) page.
| Labs | Name | Start date | Due date (23:59 [AoE](https://en.wikipedia.org/wiki/Anywhere_on_Earth)) |
| :-- | :-- | :-- | :-- |
| Lab 1 | Parallel Box Blur Filter | 24.02.21 | 07.03.2021 |
| Lab 2 | Reductions and Prefix Sums | 01.03.21 | 14.03.2021 |
| Lab 3 | K-Means | 08.03.21 | 21.03.2021 |
| Lab 4 | Barnes-Hut Simulation | 15.03.21 | 28.03.2021 |
| Lab 5 | Bounded Buffer | 22.03.21 | 04.04.2021 |
| Lab 6 | Lock-free Sorted List | 29.03.21 | 25.04.2021 |
| Lab 7 | Actors Binary Tree | 26.04.21 | 09.05.2021 |
| Lab 8 | Wikipedia | 10.05.21 | 23.05.2021 |
| Lab 9 | StackOverflow | 17.05.21 | 30.05.2021 |
# Exercises
Exercises are pen and paper style questions that will help you consolidate the knowledge learned during lectures.
Exercises should be done in groups and submitted on GitLab.
You should form groups of up to five students for each exercise, solve the exercise remotely with your group (using Discord, Hangouts, Zoom, ...), and write your solutions in a text file.
The first line of your solution file should list all the group members' SCIPER numbers.
After you solve the exercise with your group, **each member should submit a copy of this file** to their GitLab repository following the instructions given in the problem statement.
Exercises will be given a participation grade at the end of the semester, which accounts for 5% of the overall course grade.
| Exercises | Start date | Due date (23:59 [AoE](https://en.wikipedia.org/wiki/Anywhere_on_Earth)) |
| :-- | :-- | :-- |
| Exercise 1 | 01.03.2021 | 07.03.2021 |
| Exercise 2 | 08.03.2021 | 14.03.2021 |
| Exercise 3 | 15.03.2021 | 21.03.2021 |
| Exercise 4 | 26.04.2021 | 02.05.2021 |
| Exercise 5 | 03.05.2021 | 09.05.2021 |
| Exercise 6 | 10.05.2021 | 16.05.2021 |
| Exercise 7 | 17.05.2021 | 23.05.2021 |
| Exercise 8 | 24.05.2021 | 30.05.2021 |
# Exams
The midterm exam will take place on 14.04.21. The midterm exam will cover all the material seen in the class up to week 6 (included).
~~The final exam will take place on 02.06.21.~~ The final exam for our course, parallelism and concurrency has been scheduled for Saturday July 10th from 8.15 to 11.15 in rooms SG1, AAC137, CO1 and CE6. The final exam will cover all material seen during the semester.
Information about exams organization will be communicated by email.
[Zoom1]: https://epfl.zoom.us/rec/share/wKUHvD1vNyczKpb2e161QCnqi3BtURL5D9zfVie3iXqXmrsIbJ2lrl4cCO_1FgSx.ZfvYlrRfSAFFdMcq?startTime=1614170578000
[ZoomConcurrency1]: https://epfl.zoom.us/rec/play/LzV8eccIFbEgHMi-DLJqynUus7chng8NYyR7XeA5Jjn_NGqkeAhIiTNsOLmIOp0XJHfjrNeB-XS6F5Fw.47WW7-mux-mqQZN8?startTime=1616586685000
[ZoomConcurrency2]: https://epfl.zoom.us/rec/play/nIeJO-yRcbqmjzIZWU4yU3gQC5h8rQlebVxbWSUEP426NvwgSsmis9lOOyV-m7SjVqFV9ML0vggMDg5N.01wR77_gQbSIUv8b?autoplay=true
[ZoomConcurrency3]: https://epfl.zoom.us/rec/share/Y-CKNw08e88aeAhRhApCePes-1PWj97C6x6imLRM_X4kepGNjMjZimwDXCd4wGuk.hEUUleJEvt8MClDx?startTime=1619005278000
[YouTubeConcurrency1]: https://www.youtube.com/watch?v=5oUpSoUoII4
[YouTubeConcurrency2]: https://www.youtube.com/watch?v=Jvo-vrxaGnk
[YouTubeConcurrency3]: https://www.youtube.com/watch?v=t4tqMzfvclk
Moved to [Moodle](https://moodle.epfl.ch/course/view.php?id=14388).
# Exercise 1
Use the following commands to make a fresh clone of your repository:
```
git clone -b exercise-1 git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git exercise-1
```
Update the README.md file with your solutions. Don't forget to list the group members' SCIPER numbers.
# Problem 1: Introduction to Concurrency
Freshly graduated from EPFL, you all have been hired as contractors for a successful and rapidly growing bank. The bank has recently been experiencing problems with their money management system, coded in Scala, and so they hired the best and brightest young engineers they could find: you! The system has been working perfectly fine so far, they tell you. In the past days, due to an increased number of customers, they had to switch from a single threaded sequential execution environment to a multithreaded concurrent environment, in which multiple threads may perform transactions concurrently. That's when problems started, your manager says…
Below is the code responsible to withdraw money from the account from and transfer it to the account to, within the same bank.
```scala
def transfer(from: Account, to: Account, amount: BigInt) {
require(amount >= 0)
val balanceFrom = from.balance
if (balanceFrom >= amount) {
from.balance = balanceFrom - amount
val balanceTo = to.balance
to.balance = balanceTo + amount
}
}
```
For the bank, it is very important that the following two properties hold after any sequence of completed transfer transactions:
1. The balance of an account never goes below 0.
2. The total sum of money held by the bank is constant.
## Question 1
Does the above transfer method respect the two properties in a *sequential* execution environment, that is, when there is only one thread in the program?
## Question 2
What can go wrong in a setting where multiple threads can execute the `transfer` method concurrently? For each of the two desired properties of the system, check if its holds in this concurrent environment. If not, come up with an example execution which exhibits a violation of the property.
# Question 3
For each of the proposed implementations of `transfer` below, check which of the properties hold. Additionally, check if the system is vulnerable to *deadlocks*.
Variant 1
```scala
def transfer(from: Account, to: Account, amount: Long): Unit = {
require(amount >= 0)
val balanceFrom = from.balance
if (balanceFrom >= amount) {
from.synchronized {
from.balance = balanceFrom - amount
}
to.synchronized {
val balanceTo = to.balance
to.balance = balanceTo + amount
}
}
}
```
Variant 2
```scala
def transfer(from: Account, to: Account, amount: Long): Unit = {
require(amount >= 0)
from.synchronized {
val balanceFrom = from.balance
if (balanceFrom >= amount) {
from.balance = balanceFrom - amount
to.synchronized {
val balanceTo = to.balance
to.balance = balanceTo + amount
}
}
}
}
```
Variant 3
```scala
object lock // Global object
def transfer(from: Account, to: Account, amount: Long): Unit = {
require(amount >= 0)
lock.synchronized {
val balanceFrom = from.balance
if (balanceFrom >= amount) {
from.balance = balanceFrom - amount
val balanceTo = to.balance
to.balance = balanceTo + amount
}
}
}
```
# Problem 2: Parallel Reductions
## Question 1
As a group, write a function called `minMax`, which should take a non-empty array as input and return a pair containing the smallest and the largest element of the array.
```scala
def minMax(a: Array[Int]): (Int, Int) = ???
```
Now write a parallel version of the function. You may use the constructs `task` and/or `parallel`, as seen in the lectures.
## Question 2
Imagine that the data structure you are given, instead of an `Array[A]`, is one called `ParSeq[A]`. This class offers the two following methods, which work in parallel:
```scala
def map[B](f: A => B): ParSeq[B]
def reduce(f: (A, A) => A): A
```
Can you write the following `minMax` function in terms of `map` and/or `reduce` operations ?
```scala
def minMax(data: ParSeq[Int]): (Int, Int) = ???
```
## Question 3
What property does the function `f` passed to reduce need to satisfy in order to have the same result regardless on how reduce groups the applications of the operation f to the elements of the data structure? Prove that your function `f` indeed satisfies that property.
# Exercise 2
Use the following commands to make a fresh clone of your repository:
```
git clone -b exercise-2 git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git exercise-2
```
Update the README.md file with your solutions. Don't forget to list the group members's SCIPER numbers.
# Problem 1: Aggregate
In this week's lecture, you have been introduced to the aggregate method of `ParSeq[A]` (and other parallel data structures...). It has the following signature:
```scala
def aggregate[B](z: B)(f: (B, A) => B, g: (B, B) => B): B
```
Discuss, as a group, what aggregate does and what its arguments represent.
## Question 1
Consider the parallel sequence `xs` containing the three elements `x1`, `x2` and `x3`. Also consider the following call to aggregate:
```scala
xs.aggregate(z)(f, g)
```
The above call might potentially result in the following computation:
```scala
f(f(f(z, x1), x2), x3)
```
But it might also result in other computations. Come up with at least 2 other computations that may result from the above call to `aggregate`.
## Question 2
Below are other examples of calls to aggregate. In each case, check if the call can lead to different results depending on the strategy used by `aggregate` to aggregate all values contained in `data` down to a single value. You should assume that `data` is a parallel sequence of values of type `BigInt`.
Variant 1
```scala
data.aggregate(1)(_ + _, _ + _)
```
Variant 2
```scala
data.aggregate(0)((acc, x) => x - acc, _ + _)
```
Variant 3
```scala
data.aggregate(0)((acc, x) => acc - x, _ + _)
```
Variant 4
```scala
data.aggregate(1)((acc, x) => x * x * acc, _ * _)
```
## Question 3
Under which condition(s) on `z`, `f`, and `g` does aggregate always lead to the same result?
Come up with a formula on `z`, `f`, and `g` that implies the correctness of aggregate.
*Hint*: You may find it useful to use calls to `foldLeft(z)(f)` in your formula(s).
## Question 4
Implement `aggregate` using the methods `map` and/or `reduce` of the collection you are defining aggregate for.
## Question 5
Implement `aggregate` using the `task` and/or `parallel` constructs seen in the first week and the `Splitter[A]` interface seen in this week's lecture. The `Splitter` interface is defined as:
```scala
trait Splitter[A] extends Iterator[A] {
def split: Seq[Splitter[A]]
def remaining: Int
}
```
You can assume that the data structure you are defining aggregate for already implements a `splitter` method which returns an object of type `Splitter[A]`.
Your implementation of `aggregate` should work in parallel when the number of remaining elements is above the constant THRESHOLD and sequentially below it.
*Hint*: `Iterator`, and thus `Splitter`, implements the `foldLeft` method.
## Question 6
Discuss the implementations from questions 4 and 5. Which one do you think would be more efficient?
# Problem 2: Depth
Review the notion of depth seen in the lecture. What does it represent?
Below is a formula for the depth of a *divide and conquer* algorithm working on an array segment of *size L*, as a function of *L*. The values *c*, *d* and *T* are constants. We assume that *L>0* and *T>0*.
![](images/2-1.png)
Below the threshold *T*, the algorithm proceeds sequentially and takes time *c* to process each single element. Above the threshold, the algorithm is applied recursively over the two halves of the array. The results are then merged using an operation that takes *d* units of time.
## Question 1
Is it the case that for all *1 ≤ L1 ≤ L2* we have *D(L1) ≤ D(L2)*?
If it is the case, prove the property by induction on *L*. If it is not the case, give a counterexample showing values of *L1*, *L2*, *T*, *c*, and *d* for which the property does not hold.
## Question 2
Prove a logarithmic upper bound on *D(L)*. That is, prove that *D(L)* is in *O(log(L))* by finding specific constants *a*, *b* such that *D(L) ≤ a &ast; log2(L) + b*.
*Hint:* The proof is more complex that it might seem. One way to make it more manageable is to define and use a function *D'(L)* that has the property described in question 1, and is greater or equal to *D(L)*. We suggest you use:
![](images/2-2.png)
Also remark that computing *D'(L)* when *L* is a power of 2 is easy. Also remember that there always exists a power of 2 between any positive integer and its double.
# Exercise 3
Use the following commands to make a fresh clone of your repository:
```
git clone -b exercise-3 git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git exercise-3
```
Update the README.md file with your solutions. Don't forget to list the group members's SCIPER numbers.
# Problem 1: Parallel Encoding
In this exercise, your group will devise a parallel algorithm to encode sequences using the run-length encoding scheme. The encoding is very simple. It transforms sequences of letters such that all subsequences of the same letter are replaced by the letter and the sequence length. For instance:
```
"AAAAATTTGGGGTCCCAAC" ⇒ "A5T3G4T1C3A2C1"
```
Your goal in this exercise is to come up with a parallel implementation of this algorithm. The function should have the following shape:
```scala
def rle(data: ParSeq[Char]): Buffer[(Char, Int)] =
data.aggregate(???)(???, ???)
```
The Buffer class is already given to you. A buffer of type `Buffer[A]` represents sequences of elements of type `A`. It supports the following methods, all of which are efficient:
```scala
def isEmpty: Boolean // Checks if the buffer is empty.
def head: A // Returns the first element of the buffer.
def tail: Buffer[A] // Returns the buffer minus its first element.
def last: A // Returns the last element of the buffer.
def init: Buffer[A] // Returns the buffer minus its last element.
def ++(that: Buffer[A]): Buffer[A] // Concatenate two buffers.
def append(elem: A): Buffer[A] // Appends a single element to the right.
Buffer.empty[A]: Buffer[A] // Returns an empty buffer.
Buffer.singleton[A](element: A): Buffer[A] // Single element buffer.
```
# Problem 2: Parallel Two Phase Construction
In this exercise, you will implement an array Combiner using internally a double linked list (DLL). Below is a minimal implementation of the `DLLCombiner` class and the related `Node` class. Your goal for this exercise is to complete the implementation of the (simplified) Combiner interface of the `DLLCombiner` class.
```scala
class DLLCombiner[A] extends Combiner[A, Array[A]] {
var head: Node[A] = null // null for empty lists.
var last: Node[A] = null // null for empty lists.
var size: Int = 0
// Implement these three methods...
override def +=(elem: A): Unit = ???
override def combine(that: DLLCombiner[A]): DLLCombiner[A] = ???
override def result(): Array[A] = ???
}
class Node[A](val value: A) {
var next: Node[A] // null for last node.
var previous: Node[A] // null for first node.
}
```
**Question 1:** What computational complexity do your methods have? Are the actual complexities of your methods acceptable according to the `Combiner` requirements?
**Question 2:** One of the three methods you have implemented, `result`, should work in parallel according to the `Combiner` contract. Can you think of a way to implement this method efficiently using 2 parallel tasks?
**Question 3:** Can you, given the current internal representation of your combiner, implement `result` so that it executes efficiently using 4 parallel tasks? If not, can you think of a way to make it possible?
*Hint:* This is an open-ended question, there might be multiple solutions. In your solution, you may want to add extra information to the class Node and/or the class DLLCombiner.
# Problem 3: Pipelines
In this exercise, we look at pipelines of functions. A pipeline is simply a function which applies its argument successively to each function of a sequence. To illustrate this, consider the following pipeline of 4 functions:
```scala
val p: Int => Int = toPipeline(ParSeq(_ + 1, _ * 2, _ + 3, _ / 4))
```
The pipeline `p` is itself a function. Given a value `x`, the pipeline `p` will perform the following computations to process it. In the above example,
```scala
p(x) = (((x + 1) Application of first function
* 2) Application of second function
+ 3) Application of third function
/ 4 Application of fourth function
```
In this exercise, we will investigate the possibility to process such pipelines in parallel.
**Question 1:** Implement the following `toPipeline` function, which turns a parallel sequence of functions into a pipeline. You may use any of the parallel combinators available on `ParSeq`, such as the parallel `fold` or the parallel `reduce` methods.
```scala
def toPipeline(fs: ParSeq[A => A]): A => A = ???
```
*Hint:* Functions have a method called andThen, which implements function composition: it takes as argument another function and also returns a function. The returned function first applies the first function, and then applies the function passed as argument to that result. You may find it useful in your implementation of pipeline.
**Question 2:** Given that your `toPipeline` function works in parallel, would the pipelines it returns also work in parallel? Would you expect pipelines returned by a sequential implementation of toPipeline to execute any slower? If so, why?
Discuss those questions with your group and try to get a good understanding of what is happening.
**Question 3:** Instead of arbitrary functions, we will now consider functions that are constant everywhere except on a finite domain. We represent such functions in the following way:
```scala
class FiniteFun[A](mappings: immutable.Map[A, A], default: A) {
def apply(x: A): A =
mappings.get(x) match {
case Some(y) => y
case None => default
}
def andThen(that: FiniteFun[A]): FiniteFun[A] = ???
}
```
Implement the andThen method. Can pipelines of such finite functions be efficiently constructed in parallel using the appropriately modified `toPipeline` method? Can the resulting pipelines be efficiently executed?
**Question 4:** Compare the *work* and *depth* of the following two functions, assuming infinite parallelism. For which kind of input would the parallel version be asymptotically faster?
```scala
def applyAllSeq[A](x: A, fs: Seq[FiniteFun[A]]): A = {
// Applying each function sequentially.
var y = x
for (f <- fs)
y = f(y)
y
}
def applyAllPar[A](x: A, fs: ParSeq[FiniteFun[A]]): A =
if (fs.isEmpty) x
else {
// Computing the composition in parallel.
val p = fs.reduce(_ andThen _)
// Applying the pipeline.
p(x)
}
```
# Exercise 4
Use the following commands to make a fresh clone of your repository:
```
git clone -b exercise-4 git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git exercise-4
```
Update the README.md file with your solutions. Don't forget to list the group members's SCIPER numbers.
# Problem 1: Implementing map and filter on Futures
In this exercise, you will come up with an implementation of the `map` and `filter` methods of Futures. First of all, spend some time as a group to make sure that you understand what those methods are supposed to do. Then, complete the following code to implement the two methods:
```scala
trait Future[T] { self =>
def map[S](f: T => S): Future[S] =
new Future[S] {
def onComplete(callback: Try[S] => Unit): Unit = ???
}
def filter(f: T => Boolean): Future[T] =
new Future[T] {
def onComplete(callback: Try[T] => Unit): Unit = ???
}
}
```
In the case of `filter`, if the original `Future` successfully returns a value which does not satisfy the predicate, the new `Future` should return a `Failure` containing a `NoSuchElementException`.
# Problem 2: Coordinator / Worker
In this exercise, you will have to implement a Coordinator / Worker actor system, in which one actor, the coordinator, dispatches work to other actors, the workers. Between the coordinator and the workers, only two kinds of messages are sent: `Request` and `Ready` messages.
```scala
case class Request(computation: => Unit)
case object Ready
```
The coordinator actor sends `Request` messages to workers to request them to perform some computation (passed as an argument of `Request`). Upon reception of a `Request`, a worker should perform the computation. Workers should send a `Ready` message to their coordinator whenever they finish executing the requested computation, and also right after they are created.
The coordinator actor itself receives requests through `Request` messages from clients. The coordinator actor should then dispatch the work to worker actors. The coordinator should however never send a request to a worker which has not declared itself ready via a `Ready` message beforehand.
Implement the `Coordinator` and `Worker` classes.
```scala
class Coordinator extends Actor {
???
override def receive = ???
}
class Worker(coordinator: Coordinator) extends Actor {
???
override def receive = ???
}
```
An example system using the Coordinator and Worker actors is shown below.
```scala
object Main extends App {
val coordinatorProps: Props = Props(new Coordinator())
def workerProps(coord: Coordinator): Props = Props(new Worker(coord))
val system = ActorSystem("coordinator/worker")
val coordinator = system.actorOf(coordinatorProps)
val workers = Seq.fill(10) {
system.actorOf(workerProps(coordinator))
}
// Now, clients should be able to send requests to the coordinator…
coordinator ! Request(println(3 + 5))
coordinator ! Request(println(67 * 3))
// And so on...
}
```
*Hint*: In order to fulfill its job, the coordinator should remember which workers are ready and what requests are still to be allocated to a worker.
exercises/images/2-1.png

30.4 KiB

exercises/images/2-2.png

27.5 KiB

exercises/images/8-1.png

8.59 KiB

# Exercise 1 : Introduction to Concurrency
## Question 1
Yes
## Question 2
#### Property 1 holds
Assuming that the execution of two concurrent threads only interleaves instructions and that reads and writes are executed atomically, it can be shown that property 1 always holds. Unfortunately, such strong guarantees are not offered by the Java Memory Model. If you are interested, have a look at the note below on the Java Memory Model.
#### Violation of property 2
Consider 2 threads that execute concurrently `transfer(from, to, amount)` with the exact same parameters. Assume that the account from has sufficient funds for at least one transfer.
Thread 1 executes until it has computed the value balanceFrom - amount and then stops. Thread 2 then executes in its entirety the call to `transfer(from, to, amount)`. Then thread 1 resumes its execution and completes the call to `transfer`.
At the end of this execution, the total amount of money held by the bank has changed. It has in fact increased by the value amount.
#### Note on the Java Memory Model
Assuming the Java Memory Model, both of the two properties can potentially be violated. Indeed, the model only ensures that the execution of each thread appears sequential to the thread itself, and not to any other concurrently running threads. Seemingly atomic instructions can be arbitrarily decomposed by the underlying virtual machine. Sequences of instructions can also be reordered at will by the VM, as long as the execution of a single thread appears as if it were executed sequentially. In these settings, both properties can be violated.
## Question 3
Variant 1
In this variant, property 2 can be violated. It is not vulnerable to deadlocks.
Variant 2
In this variant, none of the two properties can be violated. However, it is susceptible to deadlocks.
Variant 3
In this last variant, none of the two properties can be violated and no deadlock can occur. It is however still not entirely satisfactory, since no two threads can execute transfers in parallel, even when the accounts are totally disjointed. Can you think of a better solution?
## Question 1
```scala
def minMax(a: Array[Int]): (Int, Int) = {
val threshold = 10
def minMaxPar(from: Int, until: Int): (Int, Int) = {
if (until - from <= threshold) {
var i = from
var min = a(from)
var max = a(from)
while (i < until) {
val x = a(i)
if(x < min) min = x
if(x > max) max = x
i = i + 1
}
(min, max)
} else {
val mid = from + ((until - from) / 2)
val ((xMin, xMax),
(yMin, yMax)) = parallel(minMaxPar(from, mid),
minMaxPar(mid, until))
(min(xMin, yMin), max(xMax, yMax))
}
}
minMaxPar(0, a.size)
}
```
## Question 2
```scala
def minMax(data: ParSeq[Int]): (Int, Int) = data.map({
(x: Int) => (x, x)
}).reduce({
case ((mn1, mx1), (mn2, mx2)) => (min(mn1, mn2), max(mx1, mx2))
})
```
Or:
```scala
def minMax(data: ParSeq[Int]): (Int, Int) =
(data.reduce(min), data.reduce(max))
```
## Question 3
The function `f` must be associative. That is, for any `x`, `y`, `z`, it should be the case that:
```
f(x, f(y, z)) == f(f(x, y), z).
```
Both the `min` and `max` functions are associative. In addition, it can be easily shown that pairwise application of associative functions is also associative. From this follows that `f` is indeed associative.
# Exercise 1 : Aggregate
## Question 1
- g(f(z, x1), f(f(z, x2), x3))
- g(f(f(z, x1), x2), f(z, x3))
- g(g(f(z, x1), f(z, x2)), f(z, x3))
- g(f(z, x1), g(f(z, x2), f(z, x3)))
## Question 2
Variant 1
This might lead to different results.
Variant 2
This might lead to different results.
Variant 3
This always leads to the same result.
Variant 4
This always leads to the same result.
## Question 3
A property that implies the correctness is:
```
forall xs, ys. g(xs.F, ys.F) == (xs ++ ys).F (split-invariance)
```
where we define
```
xs.F == xs.foldLeft(z)(f)
```
The intuition is the following. Take any computation tree for
`xs.aggregate`. Such a tree has internal nodes labelled by g and segments processed using `foldLeft(z)(f)`. The split-invariance law above says that any internal g-node can be removed by concatenating the segments. By repeating this transformation, we obtain the entire result equals `xs.foldLeft(z)(f)`.
The split-invariance condition uses `foldLeft`. The following two conditions together are a bit simpler and imply split-invariance:
```
forall u. g(u,z) == u (g-right-unit)
forall u, v. g(u, f(v,x)) == f(g(u,v), x) (g-f-assoc)
```
Assume g-right-unit and g-f-assoc. We wish to prove split-invariance. We do so by induction on the length of `ys`. If ys has length zero, then `ys.foldLeft` gives `z`, so by g-right-unit both sides reduce to xs.foldLeft. Let `ys` have length `n>0` and assume by I.H. split-invariance holds for all `ys` of length strictly less than `n`. Let `ys == ys1 :+ y` (that is, y is the last element of `ys`). Then
```
g(xs.F, (ys1 :+ y).F) == (foldLeft definition)
g(xs.F, f(ys1.F, y)) == (by g-f-assoc)
f(g(xs.F, ys1.F), y) == (by I.H.)
f((xs++ys1).F, y) == (foldLeft definition)
((xs++ys1) :+ y).F == (properties of lists)
(xs++(ys1 :+ y)).F
```
## Question 4
```scala
def aggregate[B](z: B)(f: (B, A) => B, g: (B, B) => B): B =
if (this.isEmpty) z
else this.map((x: A) => f(z, x)).reduce(g)
```
## Question 5
```scala
def aggregate(z: B)(f: (B, A) => B, g: (B, B) => B): B = {
def go(s: Splitter[A]): B = {
if (s.remaining <= THRESHOLD)
s.foldLeft(z)(f)
else {
val splitted = s.split
val subs = splitted.map((t: Splitter[A]) => task { go(t) })
subs.map(_.join()).reduce(g)
}
}
go(splitter)
}
```
## Question 6
The version from question 4 may require 2 traversals (one for `map`, one for `reduce`) and does not benefit from the (potentially faster) sequential operator `f`.
# Exercise 2 : Depth
## Question 1
Somewhat counterintuitively, the property doesn't hold. To show this, let's take the following values for *L1*, *L2*, *T*, *c*, and *d*.
```
L1 = 10, L2 = 12, T = 11, c = 1, and d = 1.
```
Using those values, we get that:
```
D(L1) = 10
D(L2) = max(D(6), D(6)) + 1 = 7
```
## Question 2
*Proof sketch*
Define the following function D'(L).
![](images/2-2.png)
Show that *D(L) ≤ D'(L)* for all *1 ≤ L*.
Then, show that, for any *1 ≤ L1 ≤ L2* we have *D'(L1) ≤ D'(L2)*. This property can be shown by induction on *L2*.
Finally, let *n* be such that *L ≤ 2n < 2L*. We have that:
```
D(L) ≤ D'(L) Proven earlier.
≤ D'(2n) Also proven earlier.
≤ log2(2n) (d + cT) + cT
< log2(2L) (d + cT) + cT
= log2(L) (d + cT) + log2(2) (d + cT) + cT
= log2(L) (d + cT) + d + 2cT
```
Done.
# Exercise 1 : Parallel Encoding
```scala
def rle(data: ParSeq[Char]): Buffer[(Char, Int)] = {
def g(as: Buffer[(Char, Int)], bs: Buffer[(Char, Int)]) =
if (as.isEmpty || bs.isEmpty || as.last._1 != bs.head._1)
as ++ bs
else
as.init.append((as.last._1, as.last._2 + bs.head._2)) ++ bs.tail
def f(acc: Buffer[(Char, Int)], x: Char) =
if (acc.isEmpty || acc.last._1 != x)
acc.append((x, 1))
else
acc.init.append((x, acc.last._2 + 1))
val z: Buffer[(Char, Int)] = Buffer.empty
data.aggregate(z)(f, g)
}
```
# Exercise 2 : Parallel Two Phase Construction
```scala
class DLLCombiner[A] extends Combiner[A, Array[A]] {
var head: Node[A] = null
var last: Node[A] = null
var size: Int = 0
override def +=(elem: A): Unit = {
val node = new Node(elem)
if (size == 0) {
head = node
last = node
size = 1
}
else {
last.next = node
node.previous = last
last = node
size += 1
}
}
override def combine(that: DLLCombiner[A]): DLLCombiner[A] =
if (this.size == 0)
that
else if (that.size == 0)
this
else {
this.last.next = that.head
that.head.previous = this.last
this.size = this.size + that.size
this.last = that.last
this
}
// This is not implemented in parallel yet.
override def result(): Array[A] = {
val data = new Array[A](size)
var current = head
var i = 0
while (i < size) {
data(i) = current.value
i += 1
current = current.next
}
data
}
}
```
## Question 1
The complexity of `+=` is constant, as well as the complexity of `combine`. This is obviously well within the desired complexity range. The result function takes time linear in the size of the data, which is acceptable according to the Combiner requirements. However, the result function should work in parallel according to the contract. This isn't the case here.
## Question 2
The idea is to copy the double linked list to the array from both ends at the same time. For this, we create a task that handles the second half of the array, while the current thread copied the first half.
```scala
override def result(): Array[A] = {
val data = new Array[A](size)
val mid = size / 2
// This is executed on a different thread.
val taskEnd = task {
var i = size - 1
var current = last
while (i >= mid) {
data(i) = current.value
current = current.previous
i -= 1
}
}
// This is executed on the current thread.
var i = 0
var current = head
while (i < mid) {
data(i) = current.value
current = current.next
i += 1
}
taskEnd.join()
data
}
```
## Question 3
The actual answer to this question is: *it depends*. Two see why, we first make the following observation:
All implementations of the result function must consist of primarily two operations:
1. Moving to the next node in the list, and,
2. Copying the value of the node to the array.
Depending on the actual cost of the two operations, one may devise schemes that can make efficient use of more than two threads. For instance, assume for a moment that copying a value to the array is significantly costlier than moving to the next node in the list. In this case, we could execute the function efficiently in parallel by spawning multiple threads starting from the head of the list, each handling a disjoint set of indexes (for instance, one thread takes indexes of the form 4n, another 4n + 1 and so on).
On the other hand, if we assume that moving to the next node in the list has a cost comparable to the one of copying a value to the array, then finding such a strategy is more challenging, or even impossible.
However, there are ways to circumvent this problem by modifying the data structure used. One way could be to keep track of the middle of the double linked lists. The result function could then execute in parallel on 4 different threads by copying the array from both ends and from the middle (in both directions) simultaneously. The problem would then be to efficiently maintain the pointer to the middle of the list, which might not be a trivial task when combining arbitrary lists together. If you are interested in learning more about such data-structures, we encourage you to look up the skip list data structure, which generalises on this idea.
Another solution would be to modify the nodes so that they also point to their successor's successor and their predecessor's predecessor. This way, two threads could start from the start of the list and two from the end. In each case, one thread would be responsible for odd indexes and the other for even ones. This solution does not change at all the complexity of the various Combiner operations, but requires a bit more bookkeeping.
# Exercise 3: Pipelines
## Question 1
```scala
def toPipeline(fs: ParSeq[A => A]): A => A =
if (fs.isEmpty)
(x: A) => x
else
fs.reduce(_ andThen _)
```
## Question 2
Even though the pipeline is constructed in parallel, *it will not itself execute in parallel*. The resulting pipeline still has to apply its argument to all the functions it contains sequentially. This is due to the fact that the andThen method simply returns a function that will apply the first function and then the second, sequentially.
## Question 3
```scala
def andThen(that: FiniteFun[A]): FiniteFun[A] = {
val newDefault = that(default)
val newMappings = for {
(x, y) <- mappings
val z = that(y)
if (z != newDefault)
} yield (x, z)
new FiniteFun(newMappings, newDefault)
}
```
Pipelines of such functions can be efficiently constructed in parallel, as was the case for "normal" functions also. Also, interestingly, the resulting pipeline can be executed efficiently. The execution time of the pipeline does not depend on the number of functions in the pipeline, only on the lookup time in the finite map mappings (which can be nearly constant time if the underlying map is a hashtable). The size of this map is upper bounded by the size of the mappings of the functions in the pipeline.
## Question 4
To simplify the analysis, we will assume that lookup in the mappings takes constant time, and thus that applying a FiniteFun also takes constant time. Let's also assume that fs is of size `n` for both functions.
Since the function is purely sequential, the work and depth of applyAllSeq are equal. They amount to `n` applications of a finite function, which is linear in `n`.
For applyAllPar, things are a bit more complex. Let's denote by `d` the size of the largest domain of all functions passed as argument.
The depth of the function is simply the depth of computing the pipeline (`fs.reduce(_ andThen _)`) plus a constant for applying the pipeline. Assuming infinite parallelism, this results in a depth that is in `O(log2(n) ⋅ d)`.
The work of applyAllPar is significantly more than its depth, and can be upper bounded by `O(n ⋅ d)`. Indeed, there are `n` applications of the `andThen` method, each of which takes `O(d)` time.
When `d` is a constant, then the parallel version will be asymptotically faster than its sequential counterpart. If `d` is exponentially larger than `n`, then the sequential version is expected to perform better.
......@@ -14,52 +14,31 @@ stages:
compile:
stage: build
timeout: 15m
image: lampepfl/moocs:cs206-2021-02-22
timeout: 10m
image: ${DOCKER_REPO}/compile
tags:
- cs210
except:
- tags
- master
- exercise-1
- exercise-2
- exercise-3
- exercise-4
- exercise-5
- exercise-6
- exercise-7
- exercise-8
- main
script:
- sbt packageSubmission
# Output of this job that needs to be stored to be reused in the grade job below
artifacts:
expire_in: 1 day
expire_in: 24h
paths:
- submission.jar
grade:
stage: grade
timeout: 15m
timeout: 10m
except:
- tags
- master
- exercise-1
- exercise-2
- exercise-3
- exercise-4
- exercise-5
- exercise-6
- exercise-7
- exercise-8
- main
tags:
- cs210
image:
name: lampepfl/moocs:cs206-2021-${CI_COMMIT_BRANCH}
entrypoint: [""]
variables:
JAR_PATH: "${CI_PROJECT_DIR}/submission.jar"
image: ${DOCKER_REPO}/grader-${CI_COMMIT_BRANCH}
allow_failure: true
before_script:
- mkdir -p /shared/submission/
- cp submission.jar /shared/submission/submission.jar
script:
- cd /grader
- /grader/grade | /grader/feedback-printer
script: "😎"
# Example lab
The goal of this lab is to familiarize yourself with the infrastructure and tools used in this class. Even though the grade in this lab won't influence your grade for the course, it is important that you work through this lab carefully.
## Part 1: Obtaining the Project Files
First, make sure you've followed the [Tools Setup](tools-setup.md) page.
**At this point, we strongly encourage you to take the time to read at least the first three chapters of the [Git Book](https://git-scm.com/book/en/v2). If you just copy-paste the commands we give you without understanding them, it's likely that you'll make a mistake somewhere and waste time. Git can be a huge productivity enhancer when used correctly, so it's definitely worth the investment!**
We'll starting by cloning the repository containing all our lab (make
sure to replace `GASPAR` with your EPFL username (the one with letters, not the
one with number) in the following command).
```shell
git clone -b example git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git cs206-example
```
**If this command fails, make sure you've [logged into
gitlab](https://gitlab.epfl.ch/users/sign_in) and [registered in a
group](https://gitlab.epfl.ch/lamp/cs206/-/blob/master/exercises/Group%20workspaces.md),
then wait a few minutes.
If it still doesn't work it's likely that you didn't correctly upload your ssh
key to gitlab, look at the last part of the [Tools Setup](tools-setup.md) page again.**
```shell
cd cs206-example
```
Now that we've obtained the project, let's take a look at its structure:
```shell
.
├── build.sbt
├── project
│ ├── ...
└── src
├── main
│ └── scala
│ └── example
│ └── Lists.scala
└── test
└── scala
└── example
└── ListsSuite.scala
```
- All the files ending with `.sbt` or in the `project/` directory are build tool configuration files: you don't need to modify them or look at them for any of the labs
- The project sources are in `src/main/scala/`
- The sources of the unit tests are in `src/test/scala/`. You will need to make all the tests pass to complete the labs, and you should write additional tests to check for cases that our tests do not cover.
## Part 2: Using sbt
Start sbt by running:
```shell
sbt
```
Once it's finished starting (this may take a while), you'll be able to enter sbt
commands. You can compile your project using `compile` and run the tests with
`test` (this automatically compiles your code if needed to). Note that if
compilation fails, no tests will be run. The first time you'll run `test` in an
lab you should see many errors: that's normal, your job is to make the
tests pass! But first, let's look at a failed test in detail:
![](images/sbt-test-error.png)
This tells us several things:
- There's a test named `sum of a few numbers (10pts)` in the class `ListsSuite` in the package `example`
- The test failed (that's why it's in red and starts with `==> X`) with an exception: `NotImplementedError`.
- This exception was thrown from the method `???` in `scala.Predef` in the file
`Predef.scala`, this file is not part of our project (that's why it's in
grey), to find the actual error in our code we have to look at where this
method was called from.
- This method was called from method `max` in `example.Lists` in the file
`Lists.scala` at line 40, this is where the bug is!
- It's also important to see where in our test this was called from, here
it's line 102 of `ListsSuite.scala`.
Time to go fix that bug! The next section will show you how to do that using the IDE.
## Part 3: Using the IDE
### Setup
Let's upgrade the IDE support first, close VSCode if it's open and run:
```shell
code --force --install-extension scalameta.metals
```
### Startup
To start Code, run the following in the project directory (the same directory where you
previously ran `sbt`), it's important to run Code in the correct directory or
it won't be able to import your project:
```shell
code .
```
(In this command the `.` is important, it's how we tell Code to run in the
current directory)
(if you see an error `Expected ';'` it means you're inside sbt, open a new
terminal in the same directory)
The first time the IDE starts, it will take some time to download more
components, eventually it will ask you to import the build, please click "Import
build":
![](images/metals-import.png)
You'll need to wait a bit for the import to finish, if an error appears try
closing and restarting Code in the same way we started it above.
### Usage
It's now time to dig in! Earlier we talked about a failing test, the stack trace
told us that it was failing on line 102 of the file `ListsSuite.scala`, so let's open that file:
![](images/open-test.png)
Here's the source code of the test:
![](images/test-source.png)
The first line gives a name to the test, the second line runs
`sum(List(1, 2, 0))` and tests that it equals 3, but in our case we never got to
this point because an exception was thrown, recall that the second line
of the stack trace was:
```scala
at example.Lists$.sum(Lists.scala:25)
```
This tells us that the crash happened when calling `sum`, we can hover with our mouse over the call to `sum` in the test method to get more information on it:
![](images/hover.png)
**If hovering doesn't show this see the [Troubleshooting](#troubleshooting) section.**
The hover is split into two parts: the first part is:
```scala
def sum(xs: List[Int]): Int
```
This means that `sum` is a method that takes a `List` of `Int` as argument and
returns an `Int`. The second part is the documentation of `sum`. We can jump to
the definition of sum by `Ctrl+click` (`Cmd+click` on Mac) or by `right click ->
Go to Definition`. Once there we see:
![](images/sum-def.png)
Now we know why the test failed: `sum` calls `???`, which is a method defined in
the Scala standard library that simply crashes your program: whenever you see it
in a lab it means that this is something you need to replace by your own
implementation.
Once you've implemented this method, you can run `test` from sbt again to see if
the test passed, if you want to run a single test instead of all tests you can
use `testOnly` instead and specify part of the name of the test:
```scala
testOnly -- "--tests=.*max of a few.*"
```
You now know enough to be able to work with the IDE, here are some additional tips:
- When you press `Enter` to make a new line, the IDE will automatically indent the
line if needed (for example, if the last word on the previous line was
`then`), however it will never unindent code for you (for example, when
writing `else`). You can indent code manually by pressing `Tab` and
unindent it by pressing `Backspace` or `Shift + Tab`.
- When working on an lab, you are free to create as many methods, classes and objects as you want. **But you shouldn't change the name of existing methods, classes and objects, because that may break the automated grading system, this is important!**.
- You can see a list of all warnings and errors reported by the compiler by clicking on ![](images/warnings-errors.png) at the bottom left of Code.
- The IDE can show you on hover the documentation of classes, defs and vals defined in the current project but support for external project is currently missing. To compensate for this, you can consult the documentation online:
- The documentation for the Scala standard library is at [https://www.scala-lang.org/files/archive/api/2.13.3/](https://www.scala-lang.org/files/archive/api/2.13.3/)
- The documentation for the Java standard library is at [https://docs.oracle.com/en/java/javase/15/docs/api/index.html](https://docs.oracle.com/en/java/javase/15/docs/api/index.html)
- You can customize Code as much as you want, including installing additional extensions, but please avoid installing other Scala-related extensions: they may conflict with the one we use for this course.
- While working on your lab, you will regularly want to go back to the sbt console to run the tests. You could simply run the command `test` every time, but you can take advantage of the watch mode instead: if a command is prefixed by `~`, sbt will watch the source directory for changes and re-run the command every time a file is saved. So a possible workflow is:
1. Start the IDE
2. Start sbt in the terminal (protip: you can start a terminal inside Code from the
menu: `Terminal -> New Terminal`)
3. Inside sbt, run `~test`
4. Work in the IDE, and check the output of the sbt console from time to time
## Part 4: Running your code
Writing code and running tests is nice, but sometimes more direct feedback is useful, like when you want to experiment with Scala, or try out some methods that you implemented. You can do this using the Scala REPL (Read/Eval/Print Loop) or with a worksheet.
### The REPL
After having started sbt, you can start the REPL by typing `console`, you will see the following prompt:
```scala
scala>
```
At this point you can write any Scala expression you want, for example:
```scala
scala> val l = List(3,7,2)
val l: List[Int] = List(3, 7, 2)
```
(If you write an expression without wrapping it in a `val` or a `def`, the REPL will give it a name for you, starting with `res`)
```scala
scala> l.isEmpty
val res0: Boolean = false
scala> println(res0)
false
scala> l.tail.head
res1: Int = 7
scala> List().isEmpty
res2: Boolean = true
```
The classes of the lab are available inside the REPL, so you can for instance import all the methods from `object Lists` and start using `max`:
```scala
scala> import example.Lists._
import example.Lists._
scala> max(List(1,3,2))
res1: Int = 3
```
You can enter a multiline expression in the REPL by using `Alt+Enter` instead of `Enter`:
```scala
scala> if 1 == 1 then
| "a"
| else
| "b"
val res0: String = a
```
(on macOS, first go to `Terminal -> Preference -> Profiles -> Keyboard` and then
select `Use Option as Meta key`, then `Option+Enter` will work for multiline
expressions.)
In order to exit the Scala REPL and go back to sbt, type `Ctrl+D`.
### The worksheet mode
A *worksheet* is a file where every line of code written in the IDE is executed and its output displayed as a comment.
#### Creating a worksheet
Any file that ends in `.worksheet.sc` is considered to be a worksheet by the IDE. Once Code is launched in a project, all you have to do is create a new file and save it (`Ctrl+N, Ctrl+S`) using any name as long as it ends in `.worksheet.sc`.
#### Using the worksheet mode
Inside this file, you can type any line of code you would type in the REPL. The
worksheet will be automatically run when the code is saved and auto-save is
enabled by default. Each line of code will be executed one by one and its output
will be shown in green on the right.
## Part 5: Submitting your Solution
[Click here to learn how to submit your lab.](grading-and-submission.md)
# Troubleshooting
## sbt fails to start
If you see any kind of error when sbt starts that prevents you from using it,
try cleaning the project cache by running:
```shell
git clean -Xdf
```
Then restarting `sbt`, if this still doesn't work, try deleting the global sbt
cache:
```shell
rm -r ~/.sbt
```
## IDE features like type-on-hover or go-to-definition do not work
It's likely that the build wasn't imported, we can import it manually:
Click on the "m" logo in the left bar (where the red arrow is in the screenshot below):
![](images/menu-metals.png)
In the sidebar that appears, click on "Import build", then wait a bit:
![](images/import-build.png)
If things still don't work, try restarting Code (launch it in the same way you
started it before, using `code .` from the project directory). If you're still
having issues, try clicking on "Clean compile workspace" from the same
sidebar.
## Warning about Bloop installed version
If you get a warning like this:
![](images/bloop-update.png)
Please click the first button "Turn off old server".
# Grading and submission
## Local tests and grading
Each lab is shipped with two sets of tests:
* The tests defined under `src/test/scala`, these are usually incomplete: we
strongly encourage you to write additional tests to exercise every aspect of
your code. You can run these tests from `sbt` using the following command:
```shell
test
```
* The **grading tests**, that we use to grade your labs, these are
provided as a "black box": you can run them but are not allowed to see their
content, to encourage you to write your own tests. You can run them from
`sbt` using the following command:
```shell
grading:test
```
Your grade is determined by the number of these tests that pass, see the
next section to see how to actually get a grade.
(remember that you can always put `~` before an sbt command to automatically
re-run it when you save a file)
## Committing and pushing your code
**You need to push your code to gitlab before the deadline for the lab to
receive a grade. If you forget, your grade will be zero.**
If you've read the first few chapters of [Git
book](https://git-scm.com/book/en/v2) as recommended, you must already be
familiar with committing and pushing, but let's go over the basics once again:
At any point while working on an lab, you can see what changes you
haven't committed by writing:
```shell
git status
```
(If you get the error `Not a valid command: git`, it means you're inside sbt,
you can't run git from sbt itself, you need to open another terminal in the same
directory)
This will display the list of files you have modified since the last commit, to
see the exact changes you made, run:
```shell
git diff
```
To save these changes in a commit, run: (replace `"My message"` by a message
describing the changes since the last commit, see [How to Write a Git Commit
Message](https://chris.beams.io/posts/git-commit/) for some tips on writing
good commit messages)
```shell
git commit -am "My message"
```
(It's good practice to commit your code regularly to document your changes)
You can then synchronize your local git repository with the gitlab server by
first running:
```shell
git pull --no-rebase
```
This will merge in your local repository any change we made to the lab
(we only change labs to fix critical bugs found after it's been released,
so most of the time this will not do anything).
Finally, you can push your changes to gitlab:
```shell
git push
```
Note that there are also graphical user interfaces to interact with git, for
example [VSCode has built-in git
support](https://code.visualstudio.com/docs/editor/versioncontrol#_git-support).
Once you've pushed your code, you can see the result online by going to
`gitlab.epfl.ch/lamp/student-repositories-s21/cs206-GASPAR/pipelines` where
`GASPAR` is your username (you can also access this page from the main page of
your repository by clicking on the rocket icon on the left side of the gitlab
interface, then clicking on "Pipelines").
![](images/pipeline-tab.png)
The grading pipeline contains two steps:
* *compile*: your code does not compile for grading if the job fail.
* *grade*: the job results in a warning if you do not get the maximum grade
(10.00/10.00)
If a job is marked "pending", it means there are other students running jobs at
the same time and you'll have to wait a bit for your job to run.
You can click on the `grade` job to see its output:
![](images/pipeline-details.png)
![](images/pipeline-logs.png)
Your grade is given on the line that starts with `Your overall score for this
lab is`
**If you push to gitlab multiple times, your final grade will be based on the
last commit you pushed before the deadline.**
labs/images/bloop-update.png

18.1 KiB

labs/images/clone-url.png

41.8 KiB

labs/images/gitlab-public-ssh-key.png

111 KiB

labs/images/gitlab-settings.png

24.8 KiB

labs/images/hover.png

163 KiB

labs/images/import-build.png

35.3 KiB