From fc9a90a29d7f71228b6ef8b4470932c98847b77f Mon Sep 17 00:00:00 2001
From: Olivier Blanvillain <olivier.blanvillain@epfl.ch>
Date: Mon, 17 May 2021 16:00:02 +0200
Subject: [PATCH] Add exercises/solutions-6.md

---
 exercises/solutions-6.md | 54 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)
 create mode 100644 exercises/solutions-6.md

diff --git a/exercises/solutions-6.md b/exercises/solutions-6.md
new file mode 100644
index 0000000..157961f
--- /dev/null
+++ b/exercises/solutions-6.md
@@ -0,0 +1,54 @@
+# Exercise 1 : Spark Fundamentals
+
+## Question 1
+
+```scala
+rdd.map(text => text.split(" ").length).reduce(_ + _)
+```
+
+## Question 2
+
+```scala
+val errorLogs = rawLogs.map(toLog(_)).filter(isError(_)).cache()
+val numberErrors = errorLogs.count()
+val messages = errorLogs.filter(isRecent).map(message(_)).collect()
+```
+
+## Question 3
+
+In the first case, all the strings are displayed on the master. In the second case, records are displayed on the standard output of the worker nodes.
+
+## Question 4
+
+In the case of 2., the map operations are pipelined and only applied to 10 elements.
+
+# Exercise 2 : Demographics
+
+
+## Question 1
+
+```scala
+val adultAges = people.map(_.age).filter(_ >= 18).cache()
+groups.map {
+  case (lower, upper) => adultAges.filter { (age: Int) =>
+    lower <= age && age <= upper
+  }.count()
+}
+```
+
+## Question 2
+
+```scala
+val groups: Array[(Int, Int)] =
+  people.map(_.age)
+        .filter(_ >= 18)
+        .map((age: Int) => (groupOf(age), 1))
+        .reduceByKey(_ + _)
+        .collect()
+
+// Now, we have a small collection
+// and can work on a single machine.
+groups.sortBy(_._1)
+      .map(_._2)
+      .toList()
+```
-- 
GitLab