Add labs/lab7-actors-binary-tree

1efabc09 · Olivier Blanvillain · 9fc4195d · 1efabc09
Commit 1efabc09 authored 4 years ago by Olivier Blanvillain
--- a/labs/lab7-actors-binary-tree/README.md
+++ b/labs/lab7-actors-binary-tree/README.md
+## Binary Trees
+
+Use the following commands to make a fresh clone of your repository:
+
+```
+git clone -b actorbintree git@gitlab.epfl.ch:lamp/student-repositories-s21/cs206-GASPAR.git cs206-actorbintree
+```
+
+## Useful links
+
+  * [The API documentation of Akka](https://doc.akka.io/api/akka/current/akka/index.html)
+  * [The API documentation of the Scala standard library](https://www.scala-lang.org/files/archive/api/2.13.4)
+  * [The API documentation of the Java standard library](https://docs.oracle.com/en/java/javase/15/docs/api/index.html)
+
+**If you have issues with the IDE, try [reimporting the
+build](https://gitlab.epfl.ch/lamp/cs206/-/blob/master/labs/example-lab.md#ide-features-like-type-on-hover-or-go-to-definition-do-not-work),
+if you still have problems, use `compile` in sbt instead.**
+
+## Introduction
+
+Binary trees are tree based data structures where every node has at most two children (left and right).
+In this exercise, every node stores an integer element.
+From this we can build a binary search tree by requiring for every node that
+
+ * values of elements in the left subtree are strictly smaller than the node's element
+ * values of elements in the right subtree are strictly bigger than the node's element
+
+In addition, there should be no duplicates, hence we obtain a binary tree set.
+
+Your task in this assignment is to implement an actor-based binary tree set where each node is
+represented by one actor. The advantage of such an actor-based solution is that it can execute
+fully asynchronously and in parallel.
+
+
+#### The API
+
+You can find the message-based API for the actor-based binary tree to be implemented in the supplied `BinaryTreeSet` object
+in the file `BinaryTreeSet.scala`.
+
+The operations, represented by actor messages, that the implementation should support are the following:
+
+ * `Insert`
+ * `Remove`
+ * `Contains`
+
+All three of the operations expect an `ActorRef` representing the requester of the operation, a numerical identifier of
+the operation and the element itself. `Insert` and `Remove` operations should result in an `OperationFinished` message sent
+to the provided requester `ActorRef` reference including the id of the operation. `Insert` and `Remove` should return an
+`OperationFinished` message even if the element was already present in the tree or was not found, respectively.
+`Contains` should result in a `ContainsResult` message containing the result of the lookup (a Boolean which is
+true if and only if the element is in the tree when the query arrives) and the identifier of the `Contains` query.
+
+#### Handling of Removal
+
+You should observe that both the `Insert` and `Contains` operations share an important property, namely, they only
+traverse a linear path from the root of the tree to the appropriate inner node or leaf. Since the tree nodes are actors
+which process messages one-by-one, no additional synchronization is needed between these operations. Removal in a
+binary tree unfortunately results in tree restructuring, which means that nodes would need to communicate and coordinate
+between each other (while additional operations arrive from the external world!).
+
+Therefore, instead of implementing the usual binary tree removal, in your solution you should use a flag that is stored
+in every tree node (`removed`) indicating whether the element in the node has been removed or not. This will result in a very
+simple implementation that is concurrent and correct with minimal effort. Unfortunately this decision results in the
+side effect that the tree set accumulates "garbage" (elements that have been removed) over time.
+
+#### Garbage Collection
+
+As we have seen, removal of entries can be implemented simply by using a removal flag with the
+added cost of growing garbage over time. To overcome this limitation you will need to implement a "garbage collection"
+feature. Whenever your binary tree set receives a `GC` message, it should clean up all the removed elements, while
+additional operations might arrive from the external world.
+
+The garbage collection task can be implemented in two steps. The first subtask is to implement an internal
+`CopyTo` operation on the binary tree that copies all its non-removed contents from the binary tree to a provided new one. This
+implementation can assume that no operations arrive while the copying happens (i.e. the tree is protected from modifications
+while copying takes places).
+
+The second part of the implementation is to implement garbage collection in the manager (`BinaryTreeSet`) by using the copy operation.
+The newly constructed tree should replace the old one and all actors from the old one should be stopped.
+Since copying assumes no other concurrent operations, the manager should handle the case when operations arrive while still
+performing the copy in the background. It is your responsibility to implement the manager in such a way that the fact
+that garbage collection happens is invisible from the outside (of course additional delay is allowed).
+For the sake of simplicity, your implementation should ignore GC requests that arrive while garbage collection is taking place.
+
+#### Ordering Guarantees
+
+Replies to operations may be sent in any order but the contents of `ContainsResult` replies must obey the order of the
+operations. To illustrate what this means observe the following example:
+
+Client sends:
+
+    Insert(testActor, id=100, elem=1)
+    Contains(testActor, id=50, elem=2)
+    Remove(testActor, id=10, elem=1)
+    Insert(testActor, id=20, elem=2)
+    Contains(testActor, id=80, elem=1)
+    Contains(testActor, id=70, elem=2)
+
+Client receives:
+
+    ContainsResult(id=70, true)
+    OperationFinished(id=20)
+    OperationFinished(id=100)
+    ContainsResult(id=80, false)
+    OperationFinished(id=10)
+    ContainsResult(id=50, false)
+
+While the results seem "garbled", they actually strictly correspond to the order of the original operations. On
+closer examination you can observe that the order of original operations was [100, 50, 10, 20, 80, 70]. Now if you
+order the responses according to this sequence the result would be:
+
+    Insert(testActor, id=100, elem=1) -> OperationFinished(id=100)
+    Contains(testActor, id=50, elem=2) -> ContainsResult(id=50, false)
+    Remove(testActor, id=10, elem=1) -> OperationFinished(id=10)
+    Insert(testActor, id=20, elem=2) -> OperationFinished(id=20)
+    Contains(testActor, id=80, elem=1) -> ContainsResult(id=80, false)
+    Contains(testActor, id=70, elem=2) -> ContainsResult(id=70, true)
+
+As you can see, the responses the client received are the same, hence they must have been executed sequentially,
+and only the responses have arrived out of order. Thus, the responses obey the semantics of sequential operations
+ -- it is simply their arrival order is not
+defined. You might find it easier for testing to use sequential identifiers for the operations, since that makes it
+easier to follow the sequence of responses.
+
+You might also note that out-of-order responses can only happen if the client does not wait for each individual answer
+before continuing with sending operations.
+
+While this loose ordering guarantee on responses might look strange at first, it will significantly simplify the
+implementation of the binary tree and you are encouraged to make full use of it.
+
+#### Your task
+
+You can find code stubs in the file `BinaryTreeSet.scala` which provides you with the API as described above,
+the `BinaryTreeSet` and `BinaryTreeNode` classes. The `BinaryTreeSet` represents the whole binary tree.
+This is also the only actor that is explicitly created by the user and the only actor the user sends messages to.
+
+You can implement as many or as few message handlers as you like and you can add additional variables or helper
+functions. We provide suggestions in your code stub, marked with the comment `optional`, but you are free to use
+it fully or partially; the optional elements are not part of the tested API.
+
+To see a binary tree in operation check our provided tests in `BinaryTreeSuite.scala`. Note in particular
+that it is the user who triggers garbage collection by sending a `GC` message (for the sake of simplicity of this exercise).
+
+Don't forget to make sure that no `Operation` messages interfere during garbage collection and that the user does
+not receive any messages that may result from the copying process. To achieve this, put any incoming operation message
+into a queue during the garbage collection process. When the copying process is finished, re-send the queued messages
+to the tree root and clear the queue.
+
+The following may be useful for your implementation:
+
+ * Another way to stop an actor, besides the `stop` method you have seen, is to send it a `PoisonPill` message.
+ * `context.parent` returns the ActorRef of the actor which created the current actor (i.e. its parent).
+ * If you see a log message like the following
+
+       [INFO] [11/21/2013 14:04:13.237] [PostponeSpec-akka.actor.default-dispatcher-2] [akka://PostponeSpec/deadLetters] Message [actorbintree.BinaryTreeSet$OperationFinished] from Actor[akka://PostponeSpec/user/$e/my-actor#-1012560631] to Actor[akka://PostponeSpec/deadLetters] was not delivered. [1] dead letters encountered.
+
+   it means that one of your messages (here the OperationFinished) message was not delivered from actor `my-actor` to actor `deadLetters`—the latter is where actors forward their messages after they terminate.
+   You should check that you do not stop actors prematurely.
+
+*A word on Actor counts:*
+
+  * The grader verifies that enough Actors are created when inserting elements.
+  * The grader also verifies that enough Actors are stopped in response to a `GC` command (i.e. for those elements that were previously marked removed from the set).