August 25, 2015

# The Tree of Life: Reading the Input

This is one of the more involved and more interesting HackerRank challenges I did. These posts will describe how I have arrived at my solution. I don’t claim to have found the perfect solution. In fact I have found some weak points in my code and will note them in these posts. Needless to say you should give it a try yourself before reading this.

## Problem Definition

Details can be found in the problem page but here is an executive summary:

• We are dealing with cellular automata defined on a binary tree instead of a grid.
• Evolutions are performed based on rules that are encoded as an integer. Each program execution deals with only one rule.
• A current state is encoded using the node’s, parent’s and chilren’s values.
• This state code and the rule is used to determine the node’s next state.
• Program needs to be able to rewind as well as advance evolutions. But never rewind before the initial state.
• After the setup, a number of queries are performed. Each query is a number of steps to advance or rewind and a path. Program must return the value of the node path is pointing to at the target generation.

I have split the problem into four parts:

1. Build trees from string inputs.
2. Evolve trees based on rules.
3. Find the value of a node.
4. Put it all together.

Let’s get started with the first part.

## Data Model

How do we represent a binary tree? One way to do it is the following:

```case class Node(value: Boolean, left: Option[Node], right: Option[Node])
```

This is actually a pretty bad design. Trees described in the challenge are either leaves or branches with two children[1]. So let’s encode exactly that in a data structure:

```object Solution {
sealed trait TreeNode
case class BranchNode(value: Boolean, left: TreeNode, right: TreeNode) extends TreeNode
case class LeafNode(value: Boolean) extends TreeNode
}
```

Now we can’t create a node with one branch. This is good, we prevented creation of invalid trees. It’s even better having this safety compile time.[2]

Here sealing TreeNode doesn’t do much because the entire program will be inside Solution. It’s there to communicate humans reading the code that TreeNode implementations are a closed set.

TreeNode is currently a marker trait. Let’s encode the information that all nodes have a value:

```sealed trait TreeNode {
val value: Boolean
}
```

Challenge represents on values with "X" and off values with ".". Let’s override toString methods of node classes to match this:

```sealed trait TreeNode {
implicit def bool2String(v: Boolean): String = if (v) "X" else "."

val value: Boolean
}
case class BranchNode(value: Boolean, left: TreeNode, right: TreeNode) extends TreeNode {
override def toString(): String = "(%s %s %s)".format(left, value: String, right)
}
case class LeafNode(value: Boolean) extends TreeNode {
override def toString(): String = value
}
```

First take a look at LeafNode.toString. It just returns value as a String. Normally a value of true would result in "true" and false as "false". But because the implicit function bool2String in TreeNode takes precedence and converts the Boolean value into either "X" or "."

At this point we can create a tree and check how it is printed:

```scala> BranchNode(true, LeafNode(false), BranchNode(false, LeafNode(true), LeafNode(false)))
res0: Solution.BranchNode = (. X (X . .))
```

Of course what we are trying to active is the opposite; to turn "(. X (X . .))" into a TreeNode. We should have a function like this:

```def readTree(input: List[Char]): TreeNode = ???
```

readTree seems like a fold but there are two differences. Firstly, we can’t represent an empty tree with TreeNode. Secondly, we can’t represent a BranchNode with a single branch. We need to read the node’s value and its children if any before we can generate that node.

I wasn’t sure if HackerRank’s servers had scala-parser-combinators installed. So I thought long and hard about the inputs. Assuming valid inputs, a closing paren either followed a closing paren or a right value. In other words if I followed closing parens of the lowest levels I was guaranteed to find a form like "(? ? ?)". I came up with the following code:

```def readTree(input: List[Char], stack: List[TreeNode] = List()): TreeNode = input match {
case 'X' :: rest => readTree(rest, LeafNode(true) :: stack)
case '.' :: rest => readTree(rest, LeafNode(false) :: stack)
case ')' :: rest => stack match {
case r :: LeafNode(v) :: l :: restOfStack =>
readTree(rest, BranchNode(v, l, r) :: restOfStack)
case _ => throw new RuntimeException()
}
case _ :: rest => readTree(rest, stack)
}
```

First two cases push on and off leaves onto the stack. Actually we don’t know if those are leaf nodes yet. I could have made stack a List[Boolean] but then I would lose the ability to push BranchNode’s. It is a hack, but not too dirty. I can live with it. We take advantage of this in the third case:

```case ')' :: rest => stack match {
case r :: LeafNode(v) :: l :: restOfStack =>
readTree(rest, BranchNode(v, l, r) :: restOfStack)
case _ => throw new RuntimeException()
}
```

When we encounter a ')', as I said above, it is always after a right branch and lowest levels are processed first; so it’s a "(? ? ?)" form. If we have just processed a left value, node value and right value of a lowest level (innermost parens) we should have three LeafNode’s on the stack, in reverse order. So we pop them out and push a BranchNode we have created from them. This is the lowest levels. In levels above we may have LeafNode’s or BranchNode’s for the branches and (always) LeafNode’s for node values. So when we have a ')' we always combine the top three items in the stack and push the combination back onto it.

Fourth is the termination case. For valid inputs we should have a single node on the stack, we return it. With the last case we ignore spaces and opening parens.

The case of _ is to take care of the non-exhaustive match warning, it should never be matched.

This is not a particularly elegant solution but in the absence of a parser generator this is as much as I would invest during a programming challenge.

Next post will be about evolving trees.

## Added on September 2, 2015

Implicit conversions are powerful but very dangerous, especially when using between common types.

• Someone might use a Boolean in place of an Int (or String) by mistake (compiler will not indicate an error).
• Easy to write but hard to read

I agree with this. In hindsight an implicit bool2String was not necessary, TreeNode’s can be implemented more cleanly this way:

```sealed trait TreeNode {
val value: Boolean

protected def bool2String(v: Boolean): String = if (v) "X" else "."
}
case class BranchNode(value: Boolean, left: TreeNode, right: TreeNode) extends TreeNode {
override def toString(): String = "(%s %s %s)".format(left, bool2String(value), right)
}
case class LeafNode(value: Boolean) extends TreeNode {
override def toString(): String = bool2String(value)
}
```

He also said:

Throwing exceptions is not functional solution.

This is related to case _ => throw new RuntimeException() snippet I’ve used in a few places. I agree that throwing exceptions to do control flow is not functional. But I disagree that this is such a case. For valid inputs, these cases will never be triggered and this code assumes valid inputs. So getting an exception here means there is a bug in our code, or that we got an invalid input. In my opinion the exception doesn’t violate the function’s contract. Using Either or Try or Option would unnecessarily complicate the code.

 [1] (X X) would be ambiguous, is the left branch missing or is it the right branch.
 [2] If single child branches were allowed, I argue that having a LeftBranch and RightBranch is still better than the one class with Option’s.

If you have any questions, suggestions or corrections feel free to drop me a line.