muhuk's blog

Nature, to Be Commanded, Must Be Obeyed

August 27, 2015

The Tree of Life: Rules & Evolution

This is the second part of The Tree of Life series. You can read the first post here.

Rules

A rule codifies next states of nodes for all combinations of states. Note that the topology of trees don’t change, only the values of nodes. Let’s start by defining a data type for rules:

case class Rule(ruleNumber: Int)

A state, here, means a combination of a node’s value, its parent’s value and its children’s values:

case class Neighbourhood(parentValue: Boolean, leftValue: Boolean, value: Boolean, rightValue: Boolean)

We need to encode these booleans into an integer:

implicit def bool2int(v: Boolean): Int = if (v) 1 else 0

case class Neighbourhood(parentValue: Boolean, leftValue: Boolean, value: Boolean, rightValue: Boolean) {
    val mask: Int = 1 << (parentValue << 3 | leftValue << 2 | value << 1 | rightValue)
}

Basically we are encoding a 4-bit value and using it as the exponent to calculate a specific power of two. Shift operation << is essentially:

require(scala.math.pow(2, 7) == 1 << 7)

So mask is a number, with a single bit on in its binary representation. We will use it to query a single bit in our rule number:

case class Rule(ruleNumber: Int) {
    def evolve(neighbourHood: Neighbourhood): Boolean = (ruleNumber & neighbourHood.mask) != 0
}

In other words, each bit in a Rule.ruleNumber corresponds to a specific state, each bit in a Neighbourhood.mask corresponds to the current value of a node.

We will pass the state (a Neighbourhood) of a node to Rule.evolve and get back its next state. Note how we don’t expose any binary arithmetic. Only Rule and Neighbourhood knows about the dark secrets involved.

Node State

Here is one way to calculate Neighbourhood‘s:

sealed trait TreeNode {
    implicit def bool2String(v: Boolean): String = if (v) "X" else "."

    val value: Boolean

    def neighbourhood(parentValue: Boolean): Neighbourhood
}

case class BranchNode(value: Boolean, left: TreeNode, right: TreeNode) extends TreeNode {
    def neighbourhood(parentValue: Boolean): Neighbourhood = Neighbourhood(parentValue, left.value, value, right.value)

    override def toString(): String = "(%s %s %s)".format(left, value: String, right)
}

case class LeafNode(value: Boolean) extends TreeNode {
    def neighbourhood(parentValue: Boolean): Neighbourhood = Neighbourhood(parentValue, false, value, false)

    override def toString(): String = value
}

For a simple application like this it doesn’t make a huge difference how we do this. But I would still like to mention there are other ways:

  • We could have a neighbourhood(node: TreeNode): Neighbourhood function that matches on TreeNode instances. Since TreeNode is sealed compiler could even warn us if the match is not exhaustive. This is a bad choice when dealing with open sets. Adding a new case would break the function.
  • We could implement a function as an alternate constructor or as apply on TreeNode‘s companion object. This is interesting because it’s kind of like saying TreeNode implementations doesn’t know anything about Neighbourhood‘s but TreeNode itself does. Whereas the first alternative doesn’t imply any knowledge of Neighbourhood‘s on TreeNode and its implementations.
  • We could create an implicit method, but I think this is not a good alternative at all. Because we are not really doing a conversion (like we did with bool2int), we are deriving data.
  • TreeNode implementations could be extended with a type class to provide a neighbourhood method for each implementation from outside their body.

I should also mention two special cases. The root node’s parent is assumed to be always false and leaf nodes’ branches are also assumed to be false. This is unfortunately not mentioned in the problem definition.

Evolution

Evolution is implemented with quite a straightforward recursive function:

def evolve(rule: Rule, tree: TreeNode): TreeNode = {
    def evolveNode(parentValue: Boolean, node: TreeNode): TreeNode = {
        val newValue = rule.evolve(node.neighbourhood(parentValue))
        node match {
              case BranchNode(v, l, r) => BranchNode(newValue, evolveNode(v, l), evolveNode(v, r))
              case LeafNode(v) => LeafNode(newValue)
        }
    }
    evolveNode(false, tree)
}

Perhaps you have seen this pattern a few times but didn’t pay too much attention:

def outerFunc(...): ? = {
    def innerFunc(input: ?) = input match {
        case ... => ???  // Recursively call innerFunc.             (2)
        case ... => ???  // Termination clause.                     (3)
    }
    innerFunc(...)       // Do one time setup and call innerFunc.   (1)
}

The recursion can fan out (like evolve above) or it can work like a fold. In the latter case it might even read better if it is refactored into a fold.

In the next post we will implement querying trees to find values of specific nodes.


Added on September 2, 2015

Enpassant had some comments on using implicits:

Implicit conversions are powerful but very dangerous, especially when using between common types.

  • Someone might use a Boolean in place of an Int (or String) by mistake (compiler will not indicate an error).
  • Easy to write but hard to read

I agree with this. I have updated first part regarding bool2String. Here is Enpassant’s version that replaces bool2int with an implicit class:

implicit class Bool2Int(v: Boolean) {
    def asInt = if (v) 1 else 0
}

case class Neighbourhood(parentValue: Boolean, leftValue: Boolean, value: Boolean, rightValue: Boolean) {
    val mask: Int = 1 << (parentValue.asInt << 3 | leftValue.asInt << 2 | value.asInt << 1 | rightValue.asInt)
}

This is better because the conversion is explicit. As far as I know if Bool2Int extends AnyVal object allocation can be optimized away.

If you have any questions, suggestions or corrections feel free to drop me a line.