I don't really blog anymore. Click here to go to my main website.

muhuk's blog

Nature, to Be Commanded, Must Be Obeyed

December 22, 2015

Laziness in Clojure

Lazy evaluation is delaying known operations until they are forced. What I mean by laziness in this post is a bit more general than that, for some constructs what operations are delayed is not known and some are evaluated before a value is forced. When used correctly these differences should not matter. A significant detail, however, is that they all evaluate at most once.

Delay

The strict definition of lazy evaluation above fits delay:

(let [delayed (delay (println "...nooot!"))]
  (println "This is the last line.")
  @delayed)

When you run this code it will print the following:

This is the last line.
...nooot!

Laziness is accomplished by storing the expression in a function of zero parameters and running this function in a class that implements clojure.lang.IDeref. If we did not care about at most once evaluation and thread safety we could have accomplished the same thing with:

(let [delayed (fn [] (println "...nooot!"))]
  (println "This is the last line.")
  (delayed))

The expression we used above is just a side effect, println returns nil. delay is more meaningful when we actually care about the result of the expression contained. That is, instead of using deref[1] just to force the side effects, doing something useful with the value returned.

I tried to find a real world usage of delay but could not find anything except for some throwaway code and some usages that could have expressed better differently. I found in my own code that delay, because it supports IDeref, makes a good atom substitute in tests. Plus, once evaluated it’s value cannot change and calling swap! and friends on it would fail.

Promise

It might not look like much, but promise is a powerful control tool. Instead of result of a set of expressions promise delays a value. The value can be realized in one place or another depending on the application state. An example to help make this distinction clear:

(declare intense-add)


(def p (promise))


(defn computational-add [a b result]
  {:pre [(integer? a)
         (integer? b)
         (not (realized? result))]
   :post [(nil? %)
          (fn [_] (realized? result))]
  (if (or (= a 1) (= b 1))
    (deliver result (+ a b))
    (intense-add a b result)))


(defn intense-add [a b result]
  (deliver result (+ a b)))


(computational-add 5 7 p)
@p ;; --> 12

This is a contrived example and it would be better written without any promises, just returning values. But if intense-add was a complex computation running on multiple threads and/or doing significant amount of I/O (perhaps calling REST APIs) using plain values would block. Using promise we decouple how the value is computed from how it is used.

Attentive readers would notice the post-condition (not (realized? result)), would fail if intense-add would deliver after computational-add returns, using a future for instance. Also care must be taken to ensure promises are delivered or derefing would block forever[2]. These are consequences of the freedom of control promise provides.

Future

future is similar to delay with two differences; it runs the expression contained in a different thread and it may start execution before the result is dereferenced. It is more of a concurrency primitive than an abstraction for laziness. The reason I mention future is because it offers an alternative way to lazily compute things and avoid nobody called deliver problem of promises.

Usually futures are used to distribute and collect computations on multiple threads. Laziness is more significant when futures dereference other futures. Consider the following code:

(defn cpu-intensive-proc [input other]
  ...
  output)


(defn io-proc-1 []
  ...
  output)


(defn io-proc-2 []
  ...
  output)


(let [io-1 (future (io-proc-1))
      io-2 (future (io-proc-2))]
  (future (cpu-intensive-proc @io-1 @io-2)))

It is not valid code, but assume output’s are just values and the only future calls are the ones inside the let block. All future calls will return immediately with an immutable object that can be derefed to get the value. The third call in the last line will block its thread (not the main thread let runs) until io-1 and io-2 are realized. So futures are not completely safe from blocking forever[3].

These characteristics provide an alternative (more restricted but somewhat easier to reason about) way to model lazy computations to promise’s. I think it is worth mentioning a second time that side effects are non-deterministic with futures because of it’s concurrent behavior.

Lazy Sequences

Lazy sequences are also, well, lazy. But they are the least malleable for flow control. Some lazy sequences are chunked and some are not. Also they are quite dependent on who hold their heads (first in Clojure parlance). And who is a plural who. It is best to treat lazy sequences as immutable and lazy values. That means their evaluation should involve no side effects other than (preferably local) state changes[4]. Having said that I am not sure if Clojure best practises police will arrest me or not for sanctioning local state.

Laziness in sequences is more often a source of pain and suffering and StackOverflow entries that gets resolved with “a-ha!”’s after a couple of comments and no answers. In other words laziness is rarely intended when working with sequences. But consider the following example:

(->> (range 10)
     (filter odd?)
     (map (partial * 10))
     (map inc))
;; '(11 31 51 71 91)

Because both filter and map return lazy sequences intermediary lists do not get fully realized[5]. If you find this property fascinating, wait till you get to work with transducers.

Sequences that are not lazy can be used anywhere lazy sequences are used. But the opposite is not true, and this is the source of the trouble I mentioned above. Therefore making it a habit of using doall where a non-lazy sequence is expected is a good idea.

If you have liked this post, I have also written about infinite sequences, monads, macros and how not to reinvent protocols.


[1]@ is a reader macro that expands to a deref call.
[2]There is a three parameter arity of deref when something can possibly never be realized. (see deref)
[3]I have just checked and Clojure seems to use an unbounded thread pool, so it seems blocking in futures does not cause resource starvation. But this is an implementation detail and can change in the future.
[4]Think of a lazy version of The Sieve of Eratosthenes, hiding its list of candidates in an atom. Here is a purely functional version of the algorithm though (see sieveprimes).
[5]Even the first sequence is lazy, but I think (range 10) would still be fully realized due to chunking.

If you have any questions, suggestions or corrections feel free to drop me a line.