I don't really blog anymore. Click here to go to my main website.

muhuk's blog

Nature, to Be Commanded, Must Be Obeyed

November 10, 2015

Infinite Sequences in Clojure

I consider infitine sequences to be one of the building blocks of functional programming. Though not a fundamental one like first class functions, immutability or laziness. Actually infinite sequences are derived from laziness. All infinite sequences are lazy, but not all lazy sequences are infinite. For example (map inc coll) returns a lazy sequence, regardless of coll being infinite or not.

This post is written with newbies in mind. If you are Clojure guru you will probably not learn anything new.

This post will explore different ways of creating infinite sequences in Clojure. I will not go too much into the implementation details but just mention a few things about Clojure sequences:

  • Sequences are conteptually similar to immutable linked lists. But different data structures have specialized, more efficient implementations. In other words, calling seq on a vector does not convert it to a linked list.
  • If you are handed a sequence, chances are it is a lazy one. It should not change the way you write your code, except that it should. Often times just assuming lazy sequences is the way to go. (You can call realized? to check.)

Streams of Values

Let us get started with the simplest infinite sequence:

(first (repeat 1))
;; => 1
(take 10 (repeat 1))
;; => '(1 1 1 1 1 1 1 1 1 1)

As you can see above repeat produces an infinite sequence that consist of a single element[1]. What is the point of doing this? Consider the following code:

(def index
  {:animal #{:cat :dog}
   :furniture #{:table :chair :couch}
   :food #{:pizza}})

(def reverse-index
  (->> index
       (mapcat (fn [[k vs]]
                 (zipmap vs (repeat k))))
       (into {})))
;; {:cat :animal,
;;  :dog :animal,
;;  :table :furniture,
;;  :chair :furniture,
;;  :couch :furniture,
;;  :pizza :food}

If it is not clear what is happening inside ->> here is the breakdown:

;; The anonymous function is called
;; three times with these parameters:
(zipmap #{:cat :dog} (repeat :animal))
;; => {:cat :animal, :dog :animal}
(zipmap #{:table :chair :couch} (repeat :furniture))
;; => {:table :furniture, :chair :furniture, :couch :furniture}
(zipmap #{:food} (repeat :pizza))
;; => {:food :pizza}

;; mapcat is like concat • map.
(concat {:cat :animal, :dog :animal}
        {:table :furniture, :chair :furniture, :couch :furniture}
        {:food :pizza})
;; => '([:cat :animal] [:dog :animal] [:table :furniture] [:chair :furniture] [:couch :furniture] [:food :pizza])

;; Notice concat returns a sequence, the
;; following call converts it back into a map:
(into {}
      '([:cat :animal]
        [:dog :animal]
        [:table :furniture]
        [:chair :furniture]
        [:couch :furniture]
        [:food :pizza]))
;; => {:cat :animal, :dog :animal, :table :furniture, :chair :furniture, :couch :furniture, :food :pizza}

Repeating a value can get dull quickly. cycle takes a collection[2] and returns an infinite sequence repeating its elements:

(take 5 (cycle [:winter :spring :summer :autumn]))
;; => '(:winter :spring :summer :autumn :winter)

Did you notice infinite sequences and functions, as abstractions, have some kind of resemblance:

;; Building a computation pipeline using streams:
(take 10 (map vector (range) (cycle [:even :odd])))
;; => '([0 :even] [1 :odd] [2 :even] [3 :odd] [4 :even] [5 :odd] [6 :even] [7 :odd] [8 :even] [9 :odd])

;; Building the same pipeline, replacing
;; one of the sequences with a function::
(take 10 (map (fn [i] [i (if (even? i) :even :odd)]) (range)))
;; => '([0 :even] [1 :odd] [2 :even] [3 :odd] [4 :even] [5 :odd] [6 :even] [7 :odd] [8 :even] [9 :odd])

This is not to say functions and infinite sequences are interchangeable. I think the sequence version is significantly better above. And there would be instances where using a function yields better code. It’s just that they both represent infinite results not yet realized.

Streams of Computation

So far infinite sequences we have seen were all deterministic. Below is an infinite sequence of dice rolls:

(->> (repeatedly #(rand-int 6))
     (map inc) ;; convert [0, 6) to [1, 6]
     (take 10))
;; => '(1 2 6 5 2 4 3 6 3 6)

repeatedly takes a function of zero arity and returns an infinite sequence[3] consisting of successive calls to this function. It only makes sense if our function has side effects, and specifically when it returns different values for each call. Mixing laziness and side effects is always tricky. So repeatedly should be used with care.

This was probably the first program I have written:

10 PRINT "HELLO!"
20 GOTO 10

This infinite loops is actually at the heart of many applications:

(loop [state initial-state]
  (when-not (quit-signaled?)
    (let [new-state (do-things state)]
      (sleep)
      (recur new-state))))

We have four things happening here:

  • State is kept in state local var[4].
  • do-things is called to cause side effects and to get the new state.
  • sleep is called to give other threads a chance to execute their code.
  • The loop - recur wraps the entire thing in an infinite loop, until quit-signaled? returns True.

Notice how the most significant call (do-things) is buried deep within control. Perhaps we can use iterate to tell the story in a better way:

(doseq [_ (iterate (do-things) initial-state)
        :while (not (quit-signaled?))]
  (sleep))

This version is shorter but more importantly the idea of state being an infinite series of changes is packaged into one form ((iterate ...)). If the last example was too abstract, here is a more concrete example:

(defn sum [[a b]] (+ a b))

(defn pascal [row]
  (-> row
      (conj 0)
      (->> (into [0])
           (partition 2 1)
           (map sum))
      vec))

(pascal [1])
;; => [1 1]
(pascal [1 1])
;; => [1 2 1]
(pascal [1 2 1])
;; => [1 3 3 1]

(def pascals-triangle (iterate pascal [1]))

(take 5 pascals-triangle)
;; => '([1] [1 1] [1 2 1] [1 3 3 1] [1 4 6 4 1])

This post only covers functions that generate infinite sequences from non-sequences. But we have also used functions that generate sequences from other sequences, like map, zipmap & mapcat. I hope you find it useful.


[1]There is also a two parameter arity of repeat that produces finite sequences.
[2]Collections are always finite.
[3]Like repeat, repeatedly also has a two parameter arity.
[4]A Clojure var, not short for variable.

If you have any questions, suggestions or corrections feel free to drop me a line.