I don't really blog anymore. Click here to go to my main website.

muhuk's blog

Nature, to Be Commanded, Must Be Obeyed

September 28, 2014

Is Clojure Homoiconic?

Homoiconicity means you can express programs in a programming language’s primitive data types. In other words; your code is data and data is code.

Clojure is, as other dialects of Lisp, a homoiconic programming language. Code is represented as lists, symbols, vectors and other Clojure data types. When the reader (parser) reads a source file, abstract syntax tree built looks exactly like the source[1].

What Does Code Is Data Mean?

Data is useful as long as you can do interesting things with it. When you can access a piece of code as data, you can analyze and modify it just like any other data. Your programming language doesn’t need to know this data represents code.

In clojure you use quoting to turn code into data. Actually it’s not a transformation, but rather delaying evaluation (execution). Let me illustrate this with an example:

user=> (quote (quote (quote (+ 1 2))))
(quote (quote (+ 1 2)))

user=> (quote (quote (+ 1 2)))
(quote (+ 1 2))

user=> (quote (+ 1 2))
(+ 1 2)

user=> (+ 1 2)
3

Each time we press enter in REPL, we are executing an expression. That is why feeding the result of each expression back to the REPL eventually gives us the expression we were quoting. We with the expression (+ 1 2) quoted three times and we get the result of (+ 1 2) back by evaluating it three times. Up to this point (+ 1 2) is data. Because it is quoted. The fourth invocation evaluates (+ 1 2) which produces the result 3.

Here is an example of producing code:

user=> (range 10)
(0 1 2 3 4 5 6 7 8 9)

user=> (vector 0 1 2 3 4 5 6 7 8 9)
[0 1 2 3 4 5 6 7 8 9]

user=> (cons (quote vector) (range 10))
(vector 0 1 2 3 4 5 6 7 8 9)

Nothing fancy. But note that the last result is valid Clojure code. If it’s still not clear what quote does, perhaps this last example will shed some light:

user=> (def foo (vector 0 1 2 3 4 5 6 7 8 9))
#'user/foo

user=> (def bar (cons (quote vector) (range 10)))
#'user/bar

user=> foo
[0 1 2 3 4 5 6 7 8 9]

user=> bar
(vector 0 1 2 3 4 5 6 7 8 9)

What Does Data Is Code Mean?

What happens when we execute (def foo ...) expression above? First, the function vector is called with those 10 parameters. Then the resulting vector is assigned to the var foo. This is the same as (def foo [0 1 2 3 4 5 6 7 8 9]). What happens to the other expression, (def bar ...) is different. bar will be assigned the value of a list, whose first element is the symbol vector and the rest of the elements are 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 in order. When we are defining bar the function vector is not called. Its value is merely some list that is valid Clojure code:

user=> (first foo)
0

user=> (rest foo)
(1 2 3 4 5 6 7 8 9)

user=> (first bar)
vector

user=> (rest bar)
(0 1 2 3 4 5 6 7 8 9)

Data is code, when it can be executed:

user=> (eval foo)
[0 1 2 3 4 5 6 7 8 9]

user=> (eval bar)
[0 1 2 3 4 5 6 7 8 9]

(eval foo) is not really interesting, since a vector evaluates to itself. When we evaluate bar though, we run the code contained and get a vector result.

So, quote and eval are functions that complement each other. quote prevents an expression from evaluation, so that we can manipulate it like we manipulate any other data, then we call eval on the new data to have it evaluated. This brings me to the main point of this post.

What Is Metaprogramming?

Metaprogramming is manipulating code in code. Let me try to make it clear by giving two examples of what it is not:

  1. Control is not metaprogramming.

    By using control structures, you can change the runtime behaviour of your code, this is not metaprogramming since the code that runs is still exactly the same code you have written:

    (defn foo [x]
      (if (bar? x)
        (baz x)
        (foobar x)))
    
  2. Preprocessing is not metaprogramming.

    Some people accept source code preprocessors – like C’s preprocessor – as metaprogramming tools. I disagree with this view. Preprocessors work with raw source files as strings. Since they don’t parse your code and construct an abstract syntax tree, we can say they don’t see your code at all. They only see a soup of text and preprocessor directives scattered around. If a preprocessor would parse the source and work on the abstract syntax tree we would call it a macro facility.

The ability of evaluating some arbitrary code enables metaprogramming. You can, for instance, evaluate an fn form and then assign the resulting function to a var. This way you can change software when it’s running. This is also called code hot swapping.

Another, more common way to do metaprogramming is macros. Macros are run during compilation. Since Clojure compiles directly to JVM instructions, you can write succint high level code and still take advantage of compiled low level code.

An example of such a high level abstraction if clojure.core/cond:

(defmacro cond
  "Takes a set of test/expr pairs. It evaluates each test one at a
  time.  If a test returns logical true, cond evaluates and returns
  the value of the corresponding expr and doesn't evaluate any of the
  other tests or exprs. (cond) returns nil."
  {:added "1.0"}
  [& clauses]
    (when clauses
      (list 'if (first clauses)
            (if (next clauses)
                (second clauses)
                (throw (IllegalArgumentException.
                        "cond requires an even number of forms")))
            (cons 'clojure.core/cond (next (next clauses))))))

Here, the quoted if ((list 'if ...) ends up in the resulting code. But the unquoted if ((if (next clauses)) is evaluated compile time and if the predicate ((next clauses)) is false, IllegalArgumentException is thrown, halting compilation:

user=> (cond
  #_=>   false :false
  #_=>   nil :nil
  #_=>   true :true)
:true

user=> (require 'clojure.walk)
nil

user=> (clojure.walk/macroexpand-all
  #_=>   '(cond
  #_=>     false :false
  #_=>     nil :nil
  #_=>     true :true))
(if false :false (if nil :nil (if true :true nil)))

No throw’s in the resulting code as you can see.

Because we don’t want to write those nested if’s by hand so we use cond. If we need to write a lot of cond’s and none of the variants do the job, we would write a macro that generates those cond’s for us.

Conclusion

Clojure is a homoiconic language. Homoiconicity, runtime evaluation and macros are not the only means to achieve metaprogramming. But these three features provide the most powerful kind of metaprogramming. You don’t just get to change part of the language. You can change the entire language to your liking.

“The ability to represent procedures as data also makes Lisp an excellent language for writing programs that must manipulate other programs as data, such as the interpreters and compilers that support computer languages.”

Building Abstractions with Procedures, SICP

Thanks for reading. Feel free to drop me an if you have any comments or questions.


[1]There are some exceptions to this. Macro’s are expanded and EDN is extensible. These exceptions however doesn’t change the fact that Clojure doesn’t have a special syntax divorced from the it’s AST altogether.

If you have any questions, suggestions or corrections feel free to drop me a line.