Matheus Eduardo Machado Moreira

11/9/16 2:19 PM

Functional programming paradigm and languages are usually associated with core features like pure functions and immutable data (structures)¹, and other properties like referential transparency, higher-order (first order) functions, lazy evaluation, and recursion². They arguably provide programmers with better tools and abstractions to deal with difficult actual problems that involve concurrency, parallelism, big data, etc.

Despite all those important characteristics, the most commom remark I hear about functional languages when talking about them with friends and co-workers is: "What a strange syntax! And look at all those parentheses, they must make the code very hard to read and write!" So let's try to demystify this "strange" syntax and see what benefits it can give to developers.

S-expressions

The kind of "strange" syntax we are talking about is commonly associated with LISP³ and derived languages. It is easily recognized by the usage of parenthesis to denote function call/application (together with prefix notation, in which the operator is placed before its operands) and as scope delimiters (instead of { and }, for example).

LISP uses a notation called S-expression (for Symbolic expression). It is a notation for nested list data and is used for source code as well as data representation⁴. The authors of the excellent book SICP (Structure and Interpretation of Computer Programs)⁵ introduce s-expressions in a simple way in section 1.1.1 Expressions.

Expressions representing numbers may be combined with an expression representing a primitive procedure (such as + or *) to form a compound expression that represents the application of the procedure to those numbers. For example:

(+ 137 349)
486
(- 1000 334)
666
(* 5 99)
495
(/ 10 5)
2
(+ 2.7 10)
12.7

The authors call those expressions combinations. Combinations can accomodate procedures that may take an arbritary number of arguments, can be nested, and do not suffer from ambiguity:

(+ 21 35 12 7)
75
(+ (* 3 5) (- 10 6))
19

Those characteristics are already an improvement over languages that use infix notation: it is not necessary to repeat the operator between operands (e.g. 21 + 35 + 12 + 7) and to remember operator precedence (e.g. multiplication is evaluated before addition).

Homoiconicity

Wikipedia defines homoiconicity (also know as code-as-data) as "a property of some programming languages in which the program structure is similar to its syntax, and therefore the program's internal representation can be inferred by reading the text's layout."⁶ Put another way, a homoiconic language uses the same (homo) representations (icons) for source code and data structures.

Most languages need to read a program's textual representation and parse it into something that can be compiled and/or evaluated. This "something" is known as AST (Abstract Syntax Tree), a data structure that represents formally what is expressed in the source code. The following image is a sample syntax tree for the expression requiresRole && userHasRole("MODIFY", assetId)⁷.

Most languages do not provide a straightforward way to manipulate the syntax tree - it is an internal representation only accessible to the compiler during compilation, for example. Homoiconic languages, on the other hand, have programs written using data structures that represent the syntax tree directly. The same requiresRole... code can be written as (and requiresRole (userHasRole "MODIFY" assetId)). Note how the code structure resembles the syntax tree.

Powerful metaprogramming

Homoiconicity enables higher level metaprogramming facilities that cannot be attained by languages that do not exhibit this property. Higher level metaprogramming means higher level abstractions, which frequently translate to less code, more work done, and a lower margin for mistakes. Let's take a look at some interesting examples of what can be accomplished through this powerful feature. (The Clojure⁸ language will be used in the examples, but the ideas and capabilities are shared among other LISP-like languages.)

Java 5's enhanced `for` loop

Java 5 added a new syntax, know as enhanced for loop, to the language. It enabled developers to rewritte an "old style" for loop

for (int i = 0; i < collection.size(); i++) {
    SomeType element = (SomeType) collection.get(i);
    doSomething(element);
}

for (SomeType element : collection) {
    doSomething(element);
}

The new syntax is easier to read and write, and hides complexities (such as index manipulation). Java programmers had to wait until Java 5 to use the enhanced for loop because this addition required a change at the compiler level, and most programmers don't have the knowledge or ability to make it.

Clojure's metaprogramming workhorse is the macro⁹. A macro is a special function that is executed during compilation and whose arguments are unevaluated code. The arguments are data structures that represent code! Given that m is a macro with a single parameter, the call (m (+ 1 2 3)) will result in the parameter's value being (+ 1 2 3)¹⁰, not the value 6.

With macros a programmer could define in a few lines a foreach¹¹ construct, similar to Java 5's enhanced for loop, without touching the compiler or waiting years for it¹²:

(defmacro foreach
  [[sym coll] & body]
  `(loop [coll# ~coll]
    (when-let [[~sym & xs#] (seq coll)]
      ~@body
      (recur xs#))))

(foreach [x [1 2 3]]
  (println x))
=> 1
=> 2
=> 3

Code simplification

S-expressions sometimes make writing and reading chained function calls a bit cumbersome because we need to write and read the expression from the inside out:

(prn (conj (reverse [1 2 3]) 4))
; (4 3 2 1)
=> nil

In natural language we would say: "Reverse the vector [1 2 3], append 4 to it and print the result. It would be nice if we could write code similar to what we just wrote in natural language:

(thread [1 2 3] reverse (conj 4) prn)
; (4 3 2 1)
=> nil

thread would be a form that passes the first value ([1 2 3]) through a series of method calls, passing the result of a previous call (e.g. (reverse [1 2 3])) as the first argument of the next call (e.g. (conj [3 2 1] 4)).

Clojure indeed has a macro that performs what was described: -> (thread first). But the point here is that -> is not a privileged language construct, it is a macro that we could have written ourselves:

(defmacro thread
  [x & forms]
  (loop [x x, forms forms]
    (if forms
      (let [form (first forms)
            threaded (if (seq? form)
                       (list* (first forms) x (next forms))
                       (list form x))]
        (recur threaded (next forms)))
      x)))

Final thoughts

S-expressions, homoiconicity, and prefix notation are not weird characteristics of esoteric programming languages. They are well-thought out features that give rise to powerful tools at a programmer's disposal. With proper care, we are able to write code at a higher abstraction level, leading to less verbosity, more meaning (think DLSs¹³), more maintainability, and better code structure in general.

If you are interested in learning the functional programming paradigm, my humble recommendation is the SICP book. The authors use the Scheme programming language¹⁴ and present the most important functional paradigm features and characteristics through several examples and exercises of increasing difficulty¹⁵. Those who have an object-oriented background, particularly with Java, may find Scala¹⁶ and Clojure very appealing.

```
(loop [coll3531 [1 2 3]]
  (when-let [[x & xs3534] (seq coll3531)]
    (println x)
    (recur xs3534)))
```

12. Clojure programmers do not need to define the foreach macro, since the language already has iteration facilities such as doseq. ↩

13. https://www.gnu.org/software/mit-scheme/ ↩

14. There are a lot of references and resources about SICP online. Particularly useful are sites where it is possible to find solutions for most of the exercises, sometimes with different implementations. But don't cheat! Try hard to solve the exercises by yourself before searching for someone's else solution. It will pay off! ↩

15. http://scala-lang.org/ ↩

Author

Matheus Eduardo Machado Moreira

Matheus Eduardo Machado Moreira is a Senior Consultant at Avenue Code. He has 13+ years of experience with software development and has worked primarily with Java backend technologies. He loves his wife and family, likes to play games, and enjoys riding motorcycles.

((All Those Parens!))

S-expressions

Homoiconicity

Powerful metaprogramming

Java 5's enhanced `for` loop

Code simplification

Final thoughts

Author

Matheus Eduardo Machado Moreira

Leave a Comment!

Popular Snippets

About Us

Our Offices

We're Hiring!

((All Those Parens!))

S-expressions

Homoiconicity

Powerful metaprogramming

Java 5's enhanced for loop

Code simplification

Final thoughts

Author

Matheus Eduardo Machado Moreira

Related Posts

Asymmetric Cryptography

Improving the developer experience when documenting

How to Run Rust from Python

How to Create a Tic-Tac-Toe Board Using React.js

Leave a Comment!

Avenue Code Social

Newsletter

Popular Snippets

About Us

Our Offices

We're Hiring!

Java 5's enhanced `for` loop