Demystifying Elixir Macros

Macros in elixir are one of the more challenging concepts to understand, and many people simply say "don't use them" because of this.

The key to understanding macros is to also understand a little of how compilers work.

Compilers, Bytecode, Transpiling

Code comes in several levels, but first let's define two levels programmers generally understand:

  1. Human readable code — Elixir, Erlang, C, Python, Java
  2. Machine code — the operating system's binary code structure that is used to execute programs (aka native code).

The act of "compiling" code is taking it from the human readable syntax into the machine syntax.

But, that's not actually true, because compilers go through stages when making it from the first step to the second.

For example, C-code, which many think of as very "low level" because it compiles all the way to machine code, also typically goes through a second step where it's compiled to "assembly," which is still human readable, if not much more abstract, and then that assembly code is compiled into machine code.

Where it gets interesting is with "interpreted" languages where the code is not compiled to machine code, but rather is compiled to an intermediate stage like C code is compiled to assembly. This is usually called bytecode, and then an "interpreter" which is compiled to machine code then runs and interprets the bytecode at runtime (in a "virtual machine"). This is how Python, Elixir, Java and many other programs work.

It's also not as efficient as running machine native code, as it's going through a virtual interpreted layer.

Some languages introduce "Just In Time" (aka JIT) compiling for this reason, to help optimize the process. With a JIT compiler, the interpreter will take portions of the bytecode and do a pass to compile them into native machine code. However, that's a different discussion all together.

Erlang Beam AST

Where it gets interesting with elixir is the introduction of the Abstract Syntax Tree, which is a middle stage between the bytecode and human readable code. The AST is used when compiling the code, and it's how elixir and erlang can so easily co-exist on the same beam and using the same compiler.

The AST is conceptually almost like "assembly for the erlang beam" if you want to think of it that way.

So with Elixir when code is compiled and then executed, it goes through several stage:

  1. Elixir code (human readable)
  2. Compiled to AST
  3. Compiled to erlang bytecode
  4. Bytecode is interpreted by the beam, which is machine native code.

Macros, Quote, Unquote

So now, macros. The easiest way to think of this is to think of the following when you see these in Elixir code:

  • quote — think precompile-to-ast — the elixir code is compiled only to AST, and doesn't appear in the final bytecode of the defining module.
  • unquote — think uncompile-ast — the AST code is reversed back to human readable elixir code (only usable from within a quote block).
  • use — think embed ast code — the AST code from the referenced module (only the bits in the defmacro/use block) are embedded into the current module, before the overall combined AST code is compiled to beam bytecode.

Now, when you look at macros, hopefully this will help in the understanding of what is happening.

Learn more:

Brandon Gillespie

Brandon Gillespie

Brandon loves Tech and has a breadth of experience including hands-on Implementation and Leadership Roles across many companies including small startups, enterprises, and the USAF.

Read More