Elixir 101 - Introducing the Syntax
I’ve been posting a lot of articles in the last few weeks, check out the “Elixir” tag to read all of them.
Many tutorial series start introducing a new language by its syntax. I subverted the order. Elixir is not interesting because of its syntax. Erlang is interesting all by itself, because of its very mature, highly reliable, highly concurrent, distributed nature. But its syntax is not for the faint of heart. It’s not “ugly”, it’s just too different for those of us from the C school to easily digest. It derives from Prolog, and this is one small example of a Prolog exercise:
% P03 (*): Find the K'th element of a list.
% The first element in the list is number 1.
% Example:
% ?- element_at(X,[a,b,c,d,e],3).
% X = c
% The first element in the list is number 1.
% element_at(X,L,K) :- X is the K'th element of the list L
% (element,list,integer) (?,?,+)
% Note: nth1(?Index, ?List, ?Elem) is predefined
element_at(X,[X|_],1).
element_at(X,[_|L],K) :- K > 1, K1 is K - 1, element_at(X,L,K1).Erlang has a similar syntax, with the idea of phrases divided by commas and ending with a dot.
José Valim played it very smart: he chose the best of the available mature platforms and coated it with a layer of modern syntax and easier-to-use standard libraries. This is the same problem implemented in Elixir:
defmodule Exercise do
def element_at([found|_], 1), do: found
def element_at([_|rest], position) when position > 1 do
element_at(rest, position - 1)
end
endIf I copy and paste the code above in an IEx shell I can test it out like this:
iex(7)> Exercise.element_at(["a", "b", "c", "d", "e"], 3)
"c"This simple exercise shows us some of the powerful bits of Erlang that Elixir capitalizes upon, such as pattern matching and recursion.
First of all, every function must be defined inside a module, which you name with defmodule My.Module do .. end. Internally it becomes the atom “Elixir.My.Module”. Nesting modules is just a longer name concatenated with dots.
Then you can define a public function with the def my_function(args) do .. end block, which is just a macro for the same def my_function(args), do: … construct. Private functions are declared with defp.
A function is actually identified by the pair of its name and its arity. So above we have element_at/2 which means it accepts 2 arguments. But we have 2 functions with the same arity: the difference is the pattern matching.
def element_at([found|_], 1), do: foundHere we are saying: the first argument will be an array, decompose it. The first element of the array will be stored in the “found” variable, the rest “_” will be ignored. And the second argument must be the number “1”. This is the description of the so-called “pattern”, and it should “match” the input arguments received. This is “call-by-pattern” semantics.
But what if we want to pass a position other than “1”? That’s why we have this second definition:
def element_at([_|rest], position) when position > 1 doNow, the first argument again needs to be an array, but this time we don’t care about the first element, just the rest of the array without the first element. And any position other than “1” will be stored in the “position” variable.
But this function is special: it is guarded to only allow a position that is larger than 1. What if we try a negative position?
iex(8)> Exercise.element_at(["a", "b", "c", "d", "e"], -3)
** (FunctionClauseError) no function clause matching in Exercise.element_at/2
iex:7: Exercise.element_at(["a", "b", "c", "d", "e"], -3)It says that none of the arguments we passed match any of the clauses defined above. We could have added a third definition just to catch those cases:
def element_at(_list, _position), do: nilAdding the underscore “_” before the variable name is the same as having just the underscore, but we are naming it just to make it more readable. Any arguments passed will just be ignored. And this is the more generic case if the previous 2 don’t match.
The previous line is the same as writing:
def element_at(_list, _position) do
nil
endI won’t dive into macros for now. Just know that there is more than one way of doing things in Elixir and you can define those different ways using Erlang’s built-in support for macros, dynamic code that is compiled at runtime. It’s the way of doing metaprogramming in Elixir.
Now, going back to the implementation, the first function can still look weird; let’s review it:
def element_at([_|rest], position) when position > 1 do
element_at(rest, position - 1)
endWhat happens is: when we call Exercise.element_at([“a”, “b”, “c”, “d”, “e”], 3) the first argument will pattern match with [_|rest]. The first element “a” is discarded and the new list [“b”, “c”, “d”, “e”] is stored as “rest”.
Finally, we recurse the call, decrementing the “position” variable. So it becomes element_at([“b”, “c”, “d”, “e”], 2). And it repeats until position becomes “1”, in which case the pattern matching falls to the other function defined as:
def element_at([found|_], 1), do: foundIn this case the rest of the array is pattern matched and the first element, “c”, is stored in the “found” variable; the rest of the array is discarded. It only got here because the position matched as “1”, so it just returns the variable “found”, which contains the 3rd element of the original array, “c”.
This is all nice and fancy, but in Elixir we could have just done this other version:
defmodule Exercise do
def element_at(list, position), do: Enum.at(list, position)
endAnd we are done! Several tutorials will talk about how recursion and pattern matching to decompose lists solve a lot of problems, but Elixir gives us the convenience of treating lists as Enumerables and provides us a rich Enum module with very useful functions such as at/2, each/2, take/2, and so on. Just pick what you need and you’re managing lists like a boss.
Oh, and by the way, there is something called a Sigil in Elixir. Instead of writing the List of Strings explicitly, we could have done it like this:
iex(8)> ~w(a b c d e f)
["a", "b", "c", "d", "e", "f"]Or, if we wanted a List of Atoms, we could do it like this:
iex(9)> ~w(a b c d e f)a
[:a, :b, :c, :d, :e, :f]Lists, Tuples and Keyword Lists
Well, this was too simple. You really need the idea of pattern matching and basic types in your mind to make it flow. Let’s get another snippet from the Ex Manga Downloadr:
defp parse_args(args) do
parse = OptionParser.parse(args,
switches: [name: :string, url: :string, directory: :string],
aliases: [n: :name, u: :url, d: :directory]
)
case parse do
{[name: manga_name, url: url, directory: directory], _, _} -> process(manga_name, url, directory)
{[name: manga_name, directory: directory], _, _} -> process(manga_name, directory)
{_, _, _ } -> process(:help)
end
endThe first part may puzzle you:
OptionParser.parse(args,
switches: [name: :string, url: :string, directory: :string],
aliases: [n: :name, u: :url, d: :directory]
)The OptionParser.parse/2 receives just 2 arguments: 2 arrays. If you come from Ruby it feels like it’s a Hash with optional brackets, translating to something similar to this:
# this is wrong
OptionParser.parse(args,
{ switches: {name: :string, url: :string, directory: :string},
aliases: {n: :name, u: :url, d: :directory} }
)This works in Ruby, but it is not the case in Elixir. There are optional brackets, but not where you think they are:
# this is the correct, more explicit version
OptionParser.parse(args,
[
{
:switches,
[
{:name, :string}, {:url, :string}, {:directory, :string}
]
},
{
aliases:
[
{:n, :name}, {:u, :url}, {:d, :directory}
]
}
]
)WHAT!?!?
Yep, the second argument is actually an array with elements that are Tuples paired with an atom key and value, and some of the values are themselves arrays with tuples.
in Elixir, Lists are what we usually call an Array, a Linked-List of elements. Linked-Lists, as you know from your Computer Science classes, make it easy to insert and remove elements.
in Elixir, Tuples are immutable fixed lists with fixed positions, with elements delimited by the brackets “{}”
If the previous example was just too much, let’s step back a little:
defmodule Teste do
def teste(opts) do
[{:hello, world}, {:foo, bar}] = opts
IO.puts "#{world} #{bar}"
end
endNow we can call it like this:
iex(13)> Teste.teste hello: "world", foo: "bar"
world barWhich is the same as calling like this:
iex(14)> Teste.teste([{:hello, "world"}, {:foo, "bar"}])
world barThis may confuse you, but it’s very intuitive. You can just think of this combination of Lists ("[]") with Tuple elements containing a pair of atom and value ("{:key, value}") as behaving almost like Ruby Hashes being used for optional named arguments.
Then, we have the Pattern Match section in both previous examples:
case parse do
{[name: manga_name, url: url, directory: directory], _, _} ->
process(manga_name, url, directory)
{[name: manga_name, directory: directory], _, _} ->
process(manga_name, directory)
{_, _, _ } ->
process(:help)
endAnd
[{:hello, world}, {:foo, bar}] = optsThe last example is just decomposition. The previous example is pattern matching and decomposition. You match based on the atoms and positions within the tuples within the list. You match from the more narrow case to the more generic case. And in the process, the variables in the pattern are available for you to use in the matching case clause.
Let’s understand the meaning of this line:
{[name: manga_name, url: url, directory: directory], _, _} -> process(manga_name, url, directory)It is saying: given the results of the OptionParser.parse/2 function, it must be a tuple with 3 elements. The second and third elements don’t matter. But the first element must be a List with at least 3 tuples. And the keys of each tuple must be the atoms :name, :url, and :directory. If they’re there, store the values of each tuple in the variables manga_name, url, and directory, respectively.
This may really confuse you in the beginning, but this combination of a List of Tuples is what’s called a Keyword List, and you will find this pattern many times, so get used to it.
Keyword Lists feel like a Map, but a Map has a different syntax:
list = [a: 1, b: 2, c: 3]
map = %{:a => 1, :b => 2, :c => 3}This should summarize it:
iex(1)> list = [a: 1, b: 2, c: 3]
[a: 1, b: 2, c: 3]
iex(2)> map = %{:a => 1, :b => 2, :c => 3}
%{a: 1, b: 2, c: 3}
iex(3)> list[:a]
1
iex(4)> map[:a]
1
iex(5)> list.a
** (ArgumentError) argument error
:erlang.apply([a: 1, b: 2, c: 3], :a, [])
iex(5)> map.a
1
iex(6)> list2 = [{:a, 1}, {:b, 2}, {:c, 3}]
[a: 1, b: 2, c: 3]
iex(7)> list = list2
[a: 1, b: 2, c: 3]Keyword Lists are convenient as function arguments or return values. But if you want to process a collection of key-value pairs, use a dictionary-like structure, in this case a Map, especially if you need to search the collection using the key. They look similar, but the internal structures are not the same. A Keyword List is not a Map; it’s just a convenience for a static list of tuples.
Finally, if this pattern matches the parse variable passed to the case block, it executes the statement process(manga_name, url, directory), passing the 3 variables captured in the match. Otherwise it proceeds to try the next pattern in the case block.
The idea is that the “=” operator is not an “assignment”, it’s a matcher: you match one side with the other. Read the error message when a pattern is not matched:
iex(15)> [a, b, c] = 1
** (MatchError) no match of right hand side value: 1This is a matching error, not an assignment error. But if it succeeds this is what we have:
iex(15)> [a, b, c] = [1, 2, 3]
[1, 2, 3]
iex(16)> a
1
iex(17)> c
3This is List decomposition. It so happens that in the simple case it feels like a variable assignment, but it’s much more complex than that.
Pipelines
We use exactly those concepts of pattern matching on the elements returned from the HTML parsed by Floki in my Manga Downloadr:
Floki.find(html, "#listing a")
|> Enum.map(fn {"a", [{"href", url}], _} -> url end)The find/2 takes an HTML string from the fetched page and matches against the CSS selectors in the second argument. The result is a List of Tuples representing the structure of each HTML Node found, in this case, this pattern: {“a”, [{“href”, url}], _}
We can then Enum.map/2. A map is a function that receives each element of a list and returns a new list with new elements. The first argument is the original list and the second argument is a function that receives each element and returns a new one.
One of the main features of the Elixir language that most languages don’t have is the Pipe operator ("|>"). It behaves almost like UNIX’s pipe operator “|” in any shell.
In UNIX we usually do stuff like “ps -ef | grep PROCESS | grep -v grep | awk ‘{print $2}’ | xargs kill -9”
This is essentially the same as doing:
ps -ef > /tmp/ps.txt
grep mix /tmp/ps.txt > /tmp/grep.txt
grep -v grep /tmp/grep.txt > /tmp/grep2.txt
awk '{print $2}' /tmp/grep2.txt > /tmp/awk.txt
xargs kill -9 < /tmp/awk.txtEach UNIX process can receive something from standard input (STDIN) and output something to standard output (STDOUT). We can redirect the output using “>”. But instead of doing all those extra steps and creating all those extra garbage temporary files, we can simply “pipe” the STDOUT of one command to the STDIN of the next command.
Elixir uses the same principles: we can simply use the returning value of a function as the first argument of the next function. So the first example of this section is the same as doing this:
results = Floki.find(html, "#listing a")
Enum.map(results, fn {"a", [{"href", url}], _} -> url end)In the same ExMangaDownloadr project we have this snippet:
defp process(manga_name, url, directory) do
File.mkdir_p!(directory)
url
|> Workflow.chapters
|> Workflow.pages
|> Workflow.images_sources
|> Workflow.process_downloads(directory)
|> Workflow.optimize_images
|> Workflow.compile_pdfs(manga_name)
|> finish_process
endAnd we just learned that it’s the equivalent of doing the following (I’m cheating a bit because the 3 final functions of the workflow are not transforming the input “directory”, just passing it through):
defp process(manga_name, url, directory) do
File.mkdir_p!(directory)
chapters = Workflow.chapters(url)
pages = Workflow.pages(chapters)
sources = Workflow.images_sources(pages)
Workflow.process_downloads(sources, directory)
Workflow.optimize_images(directory)
Workflow.compile_pdfs(directory, manga_name)
finish_process(directory)
endOr this much uglier version that we must read in reverse:
defp process(manga_name, url, directory) do
File.mkdir_p!(directory)
finish_process(
Workflow.compile_pdfs(
Workflow.optimize_images(
Workflow.images_sources(
Workflow.pages(
Workflow.chapters(url)
)
)
), manga_name
)
)
endWe can easily see how the Pipe Operator “|>” makes any transformation pipeline much easier to read. Anytime you are starting from a value and passing the results through a chain of transformations, you will use this operator.
Next Steps
The concepts presented in this article are the ones I think most people will find the most challenging upon first glance. If you understand Pattern Matching and Keyword Lists, you will understand all the rest.
The official website offers a great Getting Started that you must read entirely.
From intuition you know most things already. You have “do .. end” blocks, but you may not yet know that they are just convenience macros to pass a list of statements as an argument inside a Keyword List. The following blocks are equivalent:
if true do
a = 1 + 2
a + 10
end
if true, do: (
a = 1 + 2
a + 10
)
if true, [{:do, (
a = 1 + 2
a + 10
)}]Mind-blowing, huh? There are many macros that add syntactic sugar using the primitives behind it.
For the most part, Valim made the powerful Erlang primitives more accessible (Lists, Atoms, Maps, etc.) and added higher abstractions using macros (do .. end blocks, the pipe operator, keyword lists, shortcuts for anonymous functions, etc.). This precise combination is what makes Elixir very enjoyable to learn. It’s like peeling an onion: you start with the higher abstractions and discover macros of simpler structures underneath. You see a Keyword List first and discover Lists of Tuples. You see a block and discover another Keyword List disguised by a macro. And so on.
So you have a low barrier to entry and you can go as deep as you want into the rabbit hole, all the way to extending the language.
Elixir provides a very clever language design on top of the 25-year-old mature Erlang core. This is not just clever, it’s the intelligent choice. Keep learning!