Macros to the Rescue!
Each source module has 3 sub-modules: ChapterPage, IndexPage and Page. All of them have a main function that resembles this piece of code:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
defmodule ExMangaDownloadr.Mangafox.ChapterPage do require Logger ... def pages(chapter_link) do Logger.debug("Fetching pages from chapter #{chapter_link}") case HTTPotion.get(chapter_link, [headers: ["User-Agent": @user_agent, "Accept-encoding": "gzip"], timeout: 30_000]) do %HTTPotion.Response{ body: body, headers: headers, status_code: 200 } -> body = ExMangaDownloadr.Mangafox.gunzip(body, headers) { :ok, fetch_pages(chapter_link, body) } _ -> { :err, "not found"} end end ... end |
(Listing 1.1)
It calls "HTTPotion.get/2" sending a bunch of HTTP options and receives a "%HTTPotion.Response" struct that is then decomposed to get the body and headers. It gunzips the body if necessary and goes to parse the HTML itself.
Similar code exists in 6 different modules, with different links and different parser functions. It's a lot of repetition, but what about making the above code look like the snippet below?
1 2 3 4 5 6 7 8 9 |
defmodule ExMangaDownloadr.Mangafox.ChapterPage do require Logger require ExMangaDownloadr def pages(chapter_link) do ExMangaDownloadr.fetch chapter_link, do: fetch_pages(chapter_link) end ... end |
(Listing 1.2)
Changed 9 lines to just 1. And by the way, this same line can be written like this:
1 2 3 |
ExMangaDownloadr.fetch chapter_link do fetch_pages(chapter_link) end |
Seems familiar? It's like every block in the Elixir language, you can write it in the "do/end" block format or the way it really is under the covers: a keyword list with a key named ":do". And the way this macro is defined is like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
defmodule ExMangaDownloadr do ... defmacro fetch(link, do: expression) do quote do Logger.debug("Fetching from #{unquote(link)}") case HTTPotion.get(unquote(link), ExMangaDownloadr.http_headers) do %HTTPotion.Response{ body: body, headers: headers, status_code: 200 } -> { :ok, body |> ExMangaDownloadr.gunzip(headers) |> unquote(expression) } _ -> { :err, "not found"} end end end ... end |
(Listing 1.3)
There is a lot of details to consider when writing a macro and I recommend reading the documentation on Macros. The code is basically copying the function body from the "ChapterPage.pages/1" (Listing 1.1) and pasting into the "quote do .. end" block (Listing 1.3).
Inside that code we have "unquote(link)" and "unquote(expression)". You also must read the documentation on "Quote and Unquote". It just expands this "external" code inside the macro code to defer execution until the macro quote code is actually executed, instead of running it at that exact moment. I know, tricky to wrap your head around this the first time.
The bottom line is: whatever code is inside the "quote" block will be "inserted" where we called "ExMangaDownloadr.fetch/2" in the "pages/1" function in Listing 1.2, together with the unquoted code you passed as a parameter.
The resulting code will resemble the original code in Listing 1.1.
To make it simpler, if you were in Javascript this would be a similar code:
1 2 3 4 5 6 |
function fetch(url) { eval("doSomething('" + url + "')"); } function pages(page_link) { fetch(page_link); } |
"Quote" would be like the string body in an eval and "unquote" just concatenating the value you passed inside the code being eval-ed. This is a crude metaphor as "quote/unquote" is way more powerful and cleaner than ugly "eval" (you shouldn't be using, by the way!) But this metaphor should do to make you understand the code above.
Another place I used a macro was to save the images list in a dump file and load it later if the tool crashes for some reason, in order not to have to start over from scratch. The original code was like this:
1 2 3 4 5 6 7 8 9 10 11 |
dump_file = "#{directory}/images_list.dump" images_list = if File.exists?(dump_file) do :erlang.binary_to_term(File.read!(dump_file)) else list = [url, source] |> Workflow.chapters |> Workflow.pages |> Workflow.images_sources File.write(dump_file, :erlang.term_to_binary(list)) list end |
(Listing 1.4)
And now that you understand macros, you will understand what I did here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
defmodule ExMangaDownloadr do ... defmacro managed_dump(directory, do: expression) do quote do dump_file = "#{unquote(directory)}/images_list.dump" images_list = if File.exists?(dump_file) do :erlang.binary_to_term(File.read!(dump_file)) else list = unquote(expression) File.write(dump_file, :erlang.term_to_binary(list)) list end end end ... end defmodule ExMangaDownloadr.CLI do alias ExMangaDownloadr.Workflow require ExMangaDownloadr ... defp process(manga_name, directory, {_url, _source} = manga_site) do File.mkdir_p!(directory) images_list = ExMangaDownloadr.managed_dump directory do manga_site |> Workflow.chapters |> Workflow.pages |> Workflow.images_sources end ... end ... end |
And there you have it! And now you see how "do .. end" blocks are implemented. It just passes the expression as the value in the keyword list of the macro definition. Let's define a dumb macro:
1 2 3 4 5 6 7 |
defmodule Foo defmacro foo(do: expression) do quote do unquote(expression) end end end |
And not the following calls are all equivalent:
1 2 3 4 5 6 7 8 |
require Foo Foo.foo do IO.puts(1) end Foo.foo do: IO.puts(1) Foo.foo(do: IO.puts(1)) Foo.foo([do: IO.puts(1)]) Foo.foo([{:do, IO.puts(1)}]) |
This is macros combined with Keyword Lists which I explained in previous articles and it's simply a List with tuples where each tuple has an atom key and a value.
More Macros
Another opportunity to refactor were the "mangareader.ex" and "mangafox.ex" modules that were just used in the unit tests "mangareader_test.ex" and "mangafox_test.ex". This is the old "mangareader.ex" code:
1 2 3 4 5 6 7 8 9 |
defmodule ExMangaDownloadr.MangaReader do defmacro __using__(_opts) do quote do alias ExMangaDownloadr.MangaReader.IndexPage alias ExMangaDownloadr.MangaReader.ChapterPage alias ExMangaDownloadr.MangaReader.Page end end end |
And this is how it was used in "mangareader_test.ex":
1 2 3 4 |
defmodule ExMangaDownloadr.MangaReaderTest do use ExUnit.Case use ExMangaDownloadr.MangaReader ... |
It was just a shortcut to alias the modules in order to use them directly inside the tests. I just moved the entire module as a macro in "ex_manga_downloadr.ex" module:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
... def mangareader do quote do alias ExMangaDownloadr.MangaReader.IndexPage alias ExMangaDownloadr.MangaReader.ChapterPage alias ExMangaDownloadr.MangaReader.Page end end def mangafox do ... end defmacro __using__(which) when is_atom(which) do apply(__MODULE__, which, []) end end |
And now I can use it like this in the test file:
1 2 3 4 |
defmodule ExMangaDownloadr.MangaReaderTest do use ExUnit.Case use ExMangaDownloadr, :mangareader ... |
The special "using" macro is called when I "use" a module, and I can even pass arguments to it. The implementation then uses "apply/3" to dynamically call the correct macro. This exactly how Phoenix imports the proper behaviors for Models, Views, Controllers, Router, for example:
1 2 3 |
defmodule Pxblog.PageController do use Pxblog.Web, :controller ... |
The macros are open in a Phoenix file and available in the "web/web.ex" module, so I just copied the same behavior. And now I have 2 less files to worry about.
Tiny refactorings
In the previous code I used the "String.to_atom/1" to convert the string of the module name to an atom, to be later used in "apply/3" calls:
1 2 3 4 5 6 |
defp manga_source(source, module) do case source do "mangareader" -> String.to_atom("Elixir.ExMangaDownloadr.MangaReader.#{module}") "mangafox" -> String.to_atom("Elixir.ExMangaDownloadr.Mangafox.#{module}") end end |
I changed it to this:
1 2 |
"mangareader" -> :"Elixir.ExMangaDownloadr.MangaReader.#{module}" "mangafox" -> :"Elixir.ExMangaDownloadr.Mangafox.#{module}" |
It's just a shortcut to do the same thing.
In the parser I was also not using Floki correctly. So take a look at this piece of old code:
1 2 3 4 5 6 7 8 9 |
defp fetch_manga_title(html) do Floki.find(html, "#mangaproperties h1") |> Enum.map(fn {"h1", [], [title]} -> title end) |> Enum.at(0) end defp fetch_chapters(html) do Floki.find(html, "#listing a") |> Enum.map fn {"a", [{"href", url}], _} -> url end end |
Now using the better helper functions that Floki provides:
1 2 3 4 5 6 7 8 9 10 |
defp fetch_manga_title(html) do html |> Floki.find("#mangaproperties h1") |> Floki.text end defp fetch_chapters(html) do html |> Floki.find("#listing a") |> Floki.attribute("href") end |
This was a case of not reading the documentation as I should have. Much cleaner!
I did other bits of cleanup but I think this should cover the major changes. And finally, I bumped up the version to "1.0.0" as well!
1 2 3 4 5 |
def project do [app: :ex_manga_downloadr, - version: "0.0.1", + version: "1.0.0", elixir: "~> 1.1", |
And speaking of versions, I'm using Elixir 1.1 but pay attention as Elixir 1.2 is just around the corner and it brings some niceties. For example, that macro that aliased a few modules could be written this way now:
1 2 3 4 5 |
def mangareader do quote do alias ExMangaDownloadr.MangaReader.{IndexPage, ChapterPage, Page} end end |
And this is just 1 feature between many other syntax improvements and support for the newest Erlang R18. Keep an eye on both!