Webpage Generation for Lazy People

A demonstration of pandoc

If you are someone like me, you quickly get bored of repetitive tasks and if they are related to computers you seek ways to automate them. Writing HTML by hand was fun for the first few times but it got boring as the amount of boilerplate per webpage started increasing. Thankfully, there are numerous utilities that makes lives of lazy people like me easier while not being insanely bloated. One such utility is Pandoc.

What is Pandoc?

Pandoc is a Haskell library and a command-line utility that converts various markup formats between each other. It is often used to convert markdown files to other formats like LaTeX, HTML, DOC etc.

Markdown?

You might have used markdown without even hearing its name. It is used widely to indicate formatting in sites/platforms such as GitHub, Reddit and Discord to name a few.

Markdown was originally created to produce text that is easily readable by humans and formattable by machines. You can check out this guide by GitHub to learn the basics of Markdown.

Installing Pandoc

Refer to the instructions here to install Pandoc for your platform (the rest of this guide assumes that you are running a distribution of GNU/Linux, but it should work with Windows and macOS with minor, if any, changes)

Modifying the HTML Template

By default, pandoc only creates a “chunk” of HTML that you can add to an existing site. For standalone web pages, there is a standalone mode in which pandoc inserts the HTML equivalent of your Markdown text and various data (date, title, stylesheet names etc.) into a HTML template. You can specify a custom template file to have more control over your page’s layout.

You can modify the default template instead of writing one from scratch, since most people will need few deviations from the defaults. To do so, you first need to tell pandoc to put the default HTML template into a file by typing pandoc -D html > template.html into your terminal/command prompt. You can now open the file in your preferred text editor and modify it to your heart’s content.

Pandoc templates have an extensive syntax, but we don’t need to understand all of it.

What I customized

For my template, I modified three things:

  1. I removed the part that loads default HTML styling because I already have a stylesheet for my website. I actually restored this section as it enables syntax highlighting on code blocks. I still think that it needs to be mentioned. More on that later.

      <style>
      $styles.html()$
      </style>
  2. I removed the automatically generated paragraphs for all the information except the title, which is:

    $if(subtitle)$
    <p class="subtitle">$subtitle$</p>
    $endif$
    $for(author)$
    <p class="author">$author$</p>
    $endfor$
    $if(date)$
    <p class="date">$date$</p>
    $endif$

    I think this is the right time to discuss basic template syntax. In pandoc templates, blocks enclosed with dollar signs are either variables or statements.

  1. Lastly I added some common links and info at the end of the template before the closing body tag

    <a href="/index.html">Back to the homepage</a>
    <div class="bottom">
    <h3><a href="mailto:berkan.sahin@ug.bilkent.edu.tr">Contact me at:berkan.sahin[at]ug.bilkent.edu.tr</a></h3>
    <h5><a href="/about.html">About Me</a> | <a href="https://github.com/berkan-sahin/">GitHub: berkan-sahin</a></h5>
    $if(date)$
    <h6>Last update: $date$</h6>
    $endif$
    <h6>Generated with <a href="https://pandoc.org">Pandoc</a></h6>
    </div>

Metadata blocks

One of the most important features of Pandoc is the metadata blocks. Metadata blocks contain information such as the page title, author(s), the date and the stylesheet(s) for the page. They are not rendered directly in the final page, but they influence its look and structure by setting variables that are then placed into the template accordingly.

Metadata blocks are usually at the top of the document and they must begin and end with three hyphens (---). They are written in a language called YAML.

Some examples

Here is the metadata block I used for this page at the time of writing this:

---
title: Webpage Generation for Lazy People
author: Berkan Şahin
date: 02/12/2020
css: 
- /styles.css
---

Generating the HTML Document

If you are satisfied with your markdown file, you can now call pandoc to generate the final HTML page.

For this example, suppose that your markdown file is named content.md. We will also assume that your custom template is in the same directory as your markdown file and is named template.html. In this case, you tell pandoc to generate your webpage by typing

pandoc -s content.md --template=template.html -o content.html

Explaining the flags

Automating the build process with make

You probably noticed that this command is pretty long and you probably thought that typing this every time you want to generate a webpage is a tedious task and seems contradictory to our goals. In fact, you are right. But thankfully we can use tools such as make, which are designed to make the tedious process of building software easier while still being very flexible.

What is make?

The GNU project define make as a tool which controls the generation of executables and other non-source files of a program from the program's source files. Make is mostly used to build large programs given their source code, but we can employ a small fraction of make’s capabilities to make the webpage generation process almost as easy as saying make me a webpage.

Getting make

Make is a tool with a relatively long history and therefore has many different implementations. The one we are interested in is GNU make, as it has become the de-facto industry standard of make derivatives.

Makefiles

Makefiles are simply the equivalent of recipes, but for programs instead of food. A makefile for a robot that makes a cake would be:

Apple Cake : Apples, flour, sugar, eggs etc.
    preheat oven
    mix flour and eggs
    ...

Using this example, we can examine the general form of a makefile:

In our apple cake example, the specific type of apple that is used did not matter. The recipe wouldn’t change if say, you used Granny Smith apples instead of Amasya apples. Make actually allows for flexibility like that, and we can utilize this behavior to write a generic makefile for any pandoc-compliant markdown file.

After some ingenuity and examination of GNU Make Documentation we have this generic form, which you can download here.

%.html : %.md template.html
        pandoc -s $< --template=template.html -o $@

This recipe basically builds x.html from x.md using template.html as a template where x is any valid filename sans extension, provided that x.md exists.

Please keep in mind that your makefile must be named Makefile and it must be stored in the directory where make is run. Now you can type make mypage.html and your computer will automatically generate mypage.html from mypage.md, saving precious minutes which can be used in other activities (e.g sleeping). Magic, right?

Further reading

[06/05/2021]: It’s been 4 months and this tutorial is still not complete, I know. This method will probably be obsoleted by panbash in a month or so anyway. Maybe I’ll adapt this tutorial to that software instead (It will be a lot easier and less janky than my Makefile solution). Stay tuned for that!

Back to the homepage

Contact me at:berkan[at]bsahin.xyz

About Me | GitHub: berkan-sahin | Mastodon
Last update: 04/01/2021
Generated with Pandoc