To make this static website render correctly on both desktop and mobile, I’ve decided to ‘upgrade’ my setup to use the Bulma CSS framework. This introduced a problem I did not anticipate.
For example, consider the following “raw” HTML tag to create a level 1 title:
<h1>Title</h1>
However, in Bulma, headings must be of a specific class, like so 1:
<!-- Level-1 title -->
<h1 class="title is-1">Title</h1>
<!-- Level-2 title -->
<h2 class="title is-2">Title</h2>
<!-- Level-1 subtitle -->
<h1 class="subtitle is-1">Title</h1>
Problem is, a lot of headings included on my website are generated from Markdown to HTML using Pandoc. Predictably, Markdown headings like # Title
are converted to “raw” HTML headings like <h1>Title</h1>
, and not the <h1 class="title is-1">Title</h1>
that I need to use.
This is a textbook example of a problem that can be solved with a Pandoc filter.
During the conversion from Markdown to HTML, Pandoc constructs an abstract syntax tree representing the document. A Pandoc filter is used to include transformations to this abstract syntax tree. This is precisely what we want : we want to transform headings into a slightly different type of headings that will play nicely with Bulma.
There are some examples in the Pandoc documentation on filters, but I would like to document the process I used to create this filter.
We’ll be writing the filter in Haskell, because I can then include in directly in the website code generation (more info here).
The Pandoc abstract syntax tree
We need to familiarize ourselves with the Pandoc abstract syntax tree (AST). This is defined in the pandoc-types
package, most importantly in the Text.Pandoc.Definition
module (see here).
We’re using Haskell, so let’s look at the data types. A Pandoc document is converted from some source format (in our case, Markdown) to the Pandoc
type:
data Pandoc = Pandoc Meta [Block]
Without looking at the details, we can see that a document is a list of blocks as well as some metadata. The Block
datatype is more interesting:
data Block
= Plain [Inline] -- ^ Plain text, not a paragraph
| Para [Inline] -- ^ Paragraph
...) -- (omitted)
(| Header Int Attr [Inline] -- ^ Header - level (integer) and text (inlines)
...) -- (omitted) (
(source here)
There we go! One of the possible type of blocks is a header. This header has a level (level 1 header is the largest title), some attributes, and [Inline]
represents the content of the header. We’re interested in modifying the header attributes, so let’s look at Attr
:
-- | Attributes: identifier, classes, key-value pairs
type Attr = ( String -- Identifier. Not important
String] -- ^ class (e.g. ["a", "b"] -> class="a b" in HTML)
, [String, String)]) -- Not important , [(
The “classes” part of the attribute is precisely what we’d like to modify. Recall that to get Bulma to work, we want to have headings looking like <h3 class="title is-3">Title</h3>
.
Modifying one AST node
Let’s write a function that modifies Block
s (i.e. one tree node) like we want 2:
-- This is from the pandoc-types package
import Text.Pandoc.Definition (Block(..), Attr)
toBulmaHeading :: Block -> Block
-- Pattern matching on the input
-- Any Block that is actually a header should be changed
Header level attrs xs) = Header level newAttrs xs
toBulmaHeading (where
= attrs
(identifier, classes, keyvals) -- We leave identifier and key-value pairs unchanged
= ( identifier
newAttrs -- We extend header classes to have the Bulma classes "title" and "is-*"
-- where * is the header level
<> ["title", "is-" <> show level]
, classes
, keyvals)
-- We leave any non-header blocks unchanged
= x toBulmaHeading x
Modifying the entire AST
All we need now is to traverse the entire syntax tree, and modify every block according to the toBulmaHeading
function. This is trivial using the Text.Pandoc.Walk.walk
function (also from pandoc-types
). Thanks to typeclasses, walk
works on many types, but the one specialization I’m looking for is:
walk :: (Block -> Block) -- ^ A function that modifies the abstract syntax three
-> Pandoc -- ^ A syntax tree
-> Pandoc -- ^ Our modified syntax tree
Our filter then becomes:
-- This is from the pandoc-types package
import Text.Pandoc.Definition (Pandoc, Block(..), Attr)
import Text.Pandoc.Walk (walk)
toBulmaHeading :: Block -> Block
Header level attrs xs) = Header level newAttrs xs
toBulmaHeading (where
= attrs
(identifier, classes, keyvals) -- We leave identifier and key-value pairs unchanged
= ( identifier
newAttrs -- We extend header classes to have the Bulma classes "title" and "is-*"
-- where * is the header level
<> ["title", "is-" <> show level]
, classes
, keyvals)
-- We leave any non-header blocks unchanged
= x
toBulmaHeading x
-- | Pandoc filter that changes headings to play nicely with Bulma
bulmaHeadingTransform :: Pandoc -> Pandoc
= walk toBulmaHeading bulmaHeadingTransform
Hooking into Hakyll
To include this filter in my Hakyll pipeline, I only need to provide this filter to the pandocCompilerWithTransform
function. Hakyll will then apply the Pandoc filter after the AST has been generated from Markdown, but before HTML rendering happens.
If you want to know how to integrate all of this you can shoot me an e-mail.
Closing remarks
I hope this example has shown you the process behind writing Pandoc filters. Without modifying the content of my posts, I have been able to integrate Bulma in my static website.
I could also have done it by replacing Markdown headers with inline HTML. However, this would have been less fun.
You can take a look at the source code used to generate this website.