How I blog about math: an update on KaTeX with Jekyll
When I started this blog 4 years ago, I picked a blog template that felt nice, and since this template was using the Jekyll blog engine, I picked it as well. I didn’t know much about Jekyll (nor about Ruby, the language it’s implemented in), but it’s the engine powering GitHub Pages, so it seemed like a safe bet.
I must say that it was a good choice that didn’t require much maintenance, as I use Jekyll only when writing new posts. Once ready to publish, I generate HTML files to upload on my server, on which no Jekyll runs. So I don’t have to deal with any Jekyll maintenance when I’m not writing.
Yet, I recently decided to update my development setup, thereby migrating to the latest version of Jekyll (currently 4.0.1
).
This migration broke my integration with KaTeX, a library allowing to seamlessly insert math formulas written in syntax on web pages.
My blog isn’t full of math formulas, but I occasionally use them, for example for cryptography.
Here is an example formula rendered with KaTeX (if it doesn’t render well in your browser please send me feedback).
Given that I had to tweak a few things when migrating to the latest Jekyll version, I think it’s a good opportunity to present what’s behind the scenes of my blog, and how you can use KaTeX too.
About Markdown and
My blog posts are written in Markdown, a lightweight markup language that’s become ubiquitous nowadays. But there are multiple engines to convert Markdown into HTML code that can be displayed in a browser. For my blog, I use kramdown, because it was the default for the theme I picked.
How does writing math formulas in a blog post look like?
With kramdown, inserting a math formula into my blog is quite simple: write your formula in TeX syntax and put it between a pair of double dollar signs $$
.
Here is an example of Markdown code, using the formula on KaTeX’s home page (a Fourier transform).
You can insert some formula $$f(x) = \int_{-\infty}^\infty \hat f(\xi)\,e^{2 \pi i \xi x} \,d\xi$$ in the middle of a sentence.
You can also put large formulas in their own paragraph.
$$f(x) = \int_{-\infty}^\infty \hat f(\xi)\,e^{2 \pi i \xi x} \,d\xi$$
That's it!
This will render as follows.
You can insert some formula in the middle of a sentence.
You can also put large formulas in their own paragraph.
That’s it!
As you can see, the rendering is bluffingly similar to a PDF written in LaTeX, except that here everything happens in the browser without any PDF.
How does work?
KaTeX is an engine to convert math formulas written in the TeX syntax as above into nicely rendered symbols. This conversion consists of two steps.
- Converting the formula into HTML tags. This is implemented with JavaScript.
- Rendering the HTML into a set of nice-looking symbols. This is implemented with CSS and some specific fonts to render the symbols.
The second step happens entirely in the browser on the client side.
I’ll now focus on the first step, which I had to tweak when upgrading to the newer Jekyll. As described in KaTeX’s documentation, creating the HTML tags is done in JavaScript with code like the following.
katex.render("c = \\pm\\sqrt{a^2 + b^2}", element, {
throwOnError: false
});
You can either choose to integrate this JavaScript with Node.js on the server side, or directly load it in the browser. Since I’m already using Jekyll to render my blog, Node.js wouldn’t be very convenient, so I used the browser integration.
Converting formulas to HTML with Kramdown
To use the kramdown parser in Jekyll, I simply put the following line in my _config.yml
.
markdown: kramdown
Up until version 2.1.0, kramdown would generate a simple HTML script wrapping each math formula.
For example, $$2 + 2$$
would generate the following code.
<script type="math/tex">2 + 2</script>
However, since version 2.2.0, in particular due to this commit, the output of kramdown has changed and now looks like the following.
\(2 + 2\)
It turns out that by default, kramdown generates math code suitable for MathJax, an alternative to KaTeX.
But in order to be compatible with recent versions of MathJax, kramdown doesn’t wrap formulas into nice script
tags anymore.
I hadn’t previously paid attention to this, as my template was coming with some JavaScript shim to transform the math/tex
scripts into KaTeX on the client side.
// Copyright (c) 2014 Rohan Chandra
// SPDX-License-Identifier: MIT
var elements = document.getElementsByTagName('script')
Array.prototype.forEach.call(elements, function(element) {
if (element.type.indexOf('math/tex') != -1) {
// Extract math markdown
var textToRender = element.innerText || element.textContent;
// Create span for KaTeX
var katexElement = document.createElement('span');
// Support inline and display math
if (element.type.indexOf('mode=display') != -1){
katexElement.className += "math-display";
textToRender = '\\displaystyle {' + textToRender + '}';
} else {
katexElement.className += "math-inline";
}
katex.render(textToRender, katexElement);
element.parentNode.insertBefore(katexElement, element);
}
});
As you can see, this was looking for <script type="math/tex">
tags in the HTML, and replacing the raw formula into symbols via katex.render
.
But modifying this shim to look for matching \(
and \)
instead would be quite cumbersome.
What if, for example, some unrelated code in your blog legitimately contains this combination of parentheses?
On-server rendering with kramdown-math-katex
I also realized that kramdown comes with KaTeX integration, so I decided to try that instead of hacking a bridge between MathJax and KaTeX.
Turning on the KaTeX engine is done by adding the following to your Jekyll _config.yml
.
kramdown:
math_engine: katex
On the Ruby side, this requires installing the kramdown-math-katex
package, e.g. with gem install kramdown-math-katex
.
And this one itself requires a JavaScript engine of your choice.
Targeting directly the KaTeX math engine, kramdown will call the server-side rendering API of KaTeX, generating HTML tags to insert in the page. For example, the generated HTML for our formula will look like the following.
<span class="katex-display">
<span class="katex">
<span class="katex-mathml">
<math xmlns="http://www.w3.org/1998/Math/MathML">
<semantics>
<mrow>
<mn>2</mn>
<mo>+</mo>
<mn>2</mn>
</mrow>
<annotation encoding="application/x-tex">2 + 2</annotation>
</semantics>
</math>
</span>
<span class="katex-html" aria-hidden="true">
<span class="base">
<span class="strut" style="height:0.72777em;vertical-align:-0.08333em;"></span>
<span class="mord">2</span>
<span class="mspace" style="margin-right:0.2222222222222222em;"></span>
<span class="mbin">+</span>
<span class="mspace" style="margin-right:0.2222222222222222em;"></span>
</span>
<span class="base">
<span class="strut" style="height:0.64444em;vertical-align:0em;"></span>
<span class="mord">2</span>
</span>
</span>
</span>
</span>
The size of the HTML generated for such a simple formula can be surprising. It’s organized as follows.
- A
<span class="katex-mathml">
block that contains a MathML description of the formula. It contains the semantics of the formula, but not the rendering itself. It also contains anannotation
tag containing the raw TeX formula that we wrote. - A
<span class="katex-html" aria-hidden="true">
block that contains a set of HTML elements that form the actual rendering of the formula. These are decorated with CSS classes, allowing to retrieve the suitable style and fonts. Some elements also have explicit styling, especially regarding layout (size, margins, alignment).
As another example, the Fourier transform at the top of this page generates the following HTML.
You can see that it creates a lot of span
tags.
<span class="katex-display">
<span class="katex">
<span class="katex-mathml">
<math xmlns="http://www.w3.org/1998/Math/MathML">
<semantics>
<mrow>
<mi>f</mi>
<mo stretchy="false">(</mo>
<mi>x</mi>
<mo stretchy="false">)</mo>
<mo>=</mo>
<msubsup>
<mo>∫</mo>
<mrow>
<mo>−</mo>
<mi mathvariant="normal">∞</mi>
</mrow>
<mi mathvariant="normal">∞</mi>
</msubsup>
<mover accent="true">
<mi>f</mi>
<mo>^</mo>
</mover>
<mo stretchy="false">(</mo>
<mi>ξ</mi>
<mo stretchy="false">)</mo>
<mtext> </mtext>
<msup>
<mi>e</mi>
<mrow>
<mn>2</mn>
<mi>π</mi>
<mi>i</mi>
<mi>ξ</mi>
<mi>x</mi>
</mrow>
</msup>
<mtext> </mtext>
<mi>d</mi>
<mi>ξ</mi>
</mrow>
<annotation encoding="application/x-tex">f(x) = \int_{-\infty}^\infty \hat f(\xi)\,e^{2 \pi i \xi x} \,d\xi</annotation>
</semantics>
</math>
</span>
<span class="katex-html" aria-hidden="true">
<span class="base">
<span class="strut" style="height:1em;vertical-align:-0.25em;"></span>
<span class="mord mathdefault" style="margin-right:0.10764em;">f</span>
<span class="mopen">(</span>
<span class="mord mathdefault">x</span>
<span class="mclose">)</span>
<span class="mspace" style="margin-right:0.2777777777777778em;"></span>
<span class="mrel">=</span>
<span class="mspace" style="margin-right:0.2777777777777778em;"></span>
</span>
<span class="base">
<span class="strut" style="height:2.384573em;vertical-align:-0.970281em;"></span>
<span class="mop">
<span class="mop op-symbol large-op" style="margin-right:0.44445em;position:relative;top:-0.0011249999999999316em;">∫</span>
<span class="msupsub">
<span class="vlist-t vlist-t2">
<span class="vlist-r">
<span class="vlist" style="height:1.414292em;">
<span style="top:-1.7880500000000001em;margin-left:-0.44445em;margin-right:0.05em;">
<span class="pstrut" style="height:2.7em;"></span>
<span class="sizing reset-size6 size3 mtight">
<span class="mord mtight">
<span class="mord mtight">−</span>
<span class="mord mtight">∞</span>
</span>
</span>
</span>
<span style="top:-3.8129000000000004em;margin-right:0.05em;">
<span class="pstrut" style="height:2.7em;"></span>
<span class="sizing reset-size6 size3 mtight">
<span class="mord mtight">∞</span>
</span>
</span>
</span>
<span class="vlist-s"></span>
</span>
<span class="vlist-r">
<span class="vlist" style="height:0.970281em;">
<span></span>
</span>
</span>
</span>
</span>
</span>
<span class="mspace" style="margin-right:0.16666666666666666em;"></span>
<span class="mord accent">
<span class="vlist-t vlist-t2">
<span class="vlist-r">
<span class="vlist" style="height:0.9578799999999998em;">
<span style="top:-3em;">
<span class="pstrut" style="height:3em;"></span>
<span class="mord mathdefault" style="margin-right:0.10764em;">f</span>
</span>
<span style="top:-3.26344em;">
<span class="pstrut" style="height:3em;"></span>
<span class="accent-body" style="left:-0.08332999999999999em;">^</span>
</span>
</span>
<span class="vlist-s"></span>
</span>
<span class="vlist-r">
<span class="vlist" style="height:0.19444em;">
<span></span>
</span>
</span>
</span>
</span>
<span class="mopen">(</span>
<span class="mord mathdefault" style="margin-right:0.04601em;">ξ</span>
<span class="mclose">)</span>
<span class="mspace" style="margin-right:0.16666666666666666em;"></span>
<span class="mord">
<span class="mord mathdefault">e</span>
<span class="msupsub">
<span class="vlist-t">
<span class="vlist-r">
<span class="vlist" style="height:0.8991079999999999em;">
<span style="top:-3.1130000000000004em;margin-right:0.05em;">
<span class="pstrut" style="height:2.7em;"></span>
<span class="sizing reset-size6 size3 mtight">
<span class="mord mtight">
<span class="mord mtight">2</span>
<span class="mord mathdefault mtight" style="margin-right:0.03588em;">π</span>
<span class="mord mathdefault mtight">i</span>
<span class="mord mathdefault mtight" style="margin-right:0.04601em;">ξ</span>
<span class="mord mathdefault mtight">x</span>
</span>
</span>
</span>
</span>
</span>
</span>
</span>
</span>
<span class="mspace" style="margin-right:0.16666666666666666em;"></span>
<span class="mord mathdefault">d</span>
<span class="mord mathdefault" style="margin-right:0.04601em;">ξ</span>
</span>
</span>
</span>
</span>
A third example is a formula including a square root like . It will generate the square root line as an embedded SVG.
<span class="katex-display">
<span class="katex">
<span class="katex-mathml">
<math xmlns="http://www.w3.org/1998/Math/MathML">
<semantics>
<mrow>
<msqrt>
<mn>2</mn>
</msqrt>
</mrow>
<annotation encoding="application/x-tex">\sqrt{2}</annotation>
</semantics>
</math>
</span>
<span class="katex-html" aria-hidden="true">
<span class="base">
<span class="strut" style="height:1.04em;vertical-align:-0.08390500000000001em;"></span>
<span class="mord sqrt">
<span class="vlist-t vlist-t2">
<span class="vlist-r">
<span class="vlist" style="height:0.956095em;">
<span class="svg-align" style="top:-3em;">
<span class="pstrut" style="height:3em;"></span>
<span class="mord" style="padding-left:0.833em;">
<span class="mord">2</span>
</span>
</span>
<span style="top:-2.916095em;">
<span class="pstrut" style="height:3em;"></span>
<span class="hide-tail" style="min-width:0.853em;height:1.08em;">
<svg width='400em' height='1.08em' viewBox='0 0 400000 1080' preserveAspectRatio='xMinYMin slice'>
<path d='M95,702
c-2.7,0,-7.17,-2.7,-13.5,-8c-5.8,-5.3,-9.5,-10,-9.5,-14
c0,-2,0.3,-3.3,1,-4c1.3,-2.7,23.83,-20.7,67.5,-54
c44.2,-33.3,65.8,-50.3,66.5,-51c1.3,-1.3,3,-2,5,-2c4.7,0,8.7,3.3,12,10
s173,378,173,378c0.7,0,35.3,-71,104,-213c68.7,-142,137.5,-285,206.5,-429
c69,-144,104.5,-217.7,106.5,-221
l0 -0
c5.3,-9.3,12,-14,20,-14
H400000v40H845.2724
s-225.272,467,-225.272,467s-235,486,-235,486c-2.7,4.7,-9,7,-19,7
c-6,0,-10,-1,-12,-3s-194,-422,-194,-422s-65,47,-65,47z
M834 80h400000v40h-400000z'/>
</svg>
</span>
</span>
</span>
<span class="vlist-s"></span>
</span>
<span class="vlist-r">
<span class="vlist" style="height:0.08390500000000001em;">
<span></span>
</span>
</span>
</span>
</span>
</span>
</span>
</span>
</span>
Effect on the HTML size
As we have seen, each formula is rendered into a lot of HTML code.
Even though in reality this HTML is minified without spaces (I added indentation in the previous section for readability), you may still wonder whether it’s worth it compared to the setup with simple script
tags and rendering in the browser.
One thing to note is that the browser will have roughly the same DOM representation in both cases, so the benefit of a more compact HTML file is only in the transmission from the server to the browser. Also, even if the pre-rendered HTML contains a lot of code, there is a lot of redundancy, so it can compress relatively well with gzip, and most servers will indeed use some compression mechanism in the HTTP layer.
To know the difference, I measured the size of the HTML files generated for my blog before and after this change, and also measured their compressed versions.
Post | Before | After | Before (gziped) | After (gziped) |
---|---|---|---|---|
welcome.html | 5912 | 5910 | 1926 | 1924 |
pdf-basics.html | 23731 | 23850 | 7738 | 7738 |
pdf-parsing-pitfalls.html | 27050 | 28487 | 8706 | 9035 |
pdf-encryption.html | 34465 | 81272 | 8732 | 11025 |
caradoc-version-0-3.html | 20236 | 20517 | 6322 | 6321 |
pdf-graphics.html | 23826 | 23911 | 7633 | 7617 |
plaidctf-2017-fhe.html | 19684 | 83436 | 5326 | 9259 |
conferences.html | 11334 | 11335 | 4007 | 4001 |
dpkt-parsing-http2.html | 23939 | 24232 | 5663 | 5690 |
googlectf-2017-rubik.html | 42859 | 108081 | 8690 | 11891 |
rubiks-cube.html | 9864 | 12237 | 3066 | 3426 |
rust-vs-cpp-ocaml-part1.html | 37583 | 37782 | 9174 | 9156 |
rust-vs-cpp-ocaml-part2.html | 46317 | 48007 | 10065 | 10437 |
rust-best-features-of-2017-and-beyond.html | 41409 | 42910 | 10331 | 10647 |
rust-advent-of-code.html | 46796 | 47232 | 11878 | 11912 |
profiling-rust-docker-perf.html | 46806 | 49122 | 13424 | 13626 |
rust-2020.html | 26850 | 29708 | 8213 | 8627 |
As you can see, even if the raw HTML size increased for some pages – up to 4 times for a math-heavy page – the increase is much more modest in the gziped versions. Some pages are even smaller after gzip compression!
In any case, the gziped HTML on my blog just exceeds 10 KB for the larger pages, so the bandwidth requirements are quite modest anyway. As a comparison point, font files used by KaTeX are generally several KB each.
integration in the browser
As mentioned above, once we obtained the HTML tags that represent a formula, via the katex.render
API, there is still some final rendering happening in the browser.
This is done via some CSS to layout the formula, as well as various fonts to render math symbols.
In this section, I’ll explain how to add these resources to relevant blog posts.
Which resources to add?
On KaTeX’s documentation, the starter template includes links to CSS and JavaScript from some CDN. There are two upgrades that we can make to this basic setup.
- Given that we already convert the formula into HTML in the browser, we don’t need the JavaScript.
The script was only useful to actively do
katex.render
in the page, but we now generate the HTML ahead of time. - Alternatively to using a CDN, I like to include all resources directly on my website. There are trade-offs here, but the main benefit for me is to make sure they are always available locally when I’m writing my blog, even if I don’t have an Internet connection - or not a good one. So I can write on my blog even when traveling in a train or a plane for example. The other benefit is that there is potentially less maintenance, as I don’t have to worry about the CDN shutting down or something like that. If I don’t touch it, my blog remains functional.
To obtain the relevant resources, I downloaded the latest release of KaTeX on GitHub.
I only kept the katex.min.css
file, as well as the fonts/
folder.
$ tree assets/plugins/katex.0.11.1/
assets/plugins/katex.0.11.1/
├── fonts
│ ├── KaTeX_AMS-Regular.ttf
│ ├── KaTeX_AMS-Regular.woff
│ ├── KaTeX_AMS-Regular.woff2
│ ├── KaTeX_Caligraphic-Bold.ttf
│ ├── KaTeX_Caligraphic-Bold.woff
│ ├── KaTeX_Caligraphic-Bold.woff2
│ ├── KaTeX_Caligraphic-Regular.ttf
│ ├── KaTeX_Caligraphic-Regular.woff
│ ├── KaTeX_Caligraphic-Regular.woff2
│ ├── KaTeX_Fraktur-Bold.ttf
│ ├── KaTeX_Fraktur-Bold.woff
│ ├── KaTeX_Fraktur-Bold.woff2
│ ├── KaTeX_Fraktur-Regular.ttf
│ ├── KaTeX_Fraktur-Regular.woff
│ ├── KaTeX_Fraktur-Regular.woff2
│ ├── KaTeX_Main-BoldItalic.ttf
│ ├── KaTeX_Main-BoldItalic.woff
│ ├── KaTeX_Main-BoldItalic.woff2
│ ├── KaTeX_Main-Bold.ttf
│ ├── KaTeX_Main-Bold.woff
│ ├── KaTeX_Main-Bold.woff2
│ ├── KaTeX_Main-Italic.ttf
│ ├── KaTeX_Main-Italic.woff
│ ├── KaTeX_Main-Italic.woff2
│ ├── KaTeX_Main-Regular.ttf
│ ├── KaTeX_Main-Regular.woff
│ ├── KaTeX_Main-Regular.woff2
│ ├── KaTeX_Math-BoldItalic.ttf
│ ├── KaTeX_Math-BoldItalic.woff
│ ├── KaTeX_Math-BoldItalic.woff2
│ ├── KaTeX_Math-Italic.ttf
│ ├── KaTeX_Math-Italic.woff
│ ├── KaTeX_Math-Italic.woff2
│ ├── KaTeX_SansSerif-Bold.ttf
│ ├── KaTeX_SansSerif-Bold.woff
│ ├── KaTeX_SansSerif-Bold.woff2
│ ├── KaTeX_SansSerif-Italic.ttf
│ ├── KaTeX_SansSerif-Italic.woff
│ ├── KaTeX_SansSerif-Italic.woff2
│ ├── KaTeX_SansSerif-Regular.ttf
│ ├── KaTeX_SansSerif-Regular.woff
│ ├── KaTeX_SansSerif-Regular.woff2
│ ├── KaTeX_Script-Regular.ttf
│ ├── KaTeX_Script-Regular.woff
│ ├── KaTeX_Script-Regular.woff2
│ ├── KaTeX_Size1-Regular.ttf
│ ├── KaTeX_Size1-Regular.woff
│ ├── KaTeX_Size1-Regular.woff2
│ ├── KaTeX_Size2-Regular.ttf
│ ├── KaTeX_Size2-Regular.woff
│ ├── KaTeX_Size2-Regular.woff2
│ ├── KaTeX_Size3-Regular.ttf
│ ├── KaTeX_Size3-Regular.woff
│ ├── KaTeX_Size3-Regular.woff2
│ ├── KaTeX_Size4-Regular.ttf
│ ├── KaTeX_Size4-Regular.woff
│ ├── KaTeX_Size4-Regular.woff2
│ ├── KaTeX_Typewriter-Regular.ttf
│ ├── KaTeX_Typewriter-Regular.woff
│ └── KaTeX_Typewriter-Regular.woff2
└── katex.min.css
1 directory, 61 files
Adding resources to a blog post
HTML pages for my blog posts look like the following (using Liquid templates).
<!DOCTYPE html>
<html>
{% include head-blog.html %}
<body>
{% include navbar.html %}
<div class="page-content">
{{ content }}
</div>
{% include footer-blog.html %}
</body>
</html>
This page depends on the _includes/head-blog.html
template, which contains the <head></head>
section of the page, itself containing links to the various resources.
To integrate KaTeX, I added the relevant resources to this section.
As mentioned previously, I only need to include the CSS (the font files are automatically referenced from it).
I also made it conditional to a katex
property of the current page.
<head>
...
{% if page.katex %}
<!-- KaTeX -->
<link rel="stylesheet" href="/assets/plugins/katex.0.11.1/katex.min.css">
{% endif %}
</head>
Last, I can enable KaTeX in a blog post by setting the katex: True
property in the post’s header, as follows.
This will include links to the relevant resources.
---
layout: post
title: My awesome blog post
katex: True
---
Insert blog post content here...
Did you know that $$1 + 1 = 2$$?
That way, we only load the KaTeX-specific resources on posts that use it. The other posts produce more lightweight pages, rendering faster in the browser of our viewers.
Conclusion
As we’ve seen, KaTeX allows you to render beautiful math easily in your website. No need to take a screenshot of a PDF made with LaTeX, you can get formulas in vector graphics directly in HTML. I hope that you’ve found this blog post useful to understand what happens behind the scenes.
I’ve now migrated from a KaTeX rendering setup relying on client-side JavaScript – as well as some hack to convert MathJax <script type="math/tex">
tags into KaTeX – to a simpler setup relying only on CSS and fonts on the client side.
As a side note, thanks to this new setup I’ve also upgraded Jekyll and kramdown to newer versions, and in particular syntax highlighting now works better on my blog. Things like ranges, lifetimes and attributes in Rust, Yaml scripts, Dockerfiles, sudo and flags in Bash are now rendering better!
Comments
To react to this blog post please check the Twitter thread.
You may also like