Using eqn for Website Equation Formatting

January 14, 2025

Last year, I began the process of writing a static site generator to replace Hugo for my personal website. One major goal of this process was to remove all Javascript [1]. This meant that I could not continue to use MathJax for rendering mathematics, and instead had to devise a way of formatting equations statically. I had the beginnings of an idea to use eqn(1) for this, and published a YouTube video going over a first pass at a script for doing just that. The basic idea was to embed eqn code into the page and use a script to replace that code with a rendered SVG file during site generation.

Since then, I have completed a functioning version of my generator, which is what was used to create the page you are reading right now. However, the technique that I'm using to render equations is actually a bit different than the one discussed in that video. Rather than using SVGs, I'm instead using a feature of eqn that I happened to stumble across reading its man page: MathML generation. In this article, I'm going to discuss why I ultimately went this route, how it all works, and some annoying quirks I've discovered along the way.

A Quick Summary of the Options

We'll begin with a brief summary of the options that are available for typesetting equations on the web. Broadly speaking, there are two standard approaches for this: MathML and images. Commonly used JavaScript libraries like MathJax convert the equation from a language like LaTex into one of these two options, based on the capabilities of the web browser being used. So, in principle, I should be able to pick one of them and do the conversion myself during HTML generation.

MathML is, in many ways, the most "correct" option. It's an XML-based language for expressing equations that can be embedded directly into an HTML page and be rendered by the web browser itself. However, it has historically been poorly supported and standardized. While MathML Core is pretty well supported in mainstream web browsers at this point, using it would prevent my equations from rendering properly in oddball ones like netsurf or mothra.

A desire to have my site work well in odd browsers led me to to instead look into the second option: images. I could, using a couple of commands, render my equation into a picture and drop it into the HTML of my page. This would mean that basically any browser would work just fine--in the worst case, even terminal browsers could let the user download the picture to look at it if need be. So, this was the route that I initially decided to go. I wrote a simple script that would render embedded eqn equations into an SVG image file, and then replace the code with an image tag in the generated HTML.

As an example, I can specify an equation in eqn like,

<SSG_EQN>
x = 3 + sum from i=0 to 10 i over 5
</SSG_EQN>

and have it render in my site like so,

For context, here is what the same equation looks like rendered using MathML instead,

$x = 3 + Σ_{i = 0}^{10} \frac{i}{5}$

The appearance of the MathML equation will vary depending upon your web browser. At least on FireFox, I find the SVG version to be quite a bit prettier. However, the MathML has the advantage of containing semantic markup for each character in the equation. In fact, you can even select and copy the text of the MathML equation as plaintext.

Problems with SVGs

Using SVGs works--but has a few issues that ultimately lead to me abandoning it. I mentioned these problems in my video demonstrating the SVG solution,

The SVG images were of different heights, depending on the specific details of the equation being rendered. This meant that it was difficult to scale the SVGs, and so there were inconsistent font sizes from equation to equation.
I hadn't worked out a reasonable way to do inline equations.
SVGs are a bit of an accessibility nightmare. Screen readers, for example, can't really do anything with them. My "solution" to this problem was to use the eqn code as alt text for the images, but this wasn't a particularly good one.

After considering these problems and working on ways to solve them, I decided that it would be far simpler to revisit my original decision to use SVG files in the first place. eqn supports generating MathML directly, which would instantly address all of these issues, as well as vastly simplify the process of generating the equations. My original idea of using SVGs to ensure support for oddball and terminal-based browsers was very niche; it would be easier to abandon it. And that's what I did: I'm now using MathML.

How I Generate Equations

The eqn command natively supports MathML output using the -T MathML option to the command. So, in principle, all that is needed is to use the same syntax as in any other groff file to specify equations, and then run the preprocessor on the file. This would support block equations using fences,

.EQ
.EN

as well as inline math with whatever delimiters you like.

I did make things a little more complex than this; for reasons I'll get into in a minute, I wanted to retain the SVG generation capabilities of the system. So I use a two-pass approach. I first use a script similar to the one I originally proposed to handle equation blocks, using the same <SSG_EQN> tags. This lets me specify whether I want to use MathML or an SVG for each equation (by adding an fmt="svg" attribute to the tag to specify the latter), as well as makes it a bit easier to wrap the equation in a div for CSS styling. Then, I handle inline math by running the entire HTML file through eqn directly in a second pass.

Inline Math

The second pass, for inline math, is pretty simple, so we'll start there. I use,

eqn -T MathML -d$$ < $page_html

to process any inline equations. This lets me dump inline math into the document in exactly the same way as one would with MathJax, by using $ delimiters (specified by the -d option) containing eqn code.

This command on its own isn't enough. eqn emits some groff code even in MathML mode. It leaves the .EQ and .EN fences in its output, as well as adding some of its own. For example,

eqn -T MathML
.EQ
${4 x} over 2$
.EN
.do if !dEQ .ds EQ
.do if !dEN .ds EN
.EQ
<math><mfrac><mrow><mn>4</mn><mi>x</mi></mrow><mn>2</mn></mfrac></math>
.EN

As a result, I post-process the output with sed to filter out any remaining groff directives from the file. These directives always begin with a period in the first character of the line, so this is very straightforward.

The end result of this is that I can write equations like ${4x} over 2$ anywhere in the document, and have them replaced inline with MathML like this: $\frac{4 x}{2}$ .

Equation Blocks

In principle, you could handle equation blocks using the same eqn pass as for inline math. However, the MathML that eqn emits makes no distinction between inline and block equations, so you'd also need to do a pass and insert divs around the block equations to allow you to style them. Because eqn leaves the .EQ and .EN fences in place, it wouldn't be terribly complicated to do a find/replace on those to accomplish that task. I ultimately decided to not do this, though, and have a different approach for handling block equations.

I'm not going to dump my full equation block processing script here just yet--it's fairly ugly and I'm still working on cleaning it up--but I will go over the high points. There are two paths--one for generating an SVG, and the other for creating MathML blocks.

The SVG path looks basically the same as the one from the video,

groff -Tps -e -s - << EOF | ps2eps -l 2> /dev/null | \ 
 epstopdf --filter 2> /dev/null | \ 
 pdftocairo -svg - "${FILE_DIR}/eqn/eqn_${eqn_num}.svg" > /dev/null 2>&1
.so roff/colors.rf
.gcolor fgwhite
.fcolor fgwhite
.EQ
 $eqn
.EN
EOF
 
# Write the corresponding image tag to the output
echo "<img src=\"eqn/eqn_${eqn_num}.svg\" alt=\"$eqn\">"

It uses a bodged together pipeline of programs to convert the postscript output of groff into a usable SVG file, which gets numbered and dumped into a sub-directory under the page being generated. The actual input to groff is provided in the form of a heredoc, which wraps the relevant groff code around the eqn code extracted from the input file and stored in the $eqn variable. In addition to including the necessary fences around the eqn code, this also allows me to include some roff files for specifying the colors (to match the website). Note that the current "production" version of the script doesn't actually use $eqn as alt text, because the eqn code itself can contain " characters that mess things up and I haven't bothered to figure out how to escape them yet.

The MathML portion is significantly simpler, as it doesn't need the awful processing pipeline, and the colors of the equation can be directly controlled with CSS. Because this call is only returning MathML for the equation itself, not filtering the entire document, stripping out the excess groff code can be done easily with sed during MathML generation.

eqn -T MathML << EOF | sed -n '/<math>*/p'
.EQ
$eqn
.EN
EOF

Limitations and Workarounds

This approach does have a few issues that require working around occasionally, however. These aren't particularly annoying for me and the sort of writing that I do, but they might be relevant to you, so I want to discuss them and provide workarounds.

Text Encoding Woes

The inline math pass of eqn can cause problems with garbling certain unicode characters. The unicode support within groff and its preprocessors isn't great, and running the entire HTML document through this pipeline can result in some characters getting garbled when they are read and then rewritten. I've found that curly quotation marks, elipses, and dashes can cause problems. This is particularly obvious if you use pandoc to generate HTML, as it will automatically insert these characters into the output. eqn will then eat and replace them with the dreaded � character (that one was supposed to be a ‘).

For me, the solution is simply not to use those characters. I don't actually use pandoc as part of my generator. I did use it as one time pass to move my original markdown files from my Hugo site over to straight HTML, which is how I stumbled across this problem, but once I removed them all that one time, it isn't something I've had to worry too much about as I mostly just use standard ASCII characters for my own writing here.

If you do rely on some of these characters, this problem can be worked around by using HTML named character references for the symbols that eqn clobbers. It should also be possible to fix this by using the preconv preprocessor, which is designed to resolve these issues, but I haven't taken the time to set that up yet. If I do get around to it, I'll update this section later with a discussion of that.

Unescapable Delimiters

Another annoyance with eqn is that it doesn't support escaping its delimiters. This means that, if you want to use a $ in your document, you'll be stuck using the &dollar; named character everywhere. Except, for obscure technical reasons, I cannot do that within highlighted code blocks in my own generation system. This means that I have to change my delimiter in articles that contain bash scripts (like, this one!). To do this, I add,

.EQ
delim @@
.EN

To the very start of the HTML document, and then strip it out after running eqn. You can also just turn inline math off using,

.EQ
delim off
.EN

If you don't need the feature on a given page.

Unsupported eqn Code

Not all of eqn is actually supported by its MathML target. More advanced layout features such as piles, mark and lineup, and matrices, for example, do not seem to work. This is part of why I left the SVG generator in place--for easier support of these features if and when I need them.

Discrepencies in Rendering

What's a little odder are some random idiosyncracies with the formatting of the MathML output. For example, eqn uses quotation marks to force whitespace between words to render, and the roman keyword to set text in upright roman characters. For example, $roman test "hello there"$ should render like this,

But, it actually renders like this after MathML generation,

$t e s t hello there$

It seems like quotation marks also put the text in roman when generating MathML, whereas they don't when targetting more traditional document formats.

This is the only major one of these that I've stumbled across so far, but it's still early days yet. If I spot any more of these discrepencies as I continue to use the system, I'll add them here too.

Conclusion

And there you have it. eqn provides a lot of useful features for web content authoring--though there are a few rough edges that need working around. I wouldn't necessarily suggest you copy my system, but it can be made to work with a bit of effort. For me, it's allowed me to drop JavaScript entirely from my site, without adding any real extra work to the process of authoring content containing equations. If you're looking to do something similar, it might be worth giving it a look.