This is a blog post about technical considerations of markup languages — and I’m writing it on a site I built myself. If there’s an impulse more codified and vacuous than artists waxing poetic about the power of creativity, it’s programmers blogging about how their blogs work. In the years I’ve been erecting this monolith I’ve resisted that siren, but for once, I’ll indulge.
But the fact that I’ve resisted — the fact that such a post is out of character for this blog — points to a pertinent fact.
My background is in fiction writing, posting on sites like Wordpressnote[Which is, to a first approximation, the internet itself.], Royalroad[A site for ostensibly original fiction — it’s slop central.], and ArchiveOfOurOwn[A website for fanfiction.] — more or less in that order.note[Not correct at all, really — my background is in programming, plain and simple. As a teen, I had spent years in the GNU/Linux mines before I had ever picked up a pen. When I started writing my story, I started writing it in LATEX! But writing slowly exclipsed more and more of my focus. One year, I got a laptop and did not install Linux, instead rawdogging Windows.note[All I really needed was a browser and emacs, which mostly felt the same across OSes.] At a certain point, I drifted back to rich text editors, no doubt begravitated by the sites I was posting on, but I cannot recall the specifics of the progression. It was overall piecemeal and variegated.]
These sites all have rich text editors, and working with them asks far less of you than fiddling with a conversion pipeline.
But of course, if you’ve used these sites from long enough, copy-pasting from platforms like Google Docs, you’ll know there are warts — AO3 especially is notorious for the spacing errors that crop up around italicized text when you paste.note[It’s at the point where authors pass among themselves links to scripts for converting their rich text into a cleanly pasteable format — essentially recapitulating the ordeal of markup languages rich text had ostensibly delivered us from.]
However — and forgive me for speaking a bit superstitiously — there’s an kind of invisible ooze to using a WYSIWYG text editor.
Do you know what you’re pasting when you’re pasting rich text? I have an old page[Rich Text HTML Demo] on this site that exists for the express purpose of revealing what’s inside of rich text pastes.
To save you a click, here’s what a humble header, subtitle, and paragraph from a personal document results in:
Hiding this behind a click to build anticipation.
But mostly because it’s a big fat wall of text.
<h1 dir="ltr" style="line-height:1.7999999999999998;text-align: center;margin-top:20pt;margin-bottom:6pt;" id="docs-internal-guid-fffb5069-7fff-2c13-5320-771304d40c62"><span style="font-size:26pt;font-family:'Times New Roman',serif;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">A Chimerical Hope</span></h1><p dir="ltr" style="line-height:1.7999999999999998;text-align: center;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:12pt;font-family:'Times New Roman',serif;color:#000000;background-color:transparent;font-weight:400;font-style:italic;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">No predators nor parasites.</span></p><br><p dir="ltr" style="line-height:1.7999999999999998;margin-top:0pt;margin-bottom:0pt;"><span style="font-size:12pt;font-family:'Times New Roman',serif;color:#000000;background-color:transparent;font-weight:400;font-style:normal;font-variant:normal;text-decoration:none;vertical-align:baseline;white-space:pre;white-space:pre-wrap;">The dream we share is simple: no traitors, no masters, only bugs united. Lucidity is a dreamer’s sacrifice to reality, and our sacrifice is the vesperbane. A vile transformation through blood and black nerve, the vespertine arts grant a mantis power and control beyond this world. Only vesperbanes can give our grand dreams breath — but all vesperbanes are shadowed by an incorrigible potential to instead become oppressors and defectors.</span></p>
That’s HTML. A lot of it, in its most gruesome form.
That goop is absolutely lousy with hardcoded inline styles and cryptic identifiers. And sure, that’s ugly, but admittedly aesthetics is a weak critique when the point is not looking at it — rather, what really bothers me here is that’s it messy. I don’t mean that as a synonym: Those styles specify fonts and arcane rules like white-space:pre-wrap. The line-height is defined to sixteen (16!) decimal places.note[“Maybe that’s a floating point thing?”] What happens if one of those knobs get turned the wrong way?
More pointedly, part of why this HTML is so verbose is that it’s pretty much specifying the exact appearance of every span. It’s practically tailor-made for setting up a situation where one letter is jarringly, inexplicably, accidentally in the wrong font.
The workflow of editing text sees you inevitably moving bits around, italicizing this then unitalicizing that. You’ll soon wind up in situations where, say, a period or whitespace character retains a palimpsest of the styles that once graced its adjacent. Doesn’t the idea of that drive you mad?note[“It has come to my attention that there exist people who walk this Earth unbothered when the hyperlink contains a trailing or leading space.” — suckerpinch]
But the fact that all “rich text” is is just HTML wearing a fancy hat means that all of the sites I listed off at the start (and plenty more) accept raw HTML input in addition to rich text pastes.note[I’m told that some people write their stories in the site’s own text editor. But I believe that’s just a tall tale to spook children.] It’s the common tongue of the internet, after all.
But even when it isn’t riddled with the unsightly warts of rich text,note[Incidentally, before I fully embraced markdown, I got a lot of use out of this HTML editor. I still pull it up when I need to work with tumblr posts.] HTML is just cumbersome. Apparently some can stand to write it natively — but I have far too many deep-set programmer instincts to endure that much boilerplate.
And with that this prelude ends: enter markdown. Markdown is an elegant, compelling dream: readable plaintext with all the expressive power of HTML. Two decades removed from its inception, it has calcified into a formally specified format, but the pitch, the design sensibility, beckons forth.
The idea is that a Markdown-formatted document should be publishable as-is, as plain text, without looking like it’s been marked up with tags or formatting instructions.
But I’ve misled you, really. I frankly don’t care much markdown’s raw publishability, except insofar as that design coincides with ease to edit — and I mean edit in the oldest sense, proofreading prose rather than fiddling with code — because again, I write stories.
Between 2019–2023,note[Roughly: after Endless Stars’s unending hiatus set in and before Hostile Takeover arose spontaneously. I consider this period my doldrums, for mostly unrelated reasons.] I did a whole lot of writing in rich text editors.note[Part of why it was so hard to recover the oldest Black Nerve writings is that I was using frickin’ Scrivener?] In large part, this is a consequence of my posting mediums being those aforementioned sites.note[Though I feel compelled to mention there was a period where the canonical place to read Eifre Quest was, of all things, reddit.com?]
The thing about posting on a site like Wordpress or Royalroad, is that it’s putting yourself in a box — the platform limits what you can do. The rich text format itself is a prison, locking you in a broadly-interopable subset of HTML. (Perhaps a better, if no less cliché metaphor, is that it’s a road. And those roads see a lot of traffic — it more than suffices for the efforts of most people most of the time — but it won’t take you everywhere.)
But I’m ambitious and clever. I want to push the boundaries of my medium — to portray notions novel and alien. Even back in my wordpress days, you can find places where I reached for text shadows to portray strange cadences, vibes that italics nor bold quite suffice to evoke.
When I finally made the switch to neocities, after having written my own stylesheets,note[Okay, in complete fairness, you’ll find the original look of this site was far less inspired.] freedom was pulsing under my skin. It’s no surprise that the very next longform story I ended up working on winded up my most stylistically impressive.
It was posted on AO3, where the first chapter merely having code blocks was enough to dazzle fanfiction readers, but by chapter eight, I’m formatting text in two columns to represent a forked train of thought, and by chapter fourteen[14: A Sight to Behold]?
::::: reflection
Open your eyes. Find my gaze. *I want you to see this.*
:::::
::::: reflection
Let me [ [\_\_\_\_\_\_\_\_]{.invisible}[look.]{} [open.]{} [access.]{}
[control.]{}]{.overlap-container} *Let me exist.* [Or stare into death
and be still.]{.quiet}
:::::
Woah! That’s not in the markdown spec. What’s going on here?
pandoc(1) is what happened. It extends markdown with a number of useful features — the most flexible by far are fenced divs and bracketed spans. Markdown has always allowed you to embed inline HTML, but it’s ugly — and remember, the point of using markdown was that I wished for something more concise than writing quirked up XML.
pandoc is, make no mistake, an amazing achievement. while I believe Markdown is what it’s for at heartnote[if you just pipe text on stdin, it expects markdown and outputs HTML] it’s famously a polyglot. It ingests and outputs formats I’ve never even heard of,note[and admittedly, will probably never use] and there’s something luxious about deciding I want an .epub of one of my stories and getting one as simply as writing a command line so intuitive I don’t even have to look it up.
But the real superpower of pandoc is that, much in the way switching to neocities escapes the prison-roads of locked down platforms, switching to pandoc escapes at once the restrictions of both rich text and standard markdown.
Even writing raw HTML isn’t as powerful as what pandoc lets you do, because pandoc has a convenient interface for reading and writing the AST that it translates all its inputs and outputs to.
One of this year’s creations I’m proudest of (at least on technical merit) in the glossary[A Glossary for Vermin] for my new setting. And structurally, it’s just a big definition list.note[<dl> is one of those HTML elements that doesn’t get enough love] But it’s sorted, indexed, and deeply interlinked. Every definition has an anchor, almost every definition refers to other terms, and when it does, it links to those other terms.
The code to make this happen was only about a hundred lines of Lua,note[I love Lua, adorable language. ] and perhaps an evening of work.
If that isn’t a testiment to the power of this software, I don’t know what would be more convincing. Unfortunately, that may be the last kind word I speak about pandoc in this post. As I continue I shall descend into the tenor of a rant.
This will become even clearer in later sections, but suffice it to say, I’ve become something of an unwilling wizard of pandoc’s guts. I wouldn’t call myself an expert — I’ve yet to even broadly familiarize myself with how readers and writers really work! — but I have written thousands of lines of Lua code to interact with its filtering interface. I know how the AST ticks, and I know far more than I’d like about some very specific edge cases and limitations.note[Did you know “quoted text” gets its own node? Meaning that if you try to operate on the children of a node, the words inside and outside of a quote are not siblings.]
And the longer I remain victim to using it, the more I begin to have ideas.
But first, an anecdote. I do not use any off the shelf state site generator for this blog. When, midway through 2024,note[fourteen months (!) after I first moved to neocities] I decided wrangling raw HTML and hacking together site headers with autorun scripts (yuck!) was too ugly, I finally decided to look into how I might automatically generate my HTML.
Incidentally, if you’re an aspiring indie web creator, I would strongly recommend starting with one if you have any intention of crafting a site with extensive indexing and interlinking.note[Even if it’s not quite as sprawling as my own. My hundreds of pages and counting probably makes me a minor pages georg in the world of personal websites.] It’s downright necessary if you play to write stories or host a blog here.
I first tried out Hugo, easily one of the most popular SSGs so quick to come up in cursory searches, yet I swiftly encountered a blogpost persuasively arguing that it (and most major static site generators) kinda suck, actually!
This blogger name-checked the UNIX philosophy and overall appealed my sensibilities, so I decided to abort my efforts to switch to Hugo, and tried out their solution instead: soupault. The design was appealingly elegant, and it was practically a bespoke fit for the exact circumstances I had found myself in: migrating over a large, existing site.
Soupault brands itself as not a static site generator, and one of its distinctions is that it operates seamlessly on precompiled pages — it’d work just fine with the HTML files I already had, or it could convert new markdown files. It’s all so admirably flexible.
I quickly ran into some warts and deficiencies in the documentation, but this is an amateur software project, you have to be forgiving of these things. Overall, it’s mostly usable if you look past occasional typos, misleading pseudocode, a few cases where documented features are straight up wrong.note[To quote from a chatroom rant I gave back then: A memorable example being when the description of the two pass indexing option suggested one of its features was making the site index available to all page processing functions. And it’s true, you can access site_index, the table of index data for every page in the site, but the docs claim index_entry will contain the index data for the current page. And it didnt’t. It was always nil.]note[I do want to emphasize the experience I’m relaying is from 2024, almost two years ago. It may have improved since.]
But I could cope and work around, none of this was a dealbreaker for me.
Except this software intentionally refuses to acknowledge a pretty basic concept.
“Hey soupault, do you think maybe we shouldn’t waste time creating pages that already exist and whose source files haven’t changed since last we created them? Y’know, the thing we figured out with Makefiles in the 80s?”
And its answer? A take unironically, unabashedly in the vein of “640K is enough for anyone.”note[A quote often falsely attributed, never substantiated.]The FAQ suggests, to paraphrase, “rebuilding my whole site from scratch every time is plenty fast, I dont see the issue.”
But this was… a bit of an issue for me, given that my site is 700MB.
See, here’s how soupault works: it dutifully copies and processes the whole thing from source to prod every single time you run it.
Still, most of my site’s weight is music[Music], and most of the remainder is images[Art Gallery], so alright, what if I just excluded the multimedia from the source directory?
This already defeats a major purpose of doing all of this, because programatically generating thumbnails and song pages was something I wanted a static site generator for,note[…I still haven’t done it…] but maybe—
Except nope! Even when it’s just the text, it still took 10 seconds to build them. I simply cannot live like this.
I was so exhausted and irritated after spending all day on this dead end that I said, you know what? Fuck it. I’ll just write some Makefiles and shell scripts to put together my site.note[Sweetie, we have static site generation at home.]
And that’s how Equestria was made! As of writing this blogpost in early 2026, that’s still fundamentally my infrastructure, which I informally call the serpent’s den.note[There’s a longer blogpost to be written in recounting just how those scripts have mutated and metastacized — for a good year in the middle, my architecture relied on an burgeoning jq(1) script that grew so arcane that other chattersnote[I wave to lun] began to view my efforts as something of an performance art project. The jq script had neared 200 SLOC before wisdom took hold of me and I switched to a more sensible and much less exciting Lua script.]
The reason I recount this experience is it’s an illustrative characterization. There are hundreds of static site generators out there, because rolling your own is a fundamentally easy programming problem.note[The 80% required to get the basically working can be taken care of in a day or two. The rest takes years.]
I recount this anecdote to suggest an analogy. When I lay it out like this,note[i.e. glossing over the utter headaches that building a site with calcified software from the 80s has caused me] doesn’t it sound tempting? If I don’t like how pandoc operates, how hard can it be to write my own markdown→html converter? This idea bounced around in the back of my head for a long time, but I had so many other projects to work on.note[It doesn’t help that I spent the last quarter of 2025 cursedbythegods.]
But it has been observed to the point of dead metaphor — a camel can only bear so many straws.
Every advanced user probably has little gripes with their markdown engine. Heck, I have gripes with markdown itself.
I don’t like the four-space rule that turns indentation into a codeblock-landmine. I don’t like setext headers.note[Puting === or --- underneath lines to make them headers, instead of a number of #. I got my start writing novels in one big document, meaning there was a heading for the work title (H1) and each part (H2), and thus every chapter was an level three header. I carried this impulse over when I started blogging — H1 was the site name, H2 was the post title. As a result, for a time almost every header I actually used was three levels deep. So this isn’t purely aesthetic distaste — these shorthands are just no use to me.]
I don’t like that pandoc collapses two space characters into one pandoc.Space node — because I learned how to touch type from gtypist(1) and spent most my teenage and young adult years editing text in emacs(1), so I reflexively put two spaces after each sentence — I go so far as to contend this is superior, aiding in visual parsing.
In fact, if you check the original home of my first web serial, you’ll find the fruits of that determination — there are unicode no-break space characters after most sentences, because I had put a sed(1) preprocessing filter before I piped my markupnote[I know I used org-mode, but I might have switched to markdown after some time.] into a converter.
I also used that sed filter to replace intended paragraphs of my source files — [/\n {4,}/]note[“Regex for four spaces after a newline”] — with those double spaced paragraphs most processors expect. Teenage me had some adorably antiquated typographical tastes.
All told, I’d gone to some quite extreme lengths with my typography.note[One very minor optimization is when a single quote is nested at the beginning or end of a double quote — i.e. “She’s talking in ‘scare-quotes.’ ” — I insert a thin space in between the quotes, once again to aid visual parsing.] A lot of them specifically involve the em dash. The first draft of this post originally included a lengthy aside about just how I format my dashesnote[it’s here[How I Format Em Dashes] if you really want to read it] but as scatter-brained as this post is, I must admit getting into those weeds is not necessary.
The intricacies of what I do with em dashes does ultimately exemplify the strengths of pandoc’s filters. See, the way Lua filters work is simple — you hand it a file or Lua table (but I repeat myself) with fields for the AST nodes you care about.
For instance, if you have Emph = fun, then fun gets called on each instance of emphasized text and that element is replaced by whatever it returns.note[Now, reformatting em dashes like I do can’t actually work quite like this, but the nuances would, again, take us too far afield.]
As established above, I am an experimental writer, striving to do a great many creative things with formatting. In Black Nerve, you have anime-esque ⸢Spell Names⸥ and spooky vesper communion; while in Hostile Takeover, I used guillemets for «shortwave radio broadcasts», and occasionally spiced it up with rot13note[“cipher that replaces every alphabetical letter with the one 13 places down”] and zalgo textnote[common in horror shorts, infamously difficult to read, and a nightmare for screenreaders — but mine are carefully styled for accessibility] which will… «pbzr bhg ybbxvat yvxr guvf».note[I have even wrote code for spoiler text but I actually don’t have much use for it in practice — maybe if I wrote more reviews, or migrated my fanfic discussions to this site.]
And those are the just ones that are easy to list off and demonstrate inline in a single sweeping sentence-gesture. I’ve written back-and-forth text message[Off the Record, Off the Clock], and hell, my footnotes are so fluently integrated into my writing process that even while sitting here trying to think of examples, it took until now to remember how much monkeying about it had taken to get those working. Another trick you’ve already seen is my use of details disclosure elements that you must click to expand — I love that element.
Markdown is older than the <details> element, so there’s no syntax for it. But by design, you’re allowed to insert inline HTML, so it’s no real loss. If you search up how to do details in markdown, this is what stackoverflow tells you to do. Except pandoc doesn’t generally parse markdown inside HTML elements,note[meaning no emphasis, and no automatically prettying up your quote marks] so it’s a lot more ergonomic if you use filters to transmute fenced divs into details blocks when assembling your pages.
Another common trick, in both fiction and blogging, is wanting to centered a bit of text. Easy enough to add a .center { text-align: center} rule to your CSS and class="center" to your HTML — but in the case of pandoc, that means eclosing a passage in ::: center and ::: on their own lines.
Except if what you want to center is very brief, a single word, then these formatting directives quite possibly take up more space than the text they’re formatting.
But again, filters can automate this! Omit the verbose formatting, and just include enough for your code to know where to put the div with the .center classnote[my filter checks for paragraphs that have a single child: a span with the .div class] and that way you can keep your source files svelte.
But… reflect on what we’re doing here. Is there anything niggling at you?
Let’s look at it more carefully. Imagine you’re me, writing spooky murder drones pings. You might start with *Prey! Hunt! Devour!* — oblique text is pretty standard for psychic transmissions, though that’s not quite what this is.
Now, I’m adept with CSS, if I wanted to throw some yellow text shadow then it would make sense to do [Prey! Hunt! Devour!]{.murder-ping}, then add a CSS rule for making class="murder-ping" look pretty.
We can also omit the asterisks and make one of its rules font-style: italic — but if we care about people using screenreaders or reader mode or really stripped down browsers, maybe we want there to be an <em> element in our semantic HTML, so we can have our filter output an Emph node.
And again, if we’re thinking about how the unstyled text looks, it makes sense to have the filter also slap lil’ «guillemets» around them — I’ve always felt italics alone are badly overloaded in prose fiction.
I’m working through this example step by step because I think it’s worth respecting how natural each step of this progression feels.
But there’s a distinction to discover here. When a chapter has a line that says [05:50]{.timebreak .div} one of those classes is an actual class to evoke real CSS rules, and one of them is a fake class that exists to get caught in the filter.
The logical endpoint of this is the .rot13 class, which has no bearing on how the text is displayed — it’s purely an instruction to the filter.
And that’s the core insight I’ve been circling around. In web design we applaud the principle of the separation of style and content. HTML ought to specify the content, and not care about how it looks; that’s CSS’s job.
I think markup could benefit from a principle distinguishing style and procedure. Marking “This is a foo block.” vs “Postprocess this with the foo() function.” — because that’s what pandoc’s Lua filters devolve into it, ultimately. So many function calls with extra steps.note[And you’ll often find yourself repeating the pattern of “check this span/div/codeblock for one of a dozen different classes and run a block of code in each case. Since pandoc offers no pattern-matching helpers more granular then calling different functions on different node types, piecemeal evolution of your filter will result in long if-else chains checking for specific elements classes, leading to weird sequencing and duplicated code, until maybe you finally end up writing your own class-based table dispatch function that should have been part of the API in the first place.]
In short, a design that centralizes text-filtering functions as part of document structure could have valuable ergonomics.
If you aren’t familiar with my work, then when I said I loved the details disclosure element, or that I’m experimental writer doing creative things, you could have brushed it off as a cute yet idle exclamation or an otherwise meaningless remark.
If you aren’t familiar, then gaze upon Weave Me Another Cocoon[Weave Me Another Cocoon] and let its depths ensnare you.
Let’s cover some background. Telescopic text is not new — it’s been here since 2010. I first encountered it when an acquaintance linked Nutshell right after seeing a demo of how my own footnotes work, though by that point I had already built up a lot of mental underbrush around the idea.note[Due in large part to how I think about outlining stories[Outlines as Temporarily Embarrassed Drafts].]
When I left for my walk that evening, that underbrush caught fire and remained ablaze for one feverish week. The result was WMAC. The nature of the story’s structure is that I had already “completed” it on that very first day, just a few hundred words in, albeit being more poem than narrative. But poems are not exempt from the same privations as prose, and the lack of grounding left that first draft feeling vague, unmoored.
I spent the rest of the week proving its narrative theorems, so to speak — but that was only half of what took up so much of my time. While I certainly have more to say about the narrative craft that makes WMAC tick, it remains a unique project in how it was as much a product of code as ink.note[Most of the time, those parts of my brain are so separate I’d liken them to different people.]
The basic idea of WMAC is rather simple — you click some text, and its content is substituted for something else. Which itself may contain more clickable text, and so on ad nauseum.note[And I do take this to nauseating extremes — one word becomes almost six thousand, with 1305 total details elements on that page.]
How on earth do you implement that in markdown?
I’m not going to cover every quirk of WMAC’s implementation here.note[I ought to — one of my friends not only expressed interest, but suggested it could do numbers on technically-inclined websites like Hacker News. And there’s cute details to cover like the way scene breaks are styled, or how a browser inconsistencynote[it all worked in Firefox, not Chrome, which is to first approximation the only browser in this bitch of a world] regarding how inline-block elements are wrapped was not caught till the final day, nearly crippling the project at the finish line.]
But some broad strokes are obvious from the word go. I mentioned there were details elements, and that tells the whole story as far as the HTML itself is concerned — I used CSS to make all the summary elements look like links, and once you figure that out, it’s not hard to open up CodePen and cook up your own equivalent demo.
But what did the markdown source file look like? Consider those first three clicks.
Enantiodromia. Thatparadox. Thatensnaringparadox
—
metamorphosis.
Pasting the <details><summary> et al. boilerplate in front of them all is a complete nonstarter. Something like:
is better, but only slightly. Remember, we have to proofread this.
You can shave some characters if you assume the first child of a .d must be the .s. And there’s a couple place in the text where multiple words are transformed at once, but most links are one-word. But that does little to avail the plight of recursion. If we really want to make this convenient to write, we need to invent some powerful syntax.
If you’re like me, you’ll immediately start skimming the relevant section of the pandoc manual looking for any syntax you can easily repurpose in a filter. [Nesting spans suck,]{class="gratuitous-demonstration" problem="the two different types of brackets create a lot of visual noise," solution="so a promising alternative is the humble footnote."}
Not the widespread markdown footnotesnote[By that I mean the syntax described e.g. on on this random website which has a high search engine ranking.] — rather, a pandoc extension that allows you to include them inside paragraphs, something^[like this].note[I did say I was going to stop glazing pandoc after the first section, but credit where it’s due, they cooked with this.]
It’s a hack, but with:
^[Enantiodomia / ^[That / That ensnaring] ^[paradox / paradox---metaphosis.]
It now feels like we might be getting somewhere. Those slashes are what I picked as a cleaner way to separate the before and after of the substition. Start writing like this, though, and you’ll soon discover some other patterns in “idiomatic” telescopic text crop up. It’s easy enough to extend the syntax to accomodate it.
Consider:
^[Sister... / One sister | says, / says to another:]
This is nice syntactic sugar, which I used all over the place in the source document.note[The structure of cognition and narrative biases the effort to right recursion. For all that I aspire to xenofiction[Xenodeterminism], I have not rid myself of all vestiges of human perspective.] The pipe essentially declares the rest of this to be a subtree without needing the clutter of an extra opening and closing bracket pair.
Likewise, something like:
shouldn't. / + She ^[hated / always +] it.
This plus-sign syntax saves me the trouble of repeating words when I’m only appending/prepending more.
You're peerless. / I could never | compare. / lie. / lie to
save my life. & Is my flattery/distraction/dissimulation so
transparent? & I'll recuse myself, then.
The ampersand syntax takes this to a further extreme, abstracting all three symbols otherwise called for when enclosing a | word / + like this.note[It’s another common idiom in WMAC; you click to “unfurl” the rest of a section, progressing the narrative in a way that’s arguably in tension with its telescopic aspirations. Though personally, I think of it like how visual novels make you click for the next dialogue box.]
If you’re really knowledgeable about pandoc’s syntax, you’ll notice that if our filter is designed to look for / symbols specificallynote[i.e. pandoc.Str("/") nodes] then the spaces around it aren’t optional — so it’ll never match the/ones/formatted/like/this. That’s true! So this is more syntactic sugar for those cases where one word is only ever substituted for another word down to the leaf nodes, never growing spaces.
But I digress. Part of why I digressed was to distract you. There’s so many detailsnote[no pun intended] to concern yourself with — isn’t this a fun little rabbit hole to throw yourself down? I wanted to lull you then pull the rug out from under you.
See, pandoc doesn’t like what we’re doing at all. Try it yourself! Echo a (well-formed!) string like ^[^[^[^[^[ ^[^[^[^[^[g ]]]]] ]]]]] into pandoc and watch how it haaaaaangs. If you test it iteratively, the first few footnotes are instant. By the time you nest six footnotes there’s a noticeable pause that only gets nonlinearly worse. A few more and I don’t have the patience to see its output.note[What on earth is it doing???]
Especialy when that output is useless — I needed to write my own footnotes system for this site primarily because I’m doing something funky with checkboxes that’s pretty nonstandard,note[i.e. no javascript is involved] but I’d have to roll my own anyway because pandoc doesn’t let you nest footnotes! I thought this was freedom!?
What’s worse is the AST actually does support it, and the parsers even understand it. Note nodes are treated as inlines with inline children, so nested footnotes fit cleanly into the structure, but as soon as you try to render it as HTML or plain text, the output only shows the first level of footnotes. But it gets worse.
Our discussion has centered on how WMAC worked, and on paper, it’s a single tree. But I refuse to read or write scriptio continua, let alone when it’s this deeply nested in structure — so we want to separate it out into subtrees, right?
On paper, the standard markdown footnotes are perfect for this. [^paradox] should let use put the paradox subtree anywhere we want, right?
You can probably guess what goes wrong — I’ve already spoiled it — but here, it’s not a mere failure of the output: pandoc’s markdown will not even parse the nested footnote references as you’d expect, if it’s an inline footnote inside a reference footnote definition block. It just quietly eats them.
You need a workaround. I eventually settled on macro system where, say, the verbatim code flame gets replaced with the contents of a #flame element after processing. Again, I won’t describe all the internals, but I do want to underscore just how far afield I went — I made pandoc segfault!note[Version 3.7.0.1; I keep around the files that reproduce this in case any devs are interested, but it’s been a year and I haven’t updated.]
Are you beginning to sympathize with why I’m tempted to create my own markdown engine? I am pushing the limits of pandoc; my code needs to babysit its oversights, its algorithmic shortcomings, and its outright bugs.note[The documentation is middling as well, but better than soupault.]
In the end, telescope.lua is only around 159 lines of Lua.
izanagi.lua is 1659 lines.
I have tried to keep just what izanagi is somewhat under wraps.note[“With a normal genjutsu, a user will apply an illusion to a target’s senses, causing the target to experience things that are not real. With Izanagi, the user applies an illusion to reality itself, giving the user control over what is and is not real for as long as Izanagi is active.”]note[Pithily: ⸢Izanagi⸥ turns reality into illusion and illusion into reality.] It’s an ambitionous tool for authoring ergodic hypertext far outstripping what WMAC showcased. I spent August 2025 determinedly working on it. There is some unavoidable slowness in the nature of what it is doing — the core algorithm is innately, unavoidable exponential — but it being shackled to pandoc does not help. Could I create something faster?
Izanagi never produced an actual story that made use of its technote[Very determined sleuths can find a somewhat-hidden page on this website that demonstrates some of what it is already capable of.] — what had set this Hindenburg ablaze was when the engine itself was finally, triumphantly feature complete, I returned to write more scenes, and a simple dialogue tree resulted in minutes of compilation time. By this point in the drafting process, the story’s state had only begun its branching; I had barely scratched the surface of what I had outlined.
And now…
Nothing beside remains ’round the decay of that colossal wreck.
There’s one last strand worth tracing. I am a writer — an amateur webfiction writer. This comes with it an unavoidable fact: I make mistakes. Typos, errors, infelicities. I’m pretty decent at catching them — years of writing and painstaking editing passes drills that into you — but I’ll never be perfect. I have a few friends that (usually, hopefully) look at the things I write and tell me when I’ve made a mistake.note[And this is not exclusive to friends — if you spot a typo or a weakness you’d like to give feedback on, please contact me[Contact].]
Now, the way this normally goes in the worlds of webfiction and fanfiction is that people use Google Docs, and they hand their beta readers and editors a link with feedback enabled, and comments can be left inline, with suggested edits committed with a click.
I write text files on my own computer. As I’ve covered, I don’t like Google Docs nor WYSIWYG word processing as a user experience.
As a result, the way typos are reported to me is rather primitive. People will quote enough text to probably disambiguate where a typo is located and DM it to me. I then have to search my source copy for the string and type the change myself. This sucks, and this really sucks if there’s formatting that gets in the way of a naïve string search find it.
And for something like WMAC? Forget about it!
What alternative is there, though? Even if I succumbed to using GDocs, it hardly avails me of needing to manually port changes to the source doc. Perhaps if I send the raw .md files to people and got back an honest-to-god diff patch, it could work — but so many of the people I work with are simply not that technically proficient.
My site doesn’t use javascript, but I do have some skill with it. And marvelously, HTML has an element attribute called contenteditable which is a shortcut for the work it’d to make a somewhat functional rich text editor. Could I write a script that, if you load one of my posts with ?editing=true in the url, goes through and makes the article’s text editable, then programmatically diffs the HTML into a format that I can work with?
Now, in all fairness, the amount of work this would save me is somewhat marginal,note[I still need to review each one; some “typo” reports merit an WONTFIX, after all] but simply adding a floating textboxnote[like the the famous AO3 floating comment box extension, or the way Xenforo forumsnote[Spacebattles, SufficientVelocity, and another that shall not be named] let you quote highlight text] would streamline the reader’s ability to produce the desired feedback.
But around about here we find the rub. If we want to produce useful typo reports, how do you associate HTML paragraphs with the markdown that produced it?note[Truth be told, this is no more than the Perfect dragged into yandere-esque toxic yuri with the Good — it would be easy enough and mostly useful to simply quote the line around where a typo exists. This is how Royalroad’s built-in typo report feature works.]
Experienced programmers will quickly recognize that this is an old problem is matters of compilation — what we need is to add debugging symbols to the HTML and create a sourcemap.
Can we do this? The answer, as you probably guess, is no. pandoc’s AST does not expose line number information. One more reason to wish for a better engine.
Except, wait a minute. This problem, unlike all the others, presents a unique opportunity. Because we can work around it, in a novel way. For other problems, if I wished to fix it, I’d have to write my own parser from the ground up, or do something really arcane with filters that ultimately makes me more dependent on pandoc, not less.
But for this? There no amount of filter magic that can recover source information that is already lost during the parsing of the document. No, but I could preprocess the file. Go through and insert invisible spans that note which line numbers correspond to which elements. Parsing paragraphs is less work than parsing the whole elephant, but because it is parsing, it gets a foot in the door. Once you’re parsing paragraphs, not that hard to spot codeblocks and fenced divs. Lists and block quotes are a bit harder, but maybe…
And that, finally, is what this year started me down the road to writing my own markdown parser.
Let’s pick up that thread we started on — how hard could coding a markdown engine be?
Give the CommonMark Spec a look — it’s a clear and engaging read. It’s also maddening. I dare you to read to the section on block quotes without viewing this whole endeavor as a testament to the folly of man.note[I haven’t thought about how I’d handle tables. I fear my fraying sanity may not be able to take it.]
Block quotes, incidentally, are where my two-day adventure into writing a markdown engine ran aground — I was trying to parse the whole thing in one pass with the help of the LPEG library, but that seems doomed or intractable. But I might be able to hack it if I go back with a more sedate approach of parsing the syntax in steps.
All of the was preamble and contextualization. I felt it illustrates the nature of my frustration more vividly than if I had just sketched out what I wanted to from the outset.
What is SquiggleMark? Nothing but vaporware, really. A wishlist, things I’d want markdown to have and might be able to implement this year, if I really felt the push.note[Don’t make the xkcd 927[14 competing standards] joke — I wouldn’t write a standard, I’d write a script for my own use. I don’t care if anyone else uses it.]
And that’s part of why I even wrote this big rant. Is it worth it? Some of the arguments and anecdotes I recounted here I had almost forgotten about until this journalling exercise brought them back to mind.
So that’s why I’m writing this blog post. I want to lay out all of my reasonings and desirata for SquiggleMark to more cogently evaluate if it’s something that’s worth my time.
SquiggleMark Wishlist
Let’s take it from the top, one more time:
Anti-features
Indented Codeblocks (the 4-space rule)
Setext headers
Blockquotes?note[I’m not fully convinced the complexity is worth it, but backward compatibility in this regard is probably useful — even if I’m the only one using it, I still have a lot of pages with blockquotes.]
YAML frontmatternote[I didn’t even get into this one, but suffice to say, it’s given me some serious pain. I might go for TOML. JSON is a bit inflexible for my tastes, but what I really have my eye on is Lua. I have this cute idea of exclusively using tilde code blocks for the frontmatter, for branding purposes — ~~~ seems pretty squiggly!]
Core Features
Arbitrarily nested footnotessnote[And do it without taking minutes to process a dozen worst-case characters — that shouldn’t be a tall ask.]
Debugging Symbolsnote[Reporting line numbers, yeah, but take it from someone who’s written thousands of lines of them — sometimes you really need help figure out why the heck filters are doing what they’re doing.]
Filter integrationnote[I.e., special syntax for calling functions on elements, rather than indirectly coaxing your filters into targetting them. Proposed syntaxnote[again for that ‘squiggle’ branding][]{}~foo for calling a function called foo on an empty span. (maybe even []~f for raw text?) This could enable piping — []{}~foo~bar~baz — but that makes me wonder if and how you could branch. Could you compose and split squiggle-filter output syntactically? Should you?]
Metaprogramming? note[I’m thinking like… filter procedures as a first-class citizens of the AST, perhaps even filters that transform other filters? Would this be useful?]
Misc. features
Native <details> supportnote[Proposed syntax: -# summary would begins a details block, and -# on its own would end it.]
Native __underlining__note[While I’ve gotten unexpected use out of the fact that markdown has not one, but two emphasis syntices, I see no use in two strongs.]
Native ||spoilers||note[This is probably unnecessary, but could be of mild use to me. Perhaps a more useful feature would be an ergonomic way to define one-off formatting, a la what WMAC demanded. Meanwhile, ^superscript^, ~~strikeout~~, and ~subscripts~ work in pandoc out of the box, so I’d want to port over those features, but that’s less exciting. The last two are potential vexations given the proposed ‘squiggle function’ syntax, though.]
Fenced Blockquotes note[Proposed syntax: >>> to begin a quote block, <<< to end it. Much more pleasant than littering >s everywhere in a long quotation.]
[]{"Quotes in curly brackets syntax to set title attributes"}note[I had a long digression concerning this, but I cut it because it didn’t go anywhere and its broader points were better stated elsewhere.]
<abbrev> semanticsnote[Pandoc apparently acknowledges and discards an abbrev syntax from another flavor of markdown because its document model has no understanding of them. Could this be useful for my abstruse worldbuilding needs? Currently I abuse the <abbrev> element for a footnote trick I invented for this post.]
extend the syntax beyond imagesnote[Music, game iframes, but perhaps, it could fetch OpenGraph metadata for webpages?]
I think there’s some ideas I’m missing, but now if I remember I’ll have a convenient place to note it down. This blogpost was mostly for me, but if you found it interesting enough to read to end, I’m flattered — thank you for your time.