Text format feature matrix

10 min read Original article ↗

Markdown is a versatile plain text document format. The following documents were written using (or converted to) CommonMark and pandoc’s fenced div syntax, then typeset using ConTeXt:

The purpose of this page is to help dispel the notion that Markdown is inferior to other text formats for technical documentation. Rather, they each have their own strengths and weaknesses. The matrix provides an objective way to evaluate formats for specific needs. Keep in mind that tools such as pandoc can do a terrific job at converting between formats.

Full disclosure: KeenWrite is my desktop Markdown editor and command-line application. If the matrix is missing an important feature, or you’d like help with technical documentation, contact me.

The downloadable feature matrix is an unweighted comparison of various plain text systems:

Feature matrix 08-SEP-2025KeenWriteKeenWrite / ConTeXtKeenWrite / ConTeXt / R MarkdownAsciiDocAsciiDoc / extensionsreStructuredTextSphinxpandocpandoc / LaTeXpandoc / extensions / LaTeXpandoc / extensions / LaTeXpandoc / extensions / LaTeX / knitr
Turing completeNYYNYNNNYYYY
Language, RNNYNNNNNNNNY
Data plots, inlineNNYNNNNNNNNY
TablesYYYYYYYYYYYY
Tables, nestedNNNYYYYYYYYY
Tables, footerNNNYYYYNNYYY
Math, SVGYYYNNNNNNNNN
Math, KaTeX (HTML)YYYNYNYNNNNN
Math, MathJax (HTML)YYYNYYYYYYYY
Math, PDFNYYNYNYNYYYY
Variables (attributes)YYYYYYYYYYYY
Variables, interpolatedYYYNNNNNNNNN
Diagrams, text-basedYYYYYNYNNNNN
Diagrams, variables, interoplatedYYYNNNNNNNNN
Diagrams, Mermaid, raster, PDFNNNNYNNNNNNN
Diagrams, Mermaid, vector, PDFNNNNNNNNNNNN
Diagrams, Mermaid, HTMLYYYNYYYNNYYY
ImagesYYYYYYYYYYYY
Images, dimensionsNYYYYYYYYYYY
Captions, tableYYYYYYYYYYYY
Captions, imageYYYYYYYYYYYY
Captions, equationYYYNYYYNYYYY
Captions, code blockNNNYYNYNNNNN
Captions, consistencyYYYNNYYNNNNN
Cross-references, sectionYYYYYNYNNYYY
Cross-references, tableYYYYYYYNNYYY
Cross-references, figureYYYYYYYNNYYY
Cross-references, equationYYYNNYYNNYYY
Cross-references, customNYYNNNNNNNNN
Bibliographic references (citations)NNNYYYYNNYYY
MetadataYYYYYYYYYYYY
Metadata, externalYYYNNNNNNNNN
Metadata, editor-integratedYYYNNNNNNNNN
Content, externalNNYYYNYNNYYY
Content, conditionalNNYYYNYNNNNY
Export, plain textYYYYYYYYYYYY
Export, XHTMLYYYYYYYYYYYY
Export, PDFNYYNYNYNYYYY
Export, PDF, Knuth–PlassNYYNNNNNYYYY
Custom containers (annotations, roles)YYYYYYYNNYYY
Interface, command-lineYYYYYYYYYYYY
Interface, graphical (with preview)YYYYYNNNNNNN
Curls quotation marks, naturalYYYNNNNNNNNN
Curls quotation marks, contextualYYYYYNNYYYYY
Curls quotation marks, comprehensiveYYYNNNNNNNNN
Collate chaptersYYYNNNYNNNNN
Content / presentation separationYYYNNNYNNNNN
Widespread syntax knowledgeYYNNNNYYYYYY
Community supportNNNYYYYYYYYY
UnicodeYYYYYYYYYYYY
Accessible HTMLNNNNNYYNNNNN
Glossary generatorNNNNNYYNNYYY
FootnotesNNNNNYYYYYYY
EndnotesYYYYYNNYYYYY
Text, colourNYYYYNNNNYYY
Text, underlineNYYYYYYNNNNN
Text, overlineNYYYYNNNNNNN
Text, ligaturesNYYNNNNNYYYY
Text, hyperlinkYYYYYYYYYYYY
Text, superscriptsYYYYYYYYYYYY
Text, subscriptsYYYYYYYYYYYY
Text, strikethroughYYYYYNNYYYYY
Text, strongYYYYYYYYYYYY
Text, emphasisYYYYYYYYYYYY
Text, monospaceYYYYYYYYYYYY
Text, inline nestingYYYYYNNYYYYY
Text, inline classesYYYYYYYNNYYY
Text, quotation blocksYYYYYYYYYYYY
Text, quotation blocks, nestedYYYYYYYYYYYY
Text, description listsYYYYYYYYYYYY
Text, code blocksYYYYYYYYYYYY
Text, code blocks, languageYYYYYYYYYYYY
Text, dashes, en –YYYNNNNYYYYY
Text, dashes, em —YYYYYNNYYYYY
Text, ellipses …YYYYYNNYYYYY
Text, arrows -> => <= <-NNNYYNNNNNNN
Text, (C) (R) (TM)NNNYYNNNNNNN
Text, entities, numericYYYYYYYYYYYY
Text, entities, namedYYYYYYYYYYYY
Text, non-breaking spacesNNNNNNNYYYYY
Total536366495741523642545457

Breakdown

This section clarifies select features from the matrix.

Turing complete

The Turing complete row indicates whether the format may contain executable instructions for general computation. For technical documentation, dynamic functionality to retrieve information helps avoid publishing outdated information by single sourcing data.

Language, R

The Language, R row denotes that the format supports the R programming language. We could list additional languages, such as Python and Ruby, but the list is extensive.

Variables, interpolated

The Variables, interpolated row indicates that a variable value may reference variable names. When the output document is created, the references are replaced with the actual value, recursively, before embedding into the output document. For example:

protagonist:
  name:
    given: Ada
    surname: Lovelace
    full: {{protagonist.name.given}} {{protagonist.name.surname}}

Here, {{protagonist.name.full}} resolves to “Ada Lovelace”.

Diagrams, text-based

The Diagrams, text-based row declares whether the format allows marking up instructions for drawing a diagram, technical or otherwise.

Diagrams, variables, interpolated

The Diagrams, variables, interpolated row means embedding interpolated variables into diagrams. For example, a genealogy diagram could include family names from an externally defined character sheet:

``` diagram-blockdiag
blockdiag {
  orientation = portrait

  group {
    {{protagonist.mother.name.given}}
    {{protagonist.father.name.given}}
  }
  {{protagonist.name.given}}

  {{protagonist.mother.name.given}} -> {{protagonist.name.given}}
  {{protagonist.father.name.given}} -> {{protagonist.name.given}}
}
```

Diagrams, Mermaid

The Diagrams, Mermaid rows signify various aspects of exporting diagrams coded using the text-based Mermaid syntax.

At time of writing, Mermaid diagrams produce scalable vector graphics (SVG) that cannot be rendered by numerous vector graphics drawing programs and libraries outside of a web browser environment. Mermaid uses <foreignObject> to leverage features found in browsers, which was a poor technical choice in my opinion. Both PlantUML and GraphViz can produce equally complex, yet standards-compliant SVG documents.

To render Mermaid diagrams in AsciiDoc, a headless browser extension must be installed. AsciiDoctor launches the browser to draw the diagram. While this is a pragmatic solution, it has drawbacks.

Images, dimensions

The Images, dimensions row captures the ability to resize images in the output document. Setting image dimensions in documents is mixing presentation logic with content; however, converting documents into HTML often leaves no solution because CSS cannot target specific images (without classes or IDs).

A pandoc extension to CommonMark allows specifying image dimensions:

![kitten](kitten.jpg){width=300}
![kitten](kitten.jpg){#id .class width=300}

Similarly, fenced code blocks support attributes:

``` diagram-plantuml { width=400px }
```

In ConTeXt, figures can be resized by name using a macro:

\defineexternalfigure[filename.ext][width=..., height=...]

Captions, consistency

The Captions, consistency row notes whether the syntax for captions is consistent across elements being captioned. KeenWrite supports a double-colon syntax for captions, where the caption comes after the element being captioned and is separated by a mandatory blank line. Consider the following text:

| Header | Header |
|--------|--------|
| Value  | Value  |
    
:: Table caption
    
![alt text](logo.svg)
    
:: Image caption
    
$$E=mc^2$$
    
:: Equation caption

Sample output:

document captions

This syntax provided by KeenWrite is a CommonMark extension and diverges slightly from pandoc.

Cross-references

The Cross-references rows refer to built-in directives for referencing items in the document. If the cross-reference syntax for a text format requires a specific typesetting system, it is marked as N. If the syntax is independent of a typesetting system, but requires an external typesetter to produce linked cross-reference, it is marked as Y.

Many formats have a fixed set of cross-references. Beyond figures, tables, equations, and sections are algorithms, musical scores, lyrics, source listings, and more. If users can add and refer to their own cross-reference labels, including internationalized ones, the Cross-references, custom row is marked with a Y.

Existing Markdown implementations have fairly similar syntaxes for cross-references and citations. KeenWrite uses a cross-reference syntax that is based on pandoc’s crossref package.

The identifiers must be unique.

Chapters and sections are automatically named based on the title text by converting the title to lowercase and replacing spaces with hyphens. To wit:

# Timeless novels
    
[@tbl:protagonists] in [@sec:timeless-novels] are characters from novels.
    
| Novel                      | Character                    |
| -------------------------- | ---------------------------- |
| *Dream of the Red Chamber* | Jia Baoyu                    |
| *The Cairo Trilogy*        | Ahmad Abd al-Jawad           |
| *Pedro Páramo*             | Juan Preciado                |
    
:: Fictional characters {#tbl:protagonists}

Metadata, external

The Metadata, external row indicates whether document metadata (such as author, book title, and publication year) can exist in an external file, allowing it to be reused in other contexts. For arbitrarily defined variables, if the system can inject them into documents without extra tooling, it is marked as Y.

Curls quotation marks, comprehensive

The Curls quotations marks, comprehensive row describes systems that properly format quotation marks. Phrases such as “fish ’n’ chips,” “’Bout that time I says, ‘Boys! I been thinkin’ ’bout th’ Universe,’” or quotations that span multiple paragraphs must be curled.

KeenWrite integrates KeenQuotes, a natural language parsing library that can determine how to format single quotation marks, double quotes, contractions, and mark primes. Systems must pass KeenQuote’s test suite using straight double and single quotation marks (i.e., " and ') to be marked as Y.

Collate chapters

The Collate chapters row describes whether individual document files—in the same directory—can be combined into a single work using built-in tooling without requiring them to reference each other. This approach, where a program handles the collation (e.g., via a command-line utility), simplifies the process and avoids a maintenance burden when adding or removing chapters. For example:

keenwrite.bin \
  --all \
  --chapters="1-12,15,19-" \
  -i chapter-01.md \
  -o user-guide.pdf ...

Text

The Text rows refer to built-in directives. If the syntax does not offer built-in functionality, it is marked as N.

Note that Text, colour mixes content (what is written) and presentation (how the writing appears), and is therefore unsupported by KeenWrite. Instead, stylize inline classes using either CSS or a ConTeXt theme.

Text, underline is also unsupported by KeenWrite. Historically, underlining was developed to compensate for shortcomings in early typewriter technology. Namely, a lack of bold or italics. If users ache for that newsstand tabloid feel, they’ll need to look elsewhere. (Or define an inline class style.)

Text, inline styling refers to marking inline text with classes and identifiers to control its appearance. KeenWrite borrows pandoc’s bracketed spans syntax. That is:

[radiation]{#lethal .danger .glossary data-rems="450"}

produces:

<span id="lethal" class="danger glossary" data-rems="450">radiation</span>

Many online examples of inline styling demonstrate:

[radiation]{.red}

This is neither future proof nor semantically clear, resulting in Markdown that mixes presentation into content. Changing the colour from red to yellow should not require updating the content. Had .warning been used, we would only need to update the stylesheet or presentation layer at a single location.

Diagram code blocks

KeenWrite distinguishes between diagrams and source code listings by marking diagrams with a diagram- prefix, such as:

``` diagram-plantuml
@startuml
Alice -> Bob: Hello
@enduml
```

This has a few benefits. First, it allows rendering diagram types using a service without having to codify all possible diagram types. Second, when a new type of diagram is added, it’s available immediately without needing to upgrade the text editor. Third, it clearly distinguishes between a code block to be rendered pictorially and one to be listed as verbatim source code.

Typically, source code is presented in code blocks that include the language name so that syntax highlighting can be applied:

``` c
main() {
    printf( "hello, world" );
}
```

GitHub created a de facto standard that prevents parsers from dynamically distinguishing between a diagram to render and source code to list. The following fenced code block could be either a source code listing or a pie chart:

``` mermaid
pie
    title Turkish Empire Proportions, 1789
    "Asia" : 66
    "Africa" : 20
    "Europe" : 14
```

While mermaid-lang could help, it creates a special case that needs to be programmed into Markdown parsers. (Having both c-lang and c is redundant.) KeenWrite’s diagram- prefix side-steps the issue while keeping true to the human-readability nature of Markdown. For this reason, using mermaid alone for a fenced code block will list the code.

History

John Gruber created Markdown based on decades of syntax conventions that had evolved in email, mailing lists, Usenet newsgroups, and elsewhere. His original goal was to make a text format that could be easily converted to HTML while remaining human-readable. Instead of <em>emphasis</em>, we write *emphasis*. Gruber released a specification alongside a script that converts a Markdown document into valid HTML.

The format gained popularity among developers, leading to its adoption for project README files, which significantly boosted its widespread use.

However, specification ambiguities resulted in conversion software interpreting edge cases differently. This fragmentation prompted John MacFarlane to spearhead CommonMark in 2014, a Markdown standard aimed to eliminate ambiguities.

In 2016, Vladimir Schneider pushed a commit for flexmark-java, an extensible CommonMark parsing framework for Java. On a similar timeline, Karl Tauber started developing a JavaFX Markdown editor. I forked Tauber’s work on Oct 17, 2016 to make KeenWrite because I wanted an editor that could easily reference interpolated variables throughout documents.