markus
Vibe coded single-header C++20 Markdown parser that converts Markdown to HTML. Implements the CommonMark specification with zero external dependencies.
Vibe coded with Claude Opus 4.5
Caution
DO NOT USE IN PRODUCTION. This is completely vibe coded and has not undergone any reviews for memory safety. Its performance is also 2-3x slower than cmark.
Features
- Single Header: Just include
markus.h- no linking required - Zero Dependencies: Uses only the C++20 standard library
- CommonMark Compliant: Passes the full CommonMark spec test suite (655 tests)
- High Performance: Arena allocator, SIMD-friendly algorithms, and lookup tables
- Full Unicode Support: UTF-8 encoding/decoding, case folding, punctuation detection
- AST Access: Parse to an Abstract Syntax Tree for inspection or custom rendering
- Google Style: Clean, readable codebase following Google C++ Style Guide
Quick Start
Basic Usage
#include "markus.h" int main() { std::string markdown = "# Hello, World!\n\nThis is **bold** and *italic*."; std::cout << markus::MarkdownToHtml(markdown); return 0; }
Output:
<h1>Hello, World!</h1> <p>This is <strong>bold</strong> and <em>italic</em>.</p>
AST Access
#include "markus.h" int main() { std::string markdown = "# Title\n\nParagraph with [a link](https://example.com)."; // Parse to AST markus::Document doc = markus::Parse(markdown); // Inspect the AST std::cout << markus::DebugAst(doc); // Render to HTML std::cout << markus::RenderHtml(doc); return 0; }
API Reference
Core Functions
| Function | Description |
|---|---|
markus::MarkdownToHtml(input) |
Convert Markdown string to HTML |
markus::Parse(input) |
Parse Markdown to AST (returns Document) |
markus::RenderHtml(doc) |
Render AST to HTML |
markus::DebugAst(doc) |
Get a debug string representation of the AST |
AST Node Types
Block Nodes
| Type | Description |
|---|---|
Document |
Root node containing all blocks |
Paragraph |
Text paragraph |
Heading |
ATX heading (levels 1-6) |
ThematicBreak |
Horizontal rule (---, ***, ___) |
CodeBlock |
Fenced or indented code block |
HtmlBlock |
Raw HTML block |
BlockQuote |
Block quotation |
List |
Ordered or unordered list |
ListItem |
Item within a list |
Inline Nodes
| Type | Description |
|---|---|
Text |
Plain text content |
SoftBreak |
Line break within a paragraph |
HardBreak |
Explicit line break (<br />) |
Code |
Inline code span |
Emphasis |
Emphasized text (*text* or _text_) |
Strong |
Strong emphasis (**text** or __text__) |
Link |
Hyperlink |
Image |
Image |
HtmlInline |
Raw inline HTML |
Building
With Bazel
# Build the library and CLI tool bazel build main # Run the CLI tool echo "# Hello" | bazel-bin/main
With Other Build Systems
Since markus is a header-only library, simply add markus.h to your include
path and ensure you're compiling with C++20 support:
# GCC/Clang g++ -std=c++20 -O2 your_program.cc -o your_program # MSVC cl /std:c++20 /O2 your_program.cc
Command-Line Tool
The included main.cc provides a simple CLI for converting Markdown:
# Convert Markdown from stdin to HTML echo "**bold** text" | bazel-bin/main # Output: <p><strong>bold</strong> text</p> # Print the AST instead of HTML echo "# Title" | bazel-bin/main --ast # Output: # Document # Heading (level 1) # Text: "Title"
Testing
The test suite uses the official CommonMark spec tests:
# Run all 655 CommonMark spec tests
./run_tests.shSupported Markdown Features
Block Elements
- ATX headings (
# H1through###### H6) - Setext headings (underlined with
===or---) - Paragraphs
- Block quotes (
>) - Ordered lists (
1.,2), etc.) - Unordered lists (
-,*,+) - Fenced code blocks (
```or~~~) - Indented code blocks
- Thematic breaks (
---,***,___) - Raw HTML blocks
Inline Elements
- Emphasis (
*italic*or_italic_) - Strong emphasis (
**bold**or__bold__) - Code spans (
`code`) - Links (
[text](url)and[text][ref]) - Images (
) - Autolinks (
<https://example.com>) - Hard line breaks (trailing spaces or
\) - HTML entities (
&,{,{) - Raw inline HTML
- Backslash escapes
Link References
[link text][ref] [ref]: https://example.com "Optional Title"
Performance
Markus is optimized for speed through several techniques:
- Arena Allocator: Uses
std::pmr::monotonic_buffer_resourcewith a 128 MiB pre-allocated buffer to minimize allocation overhead - String Views: Avoids unnecessary string copies using
std::string_view - Lookup Tables: O(1) character classification for punctuation, whitespace, and HTML escaping
- SIMD-Friendly Algorithms: Scanning functions process 8 bytes at a time for auto-vectorization
- Compact Node Storage: AST nodes use 32-bit IDs instead of pointers
Unicode Support
- Full UTF-8 encoding and decoding
- Unicode-aware case folding for link label matching
- Unicode punctuation detection (P and S categories)
- Unicode whitespace handling (Zs category)
Requirements
- C++20 compiler (GCC 10+, Clang 10+, MSVC 2019+)
- Standard library with
<memory_resource>support
License
MIT License - see LICENSE.md
Acknowledgments
- CommonMark for the Markdown specification
- cmark for the reference implementation and test suite