GitHub - ivanbelenky/nbdantic: humble structure validator for notebooks

2 min read Original article ↗

fully typed context-free grammar for jupyter notebooks. declare expected structure, validate against it.

install

grammar

production rule note
Notebook Cell* Sequence Cell_terminal?
Sequence (Element …)+ | ε empty=True allows ε
Element Markdown | Code | Sequence | Choice | Maybe
Markdown cell<markdown> terminal
Code cell<code> terminal
Choice Element | Element | … first match wins
Maybe Element | ε

grammar elements

type ParseElement = Markdown | Code | Sequence | Choice | Maybe

# Markdown and Code are self explanatory
# Sequence is None one or more elements (enabling nesting of structures/sequences)
# Choice lets you pick from a set of Elements (it could be arbitrary sets of nested pattenrs)
# Maybe allows for expressing Optiona<Element>

usage

import ast # for custom validation function

from nbdantic import Code, Markdown, Maybe, Sequence, JupyterNotebook
from nbdantic.validators import valid_python, not_empty, at_least


def only_imports(c):
    tree = ast.parse(c)
    if not all(
        isinstance(n, (ast.Import, ast.ImportFrom, ast.alias, ast.Module)) for n in ast.walk(tree)
    ):
        raise ValueError("code should only contain import statements")


class Paper(JupyterNotebook):
    structure = Sequence("root", [
        Markdown("title", validators=[not_empty]),
        Code("imports", validators=[valid_python, only_imports]),
        Maybe("abstract", Markdown("abstract_md")),
        Sequence("sections", [
            Markdown("heading"),
            Code("body", validators=[valid_python]),
        ], validators=[at_least(2)], empty=True),
    ])

result = Paper.from_file("paper.ipynb").validate()
result.raise_if_failed()
result.cells_map

extra features

built-in validators

nbdantic.validators has some common checks, some of them are simple and example like, some of them may prove useful, like

  • ty_check
  • ruff_check
  • line_warning

script runner

nbdantic.script_runner.UVScriptRunner spins up isolated uv environments for running code, linting, type checking, whatever you need. validators like ruff_check and ty_check use it under the hood.

from nbdantic.script_runner import UVScriptRunner

with UVScriptRunner(packages=["numpy>=1.24"], python_version="3.13") as runner:
    result = runner.oneshot_python("import numpy; print(numpy.__version__)")
    print(result.stdout)

notes

pep-508/440 parsing backed by uv-pep508 & uv-pep440 crates via pyo3. Thanks uv for everything.