GitHub - 2020science/ai-in-films-corpus: An annotated corpus of 169 feature films (1927–2026, international) in which AI is a primary plot driver. Developed specifically to explore how AI as plot driver is associated with different imagined futures

3 min read Original article ↗

A curated international corpus of feature films (1927–2026) in which artificial intelligence is a primary plot driver. Each entry is annotated for the kind of future the film depicts and how AI is portrayed within it, supporting comparative reading across decades, regions, and visions of what AI might mean for the human prospect.

169 films · 31 countries · 21 languages · 1927–2026

Led by Andrew Maynard working with Claude Code using Opus 4.7 Max

Associated Substack post: https://www.futureofbeinghuman.com/p/ai-movies-may-be-less-dystopian-than-we-think

May 14, 2026

What's in the repo

File What it is
corpus.json The corpus — 169 film entries, fully annotated
corpus-viewer.html A single-file browser viewer for the corpus (no server required)
codebook.md The classification codebook — futures orientation (8 categories) and AI portrayal (4 categories)
schema.md Comprehensive description of every JSON field
methodology.md How the corpus was built — inclusion criteria, sources, stages, audit, taxonomy development
futures-landscape.md The intellectual landscape behind the futures taxonomy: futures studies, SF criticism, and AI-narrative literature
bibliography.md A loose bibliography of works that informed the corpus and its taxonomies
LICENSE CC BY 4.0
CITATION.cff Machine-readable citation metadata for GitHub and academic citation managers

Using the viewer

Open corpus-viewer.html in a browser. If served alongside corpus.json (e.g. via a local web server), the viewer auto-loads the corpus; otherwise it accepts the JSON via drag-and-drop or file picker.

The simplest local serve, from the repo root:

python3 -m http.server 8000

Then open http://localhost:8000/corpus-viewer.html.

The viewer supports full-text search, filtering by country / language / decade / franchise / futures-orientation / AI portrayal, click-to-filter on tags and badges, and hoverable definitions for the two classification dimensions.

It's also available directly (with the current version of the corpus) at https://andrewmaynard.net/aimoviefutures/

Working with the corpus directly

The corpus is a single JSON array. Every entry follows the same schema (see schema.md). Quick example — filter to films classified as Protopia:

import json
with open('corpus.json') as f:
    corpus = json.load(f)
protopia = [film for film in corpus
            if film.get('analyses', {}).get('futures_orientation', {}).get('primary') == 'Protopia']
print(f"{len(protopia)} Protopia films")

Citing

If you use this corpus, please cite as:

Maynard, Andrew (2026). AI-in-Films Corpus. Version 1.0. https://github.com/[username]/ai-in-films-corpus

License

Released under Creative Commons Attribution 4.0 International (CC BY 4.0) — share and adapt freely with attribution. See LICENSE for the full text.

Author and context

Compiled by Andrew Maynard. The corpus and accompanying analyses underpin a planned 2026 Substack essay on AI in films at andrewmaynard.substack.com.

Status and limitations

  • The corpus reflects what was findable in publicly available sources at the time of compilation. Coverage is broader than typical English-language lists but not exhaustive — every analyst will know of films that were missed, and additions are welcome.
  • Classifications are interpretive judgments grounded in a documented codebook (codebook.md). Reasonable analysts will disagree on individual films, particularly those near category boundaries; the per-film justification field is provided to make each call inspectable.
  • The futures-orientation taxonomy focuses on the trajectory the AI makes legible in the film, not the strict end-state. See the codebook for the rationale and the trade-offs.
  • The critical_context field aims to point at the most relevant scholarship for each film. It was iteratively audited (see methodology.md), but bibliographic precision in this field should be treated as a guide rather than a verified citation list — readers should consult the named sources directly.
  • While the corpus was developed under the direction of Andrew Maynard, errors may have crept in through the use of Claude. Please treat with caution.

Contributing and corrections

While this corpus will only be updated occasionally, please feel free to replicate and expand on.