Settings

Theme

Show HN: MegaHAL in Pure SQL

github.com

1 points by tgies 2 months ago · 1 comment · 1 min read

Reader

I ported Jason Hutchens' 1998 Loebner Prize-winning chatbot, MegaHAL, to run entirely inside PostgreSQL, in pure SQL. The entire lifecycle -- tokenization, learning, keyword extraction, Markov chain generation, and entropy scoring -- is implemented in standard SQL using complex CTEs. There is no PL/pgSQL or any other sort of procedural escape hatch.

Learning is a single ~560-line SQL statement that splits text, interns symbols, and updates two 5th-order Markov tries (forward and backward) using depth-unrolled writable CTEs. Inference is a recursive query that generates N candidate replies in parallel. It performs bidirectional weighted random walks, evaluates them for information-theoretic surprise, and formats the winner as a sentence-cased string.

I provided a `docker-compose.yml` and convenient Python driver script so you can try it out quickly, and there's also a web-based demo where I bundled it with PGlite (WASM PostgreSQL) at https://tgies.github.io/megahal-sql/. These are provided for convenience, but you can also just run the schema initialization SQL and `SELECT megahal_converse('hello from hn.');`

vunderba 2 months ago

Nice job! I corresponded with Hutchens back in the day about MegaHAL. What made it stand out compared to other Loebner chatbots was that it didn’t just zoom in on a couple of keywords in the user’s input and then run a forward-only chain - instead buiding both forward and backward Markov models, generating text in both directions along with calculating entropy/surprise to produce a novel response.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection