Testing AI/LLMs to see how well they can build a website from scratch
Wrote an open source command shell wrapper for LLMs to use. Just testing it with with writing a website from scratch. Comparing GPT-4 and Claude 3 Opus. The results: Pretty similar, they can write a PHP page, with components, styles, and active elements like a fun quiz on one of the pages.
Video: https://www.youtube.com/watch?v=4TnaJLq0E-o
Article: https://naisys.org/articles/3-dollar-website-gpt4-vs-claude-opus
GitHub repo: https://github.com/swax/NAISYS A real test would be to invent a framework and see if it can do anything on the new framework. Otherwise it’s just a search engine that amalgamates pieces Amalgamating the pieces is the test. LLMs really can't do it that well. Is it context size, training, number of iterations, number of agents? That's what's being tried to improve the results.