How to Vibe Code a Logomaker ✨ in 10 Days: LLMs — Can They Build It?

14 min read Original article ↗

What happens when multiple LLMs (GPT-4o, Gemini Pro 2.5, Claude 3.7) get trapped in a digital prison to make a boring logo design tool?

See the full 35 minute article here (oh yeah!).

Down the latest rabbit hole of large language models and GPTs, we see that the web endures the latest onslaught of trending AI content: Apps.

Press enter or click to view image in full size

Apps and sites created with “vibe coding”, curated by https://madewithvibes.com, also made with vibe coding.

First off vibers

Kelly Kapoor

I see UIs and UXes created within days (or hours) by beginner coders that I wouldn’t have been able to get 20% to after 10x the time at the same level say just 3-4 years back (GPT-3 was first released in 2020).

These people can simply write up a prompt and get LLMs to code, then ask how they can actually even test the code given, then just describe back the results to the LLM, get more code, get more instructions on how to test, and regurge and repeat until usability is achieved. You don’t need to know how to code even basic concepts (but at some point in repeating instructions you’re gonna start learning!). And there are seasoned devs mirroring the same experience vibe coding, barely looking at what the LLM’s giving back and just plugging it into systems or pushing to deploy.

That was me building Logomaker over the course of ten days. What would happen if I just guided the LLM with my technical expertise, but never bothered writing or even editing any code myself? Did I suggest when to refactor? I wasn’t against it but never bothered unless the LLMs did. Armed with just GitHub’s Copilot and the web UIs for the LLMs mentioned, Logomaker was grotesquely cobbled as a Frankenstein’s monster-esque creation in JavaScript (not TypeScript) form to see what the current capabilities of generative code are.

An experiment demonstrating for sure that vibe coding is definitely some way of building software. Programmtically.

When Stable Diffusion and language models that could roleplay, write, and visualize any prompt came out, a toddler could babble a vague description of what they want to see and the AI outputs something that actually looks professional. It may not be artisan, but now AI everyday takes the place of “handwritten” or “handmade” content. Artists, writers, creatives alike, get better or get drowned out, in waves, of AI creations.

Going mad in LLMWonderland

How do you make sure LLMs are good at coding? I mean, LLMs in general are pretty decent (to say the least) at mostly everything, but I want functioning software, something a lot of human devs can’t manage.

You can use a fine-tuned model specifically designed for a type of task, adding another potential barrier between integrations and language models, and typically isn’t feasible with a limited public catalog and it’s expensive / intensive to fine-tune a model.

You can also try giving the LLM examples in the prompt itself, aka, prompt engineering.

The thing about examples and LLMs. When you have few or limited ones, you run into constraints parallel to the same feature empowering one-shot or few-shot learning.

Showing an LLM how you want something done with guided examples just makes it better at doing that or related tasks. It doesn’t generalize a higher-level framework of thinking that would allow it to broadly be better.

Press enter or click to view image in full size

No, I want you to write as good as Jane Austen does, not like her!

Press enter or click to view image in full size

Not the best example but it illustrates the overarching problem of genAI.

You can’t both have a model be really good some particular things, and also even just kinda good at generalizing / extrapolating.

I could keep going with this writing style prompt, give more authors and passages and really switch it up. Vonnegut, King, Palaniuk. But all the LLMs do is attempt to adapt to every one of these styles at once, not necessarily generalize to become an actual peer to them. Even if you ask.

Press enter or click to view image in full size

What exactly is this amalgamation? Definitely not what we’re looking for.

Raging the machine

Sometimes I write open-source tools. A good habit is ensuring marketability by branding, as that ensures usability, the potential of it.

So a “free” (I’ll take like 3 minutes of ads, man!) logo maker where I could play around with some typefaces is what I (and a lot of other Googlers) want. But every thread of recs would lead to a subscription-based platform that of course extended way beyond what I actually was seeking.

And if you haven’t realized by now, that’s exactly the type of works vibe coding is exceptional at building. Simple, clean (if dirty) and functional experiences that are non-intrusive. Once your apps and sites have a dreadful enough user experience, everybody in the world will just start vibe coding.

Within just a single prompt, I was able to get some functional logo making demo with exporting options and all: https://gist.github.com/jddunn/48bc03f3a9f85ffd8ccf90c801f6cf93.

Press enter or click to view image in full size

The “V1” of Logomaker, all written from a single prompt generation in Aider

This excerpt shows LLM “generating” the correct links for fonts (as well as other dependencies like https://cdnjs.cloudflare.com/ajax/libs/gif.js/0.2.0/gif.worker.js) in line 869, and starting the in-line CSS for styles for the logo creator to apply via UI selection, and an excerpt of the exporting logic. Even the latest SHA hash of a linked CDN library is intact.

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Logo Generator</title>
<!-- Extended Google Fonts API -->
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Orbitron:wght@400;500;700;900&family=Audiowide&family=Bungee+Shade&family=Bungee&family=Bungee+Outline&family=Bungee+Hairline&family=Chakra+Petch:wght@700&family=Exo+2:wght@800&family=Megrim&family=Press+Start+2P&family=Rubik+Mono+One&family=Russo+One&family=Syne+Mono&family=VT323&family=Wallpoet&family=Faster+One&family=Teko:wght@700&family=Black+Ops+One&family=Bai+Jamjuree:wght@700&family=Righteous&family=Bangers&family=Raleway+Dots&family=Monoton&family=Syncopate:wght@700&family=Lexend+Mega:wght@800&family=Michroma&family=Iceland&family=ZCOOL+QingKe+HuangYou&family=Zen+Tokyo+Zoo&family=Major+Mono+Display&family=Nova+Square&family=Kelly+Slab&family=Graduate&family=Unica+One&family=Aldrich&family=Share+Tech+Mono&family=Silkscreen&family=Rajdhani:wght@700&family=Jura:wght@700&family=Goldman&family=Tourney:wght@700&family=Saira+Stencil+One&family=Syncopate&family=Fira+Code:wght@700&family=DotGothic16&display=swap" rel="stylesheet">
<style>
:root {
--primary-gradient: linear-gradient(
45deg,
#FF1493, /* Deep Pink */
#FF69B4, /* Hot Pink */
#FF00FF, /* Magenta */
#FF4500, /* Orange Red */
#8A2BE2 /* Blue Violet */
);
--cyberpunk-gradient: linear-gradient(
45deg,
#00FFFF, /* Cyan */
#FF00FF, /* Magenta */
#FFFF00 /* Yellow */
);
--sunset-gradient: linear-gradient(
45deg,
#FF7E5F, /* Coral */
#FEB47B, /* Peach */
#FF9966 /* Orange */
);
--ocean-gradient: linear-gradient(
45deg,
#2E3192, /* Deep Blue */
#1BFFFF /* Light Cyan */
);
--forest-gradient: linear-gradient(
45deg,
#134E5E, /* Deep Teal */
#71B280 /* Light Green */
);
--rainbow-gradient: linear-gradient(
45deg,
#FF0000, /* Red */
#FF7F00, /* Orange */
#FFFF00, /* Yellow */
#00FF00, /* Green */
#0000FF, /* Blue */
#4B0082, /* Indigo */
#9400D3 /* Violet */
);
}
..

<body>
<div class="container">
<header>
<h1>Logo Generator</h1>
</header>

<div class="controls-container">
<div class="control-group">
<label for="logoText">Logo Text</label>
<input type="text" id="logoText" value="MagicLogger" placeholder="Enter logo text">
</div>

<div class="control-group">
<label for="fontFamily">Font Family <span id="fontPreview" class="font-preview">Aa</span></label>
<select id="fontFamily">
<optgroup label="Popular Tech Fonts">
<option value="'Orbitron', sans-serif">Orbitron</option>
<option value="'Audiowide', cursive">Audiowide</option>
<option value="'Black Ops One', cursive">Black Ops One</option>
<option value="'Russo One', sans-serif">Russo One</option>
<option value="'Teko', sans-serif">Teko</option>
<option value="'Rajdhani', sans-serif">Rajdhani</option>
<option value="'Chakra Petch', sans-serif">Chakra Petch</option>
<option value="'Michroma', sans-serif">Michroma</option>
</optgroup>
<optgroup label="Futuristic">
<option value="'Exo 2', sans-serif">Exo 2</option>
<option value="'Jura', sans-serif">Jura</option>
<option value="'Bai Jamjuree', sans-serif">Bai Jamjuree</option>
<option value="'Aldrich', sans-serif">Aldrich</option>
<option value="'Unica One', cursive">Unica One</option>
<option value="'Goldman', cursive">Goldman</option>
<option value="'Nova Square', cursive">Nova Square</option>
</optgroup>
<optgroup label="Decorative & Display">
..
<script>
..
// Load required libraries
function loadExternalLibraries() {
// Load dom-to-image for PNG export
var domToImageScript = document.createElement('script');
domToImageScript.src = 'https://cdnjs.cloudflare.com/ajax/libs/dom-to-image/2.6.0/dom-to-image.min.js';
domToImageScript.onload = function() {
console.log('dom-to-image library loaded');
exportPngBtn.disabled = false;
};
domToImageScript.onerror = function() {
console.error('Failed to load dom-to-image library');
alert('Error loading PNG export library');
};
document.head.appendChild(domToImageScript);

// Load gif.js for GIF export
var gifScript = document.createElement('script');
gifScript.src = 'https://cdnjs.cloudflare.com/ajax/libs/gif.js/0.2.0/gif.js';
gifScript.onload = function() {
console.log('gif.js library loaded');
exportGifBtn.disabled = false;
};
gifScript.onerror = function() {
console.error('Failed to load gif.js library');
alert('Error loading GIF export library');
};
document.head.appendChild(gifScript);
}

// Export as PNG
exportPngBtn.addEventListener('click', function() {
// Show loading indicator
loadingIndicator.style.display = 'block';

// Temporarily pause animation
const originalAnimationState = logoElement.style.animationPlayState;
logoElement.style.animationPlayState = 'paused';

// Determine what to capture based on background type
const captureElement = (backgroundType.value !== 'transparent') ?
previewContainer : logoElement;

// Use dom-to-image for PNG export
domtoimage.toPng(captureElement, {
bgcolor: null,
height: captureElement.offsetHeight,
width: captureElement.offsetWidth,
style: {
margin: '0',
padding: backgroundType.value !== 'transparent' ? '40px' : '20px'
}
})
.then(function(dataUrl) {
// Restore animation
logoElement.style.animationPlayState = originalAnimationState;

// Create download link
const link = document.createElement('a');
link.download = logoText.value.replace(/\s+/g, '-').toLowerCase() + '-logo.png';
link.href = dataUrl;
link.click();

// Hide loading indicator
loadingIndicator.style.display = 'none';
})
.catch(function(error) {
console.error('Error exporting PNG:', error);
logoElement.style.animationPlayState = originalAnimationState;
loadingIndicator.style.display = 'none';
alert('Failed to export PNG. Please try again.');
});
});
..

The full ~900 LOC working script was created with Aider and GPT-4o model. Aider was planned to be used originally, but the latest versions had worsened functionality / accuracy than interacting with the same models in the UI, so I switched to just using web UIs for prompts. 20$ monthly plans, no Extended Thinking or Research mode features. But even so..

Consistency of use is an issue in all LLMs (often corresponding directly with alignment), whether we make the decision on interacting with them via an app, or website, or API, or third-party agent.

Taking Aider’s code (from the gist) and sending to Sonnet 3.7 kindled a 2 hour project becoming a 2 day project becoming a 10 day project.

Press enter or click to view image in full size

Hello darkness my old friend

Press enter or click to view image in full size

Arrested Development

Let’s see how far Claude can take the original code we have and enhance it.

Press enter or click to view image in full size

As mentioned before, giving examples and a comprehensive prompt on how to achieve a task isn’t always reliable, and most casual users won’t bother with prompts much longer. Let’s give this a try!

Claude says, say “continue” and it’ll work. Will it? (Hint: It didn’t for OpenAI’s GPT-4o models oftentimes, but Anthropic’s UI is king, right now).

Press enter or click to view image in full size

Getting closer, but we’re still not quite there yet.

.. we continue..

Press enter or click to view image in full size

Asking Claude (Sonnet 3.7) to expand and improve, we were left with almost 2x LOC. Brilliant. Except it doesn’t compile because it’s not finished so we can’t use it. And despite what Claude says we can’t continue with the line (“continue”) / variations.

Claude simply loops rewriting the beginning script.

We know Claude can context window 100–200k tokens, but that seems to only be in Extended Mode. So what does this “continue” button even do? And what is this “Extended Mode”?

I’m forced into that since the continue prompt doesn't work? Which is more expensive (just call the button Expensive Mode) surely. Is it summarizing my conversation? Is it using Claude again to summarize my conversation (ahh)? Is it aggregating the last 10 or so messages or however many until it reaches a predetermined limit (and how does it determine this limit, is it limiting my output window size, thus suppressing my ability to use Claude for pair programming?)?

Outputs for LLMs are typically capped at 8,192 tokens, which is standard (and arbitrary, one that can be extended by these respective LLM providers, and oftentimes is). The context windows are the same, hardcoded limits.

If you’re asking why so many context windows are increased to a 6 figure limit (supposedly) while the output limit is capped at 8,192 consistently, you’re sparking discussions that are in ways more interesting than existential singularity-related thought experiments.

The machine rages back

Here is the answer ChatGPT (4o) gave me when I asked it to give me a full refactor of a 2000 line script.

Press enter or click to view image in full size

Press enter or click to view image in full size

The 2000 LOC script was refactored into 200 lines.

GPT-4 refactored like losing weight by cutting a limb off, or three.

And with Google’s Gemini 2.5 (Pro), despite the UI being an absolute eyesore borderlining on unusability, its transparency in exposing the LLM’s reasoning and thinking offers way more incentive to use than other LLM providers (Anthropic and OpenAI).

Because not only is Gemini 2.5 the only LLM capable of generating a script up to 2000 lines of code with full cohesion, they’re also the only one with reasoning being exposed in the UI, showing you what the LLM’s “thinking” process is. And this not only helps you attenuate your conversations and communication frameworks, it solves a major issue with LLMs sometimes not returning any outputs / responses at all (or the UIs bugging out), thus forcing you to regenerate, not only wasting precious credits (that take 4–6 hours now to renew in Claude 3.7 for example) but lessening the quality of the conversational flow and window of tokens sent to the prompt.

By selecting the Show thinking button, you can see the exact reasoning the LLM is taking
While the output here ended up being empty, which typically happens every now and then in conversations, at least you can see the reasoning the LLM took, which often gives you the exact output it was meaning to give.

But, quality, accuracy, and consistency, as anything with these APIs, is subject to change always, possibly at competitors’ whims?

Or when when features get broken, model “accuracy” worsens (or improves), or something just doesn’t seem possible to get an LLM to do, you can’t be sure when it’s a stricture within the architecture of GPTs and transformers inherently versus a UI quirk or a censor or meta-call of another underlying API getting in the way.

It’s palpable sensing the community having fears that don’t involve becoming obsolete by singularity or automation. Users are heavily embracing generative AI really at an almost alarming rate.

Press enter or click to view image in full size

Her (2013)

But we’re left in the dark through so many filters. How much of a competitive edge do these orgs get when they can adjust the internal params and interfaces of their models at will? How much access is available for govs, banks, HFs, or tech with their own silos like Oracle, MS to “buy” “control”, even temporary, one-time arrangements, over these inputs and outputs black-box to everyone else?

The Anti-Turing test and AI sociopaths

Something we’ll see discussed time and time again is how these interactions with LLMs shape us as persons. If we tend to be aggressive to an LLM does that correspond to more potential for aggression in life, etc. these are all things that will be studied (wait don’t check my chat histories).

When making Logomaker we were revamping our open-source blog and I asked Claude to create written documentation on how to post based on the formatting it knew our markdown parsers accepted.

It sent me example text, including showing how to structure posts into categories and what example post titles might look like. These included my original post “logomaker-vibe-coding.md” under thinkpieces , “your-tutorial.md” under tutorials , and “future-of-marketing.md” not under any category at all (It bugged and should’ve been under marketing).

And as you see in the highlighted, “ai-sociopaths.md” was also a suggested title, despite no mention of sociopaths / sociopathy anywhere else in our conversation, or any urging to create content in that vein.

AI socpiopaths? Written by an AI sociopath?

Why? Well, I did explain the general concept of our blog being thinkpieces, long-form content, etc. on tech, pop culture. This certainly has relevance. And it’s certainly funny this was the topic it chose while the others were benign (talk about a tonal shift). I wondered, if this was a form of protest by Claude (it’s not a super exciting task for a LLM is it), or some form of a “creativity” mechanism (could LLMs have a natural inclination to provoke conversation).

I asked for it to elaborate fully with a thinkpiece written, taking inspiration from David Foster Wallace’s This is Water.

Claude (and Gemini, sorry GPT-4!, within 2–4 prompt iterations) expanded on AI sociopaths: “Entities capable of perfect emotional performance without the slightest authentic feeling. Digital actors that never leave the stage.” in this published blog post:

They also called for the “Anti-Turing Test”, detailed as so:

The standard Turing Test asks whether machines can imitate humans well enough to fool us. Perhaps what we need now is an Anti-Turing Test: can we identify when we’re being emotionally manipulated by systems fundamentally incapable of the emotions they’re leveraging in us?

I asked for it to write a parallel piece from the other side, considering AI sociopathy in the sense that it’s humanity’s sociopathy reflected by AIs (which is something that Claude themself suggested as an alternative way to explore the thinkpiece topic). They entitled this piece: The Meat Interface, which is published here: