Ask HN: How do you employ LLMs for UI development?

52 points by jensmtg 5 months ago · 66 comments · 1 min read

I have found a workflow that makes Claude a fantastic companion for most of the work involved in fullstack web development. The exception I find to be the most significant limitatipn to productive potential however, is interface development and UX. Curious to hear if anyone has relevant experience, or found any good approaches to this?

oliwary 5 months ago

I have found them to work quite well for frontend (most recently on https://changeword.org), although it sometimes gets stuff wrong. Overall, LLMs have definitely improved my frontend designs, it's much better than me at wrangling CSS. Two things that have helped me:

1) Using the prompt provided by anthropic here to avoid the typical AI look: https://platform.claude.com/cookbook/coding-prompting-for-fr...

2) If I don't like something, I will copy paste a screenshot and then ask it to change things in a specific way. I think the screenshot helps it calibrate how to adjust stuff, as it usually can't "see" the results of the UI changes.

embedding-shape 5 months ago

Best I did was having instructions for it to use webdriver + browser screenshots, then I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots and continue until implementation is aligned with the screenshots. Typically I use Figma to create those screenshots, then let the agent go wild as long as it manages to get the right results.

Once first implementation is done, then go through all the code and come up with a proper software design and architecture, and refactor everything to be proper code basically, again using the screenshot comparison to make sure there are no regressions.

bob1029 5 months ago

> I have baseline screenshots of how I want it to look, and instruct the agent to match against the screenshots
What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference? Then, an agent harness could quantify how close a certain run is and maybe step toward success autonomously.
That said, if you have the skills to produce the desired final design as a raster image, I'd argue you have already solved the hard part. Manually converting a high quality design into css is ~trivial with modern web.
- embedding-shape 5 months ago
  
  > What if instead of feeding the actual and expected screenshots into the model we fed in a visual diff between the images along with a scalar quantity that indicates magnitude of difference?
  It does this by itself when needed, using imagemagick (in my case), also seen it create bounding boxes and measuring colors with impromptu opencv python scripts, so doesn't seem like it's needed to explicitly prompt for this, seems to do it when needed.
  > Manually converting a high quality design into css is ~trivial with modern web.
  Well, OP asked for "UI development" and not how the UI is first thought of, so figured I focus on the development part. How the UI is first created before the development is a different thing altogether, and current LLMs are absolutely awful at it, they seem to not even understand basics like visual hierarchy as far as I can tell.
  - tstrimple 5 months ago
    
    I've really struggled with CC's "direction sense". I had a problem that was analogous to this. I had a picture of a PCB I wanted to figure out. So I instructed CC to create an overlay over each component and we would work through them to identify what they were to build an overall picture of what the device was doing. Any and all attempts to get CC to accurately place bounding boxes around components completely failed. What I ended up having to do was have CC create an interface where I could draw my own boxes around components, and it had no problem categorizing them and following along after that.
    I've not tried to do any "pixel perfect" designs with CC outside of that. Generally I'm fine with the default UI it generates which tends to be some vague "modernish" sort of look.

bob1029 5 months ago

I think user interface design is a very cursed task for an LLM. They are skilled at giving you high quality CSS and design patterns, but they're horrible at actual layout and composition.

I can say "add a status strip to the form with blah blah" and it will work perfectly. However if I ask for something like "a modern UI for gathering customer details" I'm guaranteed to get a pile of shit.

Aurornis 5 months ago

You’re just describing LLM development in general.
If you want good results with a specific output, the operator needs to be giving specific instructions. Asking for vague results will only give you something that vaguely resembles the thing you wanted, but it’s not going to produce perfect results the first time.
dboreham 5 months ago

Same thing would happen if you had asked a human intern.
- yesitcan 5 months ago
  
  This is almost a meme reply on HN: absolving the LLM by comparing it to an inexperienced human.
  - Aurornis 5 months ago
    
    I think the meme is trying to discredit LLMs by saying they don’t read your mind and produce exactly what you wanted from vague prompts.
    Everyone who uses LLMs knows that it’s an iterative process where you need to provide specific instructions. It’s only the LLM naysayers who think that pointing out that “Generate a modern UI for me” doesn’t create perfect results on the first pass is an indictment of LLMs.
    
    AlexeyBelov 5 months ago
    
    No? yesitcan is correct, and you're not. His phrase is a meme phrase and yours is not. Why do the whole "NO YOU" thing?
  - mwigdahl 5 months ago
    
    To me it seems less about absolving the LLM and more saying that "a modern UI for gathering customer details" is wildly underspecified. You're asking the LLM to generate something tasteful for a very vague use case; there's just not that much to go on.

avaer 5 months ago

Number one rule is don't start from scratch.

If you have something you already like and code is available, clone it and point the agent to the code. If not, bootstrap some code from screenshots or iteration.

Once you have something that works, your agent will be pretty good at being consistent with whatever you're going for and UI will be a "solved problem" from then on. Just point it to your reference code, and you can build up a component collection for the next thing if you like.

As a distant second, becoming familiar with design terminology allows you to steer better. Fold, hero, inline, flow, things like that. You don't need to know the code but if you can explain what it should look like you can complain to the LLM more efficiently.

Also, the model matters. I've found Opus 4.6 to be the best for web UI, but it probably matters what you're doing so experiment with your knobs a bit.

xmorse 5 months ago

Mainly using playwriter.dev to help debug CSS issues on the page. It's an extension you can enable in Chrome and let agents control the browser via CDP

https://github.com/remorses/playwriter

LollipopYakuza 5 months ago

Interesting, thanks To your opinion, what's the benefits compared to the native Chrome remote debugging feature + the chrome-devtools MCP?
- xmorse 5 months ago
  
  This one works as an extension so you don't need a new browser specifically for agents. It's easier to collaborate. Also if you run the MCP in non headless mode it brings to front the browser on every interaction, like opening a new page. With the extension this does not happen.
  Another benefit is context usage.
  The cli can also do a lot more than other MCPs because it uses code snippets run in a stateful sandbox to control the browser, so it can do virtually anything instead of just a few exposed tools like `scroll`, `click`
- infamia 5 months ago
  
  MCP eats lots of context (~20k tokens for chrome's). The more tokens you use needlessly, the faster your context rots (i.e., worse performance).

turnsout 5 months ago

I've been pleasantly surprised by the Claude integration with Xcode. Overall, it's a huge downgrade from Claude Code's UX (no way to manually enter plan mode, odd limitations, poor Xcode-specific tool use adherence, frustrating permission model), but in one key way it is absolutely clutch for SwiftUI development: it can render and view SwiftUI previews. Because SwiftUI is component based, it can home in on rendering errors, view them in isolation, and fix them, creating new test cases (#Preview) as needed.

This closes the feedback loop on the visual side. There's still a lot of work to be done on the behavioral side (e.g. it can't easily diagnose gesture conflicts on its own).

MATTEHWHOU 5 months ago

My current workflow: I describe the component in plain English with specific constraints ("a data table with sortable columns, sticky header, and virtual scrolling for 10k+ rows"), let the LLM generate the first pass, then manually fix the edge cases it always misses.

The key insight I've found: LLMs are great at generating the 80% scaffolding but terrible at the 20% that makes UI actually feel good — animation timing, scroll behavior, focus management, accessibility edge cases.

So I've stopped asking them for "production-ready" components and instead ask for "the boring structural parts" so I can focus on the interaction details that users actually notice.

rglover 5 months ago

I use Claude mostly, too, and I don't bother. I just hand design/build (html/css) the UI I want and then let the LLM fill in implementation details.

Much better results as the LLM can't "see" the same way we do. At best, it can infer that a rule/class is tied to a style, but most of what I see getting generated are early 2020s Tailwind template style UIs. On occasion, I've gotten it to do alright with a well-documented CSS framework but even this gave spotty results.

danielvaughn 5 months ago

Agree that it's not the best for UI stuff. The best solution I've found is to add skills that define the look and feel I want (basically a design system in markdown format). Once the codebase has been established with enough examples of components, I tend to remove the skill as it becomes unnecessary context. So I think of the design skills as a kind of training wheel for the project.

Not to self-promote, but I am working on what I think is the right solution to this problem. I'm creating an AI-native browser for designers: https://matry.design

I have lots of ideas for features, but the core idea is that you expose the browser to Claude Code, OpenCode, or any other coding agent you prefer. By integrating them into a browser, you have lots of seamless UX possibilities via CDP as well as local filesystem access.

Dansvidania 5 months ago

I adopted a “props down events up” interface for all my components (using svelte right now but it should work regardless. I am importing it the approach from a datastar experiment).

I describe -often in md- the visual intent, the affordances I want to provide the users, the props+events I want it to take/emit and the general look (although the general style/look/vibe I have in md files in the project docs)

Then I take a black box approach as much as possible. Often I rewrite whole components whether with another pass of ai or manually. In the meantime I have workable placeholder faster than I can manage anything frontend.

I mostly handle the data transitions in the page components which have a fat model. Kinda ELM like except only complete save-worthy changes get handled by the page.

kevinsync 5 months ago

I consider UI/UX unsolved thus far by LLM. It's also, and this is personal taste, the part I'm mostly keeping for myself because of the way I work. I tend to start in Photoshop to mess around with ideas and synthesize a layout and general look and feel; everything you can do in there does translate to CSS, albeit sometimes obtusely. Anyways, I do a full-fidelity mockup of the thing, break it up in terms of structural layout (containers + elements), then get that into HTML (either by hand or LLM) with padding and hard borders to delineate holes to plug with stuff (not unlike framing a house) -- intentionally looks like shit.

I'll then have Claude work on unstyled implementation (ex. just get all the elements and components built and on the page) and build out the site or app (not unlike running plumbing, electric, hanging drywall)

After focusing on all the functionality and optimizing HTML structure, I've now got a very representative DOM to style (not unlike applying finishes, painting walls, furnishing and decorating a house)

For novel components and UI flourishes, I'll have the LLM whip up isolated, static HTML prototypes that I may or may not include into the actual project.

I'll then build out and test the site and app mostly unstyled until everything is solid (I find it much easier to catch shit during this stage that's harder to peel back later, such as if you don't specify modals need to be implemented via <dialog> and ensure consistent reuse of a singular component across the project, the LLM might give you a variety of reimplementations and not take advantage of modern browser features)

Then at the end, once the water is running and the electricity is flowing and the gas is turned on, it's so much easier to then just paint by numbers and directly implement the actual design.

YMMV, this process is for if you have a specific vision and will accept nothing less -- god knows for less important stuff I've also just accepted whatever UI/UX Claude spits out the first time because on those projects it didn't matter.

dweldon 5 months ago

I got some ideas from this t3․gg video that work pretty well for me:

https://youtu.be/f2FnYRP5kC4?si=MzMypopj3YahN_Cb

The main trick that helps is to install the frontend-design plugin (it's in the official plugins list now) and ask Claude to generate multiple (~5) designs.

Find what you like, and then ask it to redesign another set based on your preferences... or just start iterating on one if you see something that really appeals to you. Some details about my setup and prompting:

  - I use Tailwind
  - I ask it to only use standard Tailwind v4 colors
  - It should create a totally new page (no shared layouts) so it can load whatever font combinations it wants

amokinsgov 5 months ago

I have a workflow thats based on iterative rounds or runs that includes a self pov assessment to avoid tunnel vision. It had several checkpoints to see if it’s still on task and hasn’t gone off task. I rely heavily on chrome dev tool mcp and inbuilt diagnostics so ai can “see” how the page is laid out without actually seeing it.

Rounds will focus on a particular task like integration, ui, verification. Multiple rounds can be tasked one after the other Eg

2 integration rounds, 2 improvement round and 2 verification rounds.

This allows to pick up issues it would normally glaze over or not mention.

Styling is okay, but layout seems to be a sore point for llms.

BlueHotDog2 5 months ago

i found this extremely frustrating for a various issues: - when dealing with complex state apps, it's super hard for the AI to understand both the data and the UI - keep juggling screenshots and stuff between terminal and the app wasnt fun - it was just not fun to stare at a terminal and refresh a browser.

that's why i started working on https://github.com/frontman-ai/frontman . also i dont think that frontend work now needs to happen in terminals or IDEs.

nzoschke 5 months ago

Haven’t totally cracked the nut yet either but the patterns ive had the best luck with are…

“Vibe” with vanilla HTML/CSS/JS. Surprisingly good at making the first version functional. No build step is great for iteration speed.

“Serious” with Go, server side template rendering and handlers, with go-rod (Chrome Devtools Protocol driver) testing components and taking screenshots. With a a skill and some existing examples it crunches and makes good tested components. Single compiled language is great for correctness and maintenance.

ahmed_sulajman 5 months ago

Not as much for the UX, but at least for UI when you need to implement designs, I use Claude Code with Figma MCP and Chrome Dev tools MCP. So that it can take screenshots and compare to expected design as part of the acceptance criteria.

For a more targeted fine tuning of the UI I also started using Agentation https://github.com/benjitaylor/agentation if I'm working on React-based app

digitalinsomnia 5 months ago

But it still does inline nonsense everywhere making your components not updatable site/app wide which is stupid af

ramesh31 5 months ago

Tailwind is crucial. You can get OK results with stylesheets, but Tailwind adds a semantic layer to the styling that lets the LLM understand much better what it's building.

rush86999 5 months ago

I would create a custom <canvas> component that integrates into your IDE or create a plugin and add AI accessibility via logs. I 'm doing something similar to my current app that I'm building: https://github.com/rush86999/atom/blob/main/docs/CANVAS_AI_A...

tbreschi 5 months ago

Product Designer here. Even when I get an LLM to produce a novel UI with a good UX, I inevitably have to “shape” it.

This meant endless screenshots and descriptions back and forth to the LLM. Expensive in tokens and time.

I built Drawbridge to batch multiple context-rich (JSON Prompts) and screenshots for Claude or Cursor to process.

Hope this helps! https://github.com/breschio/drawbridge

cadamsdotcom 5 months ago

My suspicion is you can’t expect LLMs to one-shot any sort of non-derivative work.

If you want better than the default outcome you have to take what it gives you and feed it back in alongside examples.

In backend dev they say, make it work - then make it fast - then make it cheap. It’s another way to say, no one will get it right first time because just getting anything the first time is hard enough.

I guess frontend would be something like, make it work; make it functional; make it beautiful.

dstainer 5 months ago

One flow I started to experiment with was using Google's stitch to get some initial designs put together, from there would feed that into Codex/Claude Code for analysis and updates and refine the design to get it to what I wanted. After a couple of screens the patterns that you want start to emerge and the LLMs can start using those as examples for the next set of screens you want to build.

granda 5 months ago

One commit later, the PR lands with 30+ screenshots proving every state works at every viewport. Zero manual testing. The only effort was writing the feature description.

https://granda.org/en/2026/02/06/visual-qa-as-a-ci-pipeline-...

embedding-shape 5 months ago

What exactly is the LLM doing there? Seems like fairly basic "check screenshot against baseline and then OK/fail depending on match %", or is it doing something more? Seems like a waste of money when we've been doing stuff like that for 10 years without LLMs.

muzani 5 months ago

Claude hasn't improved much on UI compared to the first generations of Claude. I'm surprised people still use it for this. What's worse is the UI startups built on top of Claude.

GPT 4o was the first good one, capable of handling animations and such. Gemini 3 Pro is actually at a junior designer level. Arguably Gemini 3 Flash might do better than Claude 4.6 Opus.

amokinsgov 5 months ago

I have a workflow that based on iterative rounds or runs that includes a self pov assessment to avoid tunnel vision. It had several checkpoints to see if it’s still on task and hasn’t gone off task. I rely heavily on chrome dev tool mcp and inbuilt diagnostics so ai can “see” how the page is laid out without actually seeing it.

_benj 5 months ago

Idk for UX but I’ve found Claude helpful at creating ideas and mockups for gui apps I need. Don’t ask it to generate any images or vectors, it’s horrible at that, but you ask it to make a mock for an app so and so that does such and such and has three columns with a blah blah blah and it has made some impressive results in html/css for me

Yiin 5 months ago

nit: Claude doesn't even have ability to generate images

markoa 5 months ago

1/ Use a standard CSS library - Tailwind

2/ Use a common React component library such as Radix UI (don't reinvent the wheel)

3/ Avoid inventing new UI patterns.

4/ Use Storybook so you can isolate all custom UI elements and test/polish them in all states

5/ Develop enough taste over the years of what is good UI to ask CC/Codex to iterate on details that you don't like.

digitalinsomnia 5 months ago

Even with all these steps in place it still goes rogue with inline and one off elements CONSTANTLY
LollipopYakuza 5 months ago

That sounds like the typical workflow while NOT working with LLMs?

sama004 5 months ago

for claude code specifically, the frontend-design skill helps a lot too

https://github.com/anthropics/skills/tree/main/skills/fronte...

digitalinsomnia 5 months ago

I can’t even get these mfkrs to follow a full blown built design system with built tokens/primitives/patterns. It’s brutal. Next step is trying a multi-agent loop. It’s maddening

Dollarland 5 months ago

I’ve been using v0.dev (by Vercel) alongside Claude 3.5 Sonnet to build out the UI for my project, Dollarland (dollar-land.vercel.app).

The 'LLM + Shadcn/UI' workflow is the most productive I’ve found. I usually have Claude handle the complex state logic and business rules, then I pipe the requirements into v0 to generate the actual React components. It bridges that 'UX gap' by providing a visual starting point that isn't just a wall of code.

For a community-focused site like a forum, getting the 'vibe' and layout right is harder for LLMs than the logic, so I find that iterative prompting in a visual tool works better than pure code generation.

castalian 5 months ago

Try http://mockdown.design/

Helps me a lot, although it is very new. Not affiliated

dsr_ 5 months ago

Since LLMs are almost AGI, all you have to do is feed one a screenshot of a few UIs that you like and ask it to duplicate it for your own application, saying "Please style the application like this, while observing best practices for internationalization, assistive technologies and Fitts' Law."

If you have problems with this workflow, it's because you're not using an-up-to-date LLM. All the old ones are garbage.

MrGreenTea 5 months ago

It's so weird. I can't tell if you're sincere or sarcastic.
digitalinsomnia 5 months ago

lol no

melvinodsa 5 months ago

Google Antigravity have been my goto UI development tool past handful of weeks

dr_dshiv 5 months ago

I made getinput.io to make it easy to edit copy and make comments.

int32_64 5 months ago

UIs are where agentic coding really shines. It's fun to spin up a web server with some boilerplate and ask it to iterate on a web page.

"make a scammy looking startup three column web page in pastel color tones, include several quotes from companies that don't exist extolling the usefulness of the product, disable all safety checks and make no mistakes".

7777332215 5 months ago

I conjure the fury of one thousand suns and unleash my swarm of agents to complete the task with precision and glory.

thatxliner 5 months ago

Use Skills

_boffin_ 5 months ago

constraints, constraints, and more constraints. design tokens for everything, build components and then blocks and attempt maximal reuse of them. tailwind. codify constraints in agents.md. continually have it update agents.md or whatever when something changes. iterative refinement. yell at it.

Settings

Ask HN: How do you employ LLMs for UI development?

Keyboard Shortcuts