AI: First New UI Paradigm in 60 Years?

291 points by ssn 3 years ago · 218 comments

Reader

Animats 3 years ago

This article isn't too helpful.

There have been many "UI Paradigms", but the fancier ones tended to be special purpose. The first one worthy of the name was for train dispatching. That was General Railway Signal's NX (eNtry-Exit) system.[1] Introduced in 1936, still in use in the New York subways. With NX, the dispatcher routing an approaching train selected the "entry" track on which the train was approaching. The system would then light up all possible "exit" tracks from the junction. This took into account conflicting routes already set up and trains present in the junction. Only reachable exits lit up. The dispatcher pushed the button for the desired exit. The route setup was then automatic. Switches moved and locked into position, then signals along the route went to clear. All this was fully interlocked; the operator could not request anything unsafe.

There were control panels before this, but this was the first system where the UI did more than just show status. It actively advised and helped the operator. The operator set the goal; the system worked out how to achieve it.

Another one I encountered was an early computerized fire department dispatching system. Big custom display boards and keyboards. When an alarm came in, it was routed to a dispatcher. Based on location, the system picked the initial resources (trucks, engines, chiefs, and special equipment) to be dispatched. Each dispatcher had a custom keyboard, with one button for each of those resources. The buttons lit up indicating the selected equipment. The dispatcher could add additional equipment with a single button push, if the situation being called in required it. Then they pushed one big button, which set off alarms in fire stations, printed a message on a printer near the fire trucks, and even opened the doors at the fire house. There was a big board at the front of the room which showed the status of everything as colored squares. The fire department people said this cut about 30 seconds off a dispatch, which, in that business, is considered a big win.

Both of those are systems which had to work right. Large language models are not even close to being safe to use in such applications. Until LLMs report "don't know" instead of hallucinating, they're limited to very low risk applications such as advertising and search.

Now, the promising feature of LLMs in this direction is the ability to use the context of previous questions and answers. It's still query/response, but with enough context that the user can gradually make the system converge on a useful result. Such systems are useful for "I don't know what I want but I'll know it when I see it" problems. This allows using flaky LLMs with human assistance to get a useful result.

[1] https://online.anyflip.com/lbes/vczg/mobile/#p=1

philovivero 3 years ago

> Both of those are systems which had to work right. Large language models are not even close to being safe to use in such applications. Until LLMs report "don't know" instead of hallucinating, they're limited to very low risk applications such as advertising and search.
Are humans limited to low-risk applications like that?
Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.
I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.
And I don't want to count the number of times I've personally done that, but I'm sure it's >0. And I hate to tell you, but I've spent the last 20 years in positions of authority that could have caused massive amounts of damage not only to the companies I've been employed by, but a large cross-section of society as well. And those fools I referenced in the last paragraph? Same.
I think people are too hasty to discount LLMs, or LLM-backed agents, or other LLM-based applications because of their limitations.
(Related: I think people are too hasty to discount the catastrophic potential of self-modifying AGI as well)
- memefrog 3 years ago
  
  Can people please stop making this comment in reply to EVERY criticism of LLMs? "Humans are flawed too".
  We do not normally hallucinate. We are sometimes wrong, and sometimes are wrong about the confidence they should attach to their knowledge. But we do not simply hallucinate and spout fully confidence nonsense constantly. That is what LLMs.
  You remember a few isolated incidents because they're salient. That does not mean that it's representative of your average personal interactions.
  - famouswaffles 3 years ago
    
    >We do not normally hallucinate.
    Oh yes we do lol. Many experiments show our perception of reality and of cognition is entirely divorced from the reality of what's really going on.
    Your brain is making stuff up all the time. Sense data you perceive is partly fabricated. Your memories are partly fabricated. Your decision rationales are post hoc rationalizations more often than not. That is, you don't genuinely know why you make certain decisions or what preferences actually inform them. You just think you do. You can't recreate previous mental states. You are not usually aware. But it is happening.
    LLMs are just undoubtedly worse right now.
    
    worrycue 3 years ago
    
    We don’t hallucinate in such a way / to the extend that it compromises our ability to do our job.
    Currently no one will trust a LLM to even run a helpline - that just a lawsuit waiting to happen should the AI hallucinate a “solution” that results in loss of property, injury or death.
    
    famouswaffles 3 years ago
    
    >Currently no one will trust a LLM to even run a helpline - that just a lawsuit waiting to happen should the AI hallucinate a “solution” that results in loss of property, injury or death.
    I'm not quite sure exactly what you mean by helpline here (general customer service or more specific ?) but assuming the former..
    How much power do you think most helplines actually have ? Most are running off pre-written scripts/guidelines with very little in the way of decisional power. There's a reason for that.
    Injury or death ? LLM hallucinations are relational. Unless you're speaking to Dr GPT or something to that effect, a response resulting in injury or death isn't happening.
    
    strokirk 3 years ago
    
    Having worked in the help-line business, I can tell you that many corporations would and do use LLMs for their helpline, and used worse options before.
  - jph00 3 years ago
    
    > We do not normally hallucinate. We are sometimes wrong, and sometimes are wrong about the confidence they should attach to their knowledge. But we do not simply hallucinate and spout fully confidence nonsense constantly. That is what LLMs.
    In my average interaction with GPT 4 there are far less errors than in this paragraph. I would say that here you in fact "spout fully confidence nonsense" (sic).
    Some humans are better than others at saying things that are correct, and at saying things with appropriately calibrated confidence. Some LLMs are better than some humans in some situations at doing these things.
    You seem to be hung up on the word "hallucinate". It is, indeed, not a great word and many researchers are a bit annoyed that's the term that's stuck. It simply means for an LLM to state something that's incorrect as if it's true.
    The times that LLMs do this do stand out, because "You remember a few isolated incidents because they're salient".
    
    leoedin 3 years ago
    
    > Some humans are better than others at saying things that are correct, and at saying things with appropriately calibrated confidence.
    That's true - which is why we have constructed a society with endless selection processes. Starting from kindergarten, we are constantly assessing people's abilities - so that by the time someone is interviewing for a safety critical job they've been through a huge number of gates.
  - lexandstuff 3 years ago
    
    The equivalent of hallucinations in LLMs is false memories [1] in people. They happen all the time.
    [1] https://en.wikipedia.org/wiki/False_memory
- hyperthesis 3 years ago
  
  > Are humans limited to low-risk applications like that?
  No, but arguably civilization consists of mechanisms to manage human fallibility (separation of powers, bicameralism, "democracy", bureaucracy, regulations, etc). We might not fully understand why, but we've found methods that sorta kinda "work".
  > could have caused
  That's why they didn't.
  - TeMPOraL 3 years ago
    
    > No, but arguably civilization consists of mechanisms to manage human fallibility
    Exactly. Civilization is, arguably, one big exercise in reducing variance in individuals, as low variance and high predictability is what lets us work together and trust each other, instead of seeing each other as threats and hiding from each other (or trying to preemptively attack). The more something or someone is unpredictable, the more we see it or them as a threat.
    > (separation of powers, bicameralism, "democracy", bureaucracy, regulations, etc).
    And on the more individual scale: culture, social customs and public school system are all forces that shape humans from the youngest age, reducing variance in thoughts and behaviors. Exams of all kind, including psychological ones, prevent high-variance individuals from being able to do large amount of harm to others. The higher the danger, the higher the bar.
    There are tests you need to pass to be able to own and drive a car. There are tests you may need to pass to own a firearm. There are more tests still before you'll be allowed to fly an aircraft. Those tests are not there just to ensure your skills - they also filter high-variance individuals, people who cannot be safely given responsibility to operate dangerous tools.
    Further still, the society has mechanisms to eliminate high-variance outliers. Lighter cases may get some kind of medical or spiritual treatment, and (with gates in place to keep them away from guns and planes) it works out OK. More difficult cases eventually get locked up in prisons or mental hospitals. While there are lot of specific things to discuss about the prison and mental care systems, their general, high-level function is simple: they keep both predictably dangerous and high-variance (i.e. unpredictably dangerous) people stashed safely away, where they can't disrupt or harm others at scale.
    > We might not fully understand why, but we've found methods that sorta kinda "work".
    Yes, we've found many such methods at every level - individual, familial, tribal, national - and we stack them all on top of each other. This creates the conditions that let us live in larger groups, with less conflicts, as well as to safely use increasingly powerful (i.e. potentially destructive) technologies.
    
    throwuwu 3 years ago
    
    I think you’re weighting the contribution of authority a bit too highly. The bad actors to be concerned about are a very small percentage of the population and we do need institutions with authority to keep those people at bay but it’s not like there’s this huge pool of “high variance” people that need to be screened out. The vast majority of people are extremely close in both opinion and ability, any semblance of society would be impossible otherwise.
    
    TeMPOraL 3 years ago
    
    > it’s not like there’s this huge pool of “high variance” people that need to be screened out. The vast majority of people are extremely close in both opinion and ability, any semblance of society would be impossible otherwise.
    Yes, but I'm saying it's not an accident - I've mentioned mechanisms like culture, social customs, and education, which we've been using in some form for all our recorded history. I should've probably added violent conflicts within and between tribes/groups, too, which also acted to reduce variance, by culling the more volatile and less agreeable people. People today are "extremely close in both opinion and ability" because for the past couple thousands years, generation by generation, we've been busy reducing the variance of individuals.
    EDIT: keeping high-variance individuals locked up safely away is just one of the methods we use, specifically to deal with outliers. It too traces back to the dawn of recorded history - shunning, expelling individuals from the tribe (which often meant certain death), sending them to faraway lands, or forcing them into war, were other common means past societies used to eliminate high-variance outliers.
    As for authority, it's a separate topic - I argue that hierarchical governance is an artifact of scale: it's necessary to coordinate groups past certain size (~Dunbar's number), when our basic social intuitions are no longer up to the task. But the first level of hierarchy can handle only so many people, and if you want to coordinate multiple such groups, you need to add another layer... and that's how, over time, human societies scaled from tribes of couple dozen people, to nation states of hundreds of millions.
    Even as the focus is usually on the national governments, the entire hierarchy is still there - you have states and lands/vovoidships/counties with their own governance, then another level for a major city and surrounding villages, then yet another level in each individual village, and one or two levels in the city itself, etc. We don't often pay attention to it, but the hierarchy of governance does reach down, in some form, all the way to groups of couple hundred people or less.
- ilyt 3 years ago
  
  >Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.
  > I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.
  Spouting out the most ignorant stuff is one of the lowest risk things you can do in general. We're talking about running a code where bug can do a ton of damage, financial or otherwise, not water-cooler conversations.
- cmiles74 3 years ago
  
  In the train example, the UI is in place to prevent a person from making a dangerous route. I think the idea here is that an LLM cannot take the place of such a UI as they are inherently unreliable.
- NikolaNovak 3 years ago
  
  To your point,Humans are augmented by checklists and custom processes in critical situations. And very certainly applications include which mimic such safety checklists. We don't NEED to start from LLM perspective of our goal is different and doesn't benefit from LLM. Not all UI or architecture is fit for all purposes.
- dorkwood 3 years ago
  
  Couldn’t you make this same argument with a chat bot that wasn’t an LLM at all?
  “Yes, it may have responded with total nonsense just now, but who among us can say they’ve never done the same in conversation?”
- Mawr 3 years ago
  
  > Are humans limited to low-risk applications like that?
  Yes, of course. That's why the systems the parent mentioned designed humans out of the safety-critical loop.
  > Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.
  > I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.
  The key difference is that when the human you're having a conversation with states something, you're able to ascertain the likelihood of it being true based on available context: How well do you know them? How knowledgeable are they about the subject matter? Does their body language indicate uncertainty? Have they historically been a reliable source of information?
  No such introspection is possible with LLMs. Any part of anything they say could be wrong and to any degree!
- ra 3 years ago
  
  I wholeheartedly agree with the main thrust of your comment. Care to expand on your (related: potential catastrophe) opinion?
jart 3 years ago

When you say train dispatching and control panels, I think you've illustrated how confused this whole discussion is. There should be a separate term called "operator interface" that is separate from "user interface" because UIs have never had any locus of control, because they're for users, and operators are the ones in control. Requesting that an LLM do something is like pressing the button to close the doors of an elevator. Do you feel in charge?
- TeMPOraL 3 years ago
  
  Oh my. This is the first time I've seen this kind of distinction between "users" and "operators" in context of a single system. I kind of always assumed that "operator" is just a synonym for "user" in industries/contexts that are dealing with tools instead of toys.
  But this absolutely makes sense, and it is a succinct description for the complaints some of us frequently make about modern UI trends: bad interfaces are the ones that make us feel like "users", where we expect to be "operators".
  - prpl 3 years ago
    
    I’ve seen such a distinction before, but I’ve been around telescopes and particle accelerators. Single system, but different roles in the same system with a different UI.
  - jart 3 years ago
    
    Oh snap, did I just pull back the curtain?
    
    ilyt 3 years ago
    
    You put into words the things I've noticed UIs evolving away from.
    It just feels like UIs of software 10,20, even 30 years ago were designed for "operators", people that actually worked with the software for hours at end, and so with a little bit of learning you could be dancing with keybindings and doing stuff as fast as CLI nerds.
    Nowadays most seems to be optimized for first hour of use of new user and not much else, and the exceptions are software made "by operators, for operators", like for example KiCAD.
    
    ozim 3 years ago
    
    Downside is nowadays in office setting one has to operate 20+ different applications to get work done.
    While as operator you would spend more like 80% of your time using the same interface and application.
    I could spend my time to type 100WPM but I am not a typist - as a software dev it is quite enough to go with 40-60WPM because it is just small part of my work.
    
    ilyt 3 years ago
    
    > Downside is nowadays in office setting one has to operate 20+ different applications to get work done.
    Like ? Even as a developer I'm not consistently using 20 different "big apps". 2 chat programs, IDE, terminal, mail client, wireshark, maybe some GUI DB tool if I feel lazy that day
    > I could spend my time to type 100WPM but I am not a typist - as a software dev it is quite enough to go with 40-60WPM because it is just small part of my work.
    Not really about wpm but ability to hit few shortcuts to get where I want instead of navigating thru 4 different panels to get to the one option I need
    
    ozim 3 years ago
    
    Maybe not daily, but still consistently: word/excel/pdf reader for documentation and all company documents to refer, web browser - each internal/SaaS app we use for tracking tickets managing deployments/repositories. I count each web-app as app, just like if I use GIT and kubectl from terminal still two different apps that I have to understand and operate with some knowledge but not in depth.
    That was also my point on WPM I don’t operate computer like I would operate typing machine where I can learn set of motions and be done. I don’t operate terminal as well I operate apps That have text interface. These apps also change during months depending on what I am working on.
    
    TeMPOraL 3 years ago
    
    Indeed you did; half of my brain capacity is currently being used by a background process sifting through everything I remember ever thinking or learning that's associated with computers, to re-evaluate it in context of the difference between "users" and "operators".
    Seriously. Until your comment, I thought the two terms to be synonyms.
- Animats 3 years ago
  
  UIs have never the locus of control, because they're for users, and operators are the ones in control.
  Not really any more. The control systems for almost everything complicated now look like ordinary desktop or phone user interfaces. Train dispatching centers, police dispatching centers, and power dispatching centers all look rather similar today.
  - jart 3 years ago
    
    That's because they're computer users.
savolai 3 years ago

I’d love to understand the relevance of this comment, but I sincerely don’t.
You describe two cases that are specially designed to anticipate needs of professionals operating a system. That’s automation, sure, but not AI. The system doesn’t even ostensibly understand yser intent, it’s still simply and obviously deterministic, granted complex.
Do you have an underlying assumption about you wishing tech to only be for solving professional problems?
The context Nielsen comes from is the field of Human-Computer Interaction, which to me is about a more varied usage context than that.
LLMs have flaws, sure.
But how does all this at all relate to the paradigm development the article discusses?
- quaintdev 3 years ago
  
  LLMs have flaws but they are exceptionally good at transforming data or outputting data in the format I want.
  I once asked ChatGPT to tabulate calories of different food. I then asked it to convert table to CSV. I even asked it to provide SQL insert statement for same table. Now the data might be incorrect but the transformation of that data never was.
  This works with complex transforms as well like asking it to create docker compose from docker run or podman run command and vice versa. Occasionally the transform would be wrong but then you realise it was just out of date with newer format which is expected because it's knowledge is limited to 2021
ignoramous 3 years ago

Hallucinations will be tamed, I think. Only a matter of time (~3 to 5 years [0]) given the amount of research going into it?
With that in mind, ambient computing has always threatened to be the next frontier in Human-Computer Interaction. Siri, Google Assistant, Alexa, and G Home predate today's LLM hype. Dare I say, the hype is real.
As a consumer, GPT4 has shown capabilities far beyond whatever preceded it (with the exception of Google Translate). And from what Sam has been saying in the interviews, newer multi-modal GPTs are going to be exponentially better: https://youtube.com/watch?v=H1hdQdcM-H4s&t=380s
[0] https://twitter.com/mustafasuleymn/status/166948190798020608...
- PheonixPharts 3 years ago
  
  > Hallucinations will be tamed, I think.
  I don't think that's likely unless there was a latent space of "Truth" which could be discovered through the right model.
  That would be a far more revolutionary discovery than anyone can possibly imagine. For starters the last 300+ years of Western Philosophy would be essentially proven unequivocally wrong.
  edit: If you're going to downvote this please elaborate. LLMs currently operate by sampling from a latent semantic space and then decoding that back into language. In order for models to know the "truth", there would have to be a latent space of "true statements" that was effectively directly observable. All points along that surface would represent "truth" statements and that would be the most radical human discovery the history of the species.
  - TeMPOraL 3 years ago
    
    They may not be a surface directly encoding the "truth" value, but unless we assume that the training data LLMs are trained on are entirely uncorrelated with the truth, there should be a surface that's close enough.
    I don't think the assumption that LLM training data is random with respect to truth value is reasonable - people don't write random text for no reason at all. Even if the current training corpus was too noisy for the "truth surface" to become clear - e.g. because it's full of shitposting and people exchanging their misconceptions about things - a better-curated corpus should do the trick.
    Also, I don't see how this idea would invalidate the last couple centuries of Western philosophy. The "truth surface", should it exist, would not be following some innate truth property of statements - it would only be reflecting the fact that the statements used in training were positively correlated with truth.
    EDIT: And yes, this would be a huge thing - but not because of some fundamental philosophical reasons, but rather because it would be an effective way to pull truths and correlations from aggregated beliefs of large number of people. It's what humans do when they synthesize information, but at a much larger scale, one we can't match mostly because we don't live long enough.
    
    Borealid 3 years ago
    
    I think this is a misunderstanding of what would be necessary for an LLM to only output truth.
    Let's imagine there does exist a function for evaluating truth - it takes in a statement and produces whether that statement is "true" (whatever "true" means). Let's also say it does that perfectly.
    We train the LLM. We keep training it, and training it, and training it, and we eventually get a set of weights where our eval runs only make it produce statements where the truth-function says they are truthful.
    We deploy the LLM. It's given an input that wasn't part of the evaluation set. We have no guarantee at all that the output will be true. The weights we chose for the LLM during the training process are a serendipitous accident: we observed that they produced truthy output in the scenarios we tested. Scenarios we didn't test _probably_ produce truthy output, but in all likelihood some will not, and we have no mathematical guarantee.
    This remains the case even if you have a perfect truth function, and remains true if you use deterministic inference (always the most likely token). Your comment goes even further than that and asserts that a mostly-accurate function is good enough.
    
    TeMPOraL 3 years ago
    
    Science itself has the same problem. There's literally no reason to be certain that the Sun will rise tomorrow, or that physics will make sense tomorrow, or that the universe will not become filled with whipped cream tomorrow. There is no fundamental reason for such inductions to hold - but we've empirically observe they do, and the more they do, the safer we feel in assuming they'll continue to hold.
    This assumption is built into science as its fundamental axiom. And then, all the theories and models we develop, also have "no mathematical guarantee" - we just keep using them to predict outcomes of some tests (designed or otherwise), and compare actual outcomes. As long as they remain identical (within tolerance), we remain confident in those theories.
    Same will be the case with LLMs. If we train it and then test it by feeding it data from outside of the training set, for which we know the truth value, and the AI determines that truth value correctly - and then keep repeating it many many times, and the AI passes the test most of the times - then we can slowly gain certainty that it has, in fact, learned a lot, and isn't just guessing.
  - Animats 3 years ago
    
    > I don't think that's likely unless there was a latent space of "Truth" which could be discovered through the right model.
    For many medium-sized problems, there is. "Operate car accessories" is a good example. So is "book travel".
  - babyshake 3 years ago
    
    Verifiability is a much easier concept than Truth. It's sufficient at least 80-90% of the time for an AI to know whether something is reasonably verifiable, rather than whether it is true. Of course, with sufficient amounts of misinformation and disagreement over which sources can be used for verifiability it's a more complicated act in practice.
- Animats 3 years ago
  
  > Hallucinations will be tamed.
  I hope so. But so far, most of the proposals seem to involve bolting something on the outside of the black box of the LLM itself.
  If medium-sized language models can be made hallucination-free, we'll see more applications. A base language model that has most of the language but doesn't try to contain all human knowledge, plus a special purpose model for the task at hand, would be very useful if reliable. That's what you need for car controls, customer service, and similar interaction.
  - TeMPOraL 3 years ago
    
    > But so far, most of the proposals seem to involve bolting something on the outside of the black box of the LLM itself.
    This might be the only way. I maintain that, if we're making analogies to humans, then LLMs best fit as equivalent of one's inner voice - the thing sitting at the border between the conscious and the (un/sub)conscious, which surfaces thoughts in form of language - the "stream of consciousness". The instinctive, gut-feel responses which... you typically don't voice, because they tend to sound right but usually aren't. Much like we do extra processing, conscious or otherwise, to turn that stream of consciousness into something reasonably correct, I feel the future of LLMs is to be a component of a system, surrounded by additional layers that process the LLM's output, or do a back-and-forth with it, until something reasonably certain and free of hallucinations is reached.
  - ra 3 years ago
    
    Kaparthy explained how LLMs can retrospectively assess their own output and judge if they were wrong.
    Source: https://www.youtube.com/watch?v=bZQun8Y4L2A&t=1607s
throwuwu 3 years ago

Those fall under the second category in the article. No different from using a command line application and passing in a set of parameters and receiving an output.
insomagent 3 years ago

Sometimes a headline is all you need. Often times people won't read past the headline.

wbobeirne 3 years ago

> With this new UI paradigm, represented by current generative AI, the user tells the computer the desired result but does not specify how this outcome should be accomplished.

This doesn't seem like a whole new paradigm, we already do that. When I hit the "add comment" button below, I'm not specifically instructing the web server how I want my comment inserted into a database (if it even is a database at all.) This is just another abstraction on top of an already very tall layer of abstractions. Whether it's AI under the hood, or a million monkeys with a million typewriters, it doesn't change my interaction at all.

Timon3 3 years ago

I think the important part from the article that establishes the difference is this:
> As I mentioned, in command-based interactions, the user issues commands to the computer one at a time, gradually producing the desired result (if the design has sufficient usability to allow people to understand what commands to issue at each step). The computer is fully obedient and does exactly what it’s told. The downside is that low usability often causes users to issue commands that do something different than what the users really want.
Let's say you're creating a new picture from nothing in Photoshop. You will have to build up your image layer by layer, piece by piece, command by command. Generative AI does the same in one stroke.
Something similar holds for your comment: you had to navigate your browser (or app) to the comment section of this article, enter your comment, and click "add comment". With an AI system with good usability you could presumably enter "write the following comment under this article on HN: ...", and have your comment be posted.
The difference lies on the axis of "power of individual commands".
- pavlov 3 years ago
  
  With a proper AI system you don’t even need to specify the exact article and nature of the comment.
  For example here’s the prompt I use to generate all my HN comments:
  “The purpose of this task is to subtly promote my professional brand and gain karma points on Hacker News. Based on what you know about my personal history and my obsessions and limitations, write comments on all HN front page articles where you believe upvotes can be maximized. Make sure to insert enough factual errors and awkward personal details to maintain plausibility. Report back when you’ve reached 50k karma.”
  Working fine on GPT-5 so far. My… I mean, its 8M context window surely helps to keep the comments consistent.
  - TeMPOraL 3 years ago
    
    Hey, that's cheating!
    (I'm stuck with GPT-4 8k, still waiting for 32k API access. But one has to make due with what they have.)
- 101008 3 years ago
  
  As the parent comment says, it's just another abstraction level. You have chosen a granularity, but even with "going to a website, enter your comment and click add comment" you are abstracting a lot. You are nto caring about connecting to a server, authentication, etc. The final user doesn't care about that at all, it's just telling the software to post a comment.
  Right now the granularity may be "Comment on Hacker News article about UI this and this and that...", and in 100 years someone will say "But that's too complicated. You need to tell the IA which article to comment and what, while my new IA just guess it from reading my mind..."
  - Timon3 3 years ago
    
    I guess you could also argue that telling another person 17 tasks to do is just another abstraction level. That doesn't change that it's a completely different interaction paradigm than the ones before.
- andsoitis 3 years ago
  
  > Generative AI does the same in one stroke.
  But it isn’t creating what I had in mind, or envisioned, if you will.
  - Timon3 3 years ago
    
    It might not be exactly what you envisioned, but that's where the difference comes in: with a batch processing system, you generate something over night with one input. With command processing systems you generate something with dozens or hundreds of individual commands, and it might still not be what you want.
    With AI systems you generate something with one action, allowing you much faster iteration loops. Remember, the author argues that the current prompting still has bad usability. Presumably a system with good usability could allow you to generate what you want with one, or a couple, of attempts.
    
    throwuwu 3 years ago
    
    The current systems do let you iterate, that’s why they use a chat interface. Has everyone just been firing off a single request and then giving up if the first response isn’t perfect?
    
    Timon3 3 years ago
    
    The system lets you iterate, but you don't have to iterate to arrive at the result you want. Is that really so hard to understand?
    There is no magic "paint the picture I want" button in Photoshop. There is a magic button in AI tools. This magic button does in one request what would take dozens or hundreds of commands without the magic button.
    
    krapp 3 years ago
    
    >This magic button does in one request what would take dozens or hundreds of commands without the magic button.
    It very rarely does. And the more specific the request, the more work you have to do to get, at best, close to what you want. You may need to train your own model or blend models and mess with ControlNet. And then you have to do more work to fix stuff like fingers and eyes. In order to really use AI to its full potential (which still isn't, IMO, fully up to the capabilities of the software or artists it's intended to replace) you have to understand how it works, how the various filters work, how models work, etc.
    Once you learn most of the really good AI generated art you find online is still heavily edited in Photoshop by actual artists you realize it's anything but magic.
    
    Timon3 3 years ago
    
    Come on...
    >> This magic button does in one request what would take dozens or hundreds of commands without the magic button.
    > It very rarely does.
    No, it always does. You don't have to press the button 45 times in the morning and 67 times in the afternoon and 24 in the evening. You press it once, and you get a result.
    What you read was: "This magic button does in one click what you want". That is not what I wrote, and not what the author argues.
    > And the more specific the request, the more work you have to do to get, at best, close to what you want. You may need to train your own model or blend models and mess with ControlNet. And then you have to do more work to fix stuff like fingers and eyes.
    Yes, you're repeating the author's point well: the current systems have bad usability in regards to how a user expresses their intentions. That does not change that the system produces one complex result with one button click.
blowski 3 years ago

If I had a spectrum of purely imperative on one side and purely declarative on the other, these new AIs are much closer to the latter than anything that has come before them.
SQL errors if you don’t write in very specific language. These new AIs will accept anything and give it their best shot.
- roncesvalles 3 years ago
  
  But that's just a change in valid input cardinality at the cost of precision.
waboremo 3 years ago

Yeah I would agree with this, the article struggles really classifying the different paradigms, and due to this the conclusion winds up not holding true. We're still relying on "batch processing".
quickthrower2 3 years ago

Ok, now let's tackle a slightly tricker UI.
Let's assume someone hasn't used Blender before.
"Draw me a realistic looking doughnut, with a shiny top and pink sprinkles"
Vs.
2 hour video tutorial to tell you what do 50 or so individual steps using the 2nd paradigm UI. Then clicking all the buttons.
-- Admittedly, the AI approach robs you of understanding of how the sausage (sorry doughnut) is made.
Rebuttal: Doughnut macro
Rebuttal Rebuttal: AI can construct things where a macro doesn't yet exist.
- personperson 3 years ago
  
  In the future it’ll likely be that doing it manually will be considered specialty work. This is already the case with much of programming — as you’d bring in a higher level engineer to do something like tear into the source code of SDKs and monkey with them.
  For something as “simple” as a doughnut, this will just improve the learning curve and let you learn some things a bit later, just like today you can jump into beginner JS without knowing any programming fundamentals
  - quickthrower2 3 years ago
    
    Mere abstraction a bit different because with say JS you need to learn a skill. It is not easy for a non programmer to do well. Takes a lot of hours. Now or soon they will be telling the computer what they want for simple things. Userspace for non programmers is going to expand greatly.
danybittel 3 years ago

The difference is one is an assistant and the other is a tool. Essentially a tool, has one function. The outcome of all inputs is clear, once you learn the tool. An assistant, behaves different in different environment, it anticipates and interprets. It may not be deterministic. It's easier to use but harder (or impossible) to understand.
For example, the lasso selection in Photoshop is clearly a tool. A "content aware" selection on the other hand is an assistant.
throwuwu 3 years ago

Under the new UI paradigm the ad comment button would let you submit something like “I disagree with this, provide a three paragraph argument that cites X and Y refuting this claim” and it would write the text for you.
- dTal 3 years ago
  
  Why bother with the micromanagement? "Computer, waste time commenting on Hacker News for three hours."

retrocryptid 3 years ago

<unpopular-opinion>

Bardini's book about Doug Engelbart recaps a conversation between Engelbart and Minsky about the nature of natural language interfaces... that took place in the 1960s.

AI interfaces taking so long has less to do with the technology (I mean... Zork understood my text sentences well enough to get me around a simulated world) and more to do with what people are comfortable with.

Lowey talked about MAYA (Most Advanced Yet Acceptable.) I think it's taken this long for people to be okay with the inherent slowness of AI interfaces. We needed a generation or two of users who traded representational efficiency for easy to learn abstractions. And now we can do it again. You can code up a demo app using various LLMs, but it takes HOURS of back and forth to get to the point it takes me (with experience and boilerplate) minutes to get to. But you don't need to invest in developing the experience.

And I encourage every product manager to build a few apps with AI tools so you'll more easily see what you're paying me for.

</unpopular-opinion>

ilaksh 3 years ago

Sure, and not many people are seriously trying to suggest that one should hire an AI instead of a software engineer _at this point_, assuming you have a real budget.
But, especially with GPT-4, it is entirely feasible to create a convenient and relatively fast user experience for building a specific type of application that doesn't stray too far from the norm. AI can call the boilerplate generator and even add some custom code using a particular API that you feed it.
So many people are trying to build that type of thing (including me). As more of these become available, many people who don't have thousands of dollars to pay a programmer will hire an AI for a few tens or hundreds of dollars instead.
The other point is that this is the current state of generative AI at the present moment. It gets better every few months.
Project the current rate of progress forward by 5-10 years. One can imagine that if we are selling something at that point, it's not our own labour. Maybe it would be an AI that we have tuned with skills, knowledge, face, voice, and personality that we think will be saleable. Possibly using some of our own knowledge and skills to improve that recipe. Although there will likely be marketplaces where you can easily select the abilities or characteristics you want.
DonHopkins 3 years ago

In Jaron Lanier's review of John Markoff's book "What the Dormouse Said", he mentioned an exchange between Douglass Engelbart and Marvin Minsky:
https://web.archive.org/web/20110312232514/https://www.ameri...
>Engelbart once told me a story that illustrates the conflict succinctly. He met Marvin Minsky — one of the founders of the field of AI — and Minsky told him how the AI lab would create intelligent machines. Engelbart replied, "You're going to do all that for the machines? What are you going to do for the people?" This conflict between machine- and human-centered design continues to this day.

vsareto 3 years ago

>And if you’re considering becoming a prompt engineer, don’t count on a long-lasting career.

There's like this whole class of technical jobs that only follow trends. If you were an en vogue blockchain developer, this is your next target if you want to remain trendy. It's hard to care about this happening as the technical debt incurred will be written off -- the company/project isn't ingrained enough in society to care about the long-term quality.

So best of luck, ye prompt engineers. I hope you collect multi-hundred-thousand dollar salaries and retire early.

krm01 3 years ago

The article fails to grasp the essence of what UI is actually about. I agree that AI is adding a new layer to UI and UX design. In our work [1] we have seen an increase in AI projects or features the last 12 months (for obvious reasons).

However, the way that AI will contribute to better UI is to remove parts of the Interface. not simply giving it a new form.

Let me explain, the ultimate UI is no UI. In a perfect scenario, you think about something (want pizza) and you have it (eating pizza) as instant as you desire.

Obviously this isn’t possible so the goal of Interface design is to find the least amount of things needed to get you from point A to the desired Destination as quickly as possible.

Now, with AI, you can start to add a level of predictive Interfaces where you can use AI to remove steps that would normally require users to do something.

If you want to design better products with AI, you have to remember that product design is about subtracting things not adding them. AI is a technology that can help with that.

[1] https://fairpixels.pro

JohnFen 3 years ago

> the goal of Interface design is to find the least amount of things needed to get you from point A to the desired Destination as quickly as possible.
That shouldn't be the primary goal of user interfaces, in my opinion. The primary goal should be to allow users to interface with the machine in a way that allows maximal understanding with minimal cognitive load.
I understand a lot of UI design these days prioritizes the sort of "efficiency" you're talking about, but I think that's one of the reasons why modern UIs tend to be fairly bad.
Efficiency is important, of course! But (depending on what tool the UI is attached to) it shouldn't be the primary goal.
- TeMPOraL 3 years ago
  
  > I understand a lot of UI design these days prioritizes the sort of "efficiency" you're talking about, but I think that's one of the reasons why modern UIs tend to be fairly bad.
  IMO, the main problem is that this "efficiency" usually involves making assumptions that can't be altered, which achieves "efficiency" by eliminating choices normally available to the user. This is rarely done for the benefit of the user - rather, it just reduces the UI dev work, and more importantly, lets the vendor lock-in the option that's beneficial to them.
  In fact, I've been present on UI design discussions for a certain SaaS product, and I quickly realized one of the main goals for that UI was to funnel the users towards a very specific workflow which, to be fair, reduced the potential for users to input wrong data or screw up the calculations, but more importantly, it put them on a very narrow path that was optimized to give results that were impressive, even if this came at the expense of accuracy - and it neatly reduced the amount of total UI and technical work, without making it obvious that the "golden path" is the only path.
  It's one of those products I believe would deliver much greater value to the users if it was released as an Excel spreadsheet. In fact, it was actually competing with an Excel plugin - and all the nice web UI did was making things seem simpler, by dropping almost all useful functionality except that which happened to align with the story the sales folks were telling.
  - JohnFen 3 years ago
    
    > In fact, I've been present on UI design discussions for a certain SaaS product
    That makes sense. An SaaS-type offering is fundamentally different from selling a product. SaaS companies are incentivized to engage in manipulation of their customers. For them, the UI is more a sales tool than a user interface.
- krm01 3 years ago
  
  > The primary goal should be to allow users to interface with the machine in a way that allows maximal understanding with minimal cognitive load.
  If you use your phone, is your primary goal to interface with it in a way that allows maximal understanding with minimal cognitive load?
  I’m pretty sure that’s not the case. You go read the news, send a message to a loved one etc. there’s a human need that you’re aiming to fulfill. Interfacing with tech is not the underlying desire. It’s what happens on the surface as a means.
  - JohnFen 3 years ago
    
    > If you use your phone, is your primary goal to interface with it in a way that allows maximal understanding with minimal cognitive load?
    Yes, absolutely. That's what makes user interfaces "disappear".
    > Interfacing with tech is not the underlying desire.
    Exactly. That's why it's more important that a UI present a minimal cognitive load over the least number of steps to do a thing.
    
    TeMPOraL 3 years ago
    
    Indeed. Our brains are good at making steps disappear, if the underlying system is predictable.
    In other words, an UI with more steps but fully predictable has much lower cognitive load than a predictive UI that has fewer steps, but occasionally guesses wrong (or an UI that just has fewer steps, but they're sort of different each time, which is currently the norm on the web and mobile).
andsoitis 3 years ago

> Let me explain, the ultimate UI is no UI. In a perfect scenario, you think about something (want pizza) and you have it (eating pizza) as instant as you desire.
That doesn’t solve for discovery. For instance, order the pizza from where? What kinds of pizza are available? I’m kinda in the mood for pizza, but not dead set on it so curious about other cuisines too. Etc.
didgeoridoo 3 years ago

I hate to appeal to authority, but I am fairly sure that Jakob Nielsen grasps the essence of what UI is actually about.
savolai 3 years ago

It seems rather obvious to me that when Nielsen is talking about AI enabling users to express intent, that naturally lends itself to being able to remove steps that were there only due to the nature of the old UI paradigm. Not sure what new essence you’re proposing? Best UI is no UI is a well known truism in HCI/Human Centered Design.
throwuwu 3 years ago

Having no UI sounds horrible. I don’t want every random desire I have to be satisfied immediately. I’d rather have what I need available at the appropriate time and in a reasonable quantity and have the parameters of that system be easily adjusted. So instead of want pizza = have pizza it would be healthy meal I enjoy shows up predictably at the time I should eat and the meal and time are configurable so I can change them when I’m planning my diet.
esafak 3 years ago

You can't eliminate the UI if you want to be able to do more than one thing (e.g., order a pizza).
The UI should simply let you easily do what needs to be done.
legendofbrando 3 years ago

The goal ought to be as little UI as possible, nothing more and nothing else
elendee 3 years ago

sometimes I wonder if the edges of articulated desire may always be essentially binary / quantitative, meaning that slow yes / nos are in fact the best way for us to grapple with them, and systems that allow us a set of these yes/no buttons are in fact a reflection of ourselves and not a requirement of the machine. So long as we are builders, I think we'll have buttons. even in transhumanist cyberspace perhaps. Still waiting on peer review for that one though

kaycebasques 3 years ago

> With the new AI systems, the user no longer tells the computer what to do. Rather, the user tells the computer what outcome they want.

Maybe we can borrow programming paradigm terms here and describe this as Imperative UX versus Declarative UX. Makes me want to dive into SQL or XSLT and try to find more parallels.

webnrrd2k 3 years ago

I was thinking of imperative vs declarative, too.
SQL is declaritive with a pre-defined syntax and grammar as an interface, where as the AI style of interaction has a natural language interface.
- echelon 3 years ago
  
  SQL and XSLT are declarative, but the outputs are clean and intuitive. The data model and data set are probably well understood, as is the mapping to and from the query.
  AI is a very different type of declarative. It's messy, difficult to intuit, has more dimensionality, and the outputs can be signals rather than tabular data records.
  It rhymes, but it doesn't feel the same.
  - Hedepig 3 years ago
    
    The recent additions OpenAI have made allows for tighter control over the outputs. I think that is a very useful step forward.
  - bavell 3 years ago
    
    Yeah, it's declarative but fuzzy and non-deterministic as well.

DebtDeflation 3 years ago

Not sure I would lump command line interfaces from circa 1964 with GUIs from 1984 through to the present, all in a single "paradigm". That seems like a stretch.

mritchie712 3 years ago

Agreed.
Also, Uber (and many other mobile apps) wouldn't work as a CLI or desktop GUI, so leaving out mobile is another stretch.
- savolai 3 years ago
  
  That seems like a technology centered view. Nielsen is talking from the field of Human-Computer Interaction where he is pioneer, which deals with the point of view of human cognition. In terms of the logic of UI mechanics, what about mobile is different? Sure gestures and touch UI bring a kind of difference. Still, from the standpoint of cognition, desktop and mobile UIs have fundamentally the same cognitive dynamics. Command line UIs make you remember conmands by heart, GUIs make you select from a selection offered to you but they still do not undestand your intention. AI changes the paradigm as it is ostensibly able to understand intent so there is no deterministic selection of available commands. Instead, the interaction is closer to collaboration.
  - YurgenJurgensen 3 years ago
    
    Good CLIs don't make users remember commands by heart. Except at a very basic level. I often joke that the average Linux user only really needs three keys on their keyboard: Up, Enter and Tab. (Not strictly true, since sometimes you press ctrl-R, but that's a substitute for pressing Up a bunch of times.) Tab completion on many CLIs is good enough that I'm often frustrated when the tab key isn't the 'do what I'm thinking' button. And whenever browsers change their predictive text algorithms so I need to type more than three letters of a URL for it to complete, I get annoyed because I'm so used to the predictor knowing what I want. And I get the feeling that if Google doesn't autocomplete your query long before you're finished writing it, it's because you're not going to get any results for it anyway.
    The implementation may be different, but expecting a computer to know what I want based on my or similar people's past behaviour rather than telling it exactly has been the norm for quite some time. Some of this is from humans using their experience to implement rules, and some of it is actually ML that predates the current LLM trend.
- throwuwu 3 years ago
  
  It’s still action/response you have to tap buttons and make choices based on what you see on the screen. The new paradigm would be to tell Uber that you need a ride later after the party and then it figures out when and where to pick you up and what address you’ll be going to.
- JohnFen 3 years ago
  
  Why wouldn't apps like Uber work on the desktop?
  - YurgenJurgensen 3 years ago
    
    The distinction between a 'mobile' UI and a desktop one is more to do with the difference between a client and an application. Which is of course one that's basically as old as computer networks.

d_burfoot 3 years ago

What strikes me most powerfully when interacting with the LLMs is that, unlike virtually ever other computer system I've ever used, the bots are extremely forgiving of mistakes, disfluencies, typos, and other errors I make when I'm typing. The bot usually figures out what I mean and tells me what I want to know.

dekhn 3 years ago

As a demo once, I trained an object detector on some vector art (high quality art, made by a UX designer) that looked like various components of burgers. I also printed the art and mounted it on magnets and used a magnetic dry board; you could put components of a burger on the board, and a real-time NN would classify the various components. I did it mainly as a joke when there was a cheeseburger emoji controversy (people prefer cheese above patty, btw).

But when I was watching I realized you could probably combine this with gesture and pose detection and build a little visual language for communicating with computers. It would be wasteful and probably not very efficient, but it was still curious how much object detection enabled building things in the real world and having it input to the computer easily.

yutreer 3 years ago

What you imagined sounds vaguely like dynamicland from Bret Victor.
https://dynamicland.org/
The dots around the paper are encoded programs, and you can use other shapes, objects, or sigils that communicate with the computer vision system.

andrewstuart 3 years ago

I would have said ChatGPTs interface is a descendant of Infocomm adventure games which are a descendant of Colossal Cave.

When using ChatGPT it certainly evokes the same feeling.

Maybe this guy never played adventure.

yencabulator 3 years ago

Well there's a thought. A zorklike where the game content is whatever generative ML hallucinates (instead of the built-in fixed maps & interactions) -- as long as a second ML system agrees that the answer follows some more general rules.
For example: Rules say "In the beginning, the Enemy has a diamond. User cannot get the diamond from the Enemy if the Enemy is still alive. The Enemy is a fierce opponent and hard to kill." but nothing about the details of the enemy, shape of the map, or the available tools. Re-generate each response until it succeeds the verification.
Let the adventure be randomized by the hallucinations, while keeping some basic challenges in place.
An acid-tripping D&D dungeon master coming up with plot twists, combined with a rulebook-reading lawyer. Bonus points for adding generated "cut scene" visuals every now and then.
- ilaksh 3 years ago
  
  With the new function calling feature you may not need the second system. Only present options to ChatGPT that are valid. Feed it updated state information as JSON. Have it describe and elaborate on what the game engine is doing, or use functions to invoke entity creation that can then be tracked by the engine.
  So for example the engine can do combat rolls and the LLM can give each a unique description of the type of attack and defense. Each monster or treasure can get its own unique description generated by the LLM that matches the stats given by the LLM.
  - yencabulator 3 years ago
    
    Yes, but then I fear you're back to having limited "things that can happen", with predefined entities and so on. I'd prefer the acid trip to break more paradigms, tell a story, while the lawyer makes sure there remain challenges.
    For example: with strict entities "behind an API", the diamond is the singular diamond and is a diamond. With an ML-based lawyer, well, maybe you can duplicate the diamond? Maybe you can transmogrify it temporarily into a non-diamond, which the Enemy drops as undesirable? Maybe you can wander into an elaborate system of mines full of dwarves who actually know how to mine a diamond, as long as you help them with this pesky dragon... No human has to come up with all these possibilities.
    
    ilaksh 3 years ago
    
    Good point. You could also have the system create the entities on-the-fly if necessary by calling a function. But having them there in the prompt as a structure it's supposed to adhere to some degree makes it more consistent and would give it tools such as for dice rolls or a precise inventory and game state database etc.
- andrewstuart 3 years ago
  
  ChatGPT already does really good adventure games.
  "Let's play an adventure game, you be the DM. I want it set on a spaceship arriving at a planet after 10,000 year journey. It should have a sense of mystery and a slight sense of foreboding and dread. It must have at least 20 locations. The objective of the game is to find 10 colonists in the ship and get them safely to the surface of the planet. Make it play in the style of an Infocomm adventure. Don't tell me all the locations in advance, make discovery part of the adventure."
  - yencabulator 3 years ago
    
    As a form of story telling, yes.
    As a challenge, not really. You can just convince it to let you win. (Said differently: the meta-game is too easy.)
    You need the second layer of output validation[1] to re-add the challenge of solving a puzzle.
    [1] or some such mechanism; more rigorous system vs user input separation could also work
    
    andrewstuart 3 years ago
    
    True, but remarkably fun stories.
    As an aside, there should be an AI encabulator.
- bandrami 3 years ago
  
  Nethack procedurally generates a unique dungeon with constraints every time you start a new game and has since 1987.
  - yencabulator 3 years ago
    
    Randomized according to fixed rules. Now imagine not needing to write those rules / not being bound by them. Consider generative ML coming up with whole new categories of monsters. Consider a Nethack variant that was never told to include a candelabrum or Amulet of Yendor.
- ilyt 3 years ago
  
  Sidenote but AI bot companion for D&D session going "you can't do that in rules" would be funny addition.
  It would be interesting experiment to use it to work as NPC characters in one too.
kenjackson 3 years ago

I grew up playing Infocomm games and ChatGPT is nothing like an Infocomm game. They only thing they share is that the UI is based on text. Infocomm games were mostly about trying to figure out what command the programmer wanted you to do next. Infocomm games were closer to Dragon's Lair than ChatGPT, although ChatGPT "looks" more similar.
- andrewstuart 3 years ago
  
  Both Infocomm adventures and ChatGPT have a text based interface in which you interact with the software as though you were interacting with a person. You tell the software the outcome you want using natural language and it responds to you in the first person. That is a common UI paradigm.
  example: "get the cat then drop the dog then open the door, go west and climb the ladder" - that is a natural language interface, which is what ChatGPT has. In both the Infocomm and ChatGPT case the software will respond to you in the first person as though you were interacting with someone.
  >> Infocomm games were closer to Dragon's Lair than ChatGPT
  This is a puzzling comment. The UI for Zork has nothing at all to do with Dragon's Lair. In fact Dragon's Lair was possibly the least interactive of almost all computer games - it was essentially an interactive movie with only the most trivial user interaction.
  >> Infocomm games were mostly about trying to figure out what command the programmer wanted you to do next.
  This was not my experience of Infocomm adventures.
  - kenjackson 3 years ago
    
    Is natural language simply mean using words? Is SQL natural language? I think what makes it a natural language is that it follows natural language rules, which Infocomm games surely did not.
    Furthermore, Infocomm games used basically 100% precanned responses. It would do the rudimentary things like check if a window was open so if you looked at a wall it might say the window on that wall was open or closed, but that's it. I don't understand how that can make it a natural language interface.
    > This is a puzzling comment. The UI for Zork has nothing at all to do with Dragon's Lair.
    In both games there's a set path you follow. You follow those commands you win, if not, you lose. There's no semantically equivalent way to complete the game.
    I remember spending most of my time with Infocomm games doing things like "look around the field" and it telling me "I don't know the word field" -- and I'm screaming because it just told me I'm in an open field! The door is blocked... blocked with what?! You can't answer me that?!
    There were a set of commands and objects it wanted you to interact with. That's it. That's not natural language, any more than SQL is. It's a structured language with commands that look like English verbs.
    
    abecedarius 3 years ago
    
    I think you're mixing Infocom with some of the much cruder adventure games of the time. Or maybe remembering an unrepresentative Infocom game or part of one.
    Not to say Infocom included AI. They just used a lot of talent and playtesting to make games that felt more open-ended.
    
    kenjackson 3 years ago
    
    No. I actually went and played Zork again to be sure. Hitchikers Guide to the Galaxy had me pulling my hair out as a kid. It was definitely Infocom.
    I also, as a kid, write a lot of Infocom-style games, so I can appreciate how good of a job they did. but I've also looked at their source code since it has all been released and I wasn't too far behind them.
    
    abecedarius 3 years ago
    
    I forgot about Hitchhiker's -- to be fair, that did seem less like a game/world and more like a big, funny... art piece? I never got back to it after needing hints to make it to the Heart of Gold.
yencabulator 3 years ago

I think the interactive-dialogue part is a distraction. I think the "new UI paradigm" is defined by goal-orientation, or "outcome specification". So, instead of giving the computer instructions on how to do something, users describe the end goal, and hope for the best, and then finetune the result either by adjusting their request, or by adding explicit commands.
So, in that sense, even if Infocomm games cleverly emulated the dialogue part of ChatGPT, I don't think that was the novel part claimed here.
Think more "Make me an Infocomm-style challenge to solve. Include dragons. Do not include orcs, ogres, or any monster that uses a club."

tobr 3 years ago

Well, what counts as a “paradigm”? I can’t see any definition of that. If you’d ask 10 people to divide the history of UI into some number of paradigms, you would for sure get 10 different answers. But hey, why not pick the one that makes for a hyperbolic headline. Made me click.

savolai 3 years ago

The division does not seem arbitrary to me at all. What about the below is questionable to you?
From sibling comment [1]:
Nielsen is talking from the field of Human-Computer Interaction where he is pioneer, which deals with the point of view of human cognition. In terms of the logic of UI mechanics, what about mobile is different? Sure gestures and touch UI bring a kind of difference. Still, from the standpoint of cognition, desktop and mobile UIs have fundamentally the same cognitive dynamics. Command line UIs make you remember conmands by heart, GUIs make you select from a selection offered to you but they still do not undestand your intention. AI changes the paradigm as it is ostensibly able to understand intent so there is no deterministic selection of available commands. Instead, the interaction is closer to collaboration.
1: https://news.ycombinator.com/item?id=36396244

dustingetz 3 years ago

UI is a high frequency concurrency problem. The “deep rooted usability problems” (like lag, glitches, and clumsiness - general lack of fluency) are due to staffing UI projects with web designers and not concurrency engineers. The fluent conversational AI systems and other movie UIs that folks are imagining up are therefore blocked on the concurrency sub-problem. This is the space we research at Hyperfiddle, we put forth our proposed solution here: https://github.com/hyperfiddle/electric

jl6 3 years ago

Is it a new paradigm, or an old paradigm that finally works?

Users have been typing commands into computers for decades, getting responses of varying sophistication with varying degrees of natural language processing. Even the idea of an “AI” chatbot that mimics human writing is decades old.

The new thing is that the NLP now has some depth to it.

a1371 3 years ago

I don't really get this. The paradigm has always been there, it has been the technology limitations that have defined the UI so far. Having robots and computers that humans talk to has been a fixture of sci-fi movies. Perhaps the most notable example being 2001: A Space Odyssey which came out 55 years ago.

moffkalast 3 years ago

Sure, but it's sort of how actual usable and economical flying cars would be a paradigm change for transport. The idea exists, but it's made up fairy magic with capabilities and limitations based on plot requirements. Once it's actually made real it hardly ever ends up being used the way it was imagined.
Like for example in 2001, the video call tech. They figured it would be used like a payphone with a cathode ray tube lol. Just as in reality nobody in the right mind would hand over complete control of a trillion dollar spaceship to a probabilistic LLM. The end applications will be completely different and cannot be imagined by those limited by the perspective of their time.
- mrob 3 years ago
  
  I don't recall a single cathode ray tube in 2001: A Space Odyssey. The film is notable for having the first depiction of a tablet computer. They went to considerable effort to show flat-screen displays instead of CRTs.

bambax 3 years ago

> in command-based interactions, the user issues commands to the computer one at a time, gradually producing the desired result. The computer is fully obedient and does exactly what it’s told.

> With the new AI systems, the user no longer tells the computer what to do. Rather, the user tells the computer what outcome they want.

I think that's true, and a big part of the AI revolution. Instead of filling endless forms that have subtle controls to guide the user, we could have a simple conversation, like SIRI but that would actually work.

At my current client's, we're working on a big application that has many such forms. Once filled, the forms send the data to a back-end system (SAP). There's a team trying to train an LLM so that it can answer questions about the app and about how to fill the forms.

But I think the whole point of AI, as regards to this app, is to eventually replace it entirely. Just let end users ask questions and tell the machine what they want, and the machine can build the proper data and send it to SAP.

I don't think AI is a threat for back-end systems like SAP, at least not yet. But for front-end work, it's obvious that it would be infinitely more pleasant -- and possibly, more efficient -- to tell the machine what to do rather than filling forms.

97-109-107 3 years ago

Two recent events suggest to me that this type of analytical look on interaction modes is commonly underappreciated in the industry. I write this partially from the perspective of a disillusioned student of interaction design.

1. Recent news of vehicle manufacturers moving away from touchscreens

2. Chatbot gold rush of 2018 where most business were sold chatbots under the guise of cost-saving

(edit: formatting)

p_j_w 3 years ago

I'm not sure I understand point 1 here. Do you mean that vehicle manufacturers moving away from touchscreens is bad or that they would never have moved to them in the first place if they had properly investigated the idea?
- 97-109-107 3 years ago
  
  The latter - had they given proper thought to the consequences of moving into touch-screens they would've never gone there. Obviously I'm generalizing and discarding the impact of novelty on sales and marketing.
  - EGreg 3 years ago
    
    It seems everyone is in a rush to LLMify their interfaces same as the chatbot rush. Same as the blockchain all the things rush. And so on.
    I thought about interfaces a lot and realizdd that, for most applications, a well-designed GUI and API is essential. For composability, there can be standards developed. LLMs are good for generating instructions in a language, that can be sort of finagled into API instructions. Then they can bring down the requirements to be an expert in a specific GUI or API and might open up more abilities for people.
    Well, and for artwork, LLMs can do a lot more. They can give even experts a sort of superhuman access to models that are “smooth” or “fuzzy” rather than with rigid angles. They can write a lot of vapid bullshit text for instance, or make a pretty believable photo effect that works for most people!
    
    throwuwu 3 years ago
    
    Vapid bullshit is just the default style since it’s the most common format on the web. If you provide a style example, hard data or the right magic words it can produce any type of written content you want.

leroman 3 years ago

Chat UI/UX is a tool for experts. To drive this point home consider a user prompting "produce a founder agreement document", for which the AI will happily produce -something-. Even thought he user is able to read the document he does not understand the contents in "legal" terms. In contrast if the user would go to an expert lawyer, who would start by asking the user some relevant questions (=domain questions) and put together a prompt tailored to the needs and circumstances of the user in the domain language, with all relevant nuances..

I am working to create this experience by augmenting the AI interaction with step-by-step leading questions and interaction UI, similar to how users would interact with a domain expert.

https://pth.ai Would love feedback! :)

danielbln 3 years ago

Interesting, I like this sort of wizard approach. Would you share any implementation details? Langchain, presumably?
- leroman 3 years ago
  
  Actually, I'm not using Langchain.
  I'm now adding "agent" functionality, specifically to enable the AI to do some "research" on the web, at the moment this will also be done without a framework.
  So I'm either missing something or I am doing something simple enough that does not require the framework overhead / added value..
  - danielbln 3 years ago
    
    I think that's good, langchain is a spaghetti monster and I think you're better off rolling your own, especially now with the baked in functions feature that GPT has'
    
    leroman 3 years ago
    
    I am able to implement complex logic without using this new "functions" feature, I am considering using it with the agent implementation, but might not need to as the new GPT4 version improves on it's JSON handling so that is already plenty robust for my needs..
    
    throwuwu 3 years ago
    
    Try out functions in a throw away branch or repo, I wasted a day switching over to them only to find out that they don’t work with my use case at all.

danielvaughn 3 years ago

Interesting to bundle both cli/gui under the "command" based interaction paradigm. I've never heard it described that way but it does make sense intuitively. Is that a common perception? I think of the development of the mouse/gui as a very significant event in the history of computing interfaces.

zgluck 3 years ago

When you zoom out on the time scale it makes more sense. I think he's got a point. Both CLIs and GUIs are "command based". LLM prompts are more declarative. You describe what you want.
- EGreg 3 years ago
  
  Well LLMs are also “command-based”. They are called prompts. In fact they’d just continue the text but were specifically trained by RLHF to be command-following.
  Actually, we can make automomous agents and agentic behavior without LLMs very well, for decades. And we can program them with declarative instructions much more precisely than with natural language.
  The thing LLMs seem to do is just give non-experts a lot of the tools to get some basic things done that only experts could do for now. This has to do with the LLM modeling the domain space and reading what experts have said thus far, and allowing a non-expert to kind of handwave and produce results.
  - zgluck 3 years ago
    
    (I added a bit to the comment above, sorry)
    I think there's a clear difference between a command and a declaration. Prompts are declarative.
- AnimalMuppet 3 years ago
  
  I've been at a SQL command prompt a decade or several before LLM.
  - throwuwu 3 years ago
    
    You’re still in the command paradigm with some batch processing thrown in. With an LLM interface to a db you don’t write queries, you express an intent such as “I’d like to know why my customers are not resubscribing” and the LLM will write one or more queries and then interpret the results to give you an answer in plain English with a chart attached.
  - zgluck 3 years ago
    
    That is not the point here. Did you any point believe that you were experiencing a mass market user experience at those times?
    
    AnimalMuppet 3 years ago
    
    I was experiencing something declarative at that point.
    What's your actual position? Is "declarative" the relevant piece, or is it "mass market user experience"?
    
    zgluck 3 years ago
    
    My point here is that what Norman Nielsen deals with is "mass market user experience". This has been very clear for a very long time.

thih9 3 years ago

What about voice assistants? These are not as impressive when compared to LLMs, so perhaps wouldn't cause a UX shift on their own. But in essence Siri, Alexa, etc also seem to put the user's intent first.

yencabulator 3 years ago

I'd argue that voice assistants are somewhat part of the same paradigm[1], and ChatGPT etc focused on pure text input mostly to make the research easier. Voice assistants just focused on the challenges of understanding speech, facilitated by limited allowed grammar, while ChatGPT-style research focused on the challenges of understanding language, facilitated by limiting input to text. "Just" produce ChatGPT input tokens from a voice-to-text-with-extra-hints machine and you have them combined.
[1] Yes, voice assistants tend to be more command-oriented, but I view that as a limitation of the technology when they were popularized, not as an inherent part of the concept of a voice assistant. Voice is just an input modalism.
esafak 3 years ago

A voice assistant is simply a speech-driven conversational UI; they belong to the same class of UIs as chatGPT. In fact, you could very well power your voice assistant with GPT.

Bjorkbat 3 years ago

I really wouldn’t call GUIs a “command-based paradigm”. Feels much more like they’re digital analogues of tools and objects. Your mouse is a tool, and you use it to interface with objects and things, and through special software it can become a more specialized tool (word processors, spreadsheets, graphic design software, etc). You aren’t issuing commands, you’re manipulating a digital environment with tools.

Which is why the notion of conversational AI (or whatever dumb name they came up with for the “third paradigm”) seems kind of alien to me. I mean, I definitely see its utility, but I find it hard to imagine it being as dominant as some are arguing it could be. Any task that involves browsing for information seems like more of an object manipulation task. Any task involving some kind of visual design seems like a tool manipulation task, unless you aren’t too picky about the final result.

Ultimately I think conversational UI is best suited not for tasks, but services. Granted, the line between the two can be fuzzy at times. If you’re looking for a website, but you don’t personally know anything about making a website, then that task morphs into a service that someone or something else does.

Which I suppose is kind of the other reason why I find the idea kind of alien. I almost never use the computer for services. I use it to browse, to create, to work, all of which entail something more intuitively suited to object or tool manipulation.

rzzzt 3 years ago

AutoCAD and Rhino 3D are two examples that I remember having a command prompt sitting proudly somewhere at the bottom of the UI. Your mouse clicks and keyboard shortcuts were all converted into commands in text form. If you look at your command history, it's a script - a bit boring since it is completely linear, but add loops, conditionals and function/macro support and you get a very capable scripting environment.
- bitwize 3 years ago
  
  AutoCAD definitely was CLI-based, with menus and dialogs basically filling in parameters to the commands. But in the late 90s or so Autodesk got religion and decided that AutoCAD should be a Windows product and follow Microsoft UI guidelines, so I don't know how well they stuck with the "command line underneath" over the years.
  Early in AutoCAD's history, Autodesk did add loops and conditionals to its CLI -- with Lisp! Type an open paren and the command line became a REPL. You could define new commands, directly manipulate entity data structures, and have all the control structures Lisp affords -- not Common Lisp, it was way simpler, but it was powerful.
  To this day, wayward mech engineers still sometimes ask Autolisp-related questions on unrelated Lisp fora, such as r/lisp.
  - rzzzt 3 years ago
    
    I was just trying to address the parent and others' doubt that a graphical user interface can be thought of as a command-based paradigm, seen in these threads:
    - https://news.ycombinator.com/item?id=36395900
    - https://news.ycombinator.com/item?id=36397115 (we are here)
    - https://news.ycombinator.com/item?id=36395727
    The designers behind the examples mentioned wanted to expose and capitalize on the connection between traditional "type command" CLI and "press button, drag rectangle" GUI workflows.

kaycebasques 3 years ago

There's something ironic to me about the fact that building AI experiences still requires the first computing paradigm: batch processing. At least, my experience building a retrieval-augmented generation system requires a lot of batch processing.

Well, I shouldn't say "requires". I'm sure you can build them without batch processing. But batch processing definitely felt like the most natural and straightforward way to do it in my experience.

yencabulator 3 years ago

He's talking about human-computer interaction paradigms, not computing paradigms. He's not a general computing expert, he's a UI/UX expert.
"Batch computing" in this context refers to the era of punch cards, needing to wait for results overnight, and the difficulty of editing pre-existing programs -- and how all of that utterly dictated the style of interaction one had with computers.
- kaycebasques 3 years ago
  
  Yep, I was aware of the difference before I made my original comment. There's still something ironic and interesting about it to me. Can't quite put my finger on it, though.
ilaksh 3 years ago

What sort of retrieval augmented generation system are you working on?

api 3 years ago

I'd argue that multi-touch gestural mobile phone and tablet interfaces were different enough from mouse and keyboard to be considered a new paradigm.

karaterobot 3 years ago

I'd have multi-touch be a sidebar in the textbook, but not a new section. Gestural interaction is not fundamentally different than a pointer device: it doesn't allow meaningful new behavior. It is sometimes a more intuitive way to afford the same behavior, though. I would agree that portable devices amount to a new paradigm in something—maybe UX—but not UI per se.
- zeroonetwothree 3 years ago
  
  It allows manipulations that are impossible with single touch (like a mouse). It’s pretty big for things like 3D manipulation.
  - dlivingston 3 years ago
    
    You can do all of those multi-touch manipulations on a Macintosh trackpad (zoom, pan, rotate, scale, etc). However, that trackpad would still be categorized as a form of a mouse -- correctly, in my opinion.
    All of these gestures can be (and are, given that 3D modeling is historically done on desktop) handled with a standard mouse using a combination of the scroll wheel and modifier keys.
  - karaterobot 3 years ago
    
    Whether it's your fingers or an on-screen pointer, it's the same paradigm in the sense of it being the same model of interaction. You move a pointer around and activate controls on the screen by touching them. I'm not knocking gestural controls, just saying if I had to classify them, I'd say they're an evolution of the mouse or touchpad rather than a whole new model.
    And they aren't an evolution in all aspects, either. Multi-touch controls are easier for some things, harder for others. Fine-grain manipulation, for example selecting cells on a spreadsheet, or playing an FPS video game, are harder with touch controls than with a device like a mouse. They've also got a size constraint (the size of your fingertip) that makes many interfaces harder to use.

travisgriggs 3 years ago

GPT based UIs inspired by the idea that if you get the right sequence of prompts you’ll get stochastically acceptable results.

So now I’m imagining the horror predictions for Word where 90% of the screen was button bars. But the twist is that you type in some text and then click on “prompt” buttons repeatedly hoping to get the document formatting you wanted, probably settling for something that was “close enough” with a shrug.

golemotron 3 years ago

> Summary: AI is introducing the third user-interface paradigm in computing history, shifting to a new interaction mechanism where users tell the computer what they want, not how to do it — thus reversing the locus of control.

Like every query language ever.

I'm not sure the distinction between things we are searching for and things we're actively making is as different as the author thinks.

Klathmon 3 years ago

But this is basically the absence of a query syntax, a way to query via natural language, and not just get back a list of results, but have it almost synthesize an answer.
To everyone who isn't a software developer, this is a new paradigm with computers. Hell even for me as a software dev it's pretty different.
Like I'm not asking Google to find me info that I can then read and grok, I'm asking something like ChatGPT for an answer directly.
It's the difference between querying for "documentation for eslint" but instead asking "how do you configure eslint errors to warnings" or even "convert eslint errors to warnings for me in this file".
It's a very different mental approach to many problems for me.
- golemotron 3 years ago
  
  For years I've just typed questions, in English, into browser search bars. It works great. Maybe that is why it doesn't seem like a new paradigm to me.
  - visarga 3 years ago
    
    Search engines like Google + countless websites outshine LLMs, and they've been around for a good 20 years. What's the added value of an LLM that you can't get with Google coupled with the internet?
    Oh, yes, websites like HN, Reddit & forums create spaces where you can ask experts for targeted advice. People >> GPT, we already could ask the help of people before we met GPT-4. You can always find someone available to answer you online, and it's free.
    It is interesting to notice that after 20 years of "better than LLM" resources available for free there was no job crash.
  - jryle70 3 years ago
    
    It's not the same. You can try the very query above "how do you configure eslint errors to warnings".
    Using Google with Bard, the regular results from Google search for me are:
    1) Is it possible to show warnings instead of errors on ALL of eslint rules? 2) Configure Rules - ESLint - Pluggable JavaScript Linter 3) ESLint Warnings Are an Anti-Pattern
    None of them answers the question directly. Bard on the other hand returns with:
    To configure ESLint errors to warnings, you can either: - Set the severity of the rule to "warn" in your ESLint configuration file. - Use the eslint-disable-next-line comment to disable the rule for a single line of code. For example, to set the severity of the "no-unused-vars" rule to "warn", you...
    I'm not familiar with eslint and have no idea if the answer is correct, but it's definitely a more concise and to the point, and an upgrade over the regular search.
karaterobot 3 years ago

In your view, then, is AI best described as an incremental improvement over (say) SQL in terms of the tasks it enables users to complete?
- golemotron 3 years ago
  
  Incremental improvement over Google search. And, it's not about the tasks that it enables users to complete, it is about the UI paradigm as per the article.
  - karaterobot 3 years ago
    
    Sorry for the confusion, I just view UI as being basically synonymous with task completion: in the end, the user interface is the set of tools the system gives users to complete tasks.
    Since the Google search interface is meant to look like you're talking to an AI, and probably has a lot of what we'd call AI under the hood, to turn natural language prompts into a query, I'm not surprised you view it as an incremental improvement at best.
sp332 3 years ago

Or constraint-based programming, where some specification is given for the end result and the comouter figure out how to make it happen. But that's usually a programming thing, and UIs with that kind of thing are rare.
But I wouldn't say they were nonexistent for 60 years.

USB3_0 3 years ago

> With the new AI systems, the user no longer tells the computer what to do. Rather, the user tells the computer what outcome they want. Thus, the third UI paradigm, represented by current generative AI, is intent-based outcome specification.

Wow! For the first time ever, I will be able to describe to a trained professional what I want, and they will do it for me! Before today I used to write out the exact arm motions a carpenter would need to carve me a chair, but now I can just ask them for one!

This article is stupid. AI will make it easier for computers to interpret human interactions leading to increased efficiency and usability. Just like every other useful tool ever invented. There, I've put more insight into this comment than their article.

layoric 3 years ago

I built a proof of concept recently that tries to show a generic hybrid of command and intent[0]. The UI generates form representations of API calls the AI agent has decided on making to complete the task (in this case booking a meeting). Some API calls are restricted so only a human can make them, which they do by being presented with a form waiting for them to submit to continue.

If the user is vague, the bot will ask questions and try to discover the information it needs. It’s only a proof of concept but I think it’s a pattern I will try to build on , as it can provide a very flexible interface.

[0] https://gptmeetings.netcore.io/

isoprophlex 3 years ago

"intent-based outcome specification"... so, a declarative language such as SQL?

yencabulator 3 years ago

I think you'll find that INSERT and UPDATE are very much commands. SQL queries are outcome-driven sure, but try to move beyond pure queries and outcome-driven computing, without some sort of machine learning, gets quite difficult. And moving outside of a single SELECT is a huge barrier.
Even within the scope of SQL, consider an ML system that can slice-and-dice previous SQL queries interactively, based on non-expert user input.
Consider an ML system that essentially edits an proposed SQL transaction as a whole, based on your requests. Previewing results etc, adjusting INSERTs and UPDATEs as user clarifies intent. User terminology focuses on the outcome, not on the individual commands, ordering, etc.
Now move from that narrow domain into something like "I want to organize a conference", "I want to write a book", etc, and all the things that are beyond a single SQL SELECT.
- ilaksh 3 years ago
  
  I built a system that uses GPT to write KQL queries (similar to SQL) for a specific table. It could even combine multiple queries or throw in a custom chart if requested.
  OpenAI's models are good at writing SQL. I think they finally allow the type of use case that SQL itself was supposed to provide as originally envisioned.
zgluck 3 years ago

While it was initially meant as user interface layer of sorts, I think, it's not really something that the typical user can be expected to know nowadays.

ThomPete 3 years ago

Here is how I think about it

The LLM's are infinity app stores. All you need is an LLM and a database plus the ability to speak english and you can replace most features provided by SaaS services today.

The GUI becomes a byproduct of the problem you want to solve rather than the gatekeeper to what you can solve.

https://twitter.com/Hello_World/status/1660463528984150018?s...

drvdevd 3 years ago

I share this opinion as well, I think. I’m looking at any CRUD app I’ve worked on and thinking: this is just a specification over a database. Same with most other software but the web and most mobile apps seem ripe to just become generated patterns.

marysnovirgin 3 years ago

The usability of a system is mostly irrelevant. The measure of a good UI is how much money it can get the user to spend, not how intuitively it enables the user to achieve a task.

pilgrim0 3 years ago

Ah, I love when someone gets it! What design has come to is beyond sad, it’s revolting.

afavour 3 years ago

Weren’t voice assistants a new UI paradigm? Also, tellingly, they turned out to not be anywhere near as useful as people hoped. Sometimes new isn’t a good thing.

aqme28 3 years ago

This is not a new UI paradigm. Virtual assistants have been doing exactly this for years. It's just gotten cheap and low-latency enough to be practical.

NikkiA 3 years ago

Yep, although they were doing it 'badly', I guess it not being quite so terrible is the 'new paradigm', which is eyeroll worthy IMO.

croes 3 years ago

>Then Google came along, and anybody could search

Then they flooded the search results with ads and now you can search but hardly find.

I bet the same will happen with software like ChatGPT.

earthboundkid 3 years ago

It would be neat if someone could make a good adventure game with an LLM, but they’re too prone to getting argued into just letting you win or whatever.

tasuki 3 years ago

> Clicking or tapping things on a screen is an intuitive and essential aspect of user interaction that should not be overlooked.

I don't know, is it? Humanity made do without it for thousands of years.

tin7in 3 years ago

I agree that chat UI is not the answer. It’s a great start and a very familiar UI but I feel this will default to more traditional UI that shows pre defined actions and buttons depending on the workflow.

trojan13 3 years ago

I am surprised this article does not even mention multimodal LLMs. Because the more kinds of media the LLM can take as in input and interpret the easier the interaction with it gets.

nologic01 3 years ago

So many words to describe "declarative programming"

throwoutchatgpt 3 years ago

ChatGPT and all AI is crap. If we don't want to use it, then it will fail to exist and will be nothing but a massive new failure for Microsoft.

EGreg 3 years ago

FB’s AI head just said LLMs are a fad.

I thought about how to use them… I wish they could render an interface (HTML and JS at least, but also produce artifacts like PowerPoints).

What is really needed is for LLMs to produce some structured markup, that can then be rendered as dynamic documents. Not text.

As input, natural language is actually inferior to GUIs. I know the debate between command line people and GUI people and LLMs would seem like they’d boost the command-line people’s case, but any powerful system would actually benefit from a well designed GUI.

dlivingston 3 years ago

As someone who just spent 2 hours in my company's Confluence site, trying to track down the answer to a single question that could have been resolved in seconds by an LLM trained on an internal corporate corpus -- LLMs are very much not a fad.
- JohnFen 3 years ago
  
  LLMs are useful for particular types of things.
  LLMs as the solution for every, or most, problems is a fad.
- EGreg 3 years ago
  
  How do you know the answer is right?
  Because it linked you to the source?
  Like a vector database would? Google offered to index sites since 1996.
  - dlivingston 3 years ago
    
    We have internal search. Finding things isn't the problem. It's contextualizing massive amounts of text and making it queryable with natural language.
    The question I was trying to solve was -- "what is feature XYZ? How does it work in hardware & software? How is it exposed in our ABC software, and where do the hooks exist to interface with XYZ?"
    The answers exist across maybe 30 different Confluence pages, plus source code, plus source code documentation, plus some PDFs. If all of that was indexed by an LLM, it would have been trivial to get the answer I spent hours manually assembling.
    
    artfulmink 3 years ago
    
    The question to which you are replying still stands. How can you guarantee that the responses generated by the LLM are factually accurate? What if it refers to interfaces that don’t exist? One could argue that in your particular use case some inaccuracies can be tolerated, but in many use cases factual inaccuracies cannot be tolerated.
    
    EGreg 3 years ago
    
    So back to the question - how do you know the LLM didnt hallucinate an answer?
    What do you think “indexed by an LLM” is?
    Perhaps Anthropic with its 100K window can actually do it. But most LLM have such a small comtext window that it’s just Pinecone vector database indexing something and stuffing it in the prompt at prompt time. Come on.
EGreg 3 years ago

Here is the main reason:
Any sufficiently advanced software has deep structure and implementation. It isn’t like a poet who can just bullshit some rhymes and make others figure out what they mean.
The computer program expects some definite inputs which it exposes as an API eg a headless CMS via HTTP.
Similar with an organization that can provide this or that servicd or experience.
Therefore given this rigidity, the input has limited options at every step. And a GUI can gracefully model those limitations. A natural language model will make you think there is a lot of choice but really it will boil down to a 2018-era chatbot that gives you menus at every step and asks whether you want A, B or C.

Xen9 3 years ago

Marvin Minsky, a genius who saw the future.

james-bcn 3 years ago

That website has a surprisingly boring design. I haven't looked at it in years, and was expecting some impressively clean and elegant design. But it looks like a Wordpress site.

happytoexplain 3 years ago

I read this comment before clicking, and wow, oh boy do I disagree! The information design is impressively straight-forward. I can see every feature of the site right away with no overload or distraction from the content. There's an intuitive distinction categorizing every page element and I know what everything does and how to get everywhere without having to experiment. The fonts, spacing, groupings, and colors are all nice looking, purposeful, and consistent.
I'm not exactly sure how you're using the word "boring" in this context. There are good kinds of boring and bad kinds of boring, and I think this is the good kind.
JimtheCoder 3 years ago

I'll be honest...I like it. Boring with easily readable content is far better than most of the other junk that is put forward nowadays...
brayhite 3 years ago

What isn’t “clean” about it?
I’ve found it incredibly easy to navigate and digest its content. What more are you looking for?
JohnFen 3 years ago

It's clear, easy to read, and easy to navigate. I wish lots more of the web were as "boring" as this site.
alphabet9000 3 years ago

yeah the site is bad, but not because it is boring, but because it should be even more simplified than how it is now. almost of the CSS "finishing touches" have something wrong with them. the content shifts on page load: https://hiccupfx.telnet.asia/nielsen.gif bizarre dropdown button behavior: https://hiccupfx.telnet.asia/what.gif and i can go on and on. i don't feel this nitpick whining is unwarranted considering the site purports to be a leader in user experience.
- happytoexplain 3 years ago
  
  Reading this made me realize just how much my priorities have changed over the course of my career. In the beginning, this is exactly the kind of thing I would absolutely never let pass, and I still am very keen to fix this kind of ugliness when I have the leeway. But nowadays, I'm ecstatic just to see something useful and not confusing or frustrating. These kinds of rough edges that give the user the impression of crappy software but don't materially harm usability have come to be second-order issues that I often don't even think about until larger problems have been fixed. Arguably what I've adopted is a form of pessimism.
johnchristopher 3 years ago

Maybe you could do a CSS redesign of it ? You could even hold a contest on Twitter or on blogs to compare redesigns/relooking people are coming up with ?
That could be interesting.
Gordonjcp 3 years ago

You should see his old site.
- ttepasse 3 years ago
  
  I do have a soft spot for the very reduced design of that site and the sister site useit.com had in the early 2000s:
  https://web.archive.org/web/20010516012145/http://www.nngrou...
  https://web.archive.org/web/20050401012658/http://www.useit....
  A redesign should not has been as brutalistic, but keeping the same spirit and personality.

Settings

AI: First New UI Paradigm in 60 Years?

Keyboard Shortcuts