Settings

Theme

Survey Study on AI Agent Architectures (2024)

arxiv.org

77 points by jslampe 2 years ago · 120 comments

Reader

rck 2 years ago

Not mentioned in the paper, but I have been experimenting with behavior trees for LLM agents, and have had a lot of success: https://richardkelley.io/dendron/tutorial_intro/

  • temporarely 2 years ago

    How is a behavior tree different from a decision tree? Is there a subtle difference I am missing?

    https://en.wikipedia.org/wiki/Decision_tree

    • sdesol 2 years ago
    • rck 2 years ago

      The other reply linked to a good explanation. I would only add that I also wrote a paper on Dendron, and Figure 17 shows how to transform a decision tree node into a behavior tree, so that you can implement any decision tree as a behavior tree:

      https://arxiv.org/abs/2404.07439

      • temporarely 2 years ago

        Thanks! Noted the cite to this paper (below) as well in your paper in case others are interested:

        Behavior Trees in Robotics and AI: An Introduction, Michele Colledanchise, Petter Ögren, 2017

        http://arxiv.org/abs/1709.00084

        1.1 A Short History and Motivation of BTs

        BTs were developed in the computer game industry, as a tool to increase modularity in the control structures of Non-Player Characters (NPCs) . In this billion-dollar industry, modularity is a key property that enables reuse of code, incremental design of functionality, and efficient testing.

        In games, the control structures of NPCs were often formulated in terms of Finite State Machines (FSMs). However, just as Petri Nets provide an alternative to FSMs that supports design of concurrent systems, BTs provide an alternative view of FSMs that supports design of modular systems. Following the development in the industry, BTs have now also started to receive attention in academia.

        At Carnegie Mellon University, BTs have been used extensively to do robotic manipulation. The fact that modularity is the key reason for using BTs is clear from the following quote from [2]: “The main advantage is that individual behaviors can easily be reused in the context of another higher-level behavior, without needing to specify how they relate to subsequent behaviors”.

        BTs have also been used to enable non-experts to do robot programming of pick and place operations, due to their “modular, adaptable representation of a robotic task” and allowed “end-users to visually create programs with the same amount of complexity and power as traditionally-written programs” [56]. Furthermore, BTs have been proposed as a key component in brain surgery robotics due to their “flex- ibility, reusability, and simple syntax”.

  • muratsu 2 years ago

    This looks very interesting, thanks for sharing.

  • gnat 2 years ago

    That’s a very well-written tutorial! I wish more software came with something so friendly and informative. Thanks.

  • billmalarky 2 years ago

    Hi Richard I just reached out to you on LI. I'd love to get a chance to chat with you about your experience with this. Thank you for sharing this is fascinating.

sgt101 2 years ago

I'll just draw folks attention to the long running AAMAS conference series.

Of course, it's quite academic in nature, but it may be that some useful approaches could be picked up from this resource for LLM driven approaches.

https://aamas2023.soton.ac.uk/

irthomasthomas 2 years ago

"In the ever-evolving landscape of Natural Lan- guage Generation (NLG) evaluation, a noteworthy paradigm shift is underway as researchers increas- ingly turn their attention towards fine-tuning open- source language models (e.g., LLaMA), in lieu of traditional closed-based LLMs like ChatGPT and GPT-4. This transformative shift is propelled by a thorough exploration of key perspectives, including the expenses associated with API calls, the robust- ness of prompting, and the pivotal consideration of domain adaptability."

This paper was written by an LLM. Probably Claude-3.

abrichr 2 years ago

Not mentioned: learning from demonstration. This is the approach we are taking at https://github.com/OpenAdaptAI/OpenAdapt.

Havoc 2 years ago

Finding it quite difficult to decide which platform to bet on. Autogen langchain and langgraph seem to be main contenders. And then people seem to custom roll them too

RamblingCTO 2 years ago

perfect timing! I'm just building myself an assistant via telegram and for now went with the multi-agent collaboration via supervisor pattern.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection