Survey Study on AI Agent Architectures (2024)

77 points by jslampe 2 years ago · 120 comments

Reader

rck 2 years ago

Not mentioned in the paper, but I have been experimenting with behavior trees for LLM agents, and have had a lot of success: https://richardkelley.io/dendron/tutorial_intro/

temporarely 2 years ago

How is a behavior tree different from a decision tree? Is there a subtle difference I am missing?
https://en.wikipedia.org/wiki/Decision_tree
- sdesol 2 years ago
  
  This talks about the distinction
  https://gamedev.stackexchange.com/questions/51693/difference...
  - temporarely 2 years ago
    
    thanks.
- rck 2 years ago
  
  The other reply linked to a good explanation. I would only add that I also wrote a paper on Dendron, and Figure 17 shows how to transform a decision tree node into a behavior tree, so that you can implement any decision tree as a behavior tree:
  https://arxiv.org/abs/2404.07439
  - temporarely 2 years ago
    
    Thanks! Noted the cite to this paper (below) as well in your paper in case others are interested:
    Behavior Trees in Robotics and AI: An Introduction, Michele Colledanchise, Petter Ögren, 2017
    http://arxiv.org/abs/1709.00084
    1.1 A Short History and Motivation of BTs
    BTs were developed in the computer game industry, as a tool to increase modularity in the control structures of Non-Player Characters (NPCs) . In this billion-dollar industry, modularity is a key property that enables reuse of code, incremental design of functionality, and efficient testing.
    In games, the control structures of NPCs were often formulated in terms of Finite State Machines (FSMs). However, just as Petri Nets provide an alternative to FSMs that supports design of concurrent systems, BTs provide an alternative view of FSMs that supports design of modular systems. Following the development in the industry, BTs have now also started to receive attention in academia.
    At Carnegie Mellon University, BTs have been used extensively to do robotic manipulation. The fact that modularity is the key reason for using BTs is clear from the following quote from [2]: “The main advantage is that individual behaviors can easily be reused in the context of another higher-level behavior, without needing to specify how they relate to subsequent behaviors”.
    BTs have also been used to enable non-experts to do robot programming of pick and place operations, due to their “modular, adaptable representation of a robotic task” and allowed “end-users to visually create programs with the same amount of complexity and power as traditionally-written programs” [56]. Furthermore, BTs have been proposed as a key component in brain surgery robotics due to their “flex- ibility, reusability, and simple syntax”.
muratsu 2 years ago

This looks very interesting, thanks for sharing.
gnat 2 years ago

That’s a very well-written tutorial! I wish more software came with something so friendly and informative. Thanks.
- rck 2 years ago
  
  Thank you!
billmalarky 2 years ago

Hi Richard I just reached out to you on LI. I'd love to get a chance to chat with you about your experience with this. Thank you for sharing this is fascinating.

sgt101 2 years ago

I'll just draw folks attention to the long running AAMAS conference series.

Of course, it's quite academic in nature, but it may be that some useful approaches could be picked up from this resource for LLM driven approaches.

https://aamas2023.soton.ac.uk/

jrussino 2 years ago

In addition, here are a few other conferences that may be of interest to people in this thread:
- AAAI: https://aaai.org/conference/aaai/
- ICAPS: https://icaps24.icaps-conference.org/
- IJCAI: https://www.ijcai.org/

irthomasthomas 2 years ago

"In the ever-evolving landscape of Natural Lan- guage Generation (NLG) evaluation, a noteworthy paradigm shift is underway as researchers increas- ingly turn their attention towards fine-tuning open- source language models (e.g., LLaMA), in lieu of traditional closed-based LLMs like ChatGPT and GPT-4. This transformative shift is propelled by a thorough exploration of key perspectives, including the expenses associated with API calls, the robust- ness of prompting, and the pivotal consideration of domain adaptability."

This paper was written by an LLM. Probably Claude-3.

abrichr 2 years ago

Not mentioned: learning from demonstration. This is the approach we are taking at https://github.com/OpenAdaptAI/OpenAdapt.

Havoc 2 years ago

Finding it quite difficult to decide which platform to bet on. Autogen langchain and langgraph seem to be main contenders. And then people seem to custom roll them too

RamblingCTO 2 years ago

perfect timing! I'm just building myself an assistant via telegram and for now went with the multi-agent collaboration via supervisor pattern.

Settings

Survey Study on AI Agent Architectures (2024)

Keyboard Shortcuts