Show HN: Octopus – a directed acyclic graph for app development

84 points by bbsimonbb 2 years ago · 48 comments · 1 min read

Reader

Directed acyclic graphs are muched discussed in comp-sci, but octopus appears to be the first reusable, turnkey, ready-to-wear, off-the-shelf implementation of a DAG for application development, in any language, that I'm aware of.

This is remarkable because DAGs hit a sweet spot in the middle of the three common programming paradigms (OO, event-driven, functional). Let's have a DAG as the top-level structure of our applications. Data-fetching and onChange handlers live in DAG nodes, next to the data they act on. The UI flows out from the DAG with fine-grained reactivity. Our app state is effortlessly consistent, because any outside change (user action, api result) unleashes a graph traversal. Our UI components become much simpler, because they just need to dumbly reflect values in the graph.

I'm putting this up for a second time. Absolutely no-one bit the first time, which can't be right :-)

kokanee 2 years ago

How is React state not a DAG out of the box? What problems does this solve? The pitch appears to be:

> any outside change (user action, api result) unleashes a graph traversal. Our UI components become much simpler, because they just need to dumbly reflect values in the graph.

But that sounds the same as the reasoning behind React's one-way data flow.

bbsimonbbOP 2 years ago

It's very different. Your react components are organized in a tree that mirrors your DOM. If you're using React without an additional state management package, you need to hoist your shared state to a common ancestor, then prop drill it down to where it's needed. Nightmare. I'm advocating a graph that mirrors your subject domain, unconstricted by your DOM. Also, my graph is constructed from below, which feels very intuitive. Nodes specify who they want to read. You make no change to the node being read.

efnx 2 years ago

> Directed acyclic graphs are muched discussed in comp-sci, but octopus appears to be the first reusable, turnkey, ready-to-wear, off-the-shelf implementation of a DAG for application development, in any language, that I'm aware of.

I wrote one for Rust called moongraph: https://crates.io/crates/moongraph https://github.com/schell/moongraph

It powers my configurable renderer and my ECS. I'm sure lots of other folks have written their own and they might not even know it's a DAG.

catapart 2 years ago

Like others are saying: I think a lot of modern frameworks already fit this mold, or are so close that people can't quite tell the difference. But for a near 1-1 example, maybe look at XState? Especially given that XState has a visual, graph view which interactively indicates state in real time.

bbsimonbbOP 2 years ago

That looks interesting, and totally new to me. Thanks.
- pcthrowaway 2 years ago
  
  Yeah, I was thinking of xstate too. I think you do a better job than xstate does in describing how a DAG model is a better fit for organizing the frontend state than the current more common spaghetti paradigm. And xstate isn't solely for representing the frontend as a DAG, but my understanding is that it can do that too.
  So while I think the description of your solution is better than I've seen in xstate's docs, I'd be inclined to go with the xstate's more mature software that gives me the ability to represent front-end state in a DAG, since it has a thriving ecosystem and (I assume) better tooling.
  But I might give Octopus a spin anyway to see what the DX is like
  - bbsimonbbOP 2 years ago
    
    So xstate implements state machines. This is quite a different concept from DAGs, though there may be a DAG underneath. This isn't mentionned anywhere though. The concepts you work with are state machine concepts.
    
    pcthrowaway 2 years ago
    
    Yes but a state machine is reducible to a directed graph, and a DAG is a specialization of a directed graph, which xstate can be used for
    edit: to be clear, I think the additional constraints on the directed graph model introduced by Octopus may be useful by requiring developers to derive state transitions from updates which can only occur upstream, because that state can't flow back in the other direction. Whereas the dependencies can have cycles with xstate.
    So I'm positing you can do everything with xstate that you can with octopus (and more). I'm not necessarily arguing that the ability to do the "and more" part with xstate is not a footgun, just that xstate lets you work with state in a similar way while also having a mature ecosystem of tooling around it. Perhaps the constraints introduced by octopus would result in more robust software in practice; I'm definitely interested to try it for myself and find out

bbor 2 years ago

A) I agree with many of the comments here arguing that this is not necessarily a new technique, and

B) you’re gonna go far in this life, I’d bet! I know nothing about you but this whole README just reeks of a bright young thinker who’s not afraid to question existing paradigms and can follow through on their conceptual vision. Plus it helps that you’re a very dramatic and effective writer. I encourage you not to let any of this feedback get you down!

Will sit down with the repo myself later, just for me it’s not an angle of HCI I’ve examined explicitly, at least for very long.

grayrest 2 years ago

> it’s not an angle of HCI I’ve examined explicitly, at least for very long
If you're looking for additional reading then it's arguable that Reacts matches this viewpoint but any web framework from the past few years using signals (re-popularized by Solid) is explicitly this approach. Most frameworks don't opt for graphical node editors and I've never liked the approach but I've seen a number of those as well.
More generally this falls into dataflow programming which has had a number of published papers starting in the late 80s I believe.
If you're looking for a similar computer sciency approach to UI structure you can look up statecharts which is an older idea that formed the conceptual basis for Ember Router and, in turn, most js framework routers in the past decade.
- bbsimonbbOP 2 years ago
  
  No. Signals work with a weak map of references, not a graph. So in my Pizza example, when pizza changes, totalPrice would recalculate twice: once on a signal from pizza, once on a signal from tip. Explicitly constructing the graph dramatically reduces needless recalculation in bigger systems.
AndrewKemendo 2 years ago

This is exactly my take. I love anything that makes it easier to track state and transitions. Especially so if it’s a framework for generalizable applications.

w10-1 2 years ago

The DAG framing might miss the key points and attract contention.

The "law of Demeter" is a design rule that says to only work with data you directly know about - i.e., friends, but not friends of friends. It make local reasoning tractable and forces any (friendly) use to be surfaced as such, which gives a better way to assess global complexity. (By contrast, the DOM is just, well, global state, whatever the data structure.)

But who or what is "you"? In this case, the node, which is responsible for (local) data liveness; a UI as simply derived nodes from that, with cross-node validation dependencies represented as graph dependencies.

Modeling overall system activity as graph updates does provide nice separation to permit concurrent programming, with graph structure as the arbiter of conflicts, but you'll need a more specific notion of "consistency" that can distinguish the use cases this would work for.

The comparison and question would be with various UI and data component systems, which handle data mapping to the back-end, cross-component aspects like themes or device constraints, user id and security, etc.: how are these higher-level concerns managed with an iterative graph structure?

Serialization is a nice sample concern, but your (attractive) approach exemplifies the issue: to make it work entails tracking some conventions/responsibilities, which are not captured by the DAG framing.

For me, frameworks work when the types basically constrain developers into doing the right thing. Local reasoning and responsibility translate directly to local code, and proof relates directly to compliance, preferably at the language level at build time.

These kind of local-reasoning models shine when the code is federated, e.g., when hosting multiple services and contributions. The whole app should just work if the participants can be validated on load/launch, and any failures should be restricted to the local nodes instead of cascading. So if you target an app requiring some specific collaborative domain (with a promising business model), this proposal might draw more attention, and you can work through issues concretely.

So perhaps target a killer app, build understandable constraints into the code, and demo key features.

bbsimonbbOP 2 years ago

The graph framing is essential. I use typescript graph libraries for the topological sort and for the visualization. I did much less work than either of those two libraries !
- j45 2 years ago
  
  This seems a useful way to imagine framing qualities about an entity to see what you know about them, be it a profile, survey, etc, which may come from many different perspectives.

hobofan 2 years ago

Almost every modern framework is implicitly modeled as a DAG? It just rarely used that nomenclature. Heck with Elm a whole language exists for that principle.

I'm really not sure what's supposed to be different here. As is evident from the repository you obviously know about React and Vue, so maybe try to contrast it to them, or how it fits in in relation to them? To me it looks like you are building a system-in-a-system, where you are rebuilding the primitives that already come with React (or any of its peers).

bbsimonbbOP 2 years ago

As I've just written above, React components form a tree that mirrors the DOM. A graph is a much better fit. Nodes in my DAG ressemble "computed" in Vue, where they watch a value and recalculate when it changes, but... in Vue the watched value needs to exist, ie you're responsible for constructing the graph and ensuring no cycles, and the computed value is only available locally, to be used in the template. In my DAG, you just name the thing you want to watch, octopus checks for cycles. Plus you get the visualization.
The closest thing I'm aware of is Jotai. Atoms in jotai can take dependencies on other atoms, so forming a graph.
Reporting nodes in Octopus are, I believe, completely new. You can create nodes that select their predecessors with a filter function. So the "totalPrice" node takes a dependency on any node with a price property, and recalculates when anything with a price changes.
- vidarh 2 years ago
  
  > React components form a tree that mirrors the DOM. A graph is a much better fit.
  Not all acyclic graphs are trees, but all trees are acyclic graphs. As such this part of your description is confusing.
  Your last paragraph does a much better job at explaining how it differs.
  - bbsimonbbOP 2 years ago
    
    I put a concrete example in the sample app. "pizza" depends on "size" and "base". So a DAG, as you point out, is less constraining than a tree. When you shoehorn your state and orchestration into react components, not only do you have to fit them into a tree, it's fundamentally the wrong tree.
    
    vidarh 2 years ago
    
    I'd suggest you emphasize that near the top of your README, maybe with a cleaned-up version of the generated graph and some annotation of the data flow. To put it a bit pointedly, "nobody" will care about the data structure until they see that it means they don't have to manually thread data down through the right branches of a tree. That's your big selling point.
    
    bbsimonbbOP 2 years ago
    
    Good suggestion.

wavemode 2 years ago

I remember working at a company where we had a backend web framework that was essentially a DAG. You could use it to create API's where the caller simply specifies what they want, and the server figures out, through graph theory, what series of processing steps and API calls to other services should be performed (some of which can have dependencies on the results of other API calls and/or processing steps), and the optimal way to parallelize them.

It struck me as being both a very intriguing framework, and also exceptionally over-engineered for the kind of problem it was solving (since an async-await architecture gets you all the same benefits).

sa46 2 years ago

I worked with a framework like that inside Google in Java. Was okay. You’d write a bunch of methods and the framework would wire up annotated method parameters and return types. The main benefit was improved tail latency since the framework would traverse the graph and run the methods with optimal concurrency.
Same downsides as runtime dependency injection frameworks like Guice—the framework would explode if it couldn’t connect the graph.
Some other upsides: you could decorate methods with all sorts of distributed goodness like retries and hedging. The approach also enable reusable methods that you snap into your server.
- aleksiy123 2 years ago
  
  Producer graphs?
  Maybe just me but I find them completely unreadable and painful.
  I don't really think there anything inherently better about having the framework connect things through inputs and outputs or developer explicitly specifying the connections through function calls.
  In both cases the executor can decide how to execute the graph.
  - bbsimonbbOP 2 years ago
    
    Are you talking about this? https://patents.google.com/patent/US7865872B2/en
    
    aleksiy123 2 years ago
    
    No, unfortunately the framework itself is not public.
- bbsimonbbOP 2 years ago
  
  That sounds very cool indeed.
j45 2 years ago

My guess as to why it was over engineered was to be reusable, or at the start maybe an unknown complexity was foreseen requiring unreasonableness flexibility.
Like many, I early in my career once rewrote a partial implementation of windows workflow foundation to work out of MySQL with db functions. It was .. beautiful. :) And the problem learned to behave and not need it again. Still running years later and reasonably refactorable.

archibaldJ 2 years ago

I don’t really understand what values such design pattern can bring worth (that will outweigh its own complexities)

my experience working with DAGs programmatically (ie not as an abstraction (like in React) but actually handling the edges&nodes of a graph-based abstraction in code) is that it looks nice theoretically but in practice top-down (conceptual) approach like this often tends to over-complicate things

would love to see some real-world examples where an actual dev team, etc, find values in graph-theortical-based abstractions like this

bbsimonbbOP 2 years ago

Have a look at the node code. It's biblically simple. I would say it's much simpler than any comparable solution. You can forget about reducers, thunks, computed, even useState, useContext, useEffect and props are rare. I've stuck with graph terminology: nodes, predecessors etc, but you don't need to grok that stuff to use this. Once you've understood that a node is a value, some methods and a reup() function, that's it.
zem 2 years ago

state machines are a common example of an explicit graph-based abstraction that people build their code around.
- bbsimonbbOP 2 years ago
  
  I looked at and rejected state machines, because I didn't like/understand the need for the formalism of defining every possible state.
  - davidkpiano 2 years ago
    
    You don't need to define every possible state. A state machine can serve as a higher-level structure of some of the states (statuses, modes) of parts of your app, for ease of structuring logic. But you can also create a single-state state machine and represent non-finite data as "external state".
- archibaldJ 2 years ago
  
  yeah I heard about lib like XState; just not sure what values it actually provides for a team to decide to use it in a real-world app
bbsimonbbOP 2 years ago

I'm not doing this for fun. Especially since I'm fundamentally lazy (and more of a practitioner than a comp-sci person). I'm looking at custom configurators for complex made-to-measure products with dozens of options, with many interdependencies between options.

mdaniel 2 years ago

well, partially because there's no license in your repo so unless you're selling it to someone ... what outcome are you expecting to happen?

bbor 2 years ago

Great message (“you forgot a license!”), not a very necessary tone IMO! Not to be the hall monitor…

kitd 2 years ago

I may be misunderstanding what this is, but back in the J2EE days ISTR Struts having something very similar to this, ie a flow from page to page declared apart from the actual page logic itself. Edit: and all done in XML, so obviously doesn't count!

Cool project though!

082349872349872 2 years ago

it's tempting to write a sh(1) based webapp that uses make(1) under the hood to provide similar functionality...

speps 2 years ago

Isn't this what Dash[1] does? You can see a very similar graph in the built-in debugging tools.

[1] https://github.com/plotly/dash

reactordev 2 years ago

The browser is already a graph. Or am I missing something?

kokanee 2 years ago

The special property of a DAG is that there are no cyclic paths between nodes. In the DOM, every node has a path to every other node, so you can create infinite loops (e.g. element.parentNode.nextElementSibling.previousElementSibling.firstChild).
That's the document tree - when it comes to application state, which is just data in memory, anything is possible. These state libraries tend to treat in-memory application state as the "real" state, and try to treat the document tree as a side effect. This is one reason that front end unit tests tend to miss bugs - application state being what you expected doesn't mean the user sees what they expected.

20after4 2 years ago

LinuxCNC is an old and proven application based on a DAG as the core architectural paradigm.

machiaweliczny 2 years ago

How is that better than MobX?

bbsimonbbOP 2 years ago

I like and use MobX. I "come from" Vue, and I like mutability (though I concede its dangers). For my needs, the limitation was reactions. In mobx, reactions shouldn't update other observables. In Octopus, chaining is the whole point. Nodes can be stacked to n depth. There is no distinction between state and computed. Computed just generates more observable state.
Then, in Octopus you also get reporting nodes and visualisation, which, once tasted, no return.

atlassubbed 2 years ago

Ah interesting, a trip down memory lane! I made something very similar 5 years ago: https://github.com/atlassubbed/atlas-relax. It was an attempt to build a "stronger" version of React (using DAGs) in under 2.5kb, and completely decouple the diffing/rendering engine from the DOM. So think "nodes that can return JSX templates in their compute functions", but the JSX templates don't need to represent the DOM. They could represent anything, like Terraform config since JSX is just a syntax for javascript objects, similar to the hyperscript function `h`. To create React with my framework, I used a plugin (one such plugin https://github.com/atlassubbed/atlas-mini-dom) which registers listeners on nodes' create/update/destroy lifecycle methods. The simple idea is when a DAG node is added and corresponds to a DOM node, add the corresponding DOM element to the DOM. When removed, remove the element, etc. The cool thing was that the "build your own react" plugin is like 20 lines of code since the base diffing/rendering is abstracted away by the underlying DAG engine, so the interface ends up being similar to an AST crawler interface, where you're just implementing listeners as you encounter new/changed tags.

  const { diff } = require("atlas-relax");
  const DOMRenderer = require("atlas-mini-dom");
  const App = () => (
    <div>
      Bonsly evolves into Sudowoodo after learning mimic.
    </div>
  )
  // create a DOMRenderer plugin
  const rootEl = document.getElementById("root");
  const renderingPlugin = new DOMRenderer(rootEl);
  // mount <App/> against the "null" DAG and render it to the DOM.
  diff(<App/>, null, renderingPlugin);

The cool thing is you could diff two different DAGs against each other and listen to the delta, like `diff(<App1/>, <App2/>, consoleLogPlugin)`. The base library could be used to generate application frameworks as long as your application framework can be thought of as a DAG operating on data. React is an example of such a framework, but so is something like Airflow, so you could write a plugin that lets you build your own kind of Airflow. That was the motivation behind my DAG abstraction -- to make it easy to create DAG frameworks for frontend and backend. Let the base library do all the hard reconciliation work, and you can build application frameworks on top.

Anyway that was all mostly an exercise. I didn't end up using my framework for anything more than a state management solution for React. It handles global data perfectly, although these days React context or hook management is usually enough.

bbsimonbbOP 2 years ago

That looks really remarkable.

Settings

Show HN: Octopus – a directed acyclic graph for app development

Keyboard Shortcuts