Settings

Theme

Google proposes Open Knowledge Format based on Markdown

cloud.google.com

86 points by itherseed 10 days ago · 18 comments

Reader

james_ross 3 hours ago

I'm a massive fan of capturing domain knowledge in a plain text format that both humans and AI can use, one of my favourites being concept maps stored in human readable form. I did a talk on this exact topic long before we all got AI pilled, this version from 2015 for example: https://www.infoq.com/presentations/concept-map/

The web app mentioned in that talk that I built to help with that no longer exists, but I recently built a new desktop app (Apple Silicon Mac only so far, sorry) for this exact purpose: https://thinkingtools.software/concepticon

sadschnitzel 10 days ago

I love the simplicity of this OKF spec, but I'm not sure everything can be represented well in "just Markdown".

I've recently become intrigued by representing concepts so that AI can co-contribute effectively and token-efficiently (typically: find a good way to represent something as semi-structured sequential text), but also without compromising the human lens on the representation. We shouldn't accept a downgrade of the human knowledge representation experience just to make it AI-accessible. That's especially true if traditionally non-dev personas need to contribute, and they almost certainly find "weird text format + git" much worse than their current authoring/viz tools.

I'm excited to see how standards for semantically representing different kinds of knowledge emerge in the next few years!

Successful examples I can think of to mix in are open standards like DBML for schemas/E-R, LikeC4 for architecture, diagrams-as-code ideas like Mermaid, all of which LLMs seem to "get" well (or can be told about from a short EBNF prompt). Crucially, they also have pretty human viz forms, and you can you can just ```code block``` inline them in Markdown next to natural language. And you can get LLMs to help you author the syntax.

Harder to crack is stuff where there's implicit human meaning in spatial layout and colour, like in complex spreadsheets or Miro. I haven't found good alternatives for those yet.

My own attempt in my (data engineering) domain is https://equalexperts.github.io/satsuma-lang/ for AI-and-human source-to-target mappings and transforms. A succinct structured text representation that allows natural language, but also nice viz and LSP/grammar tooling that helps agents not to have to slice and dice big docs token-inefficiently to reason about things like lineage or completeness or undefined sources.

  • xamde 10 days ago

    OKF seems OK, but bound to Markdown. A Markdown document can be turned into an OKF document by adding a 'type' to the frontmatter YAML.

    What about a knowledge graph language, which can be stated in Markdown prose, in Markdown code blocks, but also everywhere a text field is waiting for you? In the minimalistic language https://ddot.it you can link outside the Markdown world, to files, URLs or even just labels. Like OKF it's just a format.

    Disclaimer: I wrote that (short) spec.

    • sadschnitzel 10 days ago

      I love how unobtrusive that is, great compromise between readability and expressiveness!

      • xamde 9 days ago

        Thanks, it is based on an unreleased 50-page complex syntax speec with over 40 different kind of arrows. Luckily, I simplified BEFORE release :-)

  • UltraSane 10 days ago

    You can't represent knowledge well without a graph format showing labeled relationships between entities.

  • jarym 10 days ago

    Markdown is the defacto format for LLMs and humans to interoperate. And I agree not everything can be represented well but that’s missing the point - it seems to win because markdown is the lowest common denominator for both human and AI models.

mrkiouak 10 days ago

I love revisiting RDF/OWL Semantic Web formats every 10 years.

One of these years will be the one!

https://en.wikipedia.org/wiki/Semantic_Web

verdverm 10 days ago

A bunch of broken (slopped?) links in the original post, here is the repo

https://github.com/GoogleCloudPlatform/knowledge-catalog

Spec: https://github.com/GoogleCloudPlatform/knowledge-catalog/blo...

bsimpson 10 days ago

Is the flavor of Markdown (e.g. CommonMark) specified? Didn't see anything about it by perusing the first few pages, but that feels important for a spec.

port11 9 days ago

Google has announced… Markdown with YAML front-matter, ladies and gentlemen. Please applause. 15kb of spec for this!

(I’d be less sardonic if we could all stop using oops-you-missed-an-indent-YAML.)

yladiz 10 days ago

Having looked at many PDFs that needed to be “translated” to Markdown, it feels like a strange choice - I know it’s primarily to make things easily accessible to AI, but if we’re going to train models anyway, why not train them on something better? Markdown is quite limited, and can’t render something like a nested table for example, and if the point of having “open knowledge” is for AI, why do we need to use a format that won’t really be read by humans?

bzmrgonz 6 days ago

The way I understand it, it is essentially a scaffolding for us humans, since we can't see past 3d. Hopefully we can do a decent job so that the agents can take our markdown and build the graph infra in memory or store in neo4j.

sermakarevich 10 days ago

Love the approach. I am a big fun of hierarchical knowledge organization. I think that almost all current Claude abstractions to knowledge management are broken. It becomes visible when you start running many coders concurrently or need to create 1K+ skills fe: https://news.ycombinator.com/item?id=48407998

matthewbarras 10 days ago

Check out barrasindustries.com/okfind/

Just an idea for an OKF bundle registry

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection