Settings

Theme

We auto-convert HTML to Markdown for AI agents

blog.cloudflare.com

12 points by emot 2 months ago · 3 comments

Reader

TIPSIO 2 months ago

This is cool and glad Cloudflare is offering options for AI to everyone (tools to block, tools to better enable).

This is probably fast, but FWIW I would bet doing a simple str replace on HTML elements with '' would yield mostly the same result. Any sort of structured content (like markdown) isn't even needed really for LLM. Make it messy and super fast and don't accidentally lose anything, it's an LLM.

If compression was really the goal, you could take it further and probably remove all words like "the" and "and", punctuation, maybe even spaces

hedora 2 months ago

Why would agents use this?

HTML -> Markdown software is readily available, and some percentage of the internet is hostile towards agents.

Also, isn't the conversion lossy? I imagine an agent would rather have access to the HTML, and iteratively try strategies until it got good extraction quality? If it happens automatically inside the network, you're stuck with second-class content extraction some percentage of the time.

Has anyone built libraries to do that?

  • Suppafly 2 months ago

    I just scanned the linked article but it seems like it reduces the amount of tokens the agents would have to parse quite a bit.

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection