IETF setting standards for AI preferences

37 points by Mithriil 7 months ago

I appreciate the effort, but without any legal backing these signals are just going to be ignored like robots.txt. Hell even if they were legally binding they'd probably still be ignored if scrapers thought they could obfuscate the paper trail enough to get away with it.

Tomte - 7 months ago

This is a way to express your reservation, pursuant to Article 4(3) of the EU's DSM Directive.
The legal machinery is already in place, we now need precisely that: a standard for machine-readable reservations.
felixfbecker - 7 months ago

OpenAI and Anthropic respect robots.txt afaik
- Ukv - 7 months ago
  
  To add anecdotally based on logging on my portfolio site, all major US players (OpenAI, Google, Anthropic, Meta, CommonCrawl) appeared to respect robots.txt as they claim to do (can't say the same of Alibaba).
  Sometimes I do still get requests with their useragents, but generally from implausible IPs (residential IPs, or "Google-Extended" from an AWS range, or same IP claiming to be multiple different bots, ...) - never from the bots' actual published IP addresses (which I did see before adding robots.txt) - which makes me believe it's some third party either intentionally trolling or using the larger players as cover for their own bots.
  - dharmab - 7 months ago
    
    Using residential IPs is standard operating procedure for companies that rely on collecting information via web scraping. You can rent residential egress IPs. Sometimes this is done in a (kind of) legit way by companies that actually subscribe to residential ISPs. Mostly it's done by malware hijacking consumer devices.
- VladVladikoff - 7 months ago
  
  Noooooope! They completely ignore crawl frequency in my experience. Bing too. Only Google seems to obey it.
- mog_dev - 7 months ago
  
  They dont.

ddtaylor - 7 months ago

Asking people to read your content with a specific purpose or intent has traditionally not been very successful or useful. I understand people are frustrated with the knowledge transfer, but if the goal was to increase the reach of your ideas, it's being accomplished.

AI being involved changes the scale and scope, but it doesn't change the fundamentals. China and India were already imitating and cloning everything for their markets and for ours.

We have had virtually zero success enforcing patent, copyright and barely even the lowest bar trademark enforcement. There may not be any framework for this kind of enforcement that I want to see that would be effective, but I am open to ideas that don't involve government overreach etc.

TZubiri - 7 months ago

The ietf should be concerned with user concerns. If they make standards about AI preferences it should be around memory and language and stuff like that, not meddling with legal matters that are outside of their scope and expertise.

adrian_mrd - 7 months ago

Does anyone know whether there any licences or licence derivatives - like the various flavors of Creative Commons - that currently restrict usage by AI LLMs?

- 7 months ago

[deleted]

elitepleb - 7 months ago

DNT: AI