Settings

Theme

How is GPTBot allowed or disallowed?

medium.com

3 points by flowinghorse 2 years ago · 5 comments

Reader

flowinghorseOP 2 years ago

GPTBot — the official crawler of OpenAI, has been announced for nearly 2 months. GPTBot is for crawling web information to improve the models of OpenAI, e.g. GPT-4. We are wondering what the reactions from the Internet are. Is the bot being accepted or rejected?

  • LinuxBender 2 years ago

    I suspect not many website operators/developers are aware this exists. Usage of robots.txt is unenforceable and would only show intent to OpenAI. This would not be useful for other LLM's as Google, Bing and other search engines already have decades of ingested data to feed their LLM's.

    In my poor armchair quarterback opinion if people wish for something to not be crawled then they must make a best effort to ensure only humans are accessing it with strong authentication, legal agreements, best-effort bot detection and also have binding legal contracts that implement punitive actions for doing something with data it was not approved for and then actually follow through with legal action for breach of contract.

    • flowinghorseOP 2 years ago

      The number of disallow we found in the robots.txt files actually surprises us.

      Companies like OpenAI also have to do a lot of things to ensure the compliance to the regulation.

      • LinuxBender 2 years ago

        to ensure the compliance to the regulation

        Did legislation pass requiring people and their bots to obey robots.txt? If so I totally missed it. That would be big news if so.

  • socrateslee 2 years ago

    ChatGPT only swallows data or content but bring back no traffic to content creators. Maybe this is how Google is different from ChatGPT. But what will become of Google if the experimental generative AI's answer replace all the SERPs? An closed web where no search engine no AI can actually enter?

Keyboard Shortcuts

j
Next item
k
Previous item
o / Enter
Open selected item
?
Show this help
Esc
Close modal / clear selection