We think you should block ChatGPT and other AI bots from crawling your website
OpenAI has launched its crawler including documentation on how to block it, which can be done by adding these 2 lines to your robots.txt:
User-agent: GPTBot
Disallow: /
Note that you can do the same for Common Crawl bot and Google’s AI bot by adding this:
User-agent: CCBot
Disallow: /
User-agent: Google-Extended
Disallow: /
We’ve blocked both already for the Tactic Lab website and we think this is best for most of our clients. If GPTBot/CCBot crawl your website and uses your content to further train ChatGPT, it will mean that your content could appear in answers without any guarantee of attribution, accuracy of framing, traffic to your website or any other benefit. There’s been a lot of discussion about this already in the last few days, if you want a very jargon-free take on reasons to block, see this Insider article.
There are some special cases where it might make sense to allow access. A website aiming to provide important public information (eg. a charitable advocacy cause) might want that information to appear in ChatGPT (and similar tools’) answers even if it’s not attributed, since the benefit of the information being available in answers will outweigh any benefits to the org itself. However, there is still a loss of control: your content appearing in a remixed answer, stripped of its original context has the potential to mislead the reader.
Either way we recommend making a decision as soon as possible.