Generative AI & Web Scraping
The advent of generative AI services has made Web scraping into a very real privacy issue, but few have understood the underlying SEO threat to web domain owners.
ChatGPT and its derivatives have changed the world of SEO forever. We can argue all day long about the quality of ChatGPT derived content, what isn’t up for debate is that ChatGPT content can be optimised for SEO. The search engine algorithms don’t care if it’s human or not. They just care if it’s ‘good’.?
So how can this affect you? What's happening now is a new twist on the older SEO skyscraper strategy. Find people with proven traffic and footfall, and then build a bigger tower. You can think of this as the McDonalds / Burger King strategy. If people are already going to a hamburger destination, it pays dividends to have your competing brand right across the street. Your competitor has done all the hard work to establish footfall. You can then hijack some of the traffic.
Find a competitor with strong SEO rankings for keywords that you covet. Analyse the backlinks and copy, and then build a better content model. The skyscraper model is well known, but it needed skilled copywriters, and a knowledgable SEO team or agency to apply it. It also takes time. A lot of time. Of course, the content alone is not enough, and you have to have backlinks and authority, but great content will generate those.
领英推荐
Today, it’s possible to do some the exact same things with ChatGPT prompts and a small army of dedicated bots - and build an entire library of optimised content sucked from competitive sites within minutes.?The IP isn’t stolen as such, it's effectively been harvested,?re-written, and optimised for SEO - all automatically,
Most sites don't bother to stop web crawlers. Even if they do, many crawlers disguise themselves as legitimate services, and bypass your protection, and can easily distribute themselves into normal traffic and hide.
Many of the language models uses libraries such as those of https://commoncrawl.org/ which is a truly vast open repository of all the data crawled from years of web content.
How can you protect your domains? VerifiedVisitors allows you to see exactly which bots are crawling your web sites, and gives you the power to take control over them. You decide what you want accessing your web content, and we do the rest. Automatically.