Fall semester for scrapers

Fall semester for scrapers

As the days grow shorter, let's make your scraping and automation projects better. Check out our new features, tools, updates, and content to keep your projects in top condition.


?? ?? Top Google pages for your LLMs

Google Search Scraper + Website Content Crawler = RAG Web Browser. It's a tool to provide your RAG pipelines with up-to-date information from the web.

OpenAI Assistants couldn’t browse the web like ChatGPT. Now they can. Our new RAG Web Browser lets your Assistants fetch live data from the web, expanding their knowledge in real time. This tool integrates smoothly with RAG pipelines, helping you build more capable AI systems.

Get real-time data for LLMs with RAG Web Browser →


?? Your personal continuous browser pool

Another Actor using Standby mode, Browser Pool was created to keep your connection smooth by auto-scaling a pool of browsers using Puppeteer or Playwright, all through Chrome DevTools Protocol (CDP).

Currently in beta, Browser Pool is compatible with Browserless.io and Browserbase.com. It helps avoid blocking with fingerprinting and proxies and even picks the right Playwright version for you. As we're in the early stages of development, we're actively seeking user feedback.

Get your on-demand browsers with Browser Pool

?? Website spell cehcker

GPT-powered Actors can do so much. How about detecting and reporting spelling and grammar errors across an entire website?

Check out how we built an AI Website Spell Checker in just 10 minutes using GPT Scraper. See how you can also build a GPT-powered Actor yourself.

Learn how we built it

Try Website Spell Checker out →


?? Global search: now for everyone!

Now you can quickly find resources by ID, zip through Console routes, and access key actions like theme changes and Actor creation.

Just hit Cmd + K or click the search icon near the logo to get started. Give it a spin and share your thoughts — we're excited to see how it improves your workflow.

?? Boost security with scoped API tokens

We've rolled out scoped API tokens, letting you create access keys with exact permissions. Use them to run Actors via third-party services, build from CI/CD pipelines, or manage datasets securely.

Set up your tokens in the API & Integrations page →

For more details, see our blog post →


???? Web scraping legal landscape

We've published a fresh look at the Facebook v. Power Ventures case, exploring its impact on web scraping legality. Learn how courts view "authorized access" and what it means for your projects.

Read our Facebook v. Power Ventures case analysis →

Brush up on Van Buren v. United States and its impact on CFAA →


#?? Cross-platform social data, simplified

One hashtag, multiple platforms: YouTube, TikTok, Instagram, Facebook. What if you could scrape them all in one go? Get captions, URLs, views, likes, comments, shares, timestamps, and metadata for all relevant content? ??

Our new Actor bundle will extract posts with the same hashtag across all these platforms in one neat dataset.

Try Social Media Hashtag Research tool →


?? New Python SDK 2.0 and Crawlee-Python templates

Our latest Python SDK now runs on none other than Crawlee, while also adding some specific elements just for our platform. Plus, now everything is typed — no more general dictionaries!

You can now use Crawlee-Python templates in Console for faster setup. Our updated templates now support autoinstall, keep track of preinstalled dependency versions, and make it easier to understand performance issues.

Learn about breaking changes from the Docs →

Try new web scraping templates for Python →


?? More Python resources

We've been busy creating new Python content to help you build better web scrapers and automation tools. Check out these articles for practical tips and code examples to improve your Python scraping projects.

Crawlee longread: current mistakes of web scraping in Python and tricks to solve them →

Crawlee read: how to scrape infinite scrolling webpages with Python →

5 new articles on Python: on scraping Google Finance and Yahoo Finance, browser automation in Selenium, parsing XML in Python, and more →


?? Customer stories

Check out our new true stories from the world of web scraping and web automation.

?? "At one point, we were onboarding 10 airlines a week. That couldn't have happened without Apify."

How a travel agency uses AI and web scraping tools to improve their operations →

?? “The ready-to-use Actors are really great stuff. We really enjoy having at least the opportunity to use them as a starting point.”

How a service company collects venue data to help monetize musicians' creativity →

?? "I am actually using Apify to, first of all, build trust. So basically, I can show it to my customers that I'm actually [creating scrapers] for some time, and you can see the proof that I basically know how to do it."

How Jan Danecki creates and monetizes Actors in Store →


?? Webinar

Actor Standby mode keeps Apify Actors running in the background, ready to respond instantly to new requests — ideal for real-time processing and high-frequency tasks. Join our webinar on October 7 at 9 am EST to learn how to build the most advanced and efficient Actors using Standby mode.

Register for the webinar with live Q&A here →

要查看或添加评论,请登录

Apify的更多文章