Web scraping code templates
Actor templates help you quickly set up your web scraping projects, saving you development time and giving you immediate access to all the features the Apify platform has to offer.
Scrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
StarterA scraper example that uses Cheerio to parse HTML. It's fast, but it can't run the website's JavaScript or pass JS anti-scraping challenges.
Example of a Puppeteer and headless Chrome web scraper. Headless browsers render JavaScript and are harder to block, but they're slower than plain HTTP.
Web scraper example with Crawlee, Playwright and headless Chrome. Playwright is more modern, user-friendly and harder to block than Puppeteer.
Example of using the Playwright Test project to run automated website tests in the cloud and display their results. Usable as an API.
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
Template with basic structure for an Actor using Standby mode that allows you to easily add your own functionality.
StarterScrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
StarterScrape single page with provided URL with Axios and extract data from page's HTML with Cheerio.
StarterA scraper example that uses Cheerio to parse HTML. It's fast, but it can't run the website's JavaScript or pass JS anti-scraping challenges.
Example of a Puppeteer and headless Chrome web scraper. Headless browsers render JavaScript and are harder to block, but they're slower than plain HTTP.
Web scraper example with Crawlee, Playwright and headless Chrome. Playwright is more modern, user-friendly and harder to block than Puppeteer.
Skeleton project that helps you quickly bootstrap `CheerioCrawler` in JavaScript. It's best for developers who already know Apify SDK and Crawlee.
Example of running Cypress tests and saving their results on the Apify platform. JSON results are saved to Dataset, videos to Key-value store.
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
Template with basic structure for an Actor using Standby mode that allows you to easily add your own functionality.
StarterExample of how to use LangChain.js with Apify to crawl the web data, vectorize them, and prompt the OpenAI model.
This example Scrapy spider scrapes page titles from URLs defined in input parameter. It shows how to use Apify SDK for Python and Scrapy pipelines to save results.
Scrape single page with provided URL with HTTPX and extract data from page's HTML with Beautiful Soup.
StarterExample of a web scraper that uses Python HTTPX to scrape HTML from URLs provided on input, parses it using BeautifulSoup and saves results to storage.
Crawler example that uses headless Chrome driven by Playwright to scrape a website. Headless browsers render JavaScript and can help when getting blocked.
Scraper example built with Selenium and headless Chrome browser to scrape a website and save the results to storage. A popular alternative to Playwright.
Empty template with basic structure for the Actor with Apify SDK that allows you to easily add your own functionality.
Template with basic structure for an Actor using Standby mode that allows you to easily add your own functionality.
StarterCrawl and scrape websites using Crawlee and BeautifulSoup. Start from a given start URLs, and store results to Apify dataset.
StarterCrawl and scrape websites using Crawlee and Playwright. Start from a given start URLs, and store results to Apify dataset.
Starter