Skip to main content

Getting started

Synchronous scrape

The scrape endpoint runs a browser session and returns rendered HTML in a single request — ideal for low-latency integrations.

Async jobs

The jobs endpoints enqueue long-running scrapes and let you poll for status and results — ideal for high-volume or slow pages.

REST Architecture

The Web Scraping API follows REST principles, utilizing predictable resource-oriented URLs and standard HTTP status codes for seamless integration and error handling.

HTTPS Security

All API communications are secured using TLS 1.2 or higher encryption protocols to ensure data integrity and privacy.

API Versioning

The Web Scraping API maintains backward compatibility through versioning, currently operating on Version 1.

Authentication

Your API key serves as the exclusive credential for accessing the Web Scraping API. Each Cleariflow service requires a unique key. Include your key in the JSON body as api_key.

Base URL

https://scrape.cleariflow.com

Page rendering

Pages are rendered in a real headless browser. JavaScript is fully executed before content is returned. Built-in SSRF protection blocks requests to localhost and private IP ranges.

Response and error codes

When requests fail, the API returns structured JSON error responses with specific codes and descriptions for effective troubleshooting.
CodeTypeDetails
200OKEverything worked as expected.
202AcceptedAsync job was enqueued successfully.
400Bad requestBad request — invalid URL, blocked target, or malformed payload.
401UnauthorizedThe request was unacceptable. Typically due to the API key missing or incorrect.
422Quota reachedThe request was aborted due to insufficient API credits. (Free plans)
429Too many requestsThe request was aborted due to the number of allowed requests per second being reached. This happens on free plans as requests are limited to 1 per second.
500Internal server errorThe request could not be completed due to an error on the server side.