Crawl

Breadth-first crawl of a website, following same-host links. Each page is extracted in the formats you specify.

Endpoint

POST /v1/crawl

curl -X POST https://api.browsr.dev/v1/crawl \
  -H "Authorization: Bearer $BROWSR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://docs.example.com",
    "limit": 20,
    "max_depth": 3,
    "formats": ["markdown"]
  }'

Request

Field	Type	Default	Description
`url`	string	required	Starting URL
`limit`	number	`10`	Max pages (max: 100)
`max_depth`	number	`2`	Max link depth
`formats`	string[]	`["markdown"]`	Output formats per page
`wait_for`	number	`0`	ms to wait per page
`include_paths`	string[]	`[]`	Glob patterns to include
`exclude_paths`	string[]	`[]`	Glob patterns to exclude
`only_main_content`	boolean	`true`	Strip nav/header/footer
`json_options`	object	—	JSON extraction (applied to every page)

Path filtering

curl -X POST https://api.browsr.dev/v1/crawl \
  -H "Authorization: Bearer $BROWSR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://docs.example.com",
    "limit": 50,
    "include_paths": ["/docs/**", "/api/**"],
    "exclude_paths": ["/blog/**"],
    "formats": ["markdown"]
  }'

Response

{
  "success": true,
  "total": 15,
  "completed": 15,
  "data": [
    {
      "markdown": "# Getting Started\n\n...",
      "metadata": {
        "title": "Getting Started",
        "sourceURL": "https://docs.example.com/getting-started",
        "status_code": 200
      }
    }
  ]
}

Endpoint​

Request​

Path filtering​

Response​

Endpoint

Request

Path filtering

Response