Crawl
Breadth-first crawl of a website, following same-host links. Each page is extracted in the formats you specify.
Endpoint
POST /v1/crawl
curl -X POST https://api.browsr.dev/v1/crawl \
-H "Authorization: Bearer $BROWSR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"limit": 20,
"max_depth": 3,
"formats": ["markdown"]
}'
Request
| Field | Type | Default | Description |
|---|---|---|---|
url | string | required | Starting URL |
limit | number | 10 | Max pages (max: 100) |
max_depth | number | 2 | Max link depth |
formats | string[] | ["markdown"] | Output formats per page |
wait_for | number | 0 | ms to wait per page |
include_paths | string[] | [] | Glob patterns to include |
exclude_paths | string[] | [] | Glob patterns to exclude |
only_main_content | boolean | true | Strip nav/header/footer |
json_options | object | — | JSON extraction (applied to every page) |
Path filtering
curl -X POST https://api.browsr.dev/v1/crawl \
-H "Authorization: Bearer $BROWSR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://docs.example.com",
"limit": 50,
"include_paths": ["/docs/**", "/api/**"],
"exclude_paths": ["/blog/**"],
"formats": ["markdown"]
}'
Response
{
"success": true,
"total": 15,
"completed": 15,
"data": [
{
"markdown": "# Getting Started\n\n...",
"metadata": {
"title": "Getting Started",
"sourceURL": "https://docs.example.com/getting-started",
"status_code": 200
}
}
]
}