Skip to main content

Turn websites into LLM-ready Markdown with get_content

Use the get_content command to convert any selector to Markdown or HTML inside the same /browser/commands call. It skips an extra API hop and returns clean text for your model.

Inline extraction in your sequence

SESSION=session-123

curl -X POST "http://localhost:8082/browser/commands" \
-H "Content-Type: application/json" \
-d '{
"session_id": "'"$SESSION"'",
"headless": true,
"commands": [
{ "command": "navigate_to", "data": { "url": "https://example.com/blog" } },
{ "command": "wait_for_element", "data": { "selector": "article", "timeout_ms": 4000 } },
{ "command": "get_content", "data": { "selector": "article", "kind": "markdown" } }
]
}'
  • selector: CSS selector to target (e.g., article, main, .content).
  • kind: markdown (default) or html for raw markup.

Because get_content runs in the same request as your navigation and waits, it reuses the live DOM and session state without another HTTP round trip.

When to use /browser/content instead

If you already know the selector and just need a quick fetch, /browser/content is the fastest path:

curl "http://localhost:8082/browser/content?session_id=$SESSION&selector=main&mode=markdown"

Use get_content when you want the extraction embedded in a replayable sequence, and /browser/content for lightweight one-off pulls. Both return LLM-friendly Markdown that avoids unnecessary tokens.