POST
/api/parse
2 credits/page (fast), 5 credits/page (hires)Document parsing (PDF, images, OCR)
Parse documents into structured markdown. Supports PDF, images, and scanned
documents. Fast mode is synchronous (~15 pages/sec), HiRes mode is asynchronous
with OCR (~16 pages/min).
curl -s -X POST "$BASE/api/parse" \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{
"url": "https://arxiv.org/pdf/2301.00234v1",
"mode": "fast"
}'Request Body
| Field | Type | Req | Default | Description |
|---|---|---|---|---|
| url | string | N | — | Document URL |
| base64 | string | N | — | Base64-encoded file content |
| filename | string | N | — | Filename hint (required with base64) |
| mode | string (fast | hires | auto) | N | fast | fast: ~15 pages/sec, synchronous. hires: ~16 pages/min, asynchronous with OCR. auto: server chooses. |
| output | string (markdown | json) | N | markdown | |
| wait | boolean | N | false | Wait for async result (block until complete) |
| imageMode | string (embedded | s3) | N | — | How to handle images in output |
| promptType | string | N | — | Custom prompt type for OCR |
| callbackUrl | string | N | — | URL for async completion callback |
| includeDetection | boolean | N | false | Include detection data in response (bboxes, element types) |
Request Example
{
"url": "https://arxiv.org/pdf/2401.00001.pdf",
"mode": "fast",
"output": "markdown"
}Response Example
{
"success": true,
"mode": "fast",
"document": {
"markdown": "# Document Title\n\nContent...",
"pageCount": 10,
"metadata": {
"title": "Document Title",
"author": "Author"
}
},
"cost": {
"pages": 10,
"totalCredits": 20
},
"processingTime": 680
}