POST

/api/parse

2 credits/page (fast), 5 credits/page (hires)

Document parsing (PDF, images, OCR)

Parse documents into structured markdown. Supports PDF, images, and scanned

documents. Fast mode is synchronous (~15 pages/sec), HiRes mode is asynchronous

with OCR (~16 pages/min).

curl -s -X POST "$BASE/api/parse" \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://arxiv.org/pdf/2301.00234v1",
    "mode": "fast"
  }'

Request Body

Field	Type	Req	Default	Description
url	string	N	—	Document URL
base64	string	N	—	Base64-encoded file content
filename	string	N	—	Filename hint (required with base64)
mode	string (fast \| hires \| auto)	N	fast	fast: ~15 pages/sec, synchronous. hires: ~16 pages/min, asynchronous with OCR. auto: server chooses.
output	string (markdown \| json)	N	markdown
wait	boolean	N	false	Wait for async result (block until complete)
imageMode	string (embedded \| s3)	N	—	How to handle images in output
promptType	string	N	—	Custom prompt type for OCR
callbackUrl	string	N	—	URL for async completion callback
includeDetection	boolean	N	false	Include detection data in response (bboxes, element types)

Request Example

{
  "url": "https://arxiv.org/pdf/2401.00001.pdf",
  "mode": "fast",
  "output": "markdown"
}

Response Example

{
  "success": true,
  "mode": "fast",
  "document": {
    "markdown": "# Document Title\n\nContent...",
    "pageCount": 10,
    "metadata": {
      "title": "Document Title",
      "author": "Author"
    }
  },
  "cost": {
    "pages": 10,
    "totalCredits": 20
  },
  "processingTime": 680
}

Try it out

Request Body

Response

Click Execute to test