Developer API

PDF OCR API

Production OCR pipelines split along a predictable line: some teams ship with Google Vision or AWS Textract because accuracy on messy scans is marginally better, others stay away because of per-1000-page pricing that starts at $1.50 and the privacy implications of sending receipts/medical forms to a cloud OCR service.

PennyPDF's /v1/ocr runs Tesseract — the open-source OCR engine Google Vision itself was built on top of originally. We support 12 languages including Japanese, Arabic, and Hindi. Output is the original PDF with a searchable text layer grafted on, so downstream /v1/pdf-to-word and /v1/extract endpoints can find the text without re-processing.

3 coins per document (~$0.12 at the Saver pack, $0.09 at the Pro pack). Compare: Google Vision $1.50/1000 pages (so ~$0.075/page for a 20-page doc = $1.50), AWS Textract $1/1000 pages basic + $50/1000 for tables/forms, Adobe Extract API $0.03/doc with a $2500 annual floor.

Get an API key See coin pricing

Copy, paste, ship

Same bearer-token auth across every endpoint. Set PENNYPDF_API_KEY in your environment first.

curlPOST /v1/ocr

curl -X POST https://api.pennypdf.com/v1/ocr \
  -H "Authorization: Bearer $PENNYPDF_API_KEY" \
  -F "file=@scanned-invoice.pdf" \
  -F "languages=eng,spa" \
  -o searchable.pdf

PythonOCR + extract pipeline

import os, requests

auth = {"Authorization": f"Bearer {os.environ['PENNYPDF_API_KEY']}"}

# 1. OCR the scan (3 coins)
r = requests.post(
    "https://api.pennypdf.com/v1/ocr",
    headers=auth,
    files={"file": open("scan.pdf", "rb")},
    data={"languages": "eng"},
)
ocr_pdf = r.content

# 2. Extract structured text (0 coins — text layer is already there now)
r = requests.post(
    "https://api.pennypdf.com/v1/extract",
    headers=auth,
    files={"file": ("ocr.pdf", ocr_pdf)},
    data={"format": "json"},
)
print(r.json()["text"][:500])  # first 500 chars

PennyPDF vs Google Cloud Vision

	PennyPDF	Google Cloud Vision
Price per 1k pages	~$6 (3 coins × 20 pages avg)	$1.50 (text detection)
Price per doc (avg 20pp)	$0.12	$0.03
Monthly minimum	$0	$0 (then quota)
Data sent to third party	No — self-hosted Tesseract	Yes — Google
Languages	12 built-in	50+
Output format	Searchable PDF	JSON (DIY PDF assembly)

How it works

1POST the scanned PDF as multipart form-data to /v1/ocr with optional language hints.
2Receive the same PDF with an invisible text layer — copy-paste and search work, visuals unchanged.
3Optionally chain /v1/extract afterwards at 0 coins to pull structured text out.

Frequently asked

How accurate is Tesseract compared to Google Vision?+

On clean office scans (300dpi+, black text on white), both hit 99%+. On receipts, handwriting, or low-contrast scans, Google Vision and AWS Textract have a real edge — 2-5 percentage points better character accuracy. If accuracy is life-critical (medical records, legal discovery), use the cloud providers. For invoice/form digitization, Tesseract is good enough.

What does 'text layer added in place' mean?+

The output PDF looks visually identical to the input (raster pages unchanged) but has an invisible text overlay positioned above each character. Screen readers, text search, copy-paste, and downstream text extraction all work. No re-layout, no visual regression.

Latency?+

Tesseract is CPU-bound. p50 = 8 s for a 5-page scan, p90 = 22 s for 20 pages. Use the async /v1/jobs/ocr endpoint for anything over 10 pages to avoid tying up the connection.

What's the max resolution supported?+

600dpi scans are the sweet spot. Anything higher gets downsampled to 400dpi internally (Tesseract's accuracy actually drops at very high resolutions because of per-pixel noise). If your scan is below 200dpi, accuracy will be poor; bump the scanner's DPI before calling us.

Does it handle rotated pages?+

Yes — we auto-detect rotation per page and straighten before OCR'ing. If the source PDF has pages rotated 90°/180°/270°, the output will have those pages in their correctly-oriented form with the text layer matching.

Rate limits?+

30 synchronous OCRs per minute per API key. Async: 100 job creations per minute. OCR is our most CPU-expensive operation; bulk workloads (1000+/day) should go through the async endpoint.

Developer API

PDF extraction API

Developer API

PDF to Word API

Format conversion

Scanned PDF to Word

Why PennyPDF

No subscription. Ever.
Coins never expire — use them in 5 years.
Client-side processing for 14 of 22 tools.
No watermarks at any tier.
Per-operation pricing, shown before you click.
Same coins for web + public API.

PDF OCR API

Copy, paste, ship

PennyPDF vs Google Cloud Vision

How it works

Frequently asked

Related

Why PennyPDF