Skip to content

Converting documents

Escriba converts almost anything into Markdown. Detection is automatic — you rarely need to tell it what kind of file you dropped.

  • Documents — PDF, Word, Excel, PowerPoint, HTML, CSV, EPUB, ZIP and more.
  • Images — automatic OCR (Tesseract); optional AI description.
  • Audio & video — local, offline transcription with Whisper (mp3, wav, mp4, mov, mkv…).
  • URLs & YouTube — convert a web page, or fetch a YouTube transcript.

Text inside images is recognized automatically. Scanned and rotated PDFs are detected, OCR’d and auto-straightened on the fly. If a PDF looks scanned and your access level allows OCR, Escriba applies it without you asking.

You can also force OCR from the advanced options — useful for PDFs with broken accents (e.g. exported from LaTeX). Forcing OCR uses the document language you choose, so set it for best results.

For long PDFs, convert only the pages you need. Next to each queued PDF there’s a page picker that shows the document’s page count and lets you choose:

  • The whole document (default).
  • A range — e.g. pages 5 to 67.
  • Individual pages or ranges — e.g. 1, 6, 9, or a mix like 1, 2, 5-67.

There’s no syntax to memorize: the picker is built for it. The selection is made per file, so different PDFs in the same batch can use different pages.

Open the advanced panel to fine-tune a conversion:

  • Document language — improves audio transcription and forced OCR.
  • Force OCR — for scanned PDFs or broken accents.
  • Advanced PDF extraction — an opt-in OpenDataLoader engine for complex layouts: better reading order and heading hierarchy, with automatic fallback to the default extractor. Slower, but sharper on tricky documents.
  • Anonymization — strip or replace personal data; see Anonymization.
  • AI provider — optional. The default is No AI (local text / OCR only).

The result isn’t read-only. Hit Edit to open it in a full-screen Markdown editor with a live preview, tidy it up — drop boilerplate, fix a heading, trim noise — and Save. Your edits become the result: everything downstream (export, audio, copy and download) uses the cleaned text. Nothing is sent anywhere; it’s all in your browser until you act.

Add several files at once (your access level sets how many). Convert them all, then download everything as a .zip. Uploaded files are deleted right after conversion — nothing is stored.