Skip to content

Security

Escriba is built to run with confidential documents. The guiding principle: the control stays on the human layer — your files are processed on your server and you decide what ever reaches an LLM.

  • Nothing is stored. Uploaded files are deleted right after conversion.
  • No third-party cloud. Conversion, OCR, transcription and anonymization run locally on your host.
  • The restore map stays local. Pseudonymization’s token→original map never leaves your browser.
  • Fail-closed anonymization — if anonymization is requested and the Anonimal service is unreachable, the request errors out; raw text is never emitted as a fallback.
  • Anti-SSRF — URL fetching blocks internal IPs and redirects; local-file and file:// access is restricted to DIOS only.
  • XSS sanitization — the preview is sanitized with DOMPurify; a strict Content-Security-Policy and security headers are set.
  • Rate limiting & lockout — per-role request limits, shared across workers via the embedded Redis, plus login lockout on repeated failures.
  • Non-root container — runs as an unprivileged user with no-new-privileges.
  • Safe regex — user-supplied anonymization rules run on RE2 (linear time), which is immune to ReDoS.
  • DoS guards — uploads are size-capped via streaming; the page selector is capped to prevent range-bomb expansion.

The codebase went through a strict multi-perspective audit and a red-team pen-test, with every finding fixed and verified. Hardening highlights include a random per-install hashing key, sanitized X-Forwarded-For handling (trusted proxies only), session revocation, scrubbed PDF metadata on redaction, and no-cache headers on static assets.