Security
Escriba is built to run with confidential documents. The guiding principle: the control stays on the human layer — your files are processed on your server and you decide what ever reaches an LLM.
Private by design
Section titled “Private by design”- Nothing is stored. Uploaded files are deleted right after conversion.
- No third-party cloud. Conversion, OCR, transcription and anonymization run locally on your host.
- The restore map stays local. Pseudonymization’s token→original map never leaves your browser.
Hardened by default
Section titled “Hardened by default”- Fail-closed anonymization — if anonymization is requested and the Anonimal service is unreachable, the request errors out; raw text is never emitted as a fallback.
- Anti-SSRF — URL fetching blocks internal IPs and redirects; local-file and
file://access is restricted to DIOS only. - XSS sanitization — the preview is sanitized with DOMPurify; a strict Content-Security-Policy and security headers are set.
- Rate limiting & lockout — per-role request limits, shared across workers via the embedded Redis, plus login lockout on repeated failures.
- Non-root container — runs as an unprivileged user with
no-new-privileges. - Safe regex — user-supplied anonymization rules run on RE2 (linear time), which is immune to ReDoS.
- DoS guards — uploads are size-capped via streaming; the page selector is capped to prevent range-bomb expansion.
Audited
Section titled “Audited”The codebase went through a strict multi-perspective audit and a red-team pen-test,
with every finding fixed and verified. Hardening highlights include a random
per-install hashing key, sanitized X-Forwarded-For handling (trusted proxies only),
session revocation, scrubbed PDF metadata on redaction, and no-cache headers on
static assets.