Drop a PDF, get a structured JSON object in the Polotno design schema— typed elements with explicit position, font, color, and embedded image data, ready to feed to a canvas editor or to an LLM with spatial context. Not a plain-text extractor and not an invoice parser; if that's what you need, this tool is probably not the best fit.
The output shape
The JSON has a top-level width / height / dpi / unit describing the document, an array of fonts used, and a pages array. Each page contains a children array of typed elements:
type: "text"— text content, fontFamily, fontSize, fontWeight, fill, x/y/width/height, rotation, alignment.type: "image"— base64 src, crop region, position, opacity.type: "svg"— vector shapes re-emitted as inline SVG, with position and size.type: "line"— stroke segments with color, dash, position.
Use cases
- LLM ingestion — give the model not just text but also visual context (positions, fonts) for layout-aware tasks: contract review, form understanding, quote extraction.
- Automated edits — redact a phrase, swap a template variable, change a color, then re-export to PDF.
- Editor handoff— load the JSON into a Polotno editor in your app and let the user customize the design. That's the live demo on this page.
- Schema-driven storage — store designs as JSON rows in your database instead of opaque PDF blobs.
The same conversion in code
import { pdfToJson } from "@polotno/pdf-import";
const buffer = await file.arrayBuffer();
const json = await pdfToJson({ pdf: buffer });
// json is a Polotno design — load into a store, edit, re-export
import { createStore } from "polotno/model/store";
const store = createStore({ key: "YOUR_KEY" });
store.loadJSON(json);Full API reference: PDF Import docs.
