DeckFlow Logo Developers DeckFlow documentation
Developer GuideAPI ReferenceMCPCLI

Extractor

The Extractor API specifications and task parameters, supporting optical character recognition (OCR) from images or font detection and text extraction from PPTX presentations.

API Endpoints

File extraction tasks are processed asynchronously. Use the two endpoints below to create and retrieve tasks. Click "Try it" on the right to open the interactive playground.

POST /tools/tasks

Creates an asynchronous extraction task. Pass the corresponding extractor tool identifier in type.

GET /tools/tasks/:id

Queries the execution status of an extraction task and retrieves the final output download link via result.url.

Request Headers

Header Type Required Description
Authorization String Yes Bearer <YOUR_API_KEY> — API Key.
Content-Type String Yes (POST) Must be multipart/form-data.

POST Request Body Parameters (Form Data)

Parameter Type Required Description
files File Yes The source file to be processed. Supports uploading multiple files.
type String Yes Task type identifier. For example: image.ocr.
params String (JSON) No JSON string of tool-specific parameters. Defaults to empty JSON string "{}".
notifyURL String No Webhook callback URL to receive task status update notifications.

Task Parameter Specifications

Image OCR

Extracts multilingual text from images. Supports image/* format, with a file size limit of 50 MB. The structure of params is as follows:

{
  "language": "zh-hans" // string. Recognition language. Options: "zh-hans" (Simplified Chinese) | "zh-hant" (Traditional Chinese) | "en" (English) | "ja" (Japanese) | "ko" (Korean) | "fr" (French) | "de" (German) | "es" (Spanish)
}

PPTX Font Finder

Scans and lists all font families referenced in a presentation slides. Supports .pptx format, with a file size limit of 300 MB. params is an empty object {}.

PPTX Text Extractor

Extracts text shapes from all slide layers, supporting custom structural filtering. Supports .pptx format, with a file size limit of 300 MB. The structure of params is as follows:

{
  "hasChildren": false,       // boolean. Whether to include child layers (recursively extract text from child shapes)
  "pickPages": [1, 2, 5],     // array. Optional. List of slide page numbers to extract (1-based index)
  "imageFilter": 0.2,         // number. Optional. Filter elements based on area ratio relative to slide (0~1)
  "includeLayout": false,     // boolean. Whether to extract text from layout slides
  "includeMaster": false,     // boolean. Whether to extract text from master slides
  "includeNotes": false,      // boolean. Whether to extract text from notes slides
  "ignoreEmptyText": true,    // boolean. Whether to filter out empty text boxes
  "onlyBrief": true           // boolean. Whether to return a simplified summary structure (reduces payload size)
}