Skip to content

Slide Conversion

Slide conversion can transform PDFs or images into editable PPTX files with OCR text recognition.

Use Cases

This tool is perfect for converting slides from:

  • NotebookLM - Slides generated by Google's AI notebook tool
  • Gamma - AI presentation generator
  • Canva - Exported PDF presentations
  • Screenshots - Any slide screenshots or photos

How to Access

There are two ways to access the "PPTX Converter Tool":

Option 1: From Slides Mode

In "Slides" mode, click the "PPTX Converter Tool" card to enter.

PPTX Converter Tool Entry

Option 2: From History

Click any image in the history, and you'll see a "Convert to PPTX" button. Clicking it will bring that record's images into the conversion tool.

Interface Overview

PPTX Converter Tool Interface

The interface is divided into two main sections:

Image Preview

The left section displays your uploaded slide images. You can navigate through each page. The "Merged" and "Raw" buttons at the bottom toggle between different OCR result display modes.

Settings

The right section provides various conversion settings:

SettingDescription
ModelChoose Server (higher accuracy) or Mobile (lightweight)
OCR EngineWebGPU uses GPU acceleration; Worker runs in background thread
Setting ModeApply to All or Per Page settings
Text Removal MethodOpenCV.js (free) or Gemini API (consumes API quota)
AlgorithmText removal algorithm: NS or TELEA

Advanced Settings

Click the "Advanced" button next to OCR Engine to adjust detailed OCR parameters such as detection threshold, dilation factor, layout analysis settings, etc.

Once configured, click "Start PPTX Conversion" to begin processing.

Basic Usage

  1. Upload PDF file or images
  2. Wait for OCR processing
  3. Preview and edit results
  4. Download PPTX file

Processing Flow

1. PDF to Images

First converts each page of the PDF to high-resolution images.

2. OCR Text Recognition

Uses PaddleOCR models for text detection and recognition:

  • Text Detection: Locates text positions in the image
  • Text Recognition: Converts image text to editable text
  • Layout Analysis: Merges nearby text blocks into paragraphs

3. Text Removal

Uses OpenCV.js or Gemini API inpainting algorithm to remove original text from images.

Automatic Fallback

When using Gemini API for text removal, if an API error occurs (such as RECITATION, quota exceeded, etc.), the system automatically falls back to OpenCV.js to continue processing, ensuring the conversion flow doesn't get interrupted.

4. Generate PPTX

Combines processed background images with recognized text into a PPTX file.

OCR Settings

Execution Mode

ModeDescriptionUse Case
WebGPUGPU acceleratedModern browsers, faster
WASMCPU onlyBetter compatibility

Model Selection

ModelSizeAccuracySpeed
Server~172MBHigherSlower
Mobile~21MBNormalFaster

Automatic Fallback

The system has multiple automatic fallback mechanisms:

  • GPU Memory Insufficient: When WebGPU runs out of memory, it automatically switches to WASM Worker (CPU mode) to continue processing
  • Model Size Fallback: If the device cannot load the Server model, it automatically switches to the Mobile model

Notifications will be displayed during fallback so you know the current execution status.

Advanced Parameters

  • Detection Threshold: Adjust text detection sensitivity
  • Dilation Factor: Adjust text box size
  • Line Spacing Threshold: Affects paragraph merging

Conversion Complete

Conversion Result

After conversion completes, the interface shows:

  • Original: The original image
  • Processed: The processed image (text removed)
  • Thumbnails: All pages shown at bottom, green checkmark indicates success

Bottom button descriptions:

ButtonDescription
MergedShow merged text blocks after layout analysis
RawShow original OCR detection boxes
FailedShow regions that failed recognition (if any)
Edit RegionsEnter edit mode to adjust text regions

Edit Mode

Click "Edit Regions" to enter edit mode and fine-tune the OCR results.

Edit Mode

Toolbar

A toolbar appears at the top in edit mode with these tools:

ToolShortcutDescription
Draw RectangleDDrag to draw a new text region
TrapezoidTConvert selected region to a quadrilateral with adjustable vertices (appears when a region is selected)
SeparatorSClick two points to draw a separator line that prevents adjacent regions from merging
SelectVDrag to select regions for batch deletion
Undo / RedoCtrl+Z / Ctrl+Shift+ZUndo or redo edit operations
ResetRReset all edits to original OCR results
Done-Save edits and exit edit mode

Keyboard Shortcuts

Edit mode supports the following keyboard shortcuts for efficient operation:

ShortcutFunction
DToggle draw rectangle mode
SToggle separator line mode
VToggle selection mode
TToggle trapezoid mode (requires a selected region)
RReset all edits
Ctrl+Z / Cmd+ZUndo
Ctrl+Shift+Z / Cmd+Shift+ZRedo
EscapeCancel current operation or exit mode
Delete / BackspaceDelete selected separator line

Quick Switching

Use keyboard shortcuts to quickly switch between different tools without moving the mouse to click toolbar buttons.

Selecting and Adjusting Regions

Click any text region to select it. Once selected, you can:

  • Drag corners: Resize the region
  • Click center ✕: Delete the region

Deleting Failed Recognition Regions

If a region failed recognition (e.g., an icon was mistakenly detected as text), it's recommended to delete it. Once deleted, that region won't go through Inpaint processing, preserving the original background image.

Trapezoid Mode (Slanted Text)

When your slides contain slanted or angled text, rectangular regions cannot accurately frame them. Use "Trapezoid Mode" to convert a region into a quadrilateral with freely adjustable vertices.

Trapezoid Mode

When to use:

  • Text arranged along diagonal lines
  • Text tilted due to perspective angles
  • Artistic or decorative text layouts

How to use:

  1. Click any text region to select it (a blue border will appear)
  2. Once selected, the "Trapezoid" button appears in the toolbar (trapezoid-shaped icon)
  3. Click the "Trapezoid" button to convert the region from a rectangle to a quadrilateral
  4. The blue circular corner handles become purple diamond handles
  5. Drag the purple handles to adjust each vertex position, fitting the region to the slanted text
  6. Click the "Revert to Rectangle" button to restore the region back to a rectangle

Shape Restriction

The system automatically validates that the quadrilateral is valid (no self-intersection). If dragging a vertex causes edges to cross, the system will automatically revert to the position before dragging.

PPTX Export Behavior:

  • Trapezoid regions automatically calculate rotation angle based on their slant
  • Text boxes will be correctly rotated, maintaining visual consistency with the original slide
  • Font size is calculated using the actual trapezoid height (not the bounding box), ensuring correct proportions

Real Example:

The image below shows a complete editing case. In this pyramid diagram, slanted text like "(Thumbnails)" couldn't be detected correctly by OCR, so we:

  1. Manually drew a rectangular region to frame the text
  2. Used trapezoid mode to adjust the region shape to fit the slanted text
  3. Used separator lines to prevent adjacent regions from being incorrectly merged

Adjusted Result

Iteration is Normal

Complex slides may require multiple adjustments to achieve ideal results. As shown below, multiple thumbnails appear beneath the processed version, representing different inpainting attempts. You can click different versions to compare results.

Multiple Inpaint Attempts

Separator Lines

During layout analysis, nearby text regions are automatically merged into a single text block (which becomes one text box in the PPTX output). Separator lines prevent this automatic merging—regions on opposite sides of a separator will become separate text boxes.

When to use:

  • When two unrelated text blocks are incorrectly merged into one
  • When a title and body text should be separate text boxes

How to use:

  1. Select the "Separator" tool
  2. Click the first point
  3. Click the second point to complete the line

Regions on either side of the separator line will be output as separate text boxes in the PPTX file.

Click items in the right sidebar's region list to quickly locate and select the corresponding text region. This is particularly useful when dealing with many text blocks.

Batch Deletion

Use the "Select" tool to select multiple regions at once:

  1. Select the "Select" tool
  2. Drag to select regions to delete
  3. Click ✓ to confirm deletion

Export Options

PPTX Settings

OptionDescription
Line Height RatioAdjust text line spacing
Min Font SizeLimit minimum font size
Max Font SizeLimit maximum font size

Export Result Preview

The image below shows how the exported PPTX file appears in PowerPoint:

  • Trapezoid region text boxes are correctly rotated, aligned with the original slanted text
  • All text boxes are editable and can be modified directly
  • The background image has the original text removed, showing a clean background

PowerPoint Final Result

Inpaint Before/After Comparison:

The images below show the inpaint processing effect for trapezoid regions. Left is before processing (original image), right is after processing (text removed):

BeforeAfter
Before InpaintAfter Inpaint

Supported Formats

  • Input: PDF, PNG, JPG, WebP
  • Output: PPTX (compatible with PowerPoint, Google Slides, Keynote)

Notes

  • Complex layouts (multi-column, overlapping text) may need manual adjustment
  • Handwritten text has lower recognition accuracy
  • Special fonts may be replaced with system fonts

Next Steps

Built with VitePress