paddleocr-doc-parsing

⚠Review·Scanned 2/17/2026

This skill parses images and PDFs using PaddleOCR and provides a shell script scripts/paddleocr_parse.sh to submit files. The script executes shell commands, encodes local files to base64, and posts them to the endpoint in PADDLEOCR_API_URL using PADDLEOCR_ACCESS_TOKEN.

from clawhub.ai·vf074b44·7.8 KB·0 installs

Scanned from 1.0.2 at f074b44 · Transparency log ↗

$ vett add clawhub.ai/bobholamovic/paddleocr-doc-parsingReview findings below

PaddleOCR Document Parsing

Parse images and PDF files using PaddleOCR's API. Supports multiple document parsing algorithms with structured output.

Resource Links

Resource	Link
Official Website	https://www.paddleocr.com
API Documentation	https://ai.baidu.com/ai-doc/AISTUDIO/Cmkz2m0ma
GitHub	https://github.com/PaddlePaddle/PaddleOCR

Key Features

Multi-format support: PDF and image files (JPG, PNG, BMP, TIFF)
Layout analysis: Automatic detection of text blocks, tables, formulas
Multi-language: Support for 110+ languages
Structured output: Markdown format with preserved document structure

Setup

Obtain credentials from the PaddleOCR official website. Click the “API” button, choose the desired algorithm (e.g., PP-StructureV3, PaddleOCR-VL-1.5), and copy the API URL and the access token.
Set environment variables:

export PADDLEOCR_API_URL="https://your-endpoint-here"
export PADDLEOCR_ACCESS_TOKEN="your_access_token"

Usage Examples

Run Script

# Parse local image
{baseDir}/paddleocr_parse.sh document.jpg

# Parse local PDF file
{baseDir}/paddleocr_parse.sh -t pdf document.pdf

# Parse document from URL
{baseDir}/paddleocr_parse.sh -t pdf https://example.com/document.pdf

# Output to stdout (default)
{baseDir}/paddleocr_parse.sh document.jpg

# Save output to file
{baseDir}/paddleocr_parse.sh -o result.json document.jpg

Response Structure

{
  "logId": "unique_request_id",
  "errorCode": 0,
  "errorMsg": "Success",
  "result": {
    "layoutParsingResults": [
      {
        "prunedResult": [...],
        "markdown": {
          "text": "# Document Title\n\nParagraph content...",
          "images": {}
        },
        "outputImages": [...],
        "inputImage": "http://input-image"
      }
    ],
    "dataInfo": {...}
  }
}

Important Fields:

prunedResult - Contains detailed layout element information including positions, categories, etc.
markdown - Stores the document content converted to Markdown format with preserved structure and formatting.

Quota Information

See official documentation: https://ai.baidu.com/ai-doc/AISTUDIO/Xmjclapam