Skip to content

OCR Recognition

Use device-side GPU acceleration technology to recognize text content in specified screen area, supports multiple language recognition.

💡 Tip

System automatically recognizes phone language, no need to manually specify language type.

Why not integrate third-party OCR? See OCR Guide for details.

⚠️ Important Limitation

Due to iOS system's strict memory limits for background apps, built-in OCR is not suitable for large-area document scanning work.

Recognizing too large areas or frequently recognizing large areas may cause:

  • Memory usage exceeds limit
  • Screen mirroring service crashes and disconnects

Recommendations:

  • ✅ Suitable: Recognize buttons, titles, short text and other small area text
  • ❌ Not suitable: Full-page document scanning, large paragraph recognition, PDF text extraction, etc.

For document scanning or large-area text recognition, use Screen Push interface with computer-side OCR service.

Interface Description

Interface Type

ocr

Parameters

ParameterTypeRequiredDescription
deviceIdstringYesDevice ID
rectarrayYesRecognition area [x1, y1, x2, y2]

Return Value

Returns an array of text blocks, each text block contains:

javascript
[
    {
        text: "Recognized text",         // Recognized text content
        boundingBox: {                   // Text bounding box
            x: 100,                      // Top-left X coordinate
            y: 100,                      // Top-left Y coordinate
            width: 200,                  // Width
            height: 50                   // Height
        },
        confidence: 0.95,                // Recognition confidence (0-1)
        x: 200,                          // Text center point X coordinate
        y: 125                           // Text center point Y coordinate
    },
    // ... more text blocks
]

Basic Usage

javascript
// Recognize text in specified screen area
const results = await apiInvoke('ocr', {
    deviceId: 'P72578581E07',
    rect: [100, 100, 500, 200]    // Recognition area: top-left(100,100) to bottom-right(500,200)
});

// Iterate through all recognized text blocks
results.forEach(item => {
    console.log(`Text: ${item.text}`);
    console.log(`Position: (${item.x}, ${item.y})`);
    console.log(`Confidence: ${item.confidence}`);
});

// Get all text content
const allText = results.map(item => item.text).join(' ');
console.log('Full text:', allText);

Feature Description

Recognition Area

rect parameter defines recognition area, format is [x1, y1, x2, y2]:

  • x1, y1 - Top-left corner coordinates of area
  • x2, y2 - Bottom-right corner coordinates of area

Area Setting Recommendations:

  • Area should contain complete text content
  • Avoid including too much background and interference elements
  • Smaller area = faster recognition, less memory usage
  • Ensure area is within screen bounds
  • Avoid recognizing too large area (recommend single recognition area not exceeding 1/2 of screen)

Language Support

System automatically detects text language, including but not limited to 18+ major languages

Use Cases

💡 Suitable Scenarios

Built-in OCR is suitable for recognizing:

  • ✅ UI interface button text, titles, prompts
  • ✅ Short message content, notification text
  • ✅ In-game text prompts, number information
  • ✅ Form fields, input box content

Not suitable:

  • ❌ Full-page document scanning
  • ❌ Long article recognition
  • ❌ PDF text extraction
  • ❌ Large-area continuous text recognition

For document scanning scenarios, it's recommended to use Screen Push interface to send images to computer-side OCR service for processing.

Cooperation: try.catch@foxmail.com