OCR Recognition

Use device-side GPU acceleration technology to recognize text content in specified screen area, supports multiple language recognition.

💡 Tip

System automatically recognizes phone language, no need to manually specify language type.

Why not integrate third-party OCR? See OCR Guide for details.

⚠️ Important Limitation

Due to iOS system's strict memory limits for background apps, built-in OCR is not suitable for large-area document scanning work.

Recognizing too large areas or frequently recognizing large areas may cause:

Memory usage exceeds limit
Screen mirroring service crashes and disconnects

Recommendations:

✅ Suitable: Recognize buttons, titles, short text and other small area text
❌ Not suitable: Full-page document scanning, large paragraph recognition, PDF text extraction, etc.

For document scanning or large-area text recognition, use Screen Push interface with computer-side OCR service.

Interface Description

Interface Type

ocr

Parameters

Parameter	Type	Required	Description
deviceId	string	Yes	Device ID
rect	array	Yes	Recognition area `[x1, y1, x2, y2]`

Return Value

Returns an array of text blocks, each text block contains:

javascript

[
    {
        text: "Recognized text",         // Recognized text content
        boundingBox: {                   // Text bounding box
            x: 100,                      // Top-left X coordinate
            y: 100,                      // Top-left Y coordinate
            width: 200,                  // Width
            height: 50                   // Height
        },
        confidence: 0.95,                // Recognition confidence (0-1)
        x: 200,                          // Text center point X coordinate
        y: 125                           // Text center point Y coordinate
    },
    // ... more text blocks
]

Basic Usage

javascript

// Recognize text in specified screen area
const results = await apiInvoke('ocr', {
    deviceId: 'P72578581E07',
    rect: [100, 100, 500, 200]    // Recognition area: top-left(100,100) to bottom-right(500,200)
});

// Iterate through all recognized text blocks
results.forEach(item => {
    console.log(`Text: ${item.text}`);
    console.log(`Position: (${item.x}, ${item.y})`);
    console.log(`Confidence: ${item.confidence}`);
});

// Get all text content
const allText = results.map(item => item.text).join(' ');
console.log('Full text:', allText);

Feature Description

Recognition Area

rect parameter defines recognition area, format is [x1, y1, x2, y2]:

x1, y1 - Top-left corner coordinates of area
x2, y2 - Bottom-right corner coordinates of area

Area Setting Recommendations:

Area should contain complete text content
Avoid including too much background and interference elements
Smaller area = faster recognition, less memory usage
Ensure area is within screen bounds
Avoid recognizing too large area (recommend single recognition area not exceeding 1/2 of screen)

Language Support

System automatically detects text language, including but not limited to 18+ major languages

Use Cases

💡 Suitable Scenarios

Built-in OCR is suitable for recognizing:

✅ UI interface button text, titles, prompts
✅ Short message content, notification text
✅ In-game text prompts, number information
✅ Form fields, input box content

Not suitable:

❌ Full-page document scanning
❌ Long article recognition
❌ PDF text extraction
❌ Large-area continuous text recognition

For document scanning scenarios, it's recommended to use Screen Push interface to send images to computer-side OCR service for processing.

OCR Recognition ​

Interface Description ​

Interface Type ​

Parameters ​

Return Value ​

Basic Usage ​

Feature Description ​

Recognition Area ​

Language Support ​

Use Cases ​