Metadata-Version: 2.4
Name: ocrmypdf-appleocr
Version: 0.3.0
Summary: Plugin to run OCRmyPDF with Apple Vision Framework OCR engine
Author-email: Masahiro Kiyota <hiro@juzbox.com>
License: MIT License
        
        Copyright (c) 2025 Masahiro Kiyota
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
        
Project-URL: Homepage, https://github.com/mkyt/OCRmyPDF-AppleOCR
Project-URL: Repository, https://github.com/mkyt/OCRmyPDF-AppleOCR.git
Keywords: pdf,ocr,optical character recognition,apple,vision,ocrmypdf,plugin
Classifier: Development Status :: 4 - Beta
Classifier: Environment :: Plugins
Classifier: Intended Audience :: End Users/Desktop
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Topic :: Scientific/Engineering :: Image Recognition
Classifier: Topic :: Software Development :: Libraries
Classifier: Topic :: Text Processing :: Linguistic
Classifier: Topic :: Text Processing :: Markup
Classifier: Topic :: Utilities
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pyobjc-core
Requires-Dist: pyobjc-framework-Vision
Requires-Dist: pyobjc-framework-Cocoa
Requires-Dist: ocrmypdf>=14.2.1
Requires-Dist: Pillow>=10.0.1
Requires-Dist: pikepdf>=8.10.1
Dynamic: license-file

# OCRmyPDF AppleOCR

A plugin for [OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF/) that enables optical character recognition (OCR) using the text detection capabilities of Apple’s [Vision Framework](https://developer.apple.com/documentation/vision) on macOS.

Apple’s proprietary OCR implementation provides excellent accuracy and speed compared to other on-device OCR engines such as Tesseract.

## Installation

The package is available on [PyPI](https://pypi.org/project/ocrmypdf-appleocr/).

```bash
pip install ocrmypdf-appleocr
```

## Usage

To use the plugin, pass the `--plugin` option when invoking `ocrmypdf`. You can also specify the language(s) for OCR using the `-l` or `--language` option. If you want to enable automatic language detection, use `und` (undetermined) as the language code.

```bash
ocrmypdf -l jpn --plugin ocrmypdf_appleocr input.pdf output.pdf
```

## Options

- `--appleocr-recognition-mode`: Recognition mode for Apple Vision OCR. Choices: `fast`, `accurate`, or `livetext`. Default: `livetext` on macOS 13 and later, `accurate` on macOS 12 and earlier.
- `--appleocr-disable-correction`: Disable language correction in Apple Vision OCR (default: `False`)
- `--pdf-renderer`: Renderer used to embed OCR results as invisible (“phantom”) text. Choices: `hocr`, `sandwich`. Default: `sandwich`.
- `-l` or `--language`: Specify OCR language(s) in ISO 639-2 three-letter codes. Use `und` for undetermined language. Specifying multiple languages joined with `+` (e.g. `eng+fra`) for multilingual documents is **not supported**.

Automatic language detection (`und`) is **not supported** in `livetext` mode.

### Recognition Modes

The `fast` and `accurate` modes use [VNRecognizeTextRequest](https://developer.apple.com/documentation/vision/vnrecognizetextrequest?language=objc) from Apple's Vision framework.

The `livetext` mode uses the newer [ImageAnalyzer](https://developer.apple.com/documentation/visionkit/imageanalyzer) API from the VisionKit framework.
Although officially Swift-only, it can be accessed via private API (`VKCImageAnalyzer`) through `pyobjc`.

The key difference is that LiveText supports **vertical text layout in East Asian languages**, which is not handled properly by the older API.

### PDF Renderers

This plugin supports two [OCRmyPDF renderers](https://ocrmypdf.readthedocs.io/en/latest/advanced.html#changing-the-pdf-renderer): `hocr` and `sandwich`.
The default is `sandwich`.

- **sandwich:**
  The plugin renders OCR output as a PDF layer with invisible text, which OCRmyPDF then merges with the original page image.
- **hocr:**
  The plugin outputs OCR results as [hOCR markup](https://kba.github.io/hocr-spec/1.2/), and OCRmyPDF converts the markup to PDF.

Because the hOCR format cannot represent vertical text in East Asian (CJK) scripts, the `hocr` renderer cannot accurately reproduce vertical text layouts.
However, OCRmyPDF’s built-in hOCR-to-PDF conversion is more mature and may perform better in other scenarios.

### Supported Languages

As of macOS Tahoe 26, the following languages are supported by Apple Vision OCR:

|   Language code  |   Language name            |   Fast mode  |   Accurate mode  |   LiveText  |
|------------------|----------------------------|--------------|------------------|-------------|
|   eng            |   English                  |   ✓          |   ✓              |   ✓         |
|   fra            |   French                   |   ✓          |   ✓              |   ✓         |
|   ita            |   Italian                  |   ✓          |   ✓              |   ✓         |
|   deu            |   German                   |   ✓          |   ✓              |   ✓         |
|   spa            |   Spanish                  |   ✓          |   ✓              |   ✓         |
|   por            |   Portuguese               |   ✓          |   ✓              |   ✓         |
|   chi_sim        |   Chinese (Simplified)     |              |   ✓              |   ✓         |
|   chi_tra        |   Chinese (Traditional)    |              |   ✓              |   ✓         |
|   yue_sim        |   Cantonese (Simplified)   |              |   ✓              |   ✓         |
|   yue_tra        |   Cantonese (Traditional)  |              |   ✓              |   ✓         |
|   kor            |   Korean                   |              |   ✓              |   ✓         |
|   jpn            |   Japanese                 |              |   ✓              |   ✓         |
|   rus            |   Russian                  |              |   ✓              |   ✓         |
|   ukr            |   Ukrainian                |              |   ✓              |   ✓         |
|   tha            |   Thai                     |              |   ✓              |   ✓         |
|   vie            |   Vietnamese               |              |   ✓              |   ✓         |
|   ara            |   Arabic                   |              |   ✓              |   ✓         |
|   ars            |   Arabic (Najdi)           |              |   ✓              |   ✓         |
|   tur            |   Turkish                  |              |   ✓              |   ✓         |
|   ind            |   Indonesian               |              |   ✓              |   ✓         |
|   ces            |   Czech                    |              |   ✓              |   ✓         |
|   dan            |   Danish                   |              |   ✓              |   ✓         |
|   nld            |   Dutch                    |              |   ✓              |   ✓         |
|   nor            |   Norwegian                |              |   ✓              |   ✓         |
|   nno            |   Norwegian (Nynorsk)      |              |   ✓              |   ✓         |
|   nob            |   Norwegian (Bokmål)       |              |   ✓              |   ✓         |
|   msa            |   Malay                    |              |   ✓              |   ✓         |
|   pol            |   Polish                   |              |   ✓              |   ✓         |
|   ron            |   Romanian                 |              |   ✓              |   ✓         |
|   swe            |   Swedish                  |              |   ✓              |   ✓         |


## Acknowledgements

This project incorporates and references code from the following projects:

- [straussmaximilian/ocrmac](https://github.com/straussmaximilian/ocrmac) - for invoking `VKCImageAnalyzer` (LiveText API) via `pyobjc`
- [ocrmypdf/OCRmyPDF-EasyOCR](https://github.com/ocrmypdf/OCRmyPDF-EasyOCR) - for PDF rendering of recognized text
