dingfeng.wong e0e5edbfdc a
2025-07-22 01:51:19 +08:00
2025-07-22 01:17:49 +08:00
a
2025-07-22 01:51:19 +08:00
2025-07-22 01:17:49 +08:00
2025-07-22 01:17:49 +08:00
2025-07-22 01:17:49 +08:00
a
2025-07-22 01:51:19 +08:00
a
2025-07-22 01:51:19 +08:00
a
2025-07-22 01:26:24 +08:00
a
2025-07-22 01:51:19 +08:00

Tooling

A collection of useful command-line tools.

OCR Screenshot Tool

A cross-platform CLI tool that takes screenshots, performs OCR using DocTR (state-of-the-art deep learning OCR), and copies the result to clipboard. Features intelligent text formatting preservation and optional image annotation.

Features

  • 🌍 Cross-platform - Works on Windows, macOS, and Linux
  • Multiple screenshot methods - Choose the fastest for your system
  • 🔍 Advanced OCR - Uses DocTR with PARSeq recognition model
  • 📝 Smart formatting - Preserves text layout and indentation
  • 🎨 Image annotation - Visualize detected text regions
  • 📋 Clipboard integration - Automatic text copying

Installation

Basic installation:

pip install .

With cross-platform screenshot support:

# For fastest screenshots (recommended)
pip install ".[screenshot-fast]"

# For full automation features (region selection)
pip install ".[screenshot-full]"

# For maximum compatibility (all backends)
pip install ".[screenshot-all]"

Install specific screenshot libraries:

pip install mss          # Fastest (~30x faster than others)
pip install pyautogui    # Interactive region selection
pip install pyscreenshot # Multiple backends

Usage

Basic Commands

Take a screenshot and perform OCR:

ocr-screenshot

With verbose output and annotation:

ocr-screenshot --verbose --annotate --save-image

Screenshot Methods

Choose your preferred screenshot method:

# Auto-detect best method (default)
ocr-screenshot --screenshot-method auto

# Use MSS (fastest)
ocr-screenshot --screenshot-method mss

# Use PyAutoGUI (supports region selection)
ocr-screenshot --screenshot-method pyautogui

# Use Pillow ImageGrab (built-in)
ocr-screenshot --screenshot-method pillow

# Interactive region selection
ocr-screenshot --screenshot-method interactive

# macOS native (region selection with drag)
ocr-screenshot --screenshot-method macos

Advanced Features

Save screenshot with annotation showing detected text:

ocr-screenshot --save-image --annotate --show-words --show-text

Capture specific monitor (MSS method):

ocr-screenshot --screenshot-method mss --monitor-number 2

Full annotation with all detection levels:

ocr-screenshot --annotate --show-words --show-lines --show-blocks --show-text --save-image

Screenshot Method Comparison

Method Speed Region Selection Cross-Platform Notes
mss Fastest (crop after) ~30x faster, recommended
pyautogui Slow Interactive Best for region selection
pillow Slow Coordinates Built into Pillow
pyscreenshot Variable Coordinates Multiple backends
macos Fast Native UI 🍎 macOS only Native drag selection

How it works

  1. Screenshot: Multiple cross-platform methods available

    • Auto: Tries best method for your platform
    • MSS: Fastest full-screen capture
    • Interactive: Guided region selection
    • macOS: Native drag-to-select interface
  2. OCR: Advanced DocTR processing

    • Uses state-of-the-art PARSeq recognition model
    • Preserves text layout and indentation
    • Handles multiple languages
  3. Annotation (optional): Visual feedback

    • Word-level bounding boxes (red)
    • Line-level groupings (green)
    • Block-level sections (blue)
    • Text overlay showing detected content
  4. Output: Formatted text copied to clipboard

Command Line Options

ocr-screenshot [OPTIONS]

Options:
  --lang TEXT                     Language code for OCR (default: eng)
  --save-image                    Save the screenshot image
  --output-dir PATH               Directory to save images (default: ~/Desktop)
  --verbose                       Show detailed output
  --annotate                      Create annotated image with detection boxes
  --show-words                    Show word-level boxes (default: True)
  --show-lines                    Show line-level boxes
  --show-blocks                   Show block-level boxes  
  --show-text                     Overlay detected text on image
  --screenshot-method TEXT        Method: auto, mss, pyautogui, pillow, pyscreenshot, macos, interactive
  --monitor-number INTEGER        Monitor to capture (MSS method only, 0=all)
  --help                          Show this message and exit

Examples

Quick OCR with fastest method:

ocr-screenshot --screenshot-method mss

Debug OCR accuracy with annotations:

ocr-screenshot --annotate --show-words --show-text --save-image --verbose

Interactive region selection:

ocr-screenshot --screenshot-method interactive --save-image

Multi-monitor setup (capture monitor 2):

ocr-screenshot --screenshot-method mss --monitor-number 2

Development Guide

How to Add New Packages

To add a new production dependency (e.g., 'requests'):

uv add requests

To add a new development dependency (e.g., 'ipdb'):

uv add --dev ipdb

After adding dependencies, always re-generate requirements.txt:

uv pip compile pyproject.toml -o requirements.txt

How to Build Packages

To build your project's distributable packages (.whl, .tar.gz):

python -m build

Or using the virtual environment directly:

./venv/bin/python -m build

Offline Build

To build offline packages for deployment:

./dev_scripts/build_offline.sh

This will create offline_packages/ with all dependencies and install.sh

S
Description
No description provided
Readme 3.2 MiB
Languages
Python 99.4%
Shell 0.6%