Metadata-Version: 2.4
Name: gui-agent-screenshot-tools
Version: 1.0.0
Summary: Screenshot, bounding-box, and coordinate transformation utilities for GUI agents and other GUI related tasks.
Project-URL: Homepage, https://github.com/PoorRican/gui-agent-screenshot-tools
Project-URL: Repository, https://github.com/PoorRican/gui-agent-screenshot-tools
Project-URL: Issues, https://github.com/PoorRican/gui-agent-screenshot-tools/issues
Author-email: Josue Figueroa <poor.rican@pm.me>
License-Expression: MIT
License-File: LICENSE
Keywords: automation,computer use agent,coordinates,gui,gui agent,resize,screenshot
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: pillow>=10.0
Requires-Dist: pydantic>=2.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0; extra == 'dev'
Description-Content-Type: text/markdown

# gui-agent-screenshot-tools

Coordinate, bounding-box, and screenshot resize utilities for GUI agents.

This library helps you map detections between:
- original screenshot space
- resized screenshot space (letterbox or stretch)
- physical screen space (with optional window offsets)

## Installation

```bash
pip install gui-agent-screenshot-tools
```

## Quick Start

```python
from gui_agent_screenshot_tools import Coordinate, Space

source = Space(width=1920, height=1080)
target = Space(width=1280, height=720)

coord = Coordinate(x=960, y=540, space=source)
mapped = coord.to_space(target)
print(mapped.x, mapped.y)  # 640 360
```

## Directional APIs

Use directional methods when you have resize metadata:
- `to_source_space(...)`: resized -> original
- `to_target_space(...)`: original -> resized

Compatibility aliases are still available:
- `Coordinate.to_space(..., resize_metadata=...)`
- `ResizeMetadata.transform_coordinate(...)`
- `ResizeMetadata.forward_transform_coordinate(...)`

## Workflow 1: Detection In Resized Image -> Original Coordinates

```python
from PIL import Image
from gui_agent_screenshot_tools import Coordinate, ResizeMode, Screenshot, Space

img = Image.new("RGB", (1920, 1080), color=(255, 0, 0))
original = Screenshot.from_image(img)

resized = original.resize(Space(width=1024, height=1024), ResizeMode.LETTERBOX)
model_point = Coordinate(x=512, y=512, space=resized.space)

# New helper API
point_in_original = resized.map_coord_to_original(model_point)
print(point_in_original.x, point_in_original.y)  # 960 540
```

## Workflow 2: Window-Local BBox -> Screen Coordinates

```python
from PIL import Image
from gui_agent_screenshot_tools import BBox, ResizeMode, Screenshot, Space

screen = Space(width=2560, height=1440)
window = BBox(x=50, y=100, width=1920, height=1080, space=screen)

# Screenshot is of the window content only
img = Image.new("RGB", (1920, 1080), color=(80, 120, 180))
original = Screenshot.from_image(img)
resized = original.resize(Space(width=1024, height=1024), ResizeMode.LETTERBOX)

# Model output in resized screenshot space
model_bbox = BBox(x=100, y=200, width=300, height=150, space=resized.space)

# One call: resized -> original -> screen (via window offset)
screen_bbox = resized.map_bbox_to_original(model_bbox, offset=window)
print(screen_bbox.x, screen_bbox.y, screen_bbox.width, screen_bbox.height)
# 238 100 561 236
```

## Common Mistakes

- Using a coordinate with the wrong `space` for the metadata direction.
  Example: `to_source_space` expects coordinates in `metadata.target_space`.
- Calling screenshot remap helpers before `resize(...)`.
  `map_coord_*` and `map_bbox_*` require `resize_metadata`.
- Passing an `offset` bbox whose `offset.as_space` does not match the post-transform local space.
- Passing a raw string mode to `resize`.
  Use `ResizeMode.LETTERBOX` or `ResizeMode.STRETCH`.

## Development

```bash
uv run --extra dev python -m pytest -q
```

## License

MIT
