Metadata-Version: 2.4
Name: nova-act-mcp-server
Version: 3.0.0
Summary: An MCP server providing tools to control web browsers using the Amazon Nova Act SDK
Author-email: Jacob Taunton <jandrewt82@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/madtank/nova-act-mcp
Project-URL: Bug Tracker, https://github.com/madtank/nova-act-mcp/issues
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.28.1
Requires-Dist: mcp>=1.6.0
Requires-Dist: pydantic>=2.11.4
Requires-Dist: nova-act>=1.0.2408.0
Requires-Dist: playwright==1.48.0
Requires-Dist: fastmcp==2.2.5
Requires-Dist: asyncio-extras>=1.3.2
Requires-Dist: rich>=14.0.0
Requires-Dist: structlog>=25.3.0
Requires-Dist: tenacity>=9.1.2
Requires-Dist: uvloop>=0.21.0; sys_platform != "win32"
Provides-Extra: dev
Requires-Dist: pytest>=8.3.5; extra == "dev"
Requires-Dist: pytest-asyncio>=0.26.0; extra == "dev"
Requires-Dist: black>=25.1.0; extra == "dev"
Requires-Dist: isort>=6.0.1; extra == "dev"
Requires-Dist: mypy>=1.15.0; extra == "dev"
Requires-Dist: ruff>=0.11.7; extra == "dev"
Requires-Dist: pytest-dotenv>=0.5.2; extra == "dev"
Requires-Dist: python-dotenv>=1.0.0; extra == "dev"
Provides-Extra: sse
Dynamic: license-file

# nova-act-mcp
[![PyPI](https://img.shields.io/pypi/v/nova-act-mcp-server)](https://pypi.org/project/nova-act-mcp-server/)

**nova‑act‑mcp‑server** is a zero‑install [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) server that exposes [Amazon Nova Act](https://nova.amazon.com/act) browser‑automation tools.

## What's New in v3.0.0
- **On-Demand Screenshots**: New `inspect_browser` tool to explicitly request screenshots only when needed
- **Reduced Token Usage**: Browser actions no longer automatically include screenshots, saving context space
- **More Efficient Workflows**: Agents can now control when to get visual feedback
- **Better Performance**: Smaller response payloads improve overall agent experience

### New `inspect_browser` Tool Example

```python
# Start a browser session
start_result = await control_browser(action="start", url="https://example.com")
session_id = start_result["session_id"]

# Execute an action without getting a screenshot
execute_result = await control_browser(
    action="execute",
    session_id=session_id,
    instruction="Click on the 'More information...' link"
)

# Now explicitly request a screenshot to see the result
inspect_result = await inspect_browser(session_id=session_id)

# Example output from inspect_browser:
{
  "session_id": "f8a53291-b3a7-4e1e-8c9d-9a12b3c45d67",
  "current_url": "https://www.iana.org/domains/reserved",
  "page_title": "IANA — IANA-managed Reserved Domains",
  "content": [
    {
      "type": "image_base64",
      "data": "data:image/jpeg;base64,/9j/4AAQSkZJRgABAQEASABIAAD/2wBDAAMCA...",
      "caption": "Current viewport"
    },
    {
      "type": "text",
      "text": "Current URL: https://www.iana.org/domains/reserved\nPage Title: IANA — IANA-managed Reserved Domains"
    }
  ],
  "agent_thinking": [],
  "success": true
}
```

## What's New in v0.2.9
- **Improved Screenshot Reliability**: More dependable screenshot delivery in responses
- **Enhanced Log Path Discovery**: Smart, efficient path tracking for logs and screenshots
- **Better Agent Communication**: Clear messaging when screenshots can't be embedded
- **Improved Performance**: Eliminated inefficient directory scanning for faster responses

## What's New in v0.2.8
- **Enhanced Inline Screenshots**: Screenshots now appear directly in the response `content` array
- Improved compatibility with vision-capable models like Claude
- Screenshots include descriptive captions based on the executed instruction
- Each screenshot is delivered as `{ type: "image_base64", data: "..." }` in the content array

## What's New in v0.2.7
- **Automatic Inline Screenshots**: Every browser action now includes an optimized screenshot
- Improved screenshot quality and reliability for AI agents
- Added environment variables to customize screenshot quality and size limits
- Comprehensive test coverage ensuring screenshots work in all scenarios

### New Feature: Inline Screenshots

Every successful `execute` response now contains `inline_screenshot`, a base64-encoded JPEG of the current viewport:
- Quality ≈ 45, hard-capped at 250 KB (configurable via `NOVA_MCP_MAX_INLINE_IMG` env variable)
- If the raw JPEG is larger than the cap, the field is `null`
- No extra API calls needed - screenshots are included automatically
- For full-resolution images and HAR/HTML logs, use the `compress_logs` tool

## What's New in v0.2.6
- Added compatibility with NovaAct SDK 0.9+ by normalizing log directory handling
- Improved test organization with clear markers for unit, mock, smoke and e2e tests
- Moved mock HTML creation logic from production code to test helpers
- Fixed several syntax errors and incomplete code blocks
- Added SCREENSHOT_QUALITY constant for consistent compression settings

## Quick start (uvx)

Add it to your MCP client configuration:

```jsonc
{
  "mcpServers": {
    "nova-act-mcp-server": {
      "command": "uvx",
      "args": ["nova-act-mcp-server@latest"],
      "env": { "NOVA_ACT_API_KEY": "<your_api_key>" }
    }
  }
}
```

That's all you need to start controlling browsers from any MCP‑compatible client such as Claude Desktop or VS Code.

## Local development (optional)

```bash
git clone https://github.com/madtank/nova-act-mcp.git
cd nova-act-mcp
uv sync
uv run nova_mcp.py
```

## License
[MIT](LICENSE)
