Metadata-Version: 2.4
Name: lybic-guiagents
Version: 0.7.4
Summary: An open-source agentic framework that enables AI to use computers like humans and can provide a multi-agent runtime environment as an infrastructure capability
Author: Lybic Development Team
Author-email: Lybic Development Team <lybic@tingyutech.com>
License-Expression: Apache-2.0
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.12, <3.15
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: dotenv
Requires-Dist: pyperclip
Requires-Dist: Pillow>=11.3.0
Requires-Dist: backoff
Requires-Dist: numpy
Requires-Dist: anthropic
Requires-Dist: openai
Requires-Dist: google-genai
Requires-Dist: zhipuai>=2.1.5.20250106
Requires-Dist: groq
Requires-Dist: boto3
Requires-Dist: scikit-learn
Requires-Dist: tiktoken
Requires-Dist: lybic<1.0.0,>=0.11.0
Requires-Dist: pyobjc; platform_system == "Darwin"
Requires-Dist: oa_atomacos; platform_system == "Darwin"
Requires-Dist: pywinauto; platform_system == "Windows"
Requires-Dist: pywin32; platform_system == "Windows"
Requires-Dist: pytest>=8.4.2
Requires-Dist: protobuf>=5.29.5
Requires-Dist: googleapis-common-protos>=1.70.0
Requires-Dist: filelock
Provides-Extra: postgres
Requires-Dist: asyncpg>=0.29.0; extra == "postgres"
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: anyio; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: pytest-tornasync; extra == "dev"
Requires-Dist: pytest-trio; extra == "dev"
Requires-Dist: pytest-twisted; extra == "dev"
Requires-Dist: twisted; extra == "dev"
Provides-Extra: grpc
Requires-Dist: grpcio; extra == "grpc"
Requires-Dist: grpcio-tools>=1.71.2; extra == "grpc"
Provides-Extra: prometheus
Requires-Dist: prometheus-client>=0.20.0; extra == "prometheus"
Requires-Dist: psutil>=5.9.0; extra == "prometheus"
Provides-Extra: mcp
Requires-Dist: mcp>=1.0.0; extra == "mcp"
Requires-Dist: fastapi>=0.100.0; extra == "mcp"
Requires-Dist: sse-starlette>=2.0.0; extra == "mcp"
Requires-Dist: uvicorn>=0.30.0; extra == "mcp"
Provides-Extra: vmware-and-evaluation
Requires-Dist: python-pptx; extra == "vmware-and-evaluation"
Requires-Dist: odfpy; extra == "vmware-and-evaluation"
Requires-Dist: openpyxl; extra == "vmware-and-evaluation"
Requires-Dist: pandas; extra == "vmware-and-evaluation"
Requires-Dist: lxml; extra == "vmware-and-evaluation"
Requires-Dist: xmltodict; extra == "vmware-and-evaluation"
Requires-Dist: opencv-python-headless; extra == "vmware-and-evaluation"
Requires-Dist: rapidfuzz; extra == "vmware-and-evaluation"
Requires-Dist: imagehash; extra == "vmware-and-evaluation"
Requires-Dist: odfpy; extra == "vmware-and-evaluation"
Requires-Dist: pdfplumber; extra == "vmware-and-evaluation"
Requires-Dist: PyYAML; extra == "vmware-and-evaluation"
Requires-Dist: mutagen; extra == "vmware-and-evaluation"
Requires-Dist: pypdf; extra == "vmware-and-evaluation"
Requires-Dist: PyMuPDF; extra == "vmware-and-evaluation"
Requires-Dist: formulas; extra == "vmware-and-evaluation"
Requires-Dist: pyacoustid; extra == "vmware-and-evaluation"
Requires-Dist: opencv-python~=4.8.1.78; extra == "vmware-and-evaluation"
Requires-Dist: PyPDF2; extra == "vmware-and-evaluation"
Requires-Dist: librosa; extra == "vmware-and-evaluation"
Requires-Dist: fastdtw; extra == "vmware-and-evaluation"
Requires-Dist: scikit-learn; extra == "vmware-and-evaluation"
Requires-Dist: scikit-image; extra == "vmware-and-evaluation"
Requires-Dist: bs4; extra == "vmware-and-evaluation"
Requires-Dist: azure-identity; extra == "vmware-and-evaluation"
Requires-Dist: azure-mgmt-compute; extra == "vmware-and-evaluation"
Requires-Dist: azure-mgmt-network; extra == "vmware-and-evaluation"
Requires-Dist: toml; extra == "vmware-and-evaluation"
Requires-Dist: easyocr; extra == "vmware-and-evaluation"
Provides-Extra: pyautogui
Requires-Dist: pyautogui; extra == "pyautogui"
Provides-Extra: all
Requires-Dist: toml; extra == "all"
Requires-Dist: numpy; extra == "all"
Requires-Dist: backoff; extra == "all"
Requires-Dist: pandas; extra == "all"
Requires-Dist: openai; extra == "all"
Requires-Dist: anthropic; extra == "all"
Requires-Dist: zhipuai; extra == "all"
Requires-Dist: groq; extra == "all"
Requires-Dist: boto3; extra == "all"
Requires-Dist: paddlepaddle; extra == "all"
Requires-Dist: tiktoken; extra == "all"
Requires-Dist: toml; extra == "all"
Requires-Dist: exa_py; extra == "all"
Requires-Dist: google-genai; extra == "all"
Requires-Dist: python-dotenv; extra == "all"
Requires-Dist: Pillow>=11.3.0; extra == "all"
Requires-Dist: gymnasium~=0.28.1; extra == "all"
Requires-Dist: requests~=2.31.0; extra == "all"
Requires-Dist: pytz~=2024.1; extra == "all"
Requires-Dist: psutil~=5.9.6; extra == "all"
Requires-Dist: tqdm~=4.65.0; extra == "all"
Requires-Dist: flask~=3.0.0; extra == "all"
Requires-Dist: requests-toolbelt~=1.0.0; extra == "all"
Requires-Dist: lxml; extra == "all"
Requires-Dist: xmltodict; extra == "all"
Requires-Dist: PyGetWindow; extra == "all"
Requires-Dist: ImageHash; extra == "all"
Requires-Dist: scikit-image; extra == "all"
Requires-Dist: formulas; extra == "all"
Requires-Dist: pydrive; extra == "all"
Requires-Dist: fastdtw; extra == "all"
Requires-Dist: odfpy; extra == "all"
Requires-Dist: func-timeout; extra == "all"
Requires-Dist: beautifulsoup4; extra == "all"
Requires-Dist: dashscope; extra == "all"
Requires-Dist: google-generativeai; extra == "all"
Requires-Dist: PyYaml; extra == "all"
Requires-Dist: mutagen; extra == "all"
Requires-Dist: easyocr; extra == "all"
Requires-Dist: borb; extra == "all"
Requires-Dist: pypdf2; extra == "all"
Requires-Dist: lybic>=0.7.3; extra == "all"
Requires-Dist: pdfplumber; extra == "all"
Requires-Dist: azure-identity; extra == "all"
Requires-Dist: azure-mgmt-compute; extra == "all"
Requires-Dist: azure-mgmt-network; extra == "all"
Requires-Dist: docker; extra == "all"
Requires-Dist: dotenv; extra == "all"
Requires-Dist: python-pptx; extra == "all"
Requires-Dist: odfpy; extra == "all"
Requires-Dist: openpyxl; extra == "all"
Requires-Dist: pandas; extra == "all"
Requires-Dist: lxml; extra == "all"
Requires-Dist: xmltodict; extra == "all"
Requires-Dist: opencv-python-headless; extra == "all"
Requires-Dist: Pillow; extra == "all"
Requires-Dist: imagehash; extra == "all"
Requires-Dist: rapidfuzz; extra == "all"
Requires-Dist: fitz; extra == "all"
Requires-Dist: imagehash; extra == "all"
Requires-Dist: odfpy; extra == "all"
Requires-Dist: pdfplumber; extra == "all"
Requires-Dist: PyYaml; extra == "all"
Requires-Dist: mutagen; extra == "all"
Requires-Dist: pypdf; extra == "all"
Requires-Dist: PyMuPDF; extra == "all"
Requires-Dist: formulas; extra == "all"
Requires-Dist: pyacoustid; extra == "all"
Requires-Dist: opencv-python~=4.8.1.78; extra == "all"
Requires-Dist: PyPDF2; extra == "all"
Requires-Dist: librosa; extra == "all"
Requires-Dist: fastdtw; extra == "all"
Requires-Dist: scikit-learn; extra == "all"
Requires-Dist: scikit-image; extra == "all"
Requires-Dist: bs4; extra == "all"
Requires-Dist: azure-identity; extra == "all"
Requires-Dist: azure-mgmt-compute; extra == "all"
Requires-Dist: azure-mgmt-network; extra == "all"
Requires-Dist: easyocr; extra == "all"
Requires-Dist: pyobjc; platform_system == "Darwin" and extra == "all"
Requires-Dist: oa_atomacos; platform_system == "Darwin" and extra == "all"
Requires-Dist: pywinauto; platform_system == "Windows" and extra == "all"
Requires-Dist: pywin32; platform_system == "Windows" and extra == "all"
Requires-Dist: grpcio; extra == "all"
Requires-Dist: grpcio-tools>=1.71.2; extra == "all"
Dynamic: author
Dynamic: license-file
Dynamic: requires-python

<div align="center">

**[English](./README.md) | [中文](./README-zh.md) | [日本語](./README-jp.md)**

<br/>
</div>

<p align="center">
  <img src="assets/logo.png" alt="Lybic Logo" width="400"/>
</p>
<h1 align="center">
  Lybic GUI Agent: <small>An open-source agentic framework for Computer Use Agents</small> 
</h1>

<p align="center">
    <small>Supported OS:</small>
    <img src="https://img.shields.io/badge/OS-Windows-blue?logo=windows&logoColor=white" alt="Windows">
    <img src="https://img.shields.io/badge/OS-macOS-black?logo=apple&logoColor=white" alt="macOS">
    <img src="https://img.shields.io/badge/OS-Linux-yellow?logo=linux&logoColor=black" alt="Linux">
    <br/>
    <small>Latest Version:</small><a href="https://pypi.org/project/lybic-guiagents/"><img alt="PyPI" src="https://img.shields.io/pypi/v/lybic-guiagents"></a>
    &nbsp;
    <a href="https://github.com/lybic/agent/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/pypi/l/lybic-guiagents"></a>
    &nbsp;
    <a href="https://github.com/lybic/agent"><img alt="Stars" src="https://img.shields.io/github/stars/lybic/agent?style=social"></a>
</p>

## What is Lybic GUI Agent?

Lybic platform placeholder - comprehensive AI platform for building and deploying intelligent agents

Lybic GUI Agent is an open-source framework that enables developers and businesses to create intelligent computer-use agents,mobile-use agents, and intelligent agents that can understand and interact with graphical user interfaces across Windows, macOS, Linux and Android(via lybic Android Sandbox) platforms.

<!-- <p align="center"><small>Lybic GUI Agent is based upon the <a href="https://github.com/simular-ai/Agent-S">Agent-S</a> codebase, allowing us to focus on making the best interaction experience with Lybic while maintaining a familiar execution logic.</small></p> -->

<div align="center">

<p>Skip the setup? Try Lybic GUI Agent in our <a href="https://playground.lybic.cn/">Playground</a> with a few clicks.(Only in the Chinese mainland supported)

</div>

## 🥳 Updates
- [x] **2025/10/17**: We have completed the verification test of the 100-step task at [OS-world](https://os-world.github.io/).
- [x] **2025/09/14**: The paper has been accepted by [arxiv](https://arxiv.org/abs/2509.11067)
- [x] **2025/09/09**: We achieved the world's first place in the 50-step length of [OS-world](https://os-world.github.io/)!
- [x] **2025/08/08**: Released v0.1.0 of [Lybic GUI Agent](https://github.com/lybic/agent) library, with support for Windows, Mac, Ubuntu and Lybic API!

## Table of Contents

1. [💡 Introduction](#-introduction)
2. [🛠️ Installation & Setup](#%EF%B8%8F-installation--setup) 
3. [🚀 Usage](#-usage)
4. [🔧 Troubleshooting](#-troubleshooting)
5. [💬 Citations](#-citations)

## 💡 Introduction

## ✨ Feature Agentic Lybic Support

- **Multiple LLMs providers**:  OpenAI, Anthropic, Google, xAI , AzureOpenAI, DeepSeek, Qwen, Doubao, ZhipuGLM
  - **Aggregation Model Provider**:  Bedrock, Groq, Monica, OpenRouter, SiliconFlow
- **RAG**: We support RAG, and this capability is provided as an extension
- **Cross-Platform GUI Control**: Windows, Linux, macOS, Android Supported
- **Observability**: Supported
- **Local Deployment**: Supported
- **Cloud Sandbox Environment**: Supported

<p align="center">🎉 Agents Online Demo</p>

[![Our demo](https://img.youtube.com/vi/GaOoYoRKWhE/maxresdefault.jpg)](https://www.youtube.com/watch?v=GaOoYoRKWhE)

<p align="center">🎯 Current Results</p>

<div align="center">
  <table border="0" cellspacing="0" cellpadding="5">
    <tr>
      <th>Benchmark</th>
      <th>Lybic GUI Agent</th>
      <th>Previous SOTA</th>
    </tr>
    <tr>
      <td>OSWorld Verified (50 step)</td>
      <td><b>57.1%</b></td>
      <td>54.2%</td>
    </tr>
  </table>
</div>

<p align="center">
  <img src="assets/structure.png" alt="Lybic GUI Agent system structure" width="700"/>
</p>
<p align="center"><b>Fig. Lybic GUI Agent system structure</b></p>

## 🛠️ Installation & Setup

> [!WARNING]
> To leverage the full potential of Lybic GUI Agent, we support multiple model providers including OpenAI, Anthropic, Gemini, and Doubao. For the best visual grounding performance, we recommend using UI-TARS models.

### Installation(from pip)

You can install Lybic GUI Agent by using pip:

```bash
pip install lybic-guiagents
```

> **Note for Windows Users**: If you encounter Unicode decoding errors during installation (especially on systems with non-UTF-8 default encoding like Chinese Windows), set the following environment variable before running pip:
> ```cmd
> set PYTHONUTF8=1
> ```
> Then run the installation command again. This enables Python's UTF-8 mode, which resolves encoding issues on Windows.

### Installation(from source code)

You can use [UV](https://docs.astral.sh/uv/getting-started/installation/) (a modern Python package manager) version 0.8.5 for installation:

```bash
# 1. Install UV if not already installed
# macOS and Linux
curl -LsSf https://astral.sh/uv/0.8.5/install.sh | sh
# Windows
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/0.8.5/install.ps1 | iex"

# testing uv installation, version should be 0.8.5
uv --version

# 2. Install the python 3.14
uv python install 3.14

# 3. Create a virtual environment
uv venv -p 3.14

# 4. Activate the virtual environment
# macOS and Linux
source .venv/bin/activate
# Windows
.venv\Scripts\activate

# 5. Install dependencies (using locked versions)
uv sync

# 6. Install the package locally in development mode
uv pip install -e .
```

> **Note for Windows Users**: If you encounter Unicode decoding errors during dependency installation (e.g., `UnicodeDecodeError: 'gbk' codec can't decode byte`), this is typically caused by packages with non-UTF-8 encoded files. Set this environment variable before running `uv sync`:
> ```cmd
> set PYTHONUTF8=1
> ```
> This enables Python's UTF-8 mode, which resolves encoding issues on Windows systems with non-UTF-8 default locales.

### API Key Configuration

The simplest way to configure API keys is to:

1. Copy `gui_agents/.env.example` to `gui_agents/.env`
2. Edit the `.env` file and add your API keys

### Tool Configuration

We provide two pre-configured tool settings:

- `tools_config_en.json`: Configured for English language models (Gemini, Exa)
- `tools_config_cn.json`: Configured for Chinese language models (Doubao, bocha)

The agent uses `tools_config.json` by default. You can:

- Copy either `tools_config_en.json` or `tools_config_cn.json` to `tools_config.json`
- Or create your own custom configuration

If you are using `tools_config_cn.json` and use `pyautogui` backend, the environment variable only `ARK_API_KEY` should be set.

If you are using `tools_config_en.json` and use `pyautogui` backend, you should set those 3 environment variables:

```bash
GEMINI_ENDPOINT_URL=https://generativelanguage.googleapis.com/v1beta/openai/
GEMINI_API_KEY=your_gemini_api_key
ARK_API_KEY=your_ark_api_key
```

```bash
# For English models
cp gui_agents/tools/tools_config_en.json gui_agents/tools/tools_config.json

# For Chinese models
cp gui_agents/tools/tools_config_cn.json gui_agents/tools/tools_config.json
```

> **Note**: Our recommended configuration uses `doubao-1-5-ui-tars-250428` for `"tool_name": "grounding" or "fast_action_generator"` and `claude-sonnet-4-20250514` or `doubao-seed-1-6-250615` for other tools such as `"tool_name": "action_generator"`. You can customize the model configuration in the tools configuration files. Do not modify the `"tool_name"` in `tools_config.json` file. To change the `"provider"` and `"model_name"` in `tools_config.json` file, see [model.md](gui_agents/tools/model.md)

## 🚀 Usage

### Command Line Interface

Run Lybic GUI Agent with python in the command-line interface:

```sh
python gui_agents/cli_app.py [OPTIONS]
```

This will show a user query prompt where you can enter your instructions and interact with the agent.

### Options

- `--backend [lybic|lybic_mobile|pyautogui|pyautogui_vmware]`: Specifies the backend to use for controlling the GUI. Defaults to `lybic`.

- `--query "YOUR_QUERY"`: Optional, can be input during the runtime; if provided, the agent will execute the query and then exit. 
- `--max-steps NUMBER`: Sets the maximum number of steps the agent can take. Defaults to `50`.
- `--mode [normal|fast]`: (Optional) Selects the agent mode. `normal` runs the full agent with detailed reasoning and memory, while `fast` mode executes actions more quickly with less reasoning overhead. Defaults to `normal`.
- `--enable-takeover`: (Optional) Enables user takeover functionality, allowing the agent to pause and request user intervention when needed. By default, user takeover is disabled.
- `--disable-search`: (Optional) Disables web search functionality. By default, web search is enabled.

### Examples

Run in interactive mode with the `lybic` backend:
```sh
python gui_agents/cli_app.py --backend lybic
```


Run a single query with the `pyautogui` backend and a maximum of 20 steps:
```sh
python gui_agents/cli_app.py --backend pyautogui --query "Find the result of 8 × 7 on a calculator" --max-steps 20
```

Run in fast mode with the `pyautogui` backend:
```sh
python gui_agents/cli_app.py --backend pyautogui --mode fast
```

> [!WARNING]
> The agent will directly control your computer with `--backend pyautogui`. Please use with care.

### Docker
You can also run Lybic GUI Agent using Docker. This is an example of how to run it with the `lybic` backend:
```sh
docker run --rm -it --env-file gui_agents/.env agenticlybic/guiagent --backend lybic
```
> **Note**: This command starts the agent in interactive mode. The `--env-file` flag points to the environment file. Please ensure the path is correct.

### Lybic Sandbox Configuration

The simplest way to configure Lybic Sandbox is still to edit the `.env` file and add your API keys, as mentioned in the [API Key Configuration](#api-key-configuration) section.


```bash
LYBIC_API_KEY=your_lybic_api_key
LYBIC_ORG_ID=your_lybic_org_id
LYBIC_MAX_LIFE_SECONDS=3600
```

> **Note**: If you want to use a precreated Lybic Sandbox in [Lybic Dashboard](https://dashboard.lybic.cn/), you need to set the `LYBIC_PRECREATE_SID` to the precreated Sandbox ID.

> 
> ```bash
> LYBIC_PRECREATE_SID=SBX-XXXXXXXXXXXXXXX
> ```

### Use as a service

You can interact with the agent programmatically either by importing it as a Python library or by running it as a gRPC service.

#### As a Python Library

After installing `lybic-guiagents`, you can import and use its components directly in your Python code.

Main Components:
- `AgentService`: High-level service interface (recommended for most users).
- `AgentS2`, `AgentSFast`: Core agent implementations.
- `HardwareInterface`: Hardware abstraction layer for controlling the GUI.
- `ServiceConfig`: Configuration management.

**Quick Start:**

```python
from gui_agents import AgentService
service = AgentService()
result = service.execute_task("Take a screenshot")
print(f"Task completed: {result.status}")
```

#### As a gRPC Service

You can also run the agent as a standalone gRPC service, which is ideal for distributed architectures or integrating with applications written in other languages.

**1. Running the gRPC Server**

First, run the gRPC server using Docker. This command overrides the default CLI entrypoint and starts the gRPC service on port 50051.

```sh
docker run --rm -it -p 50051:50051 --env-file gui_agents/.env agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpc
```
> **Note**: The `-p 50051:50051` flag maps the container's gRPC port to your host machine.

**Optional: Enable Prometheus Monitoring**

The gRPC service supports optional Prometheus metrics for monitoring task execution, resource usage, and performance. To enable:

```sh
# Install Prometheus dependencies
pip install -e ".[prometheus]"

# Run with Prometheus enabled
docker run --rm -it \
  -p 50051:50051 \
  -p 8000:8000 \
  -e ENABLE_PROMETHEUS=true \
  -e PROMETHEUS_PORT=8000 \
  --env-file gui_agents/.env \
  agenticlybic/guiagent /app/.venv/bin/lybic-guiagent-grpc
```

Access metrics at `http://localhost:8000/metrics`. See [Prometheus Metrics Documentation](gui_agents/metrics/README.md) for details.

**2. Python Client Example**

Once the service is running, you can interact with it using a gRPC client. Here is a Python example of how to send an instruction to the agent and stream its progress.

First, ensure you have the necessary gRPC libraries and generated protobuf stubs:
```sh
# Install gRPC tools
pip install grpcio grpcio-tools

# Generate stubs from the .proto file
python -m grpc_tools.protoc -Igui_agents/proto --python_out=gui_agents/proto/pb --grpc_python_out=gui_agents/proto/pb --pyi_out=gui_agents/proto/pb gui_agents/proto/agent.proto
```

Then, you can use the following script to communicate with the agent:

```python
import asyncio
import grpc
from gui_agents.proto.pb import agent_pb2, agent_pb2_grpc

async def run_agent_instruction():
    # Connect to the gRPC server
    async with grpc.aio.insecure_channel('localhost:50051') as channel:
        # Create a stub for the Agent service
        stub = agent_pb2_grpc.AgentStub(channel)

        # Create a request to run an instruction
        request = agent_pb2.RunAgentInstructionRequest(
            instruction="Open a calculator and compute 1 + 1"
        )

        print(f"Sending instruction: '{request.instruction}'")

        # Call the RunAgentInstruction RPC and iterate over the stream of responses
        try:
            async for response in stub.RunAgentInstruction(request):
                print(f"[{response.stage}] {response.message}")
        except grpc.aio.AioRpcError as e:
            print(f"An error occurred: {e.details()}")

if __name__ == '__main__':
    asyncio.run(run_agent_instruction())
```

#### Task Persistence (Optional)

The gRPC service supports persistent storage for task status and history using PostgreSQL. This allows task data to survive service restarts.

-   **Default Behavior**: By default, tasks are stored in memory and will be lost when the service stops.
-   **Enable Persistence**: To enable PostgreSQL persistence, you need to:
    1.  Install the required dependency:
        -   If installing from PyPI: `pip install lybic-guiagents[postgres]`
        -   If installing from source: `uv pip install .[postgres]`
    2.  Set the following environment variables:
        ```bash
        # Use the postgres backend instead of the default 'memory'
        TASK_STORAGE_BACKEND=postgres
        # Set your PostgreSQL connection string
        POSTGRES_CONNECTION_STRING=postgresql://user:password@host:port/database
        ```
-   **Docker Usage**: The `agenticlybic/guiagent` Docker image comes with PostgreSQL support pre-installed. You only need to set the environment variables above to enable persistence when running the gRPC service container.

### VMware Configuration

To use PyAutoGUI with VMware, you need to install [VMware Workstation Pro](https://www.vmware.com/products/desktop-hypervisor/workstation-and-fusion) (on Windows) and create a virtual machine. 

Next, you need to download the [`Windows-x86.zip`](https://huggingface.co/datasets/xlangai/ubuntu_osworld/resolve/main/Ubuntu-x86.zip) and [`Ubuntu-x86.zip`](https://huggingface.co/datasets/xlangai/ubuntu_osworld/resolve/main/Ubuntu-x86.zip) from Hugging Face. Then unzip them into `./vmware_vm_data/Windows-x86` and `./vmware_vm_data/Ubuntu-x86` directory.

Finally, you need to edit the `.env` file and set the `USE_PRECREATE_VM` environment variable to the name of the virtual machine. `USE_PRECREATE_VM` support `Windows` and `Ubuntu` on x86 arch computer.

```bash
USE_PRECREATE_VM=Ubuntu
```

## 🔧 Troubleshooting

### Common Runtime Issues

#### 1. API Key Configuration Issues

**Problem**: `KeyError` or authentication errors when running the agent.

**Solution**:
- Ensure your `.env` file is properly configured with valid API keys
- Check that environment variables are set correctly:
  ```bash
  # For English models
  export GEMINI_API_KEY=your_gemini_api_key
  export ARK_API_KEY=your_ark_api_key
  
  # For Chinese models  
  export ARK_API_KEY=your_ark_api_key
  ```
- Verify API key permissions and quotas

#### 2. Python Environment Issues

**Problem**: `ModuleNotFoundError` or package import errors.

**Solution**:
- Ensure you're using Python >= 3.12
- Activate the virtual environment:
  ```bash
  # macOS/Linux
  source .venv/bin/activate
  # Windows
  .venv\Scripts\activate
  ```
- Reinstall dependencies:
  ```bash
  uv sync
  uv pip install -e .
  # uv pip install -e . -i  https://pypi.tuna.tsinghua.edu.cn/simple
  ```

#### 3. Windows Installation Encoding Issues

**Problem**: `UnicodeDecodeError` during package installation on Windows, especially with error messages like:
```
UnicodeDecodeError: 'gbk' codec can't decode byte 0xa2 in position 905: illegal multibyte sequence
```

**Solution**:

This issue occurs when Python tries to read package metadata files using the system's default encoding (e.g., GBK on Chinese Windows) instead of UTF-8. To resolve this:

**Option 1: Set environment variable temporarily (Command Prompt)**
```cmd
set PYTHONUTF8=1
pip install lybic-guiagents
```

**Option 2: Set environment variable temporarily (PowerShell)**
```powershell
$env:PYTHONUTF8=1
pip install lybic-guiagents
```

**Option 3: Set environment variable permanently (System-wide)**
1. Open System Properties → Advanced → Environment Variables
2. Add new system variable:
   - Variable name: `PYTHONUTF8`, Value: `1`
3. Restart your terminal and try installation again

**Option 4: Use Python 3.15+ (Future)**
Python 3.15+ will enable UTF-8 mode by default on Windows ([PEP 686](https://peps.python.org/pep-0686/)), which will eliminate this issue.

> **Note**: Some users have reported success with additionally setting `PYTHONIOENCODING=utf-8`, though `PYTHONUTF8=1` should be sufficient in most cases. If you still encounter issues after setting `PYTHONUTF8=1`, try adding `set PYTHONIOENCODING=utf-8` in Command Prompt (or `$env:PYTHONIOENCODING="utf-8"` in PowerShell).

#### 4. Lybic Sandbox Connection Issues

**Problem**: `Connection timeout` or `Sandbox creation failed`.

**Solution**:
- Check network connectivity to Lybic servers
- Verify `LYBIC_ORG_ID` and `LYBIC_API_KEY` are correct
- Ensure sufficient quota in your Lybic account
- Try increasing `LYBIC_MAX_LIFE_SECONDS` if sandbox times out

#### 5. VMware Backend Issues

**Problem**: Virtual machine fails to start or control.

**Solution**:
- Ensure VMware Workstation Pro is properly installed
- Check that VM files are extracted to correct directories:
  - `./vmware_vm_data/Windows-x86/`
  - `./vmware_vm_data/Ubuntu-x86/`
- Verify VMware service is running
- Set correct `USE_PRECREATE_VM` environment variable

#### 6. Model Performance Issues

**Problem**: Slow response times or poor grounding accuracy.

**Solution**:

- Use recommended models for better performance:
  - Visual grounding: `doubao-1-5-ui-tars-250428`
  - Action generation: `claude-sonnet-4-20250514`
- Switch to `--mode fast` for quicker execution
- Reduce `--max-steps` for shorter tasks

### Getting Help

If you encounter issues not covered here:

1. Check the [GitHub Issues](https://github.com/lybic/agent/issues) for similar problems
2. Review the [Lybic Documentation](https://lybic.ai/docs)
3. Create a new issue with:
   - Your operating system and version
   - Python version and environment details
   - Complete error messages
   - Steps to reproduce the issue

## 💬 Citations

If you find this codebase useful, please cite:

```bibtex
@misc{guo2025agenticlybicmultiagentexecution,
      title={Agentic Lybic: Multi-Agent Execution System with Tiered Reasoning and Orchestration}, 
      author={Liangxuan Guo and Bin Zhu and Qingqian Tao and Kangning Liu and Xun Zhao and Xianzhe Qin and Jin Gao and Guangfu Hao},
      year={2025},
      eprint={2509.11067},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2509.11067}, 
}
```

---

## ❤️ Touch us:

<div align="center" style="display: flex; justify-content: center; gap: 20px; flex-wrap: wrap;">
  <img src="assets/feishu.png" alt="Lark Group" style="width: 200px; height: auto;"/>
  <img src="assets/wechat.jpg" alt="WeChat Group" style="width: 200px; height: auto;"/>
  <img src="assets/qq.png" alt="QQ Group" style="width: 200px; height: auto;"/>
</div>

## Stargazers over time

[![Stargazers over time](https://starchart.cc/lybic/agent.svg)](https://starchart.cc/lybic/agent)

## License

This project is distributed under Apache 2.0 License.
Therefore, you can modify the source code and release it commercially.
