Metadata-Version: 2.4
Name: vixio
Version: 0.1.4
Summary: Voice-Powered Agent Framework
Author: Weyne Chen
License-Expression: MIT
Project-URL: Homepage, https://github.com/weynechen/vixio
Project-URL: Repository, https://github.com/weynechen/vixio
Project-URL: Issues, https://github.com/weynechen/vixio/issues
Keywords: voice,agent,ai,speech,framework
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pydantic>=2.5.0
Requires-Dist: pydantic-settings>=2.1.0
Requires-Dist: ruamel.yaml>=0.18.16
Requires-Dist: loguru>=0.7.3
Provides-Extra: xiaozhi
Requires-Dist: fastapi>=0.110.0; extra == "xiaozhi"
Requires-Dist: uvicorn[standard]>=0.27.0; extra == "xiaozhi"
Requires-Dist: websockets<15.0,>=14.0; extra == "xiaozhi"
Requires-Dist: opuslib_next>=1.1.5; extra == "xiaozhi"
Requires-Dist: PyJWT>=2.10.0; extra == "xiaozhi"
Requires-Dist: aiohttp>=3.13.0; extra == "xiaozhi"
Requires-Dist: aiohttp-cors>=0.8.0; extra == "xiaozhi"
Requires-Dist: numpy<2.0.0,>=1.26.0; extra == "xiaozhi"
Provides-Extra: silero-vad-grpc
Requires-Dist: grpcio>=1.76.0; extra == "silero-vad-grpc"
Requires-Dist: grpcio-tools>=1.76.0; extra == "silero-vad-grpc"
Provides-Extra: sherpa-onnx-asr-grpc
Requires-Dist: grpcio>=1.76.0; extra == "sherpa-onnx-asr-grpc"
Requires-Dist: grpcio-tools>=1.76.0; extra == "sherpa-onnx-asr-grpc"
Requires-Dist: numpy<2.0.0,>=1.26.0; extra == "sherpa-onnx-asr-grpc"
Provides-Extra: kokoro-cn-tts-grpc
Requires-Dist: grpcio>=1.76.0; extra == "kokoro-cn-tts-grpc"
Requires-Dist: grpcio-tools>=1.76.0; extra == "kokoro-cn-tts-grpc"
Requires-Dist: numpy<2.0.0,>=1.26.0; extra == "kokoro-cn-tts-grpc"
Provides-Extra: silero-vad-local
Requires-Dist: onnxruntime-gpu>=1.16.0; extra == "silero-vad-local"
Requires-Dist: numpy>=1.24.0; extra == "silero-vad-local"
Requires-Dist: silero-vad>=5.0; extra == "silero-vad-local"
Provides-Extra: sherpa-onnx-asr-local
Requires-Dist: onnxruntime-gpu>=1.16.0; extra == "sherpa-onnx-asr-local"
Requires-Dist: numpy>=1.24.0; extra == "sherpa-onnx-asr-local"
Requires-Dist: sherpa-onnx>=1.12.15; extra == "sherpa-onnx-asr-local"
Requires-Dist: huggingface_hub>=0.20.0; extra == "sherpa-onnx-asr-local"
Provides-Extra: kokoro-cn-tts-local
Requires-Dist: torch>=2.0.0; extra == "kokoro-cn-tts-local"
Requires-Dist: numpy>=1.24.0; extra == "kokoro-cn-tts-local"
Requires-Dist: kokoro>=0.8.1; extra == "kokoro-cn-tts-local"
Requires-Dist: misaki[zh]>=0.8.1; extra == "kokoro-cn-tts-local"
Provides-Extra: openai-agent
Requires-Dist: openai-agents[litellm]>=0.4.2; extra == "openai-agent"
Requires-Dist: openai>=2.7.0; extra == "openai-agent"
Requires-Dist: httpx>=0.28.0; extra == "openai-agent"
Provides-Extra: edge-tts
Requires-Dist: edge-tts>=7.2.3; extra == "edge-tts"
Requires-Dist: pydub>=0.25.0; extra == "edge-tts"
Provides-Extra: qwen
Requires-Dist: dashscope>=1.25.3; extra == "qwen"
Provides-Extra: doubao
Requires-Dist: websockets>=14.0; extra == "doubao"
Provides-Extra: dev-local-cn
Requires-Dist: vixio[kokoro-cn-tts-local,openai-agent,sherpa-onnx-asr-local,silero-vad-local,xiaozhi]; extra == "dev-local-cn"
Provides-Extra: dev-grpc
Requires-Dist: vixio[kokoro-cn-tts-grpc,openai-agent,sherpa-onnx-asr-grpc,silero-vad-grpc,xiaozhi]; extra == "dev-grpc"
Provides-Extra: dev-qwen
Requires-Dist: vixio[openai-agent,qwen,silero-vad-local,xiaozhi]; extra == "dev-qwen"
Provides-Extra: dev-qwen-streaming
Requires-Dist: vixio[openai-agent,qwen,xiaozhi]; extra == "dev-qwen-streaming"
Provides-Extra: dev-doubao
Requires-Dist: vixio[doubao,openai-agent,xiaozhi]; extra == "dev-doubao"
Provides-Extra: quickstart
Requires-Dist: vixio[openai-agent,qwen,silero-vad-local,xiaozhi]; extra == "quickstart"
Provides-Extra: test
Requires-Dist: pytest>=7.4.0; extra == "test"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "test"
Requires-Dist: pytest-cov>=4.1.0; extra == "test"
Requires-Dist: black>=23.7.0; extra == "test"
Requires-Dist: isort>=5.12.0; extra == "test"
Requires-Dist: mypy>=1.5.0; extra == "test"
Requires-Dist: ruff>=0.0.287; extra == "test"
Requires-Dist: pydub; extra == "test"
Requires-Dist: psutil>=7.0.0; extra == "test"
Dynamic: license-file

# Vixio

**A framework for quickly adding voice interaction to AI Agents**

[![Python 3.12+](https://img.shields.io/badge/python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![Status: Alpha](https://img.shields.io/badge/status-alpha-orange.svg)]()

**[中文文档](docs/README_zh.md)**

## Why Vixio?

Add voice capabilities to any Agent with a single command — no need to handle complex audio processing details.

## Features

### 🎯 Core Advantages

- **One-line startup**: `uvx --from "vixio[dev-qwen-streaming]" vixio run xiaozhi-server --preset qwen-realtime` gives you a complete voice Agent
- **Flexible DAG architecture**: Data flow design based on directed acyclic graph, nodes can be freely combined
- **Three operating modes**:
  - **Pipeline** - Traditional cascade (VAD→ASR→Agent→TTS), maximum control
  - **Streaming** - Bidirectional streaming, low latency
  - **Realtime** - End-to-end model, simplest
- **Ready to use**: Built-in Xiaozhi hardware protocol support

### 🔧 Technical Features

- **Modular design**: Install VAD / ASR / Agent / TTS on demand
- **Multiple providers**: Local inference (Silero, Sherpa-ONNX, Kokoro) or cloud services (Qwen, Doubao, Edge-TTS)
- **Multi-purpose**: Voice conversation, transcription, real-time translation, etc.
- **Session isolation**: Independent provider instances per connection, supports high concurrency

## Requirements

- Python 3.12 or higher
- [uv](https://docs.astral.sh/uv/) (recommended package manager)

## 🚀 Quick Start

Get started with Vixio in just one command! Experience real-time voice conversation powered by Qwen Omni:

```bash
# Install and run in one step (requires DashScope API key)
uvx --from "vixio[dev-qwen-streaming]" vixio run xiaozhi-server \
  --preset qwen-realtime \
  --dashscope-key sk-your-key-here
```

**What you get:**
- 🎙️ WebSocket server running at `http://localhost:8000`
- 🤖 End-to-end voice AI with Qwen Omni Realtime
- ⚡ Low latency, integrated VAD + ASR + LLM + TTS
- 📱 Ready for xiaozhi devices or custom clients

**Get your API key:** [DashScope Console](https://dashscope.console.aliyun.com/)

### Customize Your Bot

```bash
# Use custom prompt
uvx --from "vixio[dev-qwen-streaming]" vixio run xiaozhi-server \
  --preset qwen-realtime \
  --dashscope-key sk-xxx \
  --prompt "你是一个专业的编程助手"

# Use pipeline mode (more control)
uvx --from "vixio[dev-qwen-streaming]" vixio run xiaozhi-server \
  --dashscope-key sk-xxx

# Export template for full customization
uvx --from "vixio[xiaozhi]" vixio init xiaozhi-server
cd xiaozhi-server
# Edit .env, config.yaml, prompt.txt
python run.py
```

---

## Installation

### Install from source (Recommended)

```bash
git clone https://github.com/weynechen/vixio.git
cd vixio
uv sync --extra dev-qwen  # or dev-local-cn, dev-grpc, etc.
```

### Using uv

1. Install with core dependencies only:

```bash
uv pip install vixio
```

2. Install with specific providers:

```bash
# For Chinese local development (VAD + ASR + TTS + Agent)
uv pip install "vixio[dev-local-cn]"

# For Qwen platform integration
uv pip install "vixio[dev-qwen]"

# Or install individual components
uv pip install "vixio[xiaozhi,openai-agent,silero-vad-grpc]"
```

### Using pip

```bash
pip install vixio

# With optional dependencies
pip install "vixio[dev-local-cn]"
```


## Available Components

### Transports
- `xiaozhi` - Xiaozhi protocol transport (WebSocket + HTTP)

### VAD (Voice Activity Detection)
- `silero-vad-grpc` - Silero VAD via gRPC service
- `silero-vad-local` - Silero VAD local inference

### ASR (Automatic Speech Recognition)
- `sherpa-onnx-asr-grpc` - Sherpa-ONNX ASR via gRPC service
- `sherpa-onnx-asr-local` - Sherpa-ONNX ASR local inference
- `qwen` - Qwen platform ASR
...

### TTS (Text-to-Speech)
- `kokoro-cn-tts-grpc` - Kokoro TTS via gRPC service
- `kokoro-cn-tts-local` - Kokoro TTS local inference
- `edge-tts` - Microsoft Edge TTS (cloud)
- `qwen` - Qwen platform TTS
...

### Agent
- `openai-agent` - OpenAI-compatible LLM via LiteLLM


## Getting Started

1. Check out the `examples/` directory for usage examples
2. Configure your providers in a YAML config file
3. Run your voice agent application


## Project Status

**Current Version: v0.1.x (Alpha)**

> **Note**: This project is under active development. APIs may change.

## License

Apache License - see [LICENSE](LICENSE) for details.
