Metadata-Version: 2.4
Name: moxing
Version: 0.1.6
Summary: Python wrapper for llama.cpp - OpenAI API compatible LLM backend with auto GPU detection
Author: MoXing Contributors
License-Expression: MIT
Project-URL: Homepage, https://github.com/cycleuser/MoXing
Project-URL: Repository, https://github.com/cycleuser/MoXing
Project-URL: Issues, https://github.com/cycleuser/MoXing/issues
Keywords: llama,llama.cpp,gguf,openai,api,gpu,vulkan,cuda,ai,llm,metal,rocm
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: typer>=0.9.0
Requires-Dist: psutil>=5.9.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: build>=1.0.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Provides-Extra: openai
Requires-Dist: openai>=1.0.0; extra == "openai"
Provides-Extra: hf
Requires-Dist: huggingface_hub>=0.20.0; extra == "hf"
Provides-Extra: modelscope
Requires-Dist: modelscope>=1.10.0; extra == "modelscope"
Provides-Extra: cuda
Provides-Extra: vulkan
Provides-Extra: metal
Provides-Extra: rocm
Provides-Extra: cpu
Provides-Extra: auto
Provides-Extra: all
Requires-Dist: openai>=1.0.0; extra == "all"
Requires-Dist: huggingface_hub>=0.20.0; extra == "all"
Requires-Dist: modelscope>=1.10.0; extra == "all"
Dynamic: license-file

# MoXing (模型)

直接运行 Ollama 模型，有时更快。

## 安装

```bash
pip install moxing
```

## 使用

```bash
# 查看 Ollama 模型
moxing ollama list

# 运行模型
moxing ollama serve carstenuhlig/omnicoder-9b

# 交互式选择
moxing ollama list --select
```

## 性能对比

Apple M4 上测试 `carstenuhlig/omnicoder-9b`：

| 框架 | 速度 |
|------|------|
| Ollama | ~10 tokens/s |
| MoXing | ~15 tokens/s |

## 工作原理

MoXing 读取 Ollama 的 GGUF 文件，用 llama.cpp 运行。

```
Ollama manifest -> GGUF blob -> llama.cpp -> OpenAI API
```

## 兼容性

### 已验证成功

- carstenuhlig/omnicoder-9b
- Qwen2.5 系列
- Llama 3.x 系列
- Mistral 系列

### 已知不成功

- lfm2.5-thinking

**直接尝试你的模型，能跑就跑，不能跑就用 Ollama。**

## CLI 命令

| 命令 | 说明 |
|------|------|
| `moxing ollama list` | 列出 Ollama 模型 |
| `moxing ollama serve <model>` | 运行模型 |
| `moxing ollama info <model>` | 查看模型详情 |
| `moxing serve <model.gguf>` | 运行 GGUF 文件 |
| `moxing bench <model>` | 性能测试 |
| `moxing check <model>` | 检查兼容性 |

## License

MIT
