Metadata-Version: 2.4
Name: easy-edge
Version: 0.1.3
Summary: A simple Ollama-like tool for running LLMs locally
Home-page: https://github.com/criminact/easy-edge
Author: Easy Edge Team
Project-URL: Source, https://github.com/criminact/easy-edge
Project-URL: Tracker, https://github.com/criminact/easy-edge/issues
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Intended Audience :: End Users/Desktop
Classifier: Environment :: Console
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
Classifier: Topic :: Utilities
Requires-Python: >=3.9.6
Description-Content-Type: text/markdown
Requires-Dist: accelerate==1.8.1
Requires-Dist: aiohappyeyeballs==2.6.1
Requires-Dist: aiohttp==3.12.14
Requires-Dist: aiosignal==1.4.0
Requires-Dist: async-timeout==5.0.1
Requires-Dist: attrs==25.3.0
Requires-Dist: bitsandbytes
Requires-Dist: certifi==2025.7.14
Requires-Dist: charset-normalizer==3.4.2
Requires-Dist: click==8.1.7
Requires-Dist: datasets==4.0.0
Requires-Dist: dill==0.3.8
Requires-Dist: diskcache==5.6.3
Requires-Dist: filelock==3.18.0
Requires-Dist: frozenlist==1.7.0
Requires-Dist: fsspec==2025.3.0
Requires-Dist: hf-xet==1.1.5
Requires-Dist: huggingface-hub==0.33.4
Requires-Dist: idna==3.10
Requires-Dist: Jinja2==3.1.6
Requires-Dist: llama_cpp_python==0.3.12
Requires-Dist: markdown-it-py==3.0.0
Requires-Dist: MarkupSafe==3.0.2
Requires-Dist: mdurl==0.1.2
Requires-Dist: mpmath==1.3.0
Requires-Dist: multidict==6.6.3
Requires-Dist: multiprocess==0.70.16
Requires-Dist: networkx==3.2.1
Requires-Dist: numpy==2.0.2
Requires-Dist: packaging==25.0
Requires-Dist: pandas==2.3.1
Requires-Dist: peft==0.16.0
Requires-Dist: pillow==11.3.0
Requires-Dist: propcache==0.3.2
Requires-Dist: psutil==7.0.0
Requires-Dist: pyarrow==20.0.0
Requires-Dist: Pygments==2.19.2
Requires-Dist: python-dateutil==2.9.0.post0
Requires-Dist: pytz==2025.2
Requires-Dist: PyYAML==6.0.2
Requires-Dist: regex==2024.11.6
Requires-Dist: requests==2.32.4
Requires-Dist: rich==13.7.0
Requires-Dist: safetensors==0.5.3
Requires-Dist: scipy==1.13.1
Requires-Dist: six==1.17.0
Requires-Dist: sympy==1.14.0
Requires-Dist: tokenizers==0.21.2
Requires-Dist: torch
Requires-Dist: torchaudio==2.7.1
Requires-Dist: torchvision==0.22.1
Requires-Dist: tqdm==4.67.1
Requires-Dist: transformers==4.53.2
Requires-Dist: typing_extensions==4.14.1
Requires-Dist: tzdata==2025.2
Requires-Dist: urllib3==2.5.0
Requires-Dist: xxhash==3.5.0
Requires-Dist: yarl==1.20.1
Requires-Dist: sentencepiece~=0.2.0
Requires-Dist: gguf>=0.1.0
Requires-Dist: protobuf<5.0.0,>=4.21.0
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: project-url
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Easy Edge

[![PyPI](https://img.shields.io/pypi/v/easy-edge.svg)](https://pypi.org/project/easy-edge/)
[![Python](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE)

A simple Ollama-like tool for running Large Language Models (LLMs) locally using llama.cpp under the hood.

## Features

- 🚀 **Local LLM Inference**: Run models locally using llama.cpp
- 📥 **Automatic Downloads**: Download models from URLs or Hugging Face
- 💬 **Interactive Chat**: Chat with models in an interactive terminal
- 📋 **Model Management**: List, download, and remove models
- ⚙️ **Configurable**: Customize model parameters and settings

## Installation

Install Easy Edge from PyPI:

```bash
pip install easy-edge
```

Or, to install the latest version from source:

```bash
git clone https://github.com/criminact/easy-edge.git
cd easy-edge
pip install .
```

## Usage

After installation, use the `easy-edge` command from your terminal:

### Download a Model

```bash
easy-edge pull --repo-id TheBloke/Llama-2-7B-Chat-GGUF --filename llama-2-7b-chat.Q4_K_M.gguf
```

Or download from a Hugging Face URL:

```bash
easy-edge pull --url https://huggingface.co/google/gemma-3-1b-it-qat-q4_0-gguf/resolve/main/gemma-3-1b-it-q4_0.gguf
```

### Run the Model

**Single prompt:**
```bash
easy-edge run gemma-3-1b-it-qat-q4_0-gguf --prompt "Hello, how are you?"
```

**Interactive chat:**
```bash
easy-edge run gemma-3-1b-it-qat-q4_0-gguf --interactive
```

### List Installed Models
```bash
easy-edge list
```

### Remove a Model
```bash
easy-edge remove gemma-3-1b-it-qat-q4_0-gguf
```

## Configuration

The tool stores configuration in `models/config.json`. You can modify settings like:

- `max_tokens`: Maximum tokens to generate (default: 2048)
- `temperature`: Sampling temperature (default: 0.7)
- `top_p`: Top-p sampling parameter (default: 0.9)

## Requirements

- Python 3.11+
- 8GB+ RAM (for 7B models)
- 16GB+ RAM (for 13B models)
- 4GB+ free disk space per model

## Troubleshooting

### Common Issues

1. **"llama-cpp-python not installed"**
   ```bash
   pip install llama-cpp-python
   ```

2. **Out of memory errors**
   - Try smaller models (7B instead of 13B)
   - Use more quantized models (Q4_K_M instead of Q8_0)
   - Close other applications to free up RAM

3. **Slow inference**
   - The tool uses all CPU cores by default
   - For better performance, consider using GPU acceleration (requires CUDA)

### GPU Acceleration (Optional)

For faster inference with NVIDIA GPUs:

```bash
pip uninstall llama-cpp-python
pip install llama-cpp-python --force-reinstall --index-url=https://jllllll.github.io/llama-cpp-python-cuBLAS-wheels/AVX2/cu118
```

## Finetuning Your Own Model

Easy Edge supports finetuning LLMs using a Modelfile (Ollama-style) and Hugging Face Trainer. This allows you to create custom models for your own data and use them locally.

### 1. Create a Modelfile

A Modelfile describes the base model, training parameters, and example messages for finetuning. Example:

```
HF_TOKEN <your_huggingface_token>
FROM meta-llama/Llama-3.2-1B-Instruct

PARAMETER device cpu
PARAMETER max_length 64
PARAMETER learning_rate 3e-5
PARAMETER epochs 4
PARAMETER batch_size 1
PARAMETER lora true
PARAMETER lora_r 8
PARAMETER lora_alpha 32
PARAMETER lora_dropout 0.05
PARAMETER lora_target_modules q_proj,v_proj

SYSTEM You are a helpful assistant.
MESSAGE user How can I reset my password?
MESSAGE assistant To reset your password, click on 'Forgot Password' at the login screen and follow the instructions.
```

- `HF_TOKEN` is your Hugging Face access token (required for private models).
- `FROM` specifies the base model to finetune.
- `PARAMETER` lines set training options (see above for examples).
- `SYSTEM` and `MESSAGE` blocks provide training data.

### 2. Run Finetuning

Use the `finetune` command to start training:

```bash
easy-edge finetune --modelfile Modelfile --output my-finetuned-model --epochs 4 --batch-size 1 --learning-rate 3e-5
```

- `--modelfile` is the path to your Modelfile.
- `--output` is where the trained model will be saved.
- You can override epochs, batch size, and learning rate on the command line.

### 3. Convert to GGUF (for llama.cpp)

After training, you will see instructions to convert your model to GGUF format for use with llama.cpp:

```bash
python3 convert_hf_to_gguf.py --in my-finetuned-model --out my-finetuned-model.gguf
```

Upload your GGUF file to Hugging Face or use it locally with Easy Edge.

### Notes
- Finetuning is resource-intensive. For best results, use a machine with a GPU.
- LoRA/PEFT is supported for efficient finetuning.
- See the example Modelfile in the repository for more options.

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request

## License

MIT License - see LICENSE file for details.

## Acknowledgments

- [llama.cpp](https://github.com/ggerganov/llama.cpp) - The underlying inference engine
- [Ollama](https://ollama.ai/) - Inspiration for the tool design
- [Hugging Face](https://huggingface.co/) - Model hosting and distribution 
