Metadata-Version: 2.4
Name: denes-router-classifier
Version: 0.1.0
Summary: BERT-based domain classifier for Denes chatbot routing (CPU-optimized)
Author: Denes Team
License: MIT
Keywords: bert,chatbot,classification,nlp,routing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: pydantic<3.0,>=2.0
Requires-Dist: torch<3.0,>=2.0
Requires-Dist: transformers<5.0,>=4.40
Provides-Extra: dev
Requires-Dist: pytest-cov>=4.0; extra == 'dev'
Requires-Dist: pytest>=8.0; extra == 'dev'
Requires-Dist: ruff>=0.3.0; extra == 'dev'
Description-Content-Type: text/markdown

# Denes Router Classifier

# HF-token: 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

**BERT-based domain classifier for intelligent chatbot routing (CPU-optimized)**

A lightweight Python package that uses a fine-tuned BERT multilingual model to classify user queries into three domains for the Denes chatbot system:

- **Web_Search**: Explicit search requests (requires Tavily API)
- **TI**: Technical support queries (internal routing)
- **Generic**: General knowledge questions (LLM-answerable, no web search needed)

## Features

- **Fast CPU Inference**: 10-50ms latency on modern CPUs (MacBook Air M3: ~20ms)
- **Singleton Pattern**: Model loaded once and cached for all subsequent calls
- **Multi-language Support**: Based on `google-bert/bert-base-multilingual-uncased`
- **High Accuracy**: 100% accuracy on held-out test set (1500+ training examples)
- **Zero Network Overhead**: Direct library integration (not a microservice)
- **Pydantic Types**: Fully typed API with Pydantic schemas

## Installation

### From Local Directory (Development)

```bash
cd ../denes-backend-python  # Navigate to your backend project
uv add ../denes-router-trainning/denes-router-classifier
```

### From Git (Future)

```bash
uv add git+https://github.com/yourusername/denes-router-classifier.git
```

## Quick Start

### CLI Testing (Without Installation)

Test the package directly from the command line:

```bash
# Interactive mode (recommended for testing)
python classify_cli.py interactive

# Classify a single query
python classify_cli.py classify "Busca el clima en Asunción"

# With custom threshold
python classify_cli.py classify "No tengo internet" --threshold 0.5

# Multi-label classification
python classify_cli.py classify "Busca ayuda para mi PC" --multi-label

# Batch classification from file
echo "Busca el clima\nNo tengo internet\n¿Qué es Python?" > queries.txt
python classify_cli.py batch queries.txt

# Show version and model info
python classify_cli.py version
```

**Interactive Mode Example:**

```
$ python classify_cli.py interactive

Denes Router Classifier v0.1.0
Threshold: 0.7 | Type 'exit' or 'quit' to stop

Loading model... ✓ (662ms)

Query: Busca el clima en Asunción
  → Web_Search (97.8%, 36.2ms)

Query: No tengo internet
  → TI (99.8%, 35.1ms)

Query: exit

Goodbye!
```

### Single Classification

```python
from denes_router_classifier import classify_domain

# Explicit search query
result = classify_domain("Busca el clima en Asunción")
print(result.primary)       # "Web_Search"
print(result.confidence)    # 0.9984
print(result.includes_generic)  # False

# Technical support query
result = classify_domain("No tengo internet")
print(result.primary)       # "TI"
print(result.confidence)    # 0.9964

# General knowledge question
result = classify_domain("¿Cuál es la capital de Francia?")
print(result.primary)       # "Generic"
print(result.confidence)    # 0.92
```

### Multi-label Classification

```python
# Return all domains above threshold
result = classify_domain(
    "Busca ayuda para mi PC",
    threshold=0.5,
    multi_label=True
)

print(result.primary)  # "Web_Search"
print(result.all_predictions)
# [
#   {"domain": "Web_Search", "confidence": 0.78},
#   {"domain": "TI", "confidence": 0.65}
# ]
```

### Batch Classification

```python
from denes_router_classifier import classify_batch

texts = [
    "Busca el clima",
    "No tengo internet",
    "¿Cuál es la capital de Francia?"
]

results = classify_batch(texts, batch_size=32)

for text, result in zip(texts, results):
    print(f"{text} → {result.primary} ({result.confidence:.2%})")

# Output:
# Busca el clima → Web_Search (99.84%)
# No tengo internet → TI (99.64%)
# ¿Cuál es la capital de Francia? → Generic (92.00%)
```

## Integration with Backend

### Replace LLM-based Classification

In your `denes-backend-python/src/services/orchestrator.py`:

```python
from denes_router_classifier import classify_domain

# BEFORE (expensive LLM call):
# classified_domain = await domain_classifier.classify(
#     message=request.message,
#     current_domain=current_domain,
#     history=history,
#     model_name=resolved_model.name,
# )

# AFTER (fast BERT classification):
result = classify_domain(
    text=request.message,
    threshold=0.7,
    multi_label=False
)
classified_domain = result.primary

logger.info(
    "🎯 Domain classified (BERT)",
    domain=classified_domain,
    confidence=result.confidence,
    includes_generic=result.includes_generic
)

# Optional: Fallback to LLM if confidence is too low
if result.includes_generic:
    logger.warning("Low confidence, consider Generic fallback")
```

## API Reference

### `classify_domain()`

```python
def classify_domain(
    text: str,
    threshold: float = 0.7,
    multi_label: bool = False,
    log_latency: bool = False
) -> ClassificationResult:
    """Classify text into domain(s) for chatbot routing.

    Args:
        text: Input text to classify (user query)
        threshold: Confidence threshold (default: 0.7)
        multi_label: Return all domains above threshold (default: False)
        log_latency: Log inference time (default: False)

    Returns:
        ClassificationResult with primary domain, confidence, and metadata
    """
```

### `classify_batch()`

```python
def classify_batch(
    texts: list[str],
    threshold: float = 0.7,
    multi_label: bool = False,
    batch_size: int = 32
) -> list[ClassificationResult]:
    """Classify multiple texts in batch for better throughput.

    Args:
        texts: List of input texts
        threshold: Confidence threshold (default: 0.7)
        multi_label: Return all domains above threshold (default: False)
        batch_size: Batch size for inference (default: 32)

    Returns:
        List of ClassificationResult objects
    """
```

### `ClassificationResult`

```python
class ClassificationResult(BaseModel):
    primary: str               # Primary domain (highest confidence)
    confidence: float          # Confidence score (0-1)
    includes_generic: bool     # True if confidence < threshold
    all_predictions: Optional[list[PredictionDetail]]  # Multi-label mode
```

## Domain Definitions

### Web_Search

Explicit search requests with verbs like "busca", "encuentra", "investiga", "search", "find".

**Examples:**

- "Busca el clima en Asunción"
- "Encuentra información sobre Python"
- "Search for the latest news"

**Action:** Route to Tavily API for web search (paid)

### TI (Technical Support)

Technical support queries about hardware, software, network, or system issues.

**Examples:**

- "No tengo internet"
- "Mi computadora no enciende"
- "Error al instalar Python"

**Action:** Route to internal TI support system (free)

### Generic

General knowledge questions answerable by the LLM without web search.

**Examples:**

- "¿Cuál es la capital de Francia?"
- "Explica qué es Python"
- "¿Cómo se dice 'hello' en español?"

**Action:** Route to LLM (OSS 120B) for direct answer (free)

## Performance

### Latency

- **MacBook Air M3 (CPU)**: ~20ms per query
- **Target**: < 100ms on modern CPUs
- **Throughput**: 10+ req/s single-threaded

### Accuracy (Test Set - 230 examples)

- **Accuracy**: 100%
- **Macro F1**: 1.00
- **Precision**: 1.00
- **Recall**: 1.00

**Note:** Test set is synthetic. Real-world performance expected: 90-98% accuracy.

### Model Details

- **Base Model**: `google-bert/bert-base-multilingual-uncased`
- **Parameters**: 167M
- **Training**: 1577 examples (500+ per domain)
- **Validation**: 15% held-out test set
- **Training Time**: ~20s on MacBook Air M3 (CPU)

## Deployment

### HuggingFace Hub (Recommended)

The model is too large for GitHub (638MB). Use HuggingFace Hub for automatic download:

```bash
# 1. Upload model (one time)
python upload_to_hf.py  # Edit USERNAME first!

# 2. Production deployment
pip install huggingface_hub
huggingface-cli login  # One time only

# Model downloads automatically on first use
from denes_router_classifier import classify_domain
result = classify_domain("test")  # Downloads model from HF Hub
```

**See:** [HUGGINGFACE_SETUP.md](HUGGINGFACE_SETUP.md) for complete guide

### Alternative: Git LFS

See [DEPLOYMENT.md](DEPLOYMENT.md) for Git LFS and other deployment options.

## Requirements

- Python >= 3.10
- torch >= 2.0
- transformers >= 4.40
- pydantic >= 2.0

## Development

### Testing

```bash
cd denes-router-classifier
uv sync --dev
uv run pytest tests/
```

### Linting

```bash
uv run ruff check src/
uv run ruff format src/
```

## Cost Savings

By using BERT classification instead of LLM for routing:

- **80%+ queries** correctly classified to Generic (no Tavily API cost)
- **Fast inference**: No waiting for LLM response for routing
- **Single service**: No microservice deployment complexity

## License

MIT License

## Contributing

This package is part of the Denes chatbot training repository:
[denes-router-trainning](https://github.com/yourusername/denes-router-trainning)

For questions or issues, please open an issue in the main repository.

## Changelog

### 0.1.0 (2025-11-13)

- Initial release
- BERT multilingual classifier with 3 domains
- CPU-optimized inference with singleton pattern
- Batch classification support
- Full type hints with Pydantic
