Metadata-Version: 2.3
Name: mcq
Version: 1.2.0
Summary: A library that employs multiple-choice questions to guide LLM outputs by constraining unstable generated answers to one of the predefined options.
License: MIT
Author: Tomo Kanazawa
Requires-Python: >=3.10
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Dist: numpy (>=2.2.4,<3.0.0)
Requires-Dist: openai (>=1.70.0,<2.0.0)
Description-Content-Type: text/markdown

# MCQ (Multiple-Choice Question)

A Python library that constrains LLM outputs to one of the predefined options through semantic matching. MCQ ensures LLM responses are classified into expected categories even when exact string matches aren't found.

## Why MCQ?

Working with LLMs often requires classifying their responses into predefined categories, but:

- LLMs don't always respond with exact matches to expected options
- String matching alone is fragile and error-prone
- Semantic similarity provides more robust classification
- Error handling becomes complex when outputs don't match expectations

MCQ solves this by transparently handling both exact and semantic matches, making LLM response classification reliable and robust.

## Use Cases

- **AI Agent Memory Management**: Classify new information as "new", "duplicate", or "update"
- **Decision Trees**: Route LLM responses through conditional branches in pipelines
- **Intent Classification**: Map varied user expressions to predefined intents
- **Response Normalization**: Standardize diverse LLM outputs to consistent formats
- **Multiple-Choice Testing**: Evaluate LLM performance on multiple-choice questions

## Installation

```bash
pip install mcq
```

## Usage

```python
from mcq import Client

# Initialize with your answer options
mcq = Client(choices=["new information", "duplicate information", "updated information"])

llm_answer = "This appears to be a new piece of data I haven't seen before"

# Classify an answer
result = mcq.classify(llm_answer)
print(result)  # "new information"

# Even inexact matches are classified correctly
result = mcq.classify("I already have this in my memory")
print(result)  # "duplicate information"
```

## How It Works

1. **Initialization**: Creates embeddings for all predefined answer options using OpenAI's embedding model
2. **Classification**:
   - First attempts exact string matching (case-insensitive)
   - If no match, computes the embedding for the input answer
   - Calculates cosine similarity between the answer embedding and all option embeddings
   - Returns the option with the highest similarity score

This approach ensures fast classification with no runtime delay regardless of how many options you define, as options are pre-embedded during initialization.

## Requirements

- Python 3.10+
- OpenAI API key set as an environment variable (`OPENAI_API_KEY`)

## License

MIT

