Metadata-Version: 2.4
Name: colab-filtering
Version: 0.1.1
Summary: Simple package for colaborative filtering
Author: Henning Schmies
Author-email: henning.schmies@stud.th-deg.de
License: MIT License
        
        Copyright (c) 2023 Henning Schmies
        
        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:
        
        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.
        
        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
License-File: LICENSE.txt
Keywords: big data,colaborative filtering,filtering,recommendation
Classifier: Development Status :: 4 - Beta
Classifier: Programming Language :: Python
Requires-Python: >=3.12
Requires-Dist: pandas<2.3,>=2.2.0
Description-Content-Type: text/markdown

# Colab-Filtering

A simple Python package for collaborative filtering in recommendation systems.

## Description

Colab-Filtering provides tools for implementing collaborative filtering techniques in recommendation systems. It includes functions for data normalization and similarity calculations, which are essential components in building recommendation engines.

## Features

- **Normalization**: Mean-centering of utility matrices
- **Similarity Metrics**: 
  - Cosine similarity for numeric ratings
  - Jaccard similarity for binary data

## Installation

```bash
pip install colab_filtering
```

## Requirements

- Python >= 3.12
- pandas >= 2.3.0

## Usage

### Basic Example

```python
import pandas as pd
from norm.mean import mean_norm
from similarity.cosine import cosine_similarity

# Create a utility matrix (users x items)
ratings = [
    {'user_id': 1, 'movie': 'Matrix', 'rating': 5},
    {'user_id': 1, 'movie': 'Titanic', 'rating': 3},
    # ... more ratings
]
df = pd.DataFrame(ratings)
utility = df.pivot_table(index='movie', columns='user_id', values='rating')

# Apply mean normalization
utility_norm = mean_norm(utility)

# Calculate cosine similarity between items
cosine_sim = cosine_similarity(utility_norm)
```

### Using Jaccard Similarity for Binary Data

```python
import pandas as pd
from similarity.jaccard import jaccard_similarity

# Create a binary utility matrix (1 for rated, 0 for not rated)
binary_utility = utility.notna().astype(int)

# Calculate Jaccard similarity
jaccard_sim = jaccard_similarity(binary_utility)
```

## Module Descriptions

### norm

- **mean.py**: Provides functions for mean normalization of utility matrices.

### similarity

- **cosine.py**: Implements cosine similarity calculations between users or items.
- **jaccard.py**: Implements Jaccard similarity calculations, useful for binary data.

## Author

- Henning Schmies (henning.schmies@stud.th-deg.de)

## License

This project is licensed under the terms specified in the LICENSE.txt file.

## Keywords

big data, collaborative filtering, recommendation, filtering