Metadata-Version: 2.1
Name: clippie
Version: 0.0.1
Summary: A small application to see product data
Home-page: https://github.com/mloyanich/online-retail-app
Author: Masha Loianych
Author-email: m.loianych@gmail.com
Project-URL: Bug Tracker, https://github.com/mloyanich/online-retail-app/issues
Keywords: Retail Data application
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: uvicorn
Requires-Dist: fastapi
Requires-Dist: pandas
Requires-Dist: pyspark
Requires-Dist: openpyxl

# online-retail-app

This is a FastAPI application that accepts a product description as input and returns the top 10 most similar products that are in the transaction data.
The dataset used in this application is located at the [following URL](https://archive.ics.uci.edu/ml/datasets/online+retail)

## Install

```bash
pip install clippie
```

Install the application from the local directory

```bash
pip install -e .
```

## Run

Run the application with the following command

```bash
clippie
```

or

```bash
python3 src/clippie/main.py
```

Upon start the application loads sample dataset that is located in `data` folder

## API endpoints

Available endpoints:

- `/docs` - GET - API documentation
- `/product` - GET - displays list of products
- `/product?search=coala` - GET - find relevant products to the provided desciption
- `/pipeline` - POST

## TODO

- [ ] deploy to pypi
- [ ] package Java jar file in order to open excel with pyspark
- [ ] enable tempfile
- [ ] GET pipeline to see the number of pipeline that has been executed
- [ ] add history of pipeline execution to GET pipeline
- [ ] what will happen if pipelines run in parallel?
- [ ] Github action
- [ ] Stop Spark on program termination
- [ ] BUG - spark is loaded twice!
- [ ] add debug mode
- [x] run the application
- [x] precalculate number of words in entire dataset
- [x] what if I pass a new word to product GET?
- [x] product should be more than 0.0
- [x] BUG - product search stopped working
