Metadata-Version: 2.1
Name: tassadar
Version: 0.1.0
Summary: 
Author: nelly
Author-email: nelly@buzzni.com
Requires-Python: >=3.10,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Description-Content-Type: text/markdown

# GPU Inference Batch Processing with Asyncio

## Overview

The goal is to handle multiple incoming requests by grouping them into batches and processing each batch efficiently using GPU resources. By using `asyncio`, we can handle asynchronous requests and batch them without blocking the main event loop, optimizing GPU utilization and reducing overhead.

## Benefits

- **Efficiency**: Batch processing reduces the overhead per request, making the use of GPU resources more efficient.
- **Scalability**: The system can handle a large number of concurrent requests, which is critical in high-demand environments.
- **Flexibility**: The batch size and other parameters can be adjusted based on the workload and GPU capacity, allowing for dynamic scaling.

## How to Use


