Metadata-Version: 2.4
Name: langchain-modal-gpu-ez
Version: 0.1.0
Summary: LangChain integration for modal-gpu-ez: serverless GPU inference with HuggingFace models
Project-URL: Repository, https://github.com/your-username/langchain-modal-gpu-ez
License: MIT
License-File: LICENSE
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.12
Requires-Dist: langchain-core<1.0.0,>=0.3.0
Requires-Dist: modal-gpu-ez<2.0.0,>=1.0.0
Description-Content-Type: text/markdown

# langchain-modal-gpu-ez

[modal-gpu-ez](https://github.com/your-username/modal-gpu-ez)를 LangChain에서 사용할 수 있게 해주는 통합 패키지.

Modal 서버리스 GPU + HuggingFace 모델을 LangChain 체인, 에이전트, RAG 파이프라인에서 원라이너로 활용한다.

## 설치

```bash
pip install langchain-modal-gpu-ez
```

## 사용법

### LLM (텍스트 생성)

```python
from langchain_modal_gpu_ez import ModalGpuEzLLM

llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4")
result = llm.invoke("Once upon a time")
print(result)
```

### LangChain 체인에서 사용

```python
from langchain_core.prompts import PromptTemplate
from langchain_modal_gpu_ez import ModalGpuEzLLM

llm = ModalGpuEzLLM(model_id="distilgpt2", gpu="T4", max_new_tokens=100)
prompt = PromptTemplate.from_template("Tell me about {topic}")
chain = prompt | llm

result = chain.invoke({"topic": "open source"})
```

### Embeddings (임베딩)

```python
from langchain_modal_gpu_ez import ModalGpuEzEmbeddings

embeddings = ModalGpuEzEmbeddings(
    model_id="BAAI/bge-small-en-v1.5",
    gpu="Local",  # 임베딩은 로컬에서도 충분히 빠름
)

vectors = embeddings.embed_documents(["hello world", "test sentence"])
query_vector = embeddings.embed_query("search query")
```

### 벡터 스토어와 함께 사용

```python
from langchain_community.vectorstores import FAISS
from langchain_modal_gpu_ez import ModalGpuEzEmbeddings

embeddings = ModalGpuEzEmbeddings()
vectorstore = FAISS.from_texts(
    ["Python is great", "LangChain is powerful"],
    embeddings,
)

docs = vectorstore.similarity_search("programming language")
```

### GPU 선택

| GPU | VRAM | 가격/시간 |
|-----|------|----------|
| T4 | 16GB | $0.27 |
| L4 | 24GB | $0.59 |
| A10G | 24GB | $0.54 |
| A100 | 40GB | $1.64 |
| H100 | 80GB | $3.89 |
| `"Local"` | - | 무료 |
| `"auto"` | - | 모델 크기 기반 자동 선택 |

## 환경 변수

```bash
MODAL_TOKEN_ID=your_modal_token_id
MODAL_TOKEN_SECRET=your_modal_token_secret
HF_TOKEN=your_hf_token  # 게이트 모델 사용 시
```

## 라이선스

MIT
