Metadata-Version: 2.1
Name: emb3d
Version: 0.1.105
Summary: emb3d.co command line inteface to work with embeddings.
Author: Akhil Ravidas
Author-email: ar@mod0.ai
Requires-Python: >=3.9,<4.0
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Requires-Dist: aiolimiter (>=1.1.0,<2.0.0)
Requires-Dist: cohere (>=4.27,<5.0)
Requires-Dist: httpx (>=0.25.0,<0.26.0)
Requires-Dist: numpy
Requires-Dist: openai (>=0.28.1,<0.29.0)
Requires-Dist: pyyaml (>=6.0.1,<7.0.0)
Requires-Dist: sentence-transformers (>=2.2.2,<3.0.0)
Requires-Dist: tiktoken (>=0.5.1,<0.6.0)
Requires-Dist: tokenizers (>=0.14.0,<0.15.0)
Requires-Dist: typer[all] (>=0.9.0,<0.10.0)
Description-Content-Type: text/markdown

# emb3d

`emb3d` is a command-line utility that lets you generate embeddings using models from OpenAI, Cohere and HuggingFace.

## Installation

```sh
pip install --upgrade emb3d
```

## Quick Start ⚡️

### Install the library

```sh
pip install -U emb3d
```

### Prepare your input file

emb3d expects a JSONL file as input. Each line of the file should be a JSON object with a `text` key. Example input file:

```json
{"text": "I love my dog"}
{"text": "I love my cat"}
{"text": "I love my rabbit"}
```

Your files can optionally have other fields like ids, categorical labels etc.. and they are saved as-is in the final output file.

### Compute embeddings

The default model is OpenAI's `text-embedding-ada-002`. You can change the model by passing the `--model-id` flag.

```sh
emb3d compute inputs.jsonl
```

You will need to have OPENAI_API_KEY set in your environment. You can also pass it as a flag (`--api_key`) or set it in a config file.

```sh:
emb3d config set openai_token YOUR-OPENAI-API-KEY
emb3d compute inputs.jsonl
```

```sh
emb3d compute inputs.jsonl --model-id embed-english-v2.0 --output-file cohere-embeddings.jsonl
```

For COHERE models, you will need to have COHERE_API_KEY set in your environment. You can also pass it as a flag (`--api_key`) or set it in a config file with: `emb3d config set cohere_token YOUR-COHERE-API-KEY`.


### Visualize your embeddings 💥

The last step is to visualize your embeddings. This will open a browser window with a visualization of your last computed embeddings.
```sh
emb3d visualize
```

You can alternatively pass the path to the computed embeddings file:

```sh
emb3d visualize run-2020-embeddings.jsonl
```

### Profit 💰

## Usage

```
 Usage: emb3d [OPTIONS] INPUT_FILE COMMAND [ARGS]...

 Generate embeddings for fun and profit.

╭─ Arguments ───────────────────────────────────────────────────────────────────────────────╮
│ *    input_file      PATH  Path to the input file. [default: None] [required]             │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Options ─────────────────────────────────────────────────────────────────────────────────╮
│ --model-id                                     TEXT     ID of the embedding model.        │
│                                                         Default is                        │
│                                                         `text-embedding-ada-002`.         │
│                                                         [default: None]                   │
│ --output-file              -out,-o             PATH     Path to the output file. If not   │
│                                                         provided, a default path will be  │
│                                                         suggested.                        │
│                                                         [default: None]                   │
│ --api-key                                      TEXT     API key for the backend. If not   │
│                                                         provided, it will be prompted or  │
│                                                         fetched from environment          │
│                                                         variables.                        │
│                                                         [default: None]                   │
│ --remote                            --local             Choose whether to do inference    │
│                                                         locally or with an API token.     │
│                                                         This choice is available for      │
│                                                         sentence transformer and hugging  │
│                                                         face models. If a model cannot be │
│                                                         run locally (ex: OpenAI models),  │
│                                                         this flag is ignored.             │
│                                                         [default: remote]                 │
│ --max-concurrent-requests                      INTEGER  (Remote Execution) Maximum number │
│                                                         of concurrent requests for the    │
│                                                         embedding task. Default is 1000.  │
│                                                         [default: 1000]                   │
│ --help                                                  Show this message and exit.       │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
╭─ Commands ────────────────────────────────────────────────────────────────────────────────╮
│ config           Get or set a configuration value.                                        │
╰───────────────────────────────────────────────────────────────────────────────────────────╯
```

