Metadata-Version: 2.4
Name: ollama-cli
Version: 0.2.0
Summary: Simple command line tool that reads a text from stdin and pipes it to Ollama. One can set all Ollama options on commandline as well as define termination criteria in terms of maximum number of lines, paragraphs, or repeated lines.
Project-URL: Homepage, https://github.com/DrMicrobit/ollama-cli
Project-URL: GitHub, https://github.com/DrMicrobit/ollama-cli
Project-URL: Issues, https://github.com/DrMicrobit/ollama-cli/issues
Author-email: Bastien Chevreux <bach@chevreux.org>
License-Expression: MIT
License-File: LICENSE
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: End Users/Desktop
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.13
Requires-Dist: dm-ollamalib>=0.2.0
Requires-Dist: dm-streamvalve>=0.1.1
Requires-Dist: httpx>=0.28.1
Requires-Dist: ollama>=0.4.8
Description-Content-Type: text/markdown

# ollama-cli

Simple command line tool that reads a text from stdin and pipes it to Ollama. One can set all Ollama options on command line as well as define termination criteria in terms of maximum number of lines, paragraphs, or repeated lines.

Nothing stellar, but quite useful. Uses [dm-streamvalve](https://github.com/DrMicrobit/dm-streamvalve) and [dm-ollamalib](https://github.com/DrMicrobit/dm-ollamalib) to provide main functionality.

# Installation
If you haven't done so already, please install [uv](https://docs.astral.sh/uv/) as this Python package and project manager basically makes all headaches of Python package management go away in an instant.

Simply type `uv tool install ollama-cli` and your are good to go!

When a new version of Ollama or ollama-cli is published, do `uv tool upgrade ollama-cli` to pick up new Ollama options to be set on the command line.

# Usage / command line options

```
options:
  -h, --help            show this help message and exit
  --opthelp             show a list of Ollama options that can be set via
                        --opts and exit.
  --optdesc             show a list of Ollama options and descriptions (if
                        available) that can be set via --opts and exit.

Ollama setup options:
  --sysmsg TXT          In case no --sysin (see below) given, the Ollama model
                        will get this text as SYSTEM message. Default: "You
                        are a helpful assistant. Answer the request of the
                        user succinctly and diligently. Do not repeat the task
                        given to you or announce your result."
  --sysin FILENAME      Name of a text file with an Ollama SYSTEM msg to prime
                        the model. Overrides --sysmsg (see above)
  --model NAME          Use Ollama model <NAME>. Default:
                        llama3.1:8b-instruct-q8_0

  --opts OPTS           Semicolon separated list of options for Ollama. E.g.:
                        --options="num_ctx=16384;temperature=0.0" Default: ""

Early termination options:
  --max-linerepeats INT
                        Used to prevent models eventually getting stuck in
                        endless loops of repeated lines. If >0, stop after
                        this number of non-blank lines that are exact repeats
                        of previous lines. Lines do not need to be following
                        each other to be spotted as repeats. Default: 3
  --max-lines INT       To prevent endless output. If >0, stop after this
                        number of lines. Default: 200
  --max-linetokens INT  To prevent endless output in a single line. If >0, stop
                        after this number of tokens if no newline was encountered.
                        Default: 200
  --max-paragraphs INT  To prevent endless diverse output. If >0, stop after
                        this number of paragraphs. Default: 0

Output options:
  --tostderr            Redirect the streaming monitoring output to stderr.
                        The final result will be output to stdout once
                        completed. This is useful in combination with
                        termination options --max_* where, in case the
                        termination criterion triggered, stdout will contain
                        the output without the line which led to the
                        termination.

Connection options:
  --host HOST           The default empty string will connect to
                        'localhost:11434' where Ollama is usually installed.
                        Set this to connect to any other Ollama server you
                        have access to. Default: ""
```

# Usage examples

## Default usage examples
```sh
echo "Why is the sky blue? Write an article without headlines" | ollama-cli
```

Note: ollama-cli uses *llama3.1:8b-instruct-q8_0* as default model, which I found to be a good compromise between speed, memory usage, accuracy, and text generation time. In case you want to use other models, set them like so in the command line:

```sh
echo "Why is the sky blue? Write an article without headlines" | ollama-cli --model="llama3.2"
```

## Setting Ollama options examples
Easy. Put the options in a string, separated by semicolon `;`. Like this:
```sh
echo "Why is the sky blue? Write an article without headlines" | ollama-cli --opts="temperature=0.5;num_ctx=4096"
```

In case you do not remember which options are available and what their type is, ollama-cli can help you. You can get either a quick overview

```sh
ollama-cli --opthelp
```

which produces output like this:
```
                numa : bool
             num_ctx : int
           num_batch : int
...
```

or get more details like this:

```sh
ollama-cli --optdesc
```

which produces output like this:
```
numa : bool 
This parameter seems to be new, or not described in docs as of January 2025.
dm_ollamalib does not know it, sorry.

num_ctx : int 
Sets the size of the context window used to generate the next token. (Default: 2048)

...
```
> [!IMPORTANT]
> The Ollama option names and types will always be as up-to-date as the Ollama Python module used. But as the description texts are not provided by anywhere by Ollama Python, they were scraped from official Ollama and Ollama Python documentation. Alas, not all the parameters are explained there.

## Early termination examples
Sometimes models produce way more output than you wanted. Or get stuck in endless loops.

You can terminate the output of Ollama prematurely by either number of lines, number of tokens in a single line,
number of paragraphs or number of exact line repeats.

> [!NOTE]
> While the normal output of Ollama appears on stdout, reasons for terminations will be shown by `ollama-cli` on stderr. That allows you to redirect the normal output to a file or pipe it to other commands without having to think about removing the termination info.

### Maximum number of lines
Contrived example, terminating the output after just two lines:
```sh
echo "List the name of 10 animals. Output as dashed list." | ollama-cli --max-lines=2
```

The output (both stdout and stderr) of the above could look like this:
```
- Lion
- Elephant

Reading from Ollama model stopped early.
Criterion: StopCriterion.MAX_LINES
Message: Maximum number of lines reached.
Stopped at token/line: '-'
```

### Maximum number of tokens in a line
Qwen3:8b was the first model I encountered that gave me endless output without newlines, which made 
this termination criterion necessary. The default value of 3000 whould be enough for ~1000 - 3000
words (depending on model), which is way longer than any single line should be. However, as Qwen3
likes to write whole paragraphs in a single line (especially in the \<thinking\> output), this feels
like a reasonable bound.

Contrived example, terminating the output when the length of a line exceeds 10 tokens:
```sh
echo "Enumerate 5 animals in a list, then describe what is a house." | ollama-cli --max-linetokens=10
```

### Maximum number of paragraphs
Terminating the output after two paragraphs:

```sh
echo "Why is the sky blue? Write an article without headlines" | ollama-cli --max-paragraphs=2
```

### Maximum number of repeated lines
Some models sometimes get stuck and produce never-ending output repeating itself. I've seen this with requests like *"extract all acronyms from the text in a dashed list"*. For this, `--max-linerepeats` can alleviate the problem.

Contrived example:
```sh
echo "List the name of 20 animals. Mention the zebra at least 4 times across the list. Output as dashed list" | ollama-cli --max-linerepeats=2
```

The output of the above might look like this:
```
- Zebra
- Giraffe
- Zebra
- Dolphin
- Kangaroo
- Zebra

Reading from Ollama model stopped early.
Criterion: StopCriterion.MAX_LINEREPEATS
Message: Maximum number of exact repeated lines reached.
Stopped at token/line: '- Zebra\n'
```
> [!IMPORTANT]
> On screen, but also in file in case you redirected the stdout output, you will see 3 'Zebra' although you just asked for maximum of 2 via `--max_linerepeats`. Why? The reason is that ollama-cli streams each token as it receives it, but checking for duplicate lines can be done only once an end of line is received.
> In case you really want only the 'clean' output, redirect the monitoring output to stderr via `--tostderr`. In this case, the output on stdout will be written at the end and not contain the line which led to termination. E.g.:
> ```sh
> echo "List the name of 20 animals. Mention the zebra at least 4 times across the list. Output as dashed list" | ollama-cli --max-linerepeats=2 --tostderr >animals.txt
> ```
> The file 'animals.txt' will contain the 'clean' output.

# Notes
The GitHub repository comes with all files I currently use for Python development across multiple platforms. Notably:

- configuration of the Python environment via `uv`: pyproject.toml and uv.lock
- configuration for linter and code formatter `ruff`: ruff.toml
- configuration for `pylint`: .pylintrc
- configuration for `mypy`: .mypy.ini
- configuration for `pytest` (though no tests are currently defined as this is a straightforward CLI tool): pytest.ini
- git ignore files: .gitignore
- configuration for `pre-commit`: .pre-commit-config.yaml. The script used to check `git commit` summary message is in devsupport/check_commitsummary.py
- configuration for VSCode editor: .vscode directory
