Metadata-Version: 2.1
Name: octopolars
Version: 0.1.3
Summary: Pull, filter, and walk a GitHub user's repositories with Polars.
Keywords: fsspec,github,polars,repositories
Author-Email: Louis Maddox <louismmx@gmail.com>
License: MIT
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python
Project-URL: Documentation, https://octopolars.vercel.app/
Project-URL: Homepage, https://github.com/lmmx/octopols
Project-URL: Repository, https://github.com/lmmx/octopols.git
Requires-Python: >=3.11
Requires-Dist: click>=8.1.8
Requires-Dist: fsspec[github]>=2025.2.0
Requires-Dist: platformdirs>=4.3.6
Requires-Dist: pygithub>=2.5.0
Requires-Dist: requests>=2.32.3
Requires-Dist: universal-pathlib>=0.2.6
Provides-Extra: dev
Requires-Dist: pdm-bump>=0.9.10; extra == "dev"
Requires-Dist: pdm>=2.22.3; extra == "dev"
Requires-Dist: pre-commit>=4.1.0; extra == "dev"
Requires-Dist: pytest>=8.3.4; extra == "dev"
Provides-Extra: docs
Requires-Dist: livereload>=2.7.1; extra == "docs"
Requires-Dist: mkdocs-extra-sass-mathshim>=0.1.0; extra == "docs"
Requires-Dist: mkdocs-material[imaging,recommended]>=9.5.2; extra == "docs"
Requires-Dist: mkdocs-section-index>=0.3.8; extra == "docs"
Requires-Dist: mkdocs>=1.5.3; extra == "docs"
Requires-Dist: mkdocstrings[python]>=0.24.0; extra == "docs"
Requires-Dist: ruff>=0.9.5; extra == "docs"
Requires-Dist: urllib3<2; extra == "docs"
Provides-Extra: polars
Requires-Dist: polars>=1.21.0; extra == "polars"
Provides-Extra: polars-lts-cpu
Requires-Dist: polars-lts-cpu>=1.21.0; extra == "polars-lts-cpu"
Description-Content-Type: text/markdown

# octopolars

[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)
[![PyPI](https://img.shields.io/pypi/v/octopolars.svg)](https://pypi.org/project/octopolars)
[![Supported Python versions](https://img.shields.io/pypi/pyversions/octopolars.svg)](https://pypi.org/project/octopolars)
<!-- [![downloads](https://static.pepy.tech/badge/octopolars/month)](https://pepy.tech/project/octopolars) -->
[![License](https://img.shields.io/pypi/l/octopolars.svg)](https://pypi.python.org/pypi/octopolars)
[![pre-commit.ci status](https://results.pre-commit.ci/badge/github/lmmx/octopolars/master.svg)](https://results.pre-commit.ci/latest/github/lmmx/octopolars/master)

Pull, filter, and walk a GitHub user's repositories with Polars.

## Installation

```bash
pip install octopolars
```

> The `polars` dependency is required but not included in the package by default.
> It is shipped as an optional extra which can be activated by passing it in square brackets:
>
> ```bash
> pip install octopolars[polars]
> ```
>
> If you need compatibility with older CPUs, install:
> ```bash
> pip install octopolars[polars-lts-cpu]
> ```

### Requirements

- Python 3.9+
- [gh](https://cli.github.com/) GitHub CLI tool, for a [PAT](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens)
  to avoid rate limits and enable file listings.
  - **alternatively** set the `GITHUB_TOKEN` environment variable as your GitHub token.

`octopolars` is supported by these tools:

- [Polars](https://www.pola.rs/) for efficient data filtering and output formatting.
- [PyGithub](https://github.com/PyGithub/PyGithub) (for GitHub API access to enumerate the repos)
- [fsspec](https://github.com/fsspec/filesystem_spec) within [Universal Pathlib](https://github.com/fsspec/universal_pathlib)
  which provides a `github://` protocol that enables enumerating files in GitHub repos as if they were local file paths.

## Features

- **GitHub repo enumeration**: Retrieve user’s public repos (caching results to speed up repeated calls).
- **Apply filters**: Use either raw Polars expressions or a shorthand DSL (e.g. `{name}.str.starts_with("foo")`) to filter repos.
- **File tree walking**: Enumerate all files in each repository using `fsspec[github]`, supporting recursion and optional size filters.
- **Output formats**: Display data in a Polars repr table (which can be [read back in](https://docs.pola.rs/api/python/stable/reference/api/polars.from_repr.html))
  or export to CSV/JSON/NDJSON.
- **Control table size**: Limit the number of rows or columns displayed, or use `--short` mode to quickly preview data.
- **Caching**: By default, results are cached in the user’s cache directory to avoid repeated API calls (unless you force refresh).

## Usage

### Command-Line Interface

```bash
Usage: octopols [OPTIONS] USERNAME

  octopols - A CLI for listing GitHub repos or files by username, with
  optional recursion, table formatting, and Polars-based filtering.

    By default, rows and cols are unlimited (-1). Use --short/-s to switch to
    a minimal view.

    The --filter/-f flag (if provided) applies a Polars expression or DSL
    expression   (e.g., '{name}.str.startswith("a")') to the DataFrame of
    items.

    Examples:

      octopols lmmx

      octopols lmmx -f '{name}.str.startswith("a")'

      octopols lmmx -w --filter='pl.col("filename").str.contains("test")'

Options:
  -F, --files               List files (default lists repos).
  -R, --recursive           Recursively list items (repos or files).
  -o, --output-format TEXT  Output format: table, csv, json, or ndjson.
  -r, --rows INTEGER        Number of table rows to show. Default -1 means
                            show all.
  -c, --cols INTEGER        Number of table columns to show. Default -1 means
                            show all.
  -s, --short               Short mode: overrides --rows and --cols by setting
                            both to None.
  -f, --filter TEXT         A Polars expression or a shorthand DSL expression.
                            In the DSL, use {column} to refer to
                            pl.col('column'), e.g.
                            '{name}.str.startswith("a")'
  --help                    Show this message and exit.
```

#### Example 1: List All Repos for a User

```bash
octopols lmmx --short
```

Displays a table of all repositories belonging to "lmmx" in short format.

```
shape: (226, 9)
┌────────────────────────┬────────────────┬─────────────────────────────────┬──────────┬───┬────────┬───────┬───────┬───────┐
│ name                   ┆ default_branch ┆ description                     ┆ archived ┆ … ┆ issues ┆ stars ┆ forks ┆ size  │
│ ---                    ┆ ---            ┆ ---                             ┆ ---      ┆   ┆ ---    ┆ ---   ┆ ---   ┆ ---   │
│ str                    ┆ str            ┆ str                             ┆ bool     ┆   ┆ i64    ┆ i64   ┆ i64   ┆ i64   │
╞════════════════════════╪════════════════╪═════════════════════════════════╪══════════╪═══╪════════╪═══════╪═══════╪═══════╡
│ 2020-viz               ┆ master         ┆                                 ┆ false    ┆ … ┆ 10     ┆ 0     ┆ 0     ┆ 1459  │
│ 3dv                    ┆ master         ┆ Some 3D file handling with num… ┆ false    ┆ … ┆ 0      ┆ 0     ┆ 0     ┆ 21096 │
│ AbbrevJ                ┆ master         ┆ JS journal abbreviation genera… ┆ true     ┆ … ┆ 0      ┆ 0     ┆ 0     ┆ 724   │
│ AdaBins                ┆ main           ┆ Official implementation of Ada… ┆ false    ┆ … ┆ 0      ┆ 1     ┆ 0     ┆ 560   │
│ advent-of-code-2017    ┆ master         ┆ Advent of Code 2017             ┆ true     ┆ … ┆ 0      ┆ 0     ┆ 0     ┆ 32    │
│ …                      ┆ …              ┆ …                               ┆ …        ┆ … ┆ …      ┆ …     ┆ …     ┆ …     │
│ whisper                ┆ main           ┆                                 ┆ false    ┆ … ┆ 0      ┆ 1     ┆ 0     ┆ 3118  │
│ wikitransp             ┆ master         ┆ Dataset of transparent images … ┆ false    ┆ … ┆ 3      ┆ 0     ┆ 0     ┆ 58    │
│ wotd                   ┆ master         ┆ Analysis of WOTD data from Twe… ┆ false    ┆ … ┆ 1      ┆ 1     ┆ 0     ┆ 92    │
│ wwdc-21-3d-obj-capture ┆ master         ┆ "TakingPicturesFor3DObjectCapt… ┆ false    ┆ … ┆ 1      ┆ 1     ┆ 0     ┆ 2861  │
│ YouCompleteMe          ┆ master         ┆ A code-completion engine for V… ┆ false    ┆ … ┆ 0      ┆ 1     ┆ 0     ┆ 32967 │
└────────────────────────┴────────────────┴─────────────────────────────────┴──────────┴───┴────────┴───────┴───────┴───────┘
```

#### Example 2: Filter Repos by Name

```bash
octopols lmmx -f '{name}.str.contains("demo")'
```

Uses the DSL expression to select only repositories with “demo” in the repo name.

```
shape: (9, 9)
┌──────────────────────────────┬────────────────┬─────────────────────────────────┬──────────┬─────────┬────────┬───────┬───────┬───────┐
│ name                         ┆ default_branch ┆ description                     ┆ archived ┆ is_fork ┆ issues ┆ stars ┆ forks ┆ size  │
│ ---                          ┆ ---            ┆ ---                             ┆ ---      ┆ ---     ┆ ---    ┆ ---   ┆ ---   ┆ ---   │
│ str                          ┆ str            ┆ str                             ┆ bool     ┆ bool    ┆ i64    ┆ i64   ┆ i64   ┆ i64   │
╞══════════════════════════════╪════════════════╪═════════════════════════════════╪══════════╪═════════╪════════╪═══════╪═══════╪═══════╡
│ aiohttp-demos                ┆ master         ┆ Demos for aiohttp project       ┆ false    ┆ true    ┆ 0      ┆ 0     ┆ 0     ┆ 45445 │
│ demopyrs                     ┆ master         ┆ Demo Python/Rust extension lib… ┆ false    ┆ false   ┆ 0      ┆ 0     ┆ 0     ┆ 18    │
│ importstring_demo            ┆ master         ┆ Demo of deptry inability to de… ┆ false    ┆ false   ┆ 0      ┆ 1     ┆ 0     ┆ 7     │
│ pyd2ts-demo                  ┆ master         ┆ Demo of a Pydantic model conve… ┆ false    ┆ false   ┆ 0      ┆ 0     ┆ 0     ┆ 761   │
│ react-htmx-demo              ┆ master         ┆ Demo app combining HTMX and Re… ┆ false    ┆ false   ┆ 0      ┆ 0     ┆ 0     ┆ 2     │
│ self-serve-demo              ┆ master         ┆ Python package auto-generated … ┆ false    ┆ false   ┆ 1      ┆ 1     ┆ 0     ┆ 14    │
│ sphinx-type-annotations-demo ┆ master         ┆ [Resolved] A demo of how to bu… ┆ false    ┆ false   ┆ 1      ┆ 0     ┆ 0     ┆ 39    │
│ uv-doc-url-demo              ┆ master         ┆ Proof-of-concept for extractin… ┆ false    ┆ false   ┆ 0      ┆ 0     ┆ 0     ┆ 27    │
│ uv-ws-demo                   ┆ master         ┆ A simple demo of the new works… ┆ false    ┆ false   ┆ 0      ┆ 1     ┆ 0     ┆ 15    │
└──────────────────────────────┴────────────────┴─────────────────────────────────┴──────────┴─────────┴────────┴───────┴───────┴───────┘
```

#### Example 3: Walk an Entire Repo

```bash
octopols lmmx -f '{name} == "mvdef"' --walk --short
```

Lists all files in the repository named "mvdef", abbreviating the output table in 'short' format.

```
shape: (121, 4)
┌─────────────────┬─────────────────────────────────┬──────────────┬─────────────────┐
│ repository_name ┆ file_path                       ┆ is_directory ┆ file_size_bytes │
│ ---             ┆ ---                             ┆ ---          ┆ ---             │
│ str             ┆ str                             ┆ bool         ┆ i64             │
╞═════════════════╪═════════════════════════════════╪══════════════╪═════════════════╡
│ mvdef           ┆ .github                         ┆ true         ┆ 0               │
│ mvdef           ┆ .github/CONTRIBUTING.md         ┆ false        ┆ 3094            │
│ mvdef           ┆ .github/workflows               ┆ true         ┆ 0               │
│ mvdef           ┆ .github/workflows/master.yml    ┆ false        ┆ 3398            │
│ mvdef           ┆ .gitignore                      ┆ false        ┆ 204             │
│ …               ┆ …                               ┆ …            ┆ …               │
│ mvdef           ┆ tools                           ┆ true         ┆ 0               │
│ mvdef           ┆ tools/github                    ┆ true         ┆ 0               │
│ mvdef           ┆ tools/github/install_miniconda… ┆ false        ┆ 353             │
│ mvdef           ┆ tox.ini                         ┆ false        ┆ 1251            │
│ mvdef           ┆ vercel.json                     ┆ false        ┆ 133             │
└─────────────────┴─────────────────────────────────┴──────────────┴─────────────────┘
```

#### Example 4: Filter Repos by Name, List all Files

```bash
octopols lmmx -f '{name}.str.starts_with("d3") --walk'
```

Lists *all* files in every repository whose (repo) name starts with "d3" owned by "lmmx", as a table of file paths.

```
shape: (12, 4)
┌────────────────────┬───────────────────────────────┬──────────────┬─────────────────┐
│ repository_name    ┆ file_path                     ┆ is_directory ┆ file_size_bytes │
│ ---                ┆ ---                           ┆ ---          ┆ ---             │
│ str                ┆ str                           ┆ bool         ┆ i64             │
╞════════════════════╪═══════════════════════════════╪══════════════╪═════════════════╡
│ d3-step-functions  ┆ README.md                     ┆ false        ┆ 62              │
│ d3-step-functions  ┆ d3-step-function-diagram.html ┆ false        ┆ 498             │
│ d3-step-functions  ┆ d3.v7.min.js                  ┆ false        ┆ 278580          │
│ d3-step-functions  ┆ wd-style.css                  ┆ false        ┆ 35              │
│ d3-step-functions  ┆ wd_sample_data.json           ┆ false        ┆ 1053            │
│ d3-step-functions  ┆ wiring-diagram.js             ┆ false        ┆ 9264            │
│ d3-wiring-diagrams ┆ README.md                     ┆ false        ┆ 66              │
│ d3-wiring-diagrams ┆ d3-wiring-diagram.html        ┆ false        ┆ 484             │
│ d3-wiring-diagrams ┆ d3.v7.min.js                  ┆ false        ┆ 278580          │
│ d3-wiring-diagrams ┆ wd-style.css                  ┆ false        ┆ 35              │
│ d3-wiring-diagrams ┆ wd_sample_data.json           ┆ false        ┆ 1053            │
│ d3-wiring-diagrams ┆ wiring-diagram.js             ┆ false        ┆ 7482            │
└────────────────────┴───────────────────────────────┴──────────────┴─────────────────┘
```

### Library Usage

You can also import `octopols.Inventory` directly:

```python
from octopols import Inventory

inv = Inventory(username="lmmx")
repos_df = inv.list_repos()
```

If you want to apply a filter expression programmatically and walk the file trees:

```python
inv = Inventory(username="lmmx", repo_filter=pl.col("name").str.contains("demo"))
files_df = inv.walk_file_trees()
```

## Project Structure

- `cli.py`: Defines the CLI (`octopols`) with all available options and flags.
- `inventory.py`: Core logic for retrieving repos, walking file trees, caching, and applying filters.

## Contributing

1. **Issues & Discussions**: Please open a GitHub issue or discussion for bugs, feature requests, or questions.
2. **Pull Requests**: PRs are welcome!
   - Ensure you have [pdm](https://pdm.fming.dev/latest/) installed for local development.
   - Run tests (when available) and include updates to docs or examples if relevant.

## License

This project is licensed under the [MIT License](https://opensource.org/licenses/MIT).

---

Maintained by [lmmx](https://github.com/lmmx). Contributions welcome!
