Metadata-Version: 2.4
Name: folder_indexer
Version: 0.2.3
Summary: A Python script to index a large folder structure into a parquet file, along with metadata
Project-URL: Homepage, https://github.com/RecRanger/folder-indexer-py
Project-URL: Issues, https://github.com/RecRanger/folder-indexer-py/issues
Author-email: RecRanger <RecRanger+package@proton.me>
License-Expression: BSD-3-Clause
License-File: LICENSE
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Requires-Python: >=3.10
Requires-Dist: loguru>0.7.0
Requires-Dist: polars>1.30
Requires-Dist: python-magic>=0.4.27
Requires-Dist: tqdm>=4.65.0
Description-Content-Type: text/markdown

# folder-indexer-py
A Python script to index a large folder structure into a parquet file, along with metadata

## Description

This script is useful for searching for files stored on a reasonably slow disk
from backups, especially in where you aren't sure about the files are are searching for.

Use tools like DBeaver and DuckDB to query and explore the generated index.

## Usage

```bash
uv tool install folder_indexer

folder_indexer -i /path/to/input/folder -o /path/to/output/folder
```

## Metadata Indexed and Output

The output parquet file (`file_index.parquet`) has the following columns:

* file_path
* folder_path
* file_name
* file_size_bytes
* entry_kind
* md5_hex
* sha256_base64
* date_created
* date_modified
* magic_file_type_1
* first_100_bytes
* last_100_bytes
* timestamp_crawled
* indexing_start_timestamp
