Metadata-Version: 2.4
Name: transtractor
Version: 0.9.2
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Financial and Insurance Industry
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Classifier: Topic :: Office/Business :: Financial
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Dist: pdfplumber>=0.11.8,<0.12.0
License-File: LICENSE
Summary: Universal PDF bank statement parsing library
Keywords: pdf,bank,statement,parser,transaction,extraction,csv
Author-email: Daniel Weber <develop@transtractor.net>
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://github.com/transtractor/transtractor-lib
Project-URL: Documentation, https://transtractor-lib.readthedocs.io
Project-URL: Repository, https://github.com/transtractor/transtractor-lib
Project-URL: Issues, https://github.com/transtractor/transtractor-lib/issues

# The Transtractor

![PyPI version](https://img.shields.io/pypi/v/transtractor)
![Development Status](https://img.shields.io/pypi/status/transtractor)
![Tests](https://github.com/transtractor/transtractor-lib/actions/workflows/tests.yml/badge.svg)
![codecov](https://codecov.io/gh/transtractor/transtractor-lib/branch/main/graph/badge.svg)
![License](https://img.shields.io/github/license/transtractor/transtractor-lib)

## Universal PDF bank statement parsing
The Transaction Extractor, or 'Transtractor', aspires to be a universal 
library for extracting transaction data from PDF bank statements. Key features:

* Written in Rust (fast)
* Python API (user friendly)
* AI-free (lightweight)
* Rules-based extraction (100% predictable and accurate)


## Installation

### Install from PyPI

Transtractor is available on PyPI and can be installed with pip:

```bash
pip install transtractor
```

**Requirements**: Python 3.9 or higher

### Compile from source

1. **Install Rust**: Download and install Rust from [rustup.rs](https://rustup.rs/)

2. **Install Maturin**: Install the Python build tool for Rust extensions
   ```bash
   pip install maturin
   ```

3. **Build and install Transtractor**: Clone the repository and build
   ```bash
   git clone https://github.com/gravytoast/transtractor.git
   cd transtractor
   maturin develop --release
   ```

### Basic usage

1. **Import and initialise the parser**
   ```python
   from transtractor import Parser

   parser = Parser()
   ```

2. **Convert PDF to CSV**: All CSV files are written in a standard format
   ```python
   parser.parse('statement.pdf').to_csv('statement.csv')
   ```

3. **Convert PDF to DataFrame**: Load into a DataFrame for analysis
   ```python
   import pandas as pd

   data = parser.parse('statement.pdf').to_pandas_dict()
   df = pd.DataFrame(data)
   ```

## Advanced usage
See the [documentation](https://transtractor-lib.readthedocs.io/en/latest/) maintained on Read the Docs.

## Supported statements
See the documentation for a current list of [supported statements](https://transtractor-lib.readthedocs.io/en/latest/supported_statements.html). You may also
create your own parsing configuration files by following these [instructions](https://transtractor-lib.readthedocs.io/en/latest/configuration.html)
and loading it by:

```python
from transtractor import Parser

parser = Parser()
parser.load('my_config.json')
parser.parse('statement.pdf').to_csv('statement.csv')
```

## Contributions
New and well-tested configuration files are especially welcome. Please
submit a pull request with them add to the *python/transtractor/configs* directory, or
email to develop@transtractor.net.

