Metadata-Version: 2.1
Name: pdf-aggregator
Version: 0.0.1
Summary: Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline
Home-page: https://github.com/finetjul/pdf-aggregator
Author: Julien Finet
Author-email: julien.finet@kitware.com
License: UNKNOWN
Keywords: pdf aggregate extract banking financial statement
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: ISC License (ISCL)
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown

# PDF aggregator

Aggregate account PDF statements into JSON and visualize aggregated financial data as timeline.

![PDF aggregator](https://raw.githubusercontent.com/finetjul/pdf-aggregator/master/docs/pdf-aggregator.svg)

Works offline and relies on [tika](https://tika.apache.org/) for PDF parsing and [matplotlib](https://matplotlib.org/) for plotting.
It relies on regular expressions stored in simple configuration files to extract bank statements balance, date, account number...

## Installation

```
pip install -r requirements.txt
```

## Usage

### Aggregate
Scan PDF files and aggregate financial data into an accounts.json summary file:

```
python aggregate.py path/to/folder/with/PDF
```

or

```
python aggregate.py path/to/file.pdf
```

```--help``` for more options.

### Add a new config

```
python aggregate.py path/to/PDF/file --test
```

It should print out the content of the pdf. Then test regular expression:

```
python aggregate.py path/to/PDF/file --test 'Ending balance on (\d+)/(\d+)/(\d+)
```

You can then create conf file and test detection with -vvv:

```
python aggregate.py path/to/PDF/file -vvv
```


### Plot
Plot aggregated data:

```
python plot.py path/to/folder/with/multiple/accounts.json
```

or

```
python plot.py path/to/accounts.json
```

```--help``` for more options.

Example:

```
python.exe .\plot.py .\accounts\ --subtotals --no_real_estate_appreciation
```


