Metadata-Version: 2.1
Name: figur
Version: 0.0.3
Summary: Figurenerkennung for German literary texts.
Home-page: https://github.com/severinsimmler/figur
Author: Severin Simmler
License: MIT
Platform: UNKNOWN
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Python: >=3.6.0
Description-Content-Type: text/markdown
Requires-Dist: flair (>=0.4.1)
Requires-Dist: gdown (>=3.7.3)
Requires-Dist: spacy (>=2.0.18)
Requires-Dist: pandas (>=0.24.1)


# Figurenerkennung for German literary texts

[![Build Status](https://travis-ci.com/severinsimmler/figur.svg?branch=master)](https://travis-ci.com/severinsimmler/figur)

An important step in the quantitative analysis of narrative texts is the automatic recognition of references to figures, a special case of the generic NLP problem of Named Entity Recognition (NER).

Usually NER models are not designed for literary texts resulting in poor recall. This easy-to-use package is the continuation of the work of [Jannidis et al.](https://opus.bibliothek.uni-wuerzburg.de/opus4-wuerzburg/frontdoor/deliver/index/docId/14333/file/Jannidis_Figurenerkennung_Roman.pdf) using techniques from the field of Deep Learning.


## Installation

```
$ pip install figur
```


## Example

```python
>>> import figur
>>> text = "Der Gärtner entfernte sich eilig, und Eduard folgte bald."
>>> figur.tag(text)
   SentenceId      Token      Tag
0           0        Der        _
1           0    Gärtner  AppTdfW
2           0  entfernte        _
3           0       sich     Pron
4           0     eilig,        _
5           0        und        _
6           0     Eduard     Core
7           0     folgte        _
8           0      bald.        _
```


## Figurenerkennung statistics
![Confusion Matrix](doc/confusion-matrix.svg)


