Metadata-Version: 2.0
Name: ncgocr
Version: 1.0.2
Summary: Named Concept Gene Ontology Concept Recognition
Home-page: https://github.com/jeroyang/ncgocr
Author: Chia-Jung, Yang
Author-email: jeroyang@gmail.com
License: MIT
Description-Content-Type: UNKNOWN
Keywords: ncgocr
Platform: UNKNOWN
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Requires-Dist: progressbar2 (==3.12.0)
Requires-Dist: acora (==2.0)
Requires-Dist: lxml (==3.7.3)
Requires-Dist: intervaltree (==2.1.0)
Requires-Dist: mock (==2.0.0)
Requires-Dist: scikit-learn (==0.18.1)
Requires-Dist: numpy (==1.12.0)
Requires-Dist: scipy (==0.18.1)
Requires-Dist: txttk (==0.10.0)

# NCGOCR

[![](https://img.shields.io/travis/jeroyang/ncgocr.svg)](https://travis-ci.org/jeroyang/ncgocr)
[![](https://img.shields.io/pypi/v/ncgocr.svg)](https://pypi.python.org/pypi/ncgocr)


- Named Concept Gene Ontology Concept Recognition
- Automatic recognize Gene Ontology (GO) concepts from context.

## Installation

Using 'pip' to install the Python module
```bash
$ pip install -U ncgocr
```

## Usage
```python
from ncgocr import Craft, GoData, NCGOCR, Corpus, evaluate

craft = Craft('data')
corpus = craft.get_corpus()
goldstandard = craft.get_goldstandard()

print('Loading GO...')
godata = GoData('data/craft-1.0/ontologies/GO.obo')

print('Initiating NCGOCR...')
ncgocr = NCGOCR(godata)

print('Training the model...')
ncgocr.train(corpus, goldstandard)

print('Loading the testing corpus...')
corpus_name = 'testing corpus'
testing_corpus = Corpus.from_dir('data/craft-1.0/articles/txt/', corpus_name)

print('predicting the results...')
result = ncgocr.process(testing_corpus)

print('Show the first 10 results...')
print(result.to_list()[:10])

print('Evaluate the results...')
report = evaluate(result, goldstandard, 'Using the training corpus as the testing corpus')
print(report)
```


## License
* Free software: MIT license


