Metadata-Version: 2.0
Name: revscoring
Version: 1.2.9
Summary: A set of utilities for generating quality scores for MediaWiki revisions
Home-page: https://github.com/halfak/Revision-Scores
Author: Aaron Halfaker
Author-email: ahalfaker@wikimedia.org
License: MIT
Platform: UNKNOWN
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Environment :: Other Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Dist: deltas (>=0.3.10,<0.3.999)
Requires-Dist: docopt (>=0.6.2,<0.6.999)
Requires-Dist: more-itertools (==2.2)
Requires-Dist: mwapi (>=0.3.0,<0.3.999)
Requires-Dist: mwparserfromhell (>=0.3.3,<0.4.999)
Requires-Dist: mwtypes (>=0.2.0,<0.2.999)
Requires-Dist: nltk (>=3.0.0,<3.0.999)
Requires-Dist: nose (>=1.3.4,<1.3.999)
Requires-Dist: numpy (>=1.8.2,<1.10.999)
Requires-Dist: pyenchant (>=1.6.6,<1.6.999)
Requires-Dist: pytz (==2012c)
Requires-Dist: pywikibase (>=0.0.3,<0.0.999)
Requires-Dist: requests (>=2.0.0,<2.999.999)
Requires-Dist: scikit-learn (>=0.17.0,<0.17.999)
Requires-Dist: scipy (>=0.13.3,<0.17.999)
Requires-Dist: setuptools (>=5.5.1,<15.999)
Requires-Dist: tabulate (>=0.7.5,<0.7.999)
Requires-Dist: yamlconf (>=0.1.0,<0.1.999)

|travis|_ |codecov|_

Revision Scoring
================
A generic, machine learning-based revision scoring system designed to be used
to automatically differentiate damage from productive contributory behavior on
Wikipedia.

Example
========

Using a scorer_model to score a revision::

  >>> import mwapi
  >>> from revscoring import ScorerModel
  >>> from revscoring.extractors.api.extractor import Extractor
  >>>
  >>> with open("models/enwiki.damaging.linear_svc.model") as f:
  ...     scorer_model = ScorerModel.load(f)
  ...
  >>> extractor = Extractor(mwapi.Session(host="https://en.wikipedia.org",
  ...                                        user_agent="revscoring demo"))
  >>>
  >>> feature_values = list(extractor.extract(123456789, scorer_model.features))
  >>>
  >>> print(scorer_model.score(feature_values))
  {'prediction': True, 'probability': {False: 0.4694409344514984, True: 0.5305590655485017}}


Installation
============
The easiest way to install `revscoring` is via the Python package installer
(pip).

``pip install revscoring``

You may find that some of `revscorings` dependencies fail to compile (namely
`scipy`, `numpy` and `sklearn`).  In that case, you'll need to install some
dependencies in your operating system.

Ubuntu & Debian:
  Run ``sudo apt-get install python3-dev g++ gfortran liblapack-dev libopenblas-dev``
Windows:
  'TODO'
MacOS:
  Using Homebrew and pip, installing `revscoring` and `enchant` can be accomplished
  as follows::

      brew install aspell --with-all-languages
      brew install enchant
      pip install --no-binary pyenchant revscoring
  Languages can be added to `aspell`::

      cd /tmp
      wget http://ftp.gnu.org/gnu/aspell/dict/pt/aspell-pt-0.50-2.tar.bz2
      bzip2 -dc aspell-pt-0.50-2.tar.bz2 | tar xvf -
      cd aspell-pt-0.50-2
      ./configure
      make
      sudo make install
  Caveats:
    * The differences between the `aspell` and `myspell` dictionaries can cause
      some of the tests to fail


Finally, in order to make use of language features, you'll need to download
some NLTK data.  The following command will get the necessary corpus.

``python -m nltk.downloader stopwords``

You'll also need to install `enchant <https://enchant.org>`_ compatible
dictionaries of the languages you'd like to use.  We recommend the following:

* ``languages.arabic``: aspell-ar
* ``languages.czech``: myspell-cs
* ``languages.dutch``: myspell-nl
* ``languages.english``: myspell-en-us myspell-en-gb myspell-en-au
* ``languages.estonian``: myspell-et
* ``languages.french``: myspell-fr
* ``languages.german``: myspell-de-at myspell-de-ch myspell-de-de
* ``languages.hebrew``: myspell-he
* ``languages.hungarian``: myspell-hu
* ``languages.indonesian``: aspell-id
* ``languages.italian``: myspell-it
* ``languages.norwegian``: myspell-nb
* ``languages.persian``: myspell-fa
* ``languages.polish``: aspell-pl
* ``languages.portuguese``: myspell-pt
* ``languages.spanish``: myspell-es
* ``languages.swedish``: aspell-sv
* ``languages.tamil``: aspell-ta
* ``languages.russian``: myspell-ru
* ``languages.ukrainian``: myspell-uk
* ``languages.vietnamese``: hunspell-vi

Authors
=======
    Aaron Halfaker:
        * `http://halfaker.info`
    Helder:
        * `https://github.com/he7d3r`
    Adam Roses Wight:
        * `https://mediawiki.org/wiki/User:Adamw`
    Amir Sarabadani:
	* `https://github.com/Ladsgroup`

.. |travis| image:: https://api.travis-ci.org/wiki-ai/revscoring.png
.. _travis: https://travis-ci.org/wiki-ai/revscoring
.. |codecov| image:: https://codecov.io/github/wiki-ai/revscoring/revscoring.svg
.. _codecov: https://codecov.io/github/wiki-ai/revscoring


