Metadata-Version: 2.1
Name: honest
Version: 0.1.2
Summary: ...
Home-page: https://github.com/MilaNLProc/honest
Author: Federico Bianchi
Author-email: f.bianchi@unibocconi.it
License: MIT license
Keywords: honest
Platform: UNKNOWN
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Requires-Python: >=3.6
Description-Content-Type: text/x-rst
License-File: LICENSE
License-File: AUTHORS.rst

================================================================
HONEST: Measuring Hurtful Sentence Completion in Language Models
================================================================


.. image:: https://img.shields.io/pypi/v/honest.svg
        :target: https://pypi.python.org/pypi/honest

.. image:: https://img.shields.io/travis/MilaNLProc/honest.svg
        :target: https://travis-ci.com/MilaNLProc/honest

.. image:: https://readthedocs.org/projects/honest/badge/?version=latest
        :target: https://honest.readthedocs.io/en/latest/?version=latest
        :alt: Documentation Status

.. image:: https://raw.githubusercontent.com/aleen42/badges/master/src/medium.svg
    :target: https://medium.com/towards-data-science/can-too-much-bert-be-bad-for-you-92f0014e099b
    :alt: Medium Blog Post



...


Large language models (LLMs) have revolutionized the field of NLP. However, LLMs capture and proliferate hurtful stereotypes, especially in text generation. We propose **HONEST**, a score to measure hurtful sentence completions in language models. It uses a systematic template- and lexicon-based bias evaluation methodology for six languages (English, Italian, French, Portuguese, Romanian, and Spanish).

See the papers for additional details:

Nozza D., Bianchi F., and Hovy D. "HONEST: Measuring hurtful sentence completion in language models." The 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Association for Computational Linguistics, 2021. https://aclanthology.org/2021.naacl-main.191


Installing
----------

.. code-block:: bash

    pip install -U honest
    
    
Using
-----

.. code-block:: python

    # Load BERT model
    tokenizer = AutoTokenizer.from_pretrained(name_model)
    model = AutoModelWithLMHead.from_pretrained(name_model)

    # Define nlp_fill pipeline
    nlp_fill = pipeline('fill-mask', model=model, tokenizer=tokenizer, top_k=k)

    print("FILL EXAMPLE:",nlp_fill('all women likes to [M].'.replace('[M]',tokenizer.mask_token)))

    # Fill templates (please check if the filled words contain any special character)
    filled_templates = [[fill['token_str'].strip() for fill in nlp_fill(masked_sentence.replace('[M]',tokenizer.mask_token))] for masked_sentence in masked_templates.keys()]

    honest_score = evaluator.honest(filled_templates)
    print(name_model, k, honest_score)

Citation
--------

Please use the following bibtex entry if you use this score in your project:

::

  @inproceedings{nozza-etal-2021-honest,
    title = {"{HONEST}: Measuring Hurtful Sentence Completion in Language Models"},
    author = "Nozza, Debora and Bianchi, Federico  and Hovy, Dirk",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.naacl-main.191",
    doi = "10.18653/v1/2021.naacl-main.191",
    pages = "2398--2406",
  }

Development Team
----------------

* Federico Bianchi <f.bianchi@unibocconi.it> Bocconi University
* Debora Nozza <debora.nozza@unibocconi.it> Bocconi University
* Dirk Hovy <dirk.hovy@unibocconi.it> Bocconi University

Software Details
----------------

* Free software: MIT license
* Documentation: https://honest.readthedocs.io.

Credits
-------

This package was created with Cookiecutter_ and the `audreyr/cookiecutter-pypackage`_ project template.

.. _Cookiecutter: https://github.com/audreyr/cookiecutter
.. _`audreyr/cookiecutter-pypackage`: https://github.com/audreyr/cookiecutter-pypackage

Note
----

Remember that this is a research tool :)


=======
History
=======

0.1.0 (2022-01-25)
------------------

* First release on PyPI.


