Metadata-Version: 2.1
Name: ner-dataset
Version: 0.0.1
Summary: Various Ner dataset for multiple domains and languages
Home-page: UNKNOWN
Author: Mohamed Ben Haddou
Author-email: mbenhaddou@mentis.io
License: UNKNOWN
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3.7
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 4 - Beta
Requires-Python: >=3.0
Description-Content-Type: text/markdown
License-File: LICENSE

===============================
Datasets for Entity Recognition
===============================

This repository contains datasets from several domains and languages
annotated with a variety of entity types, useful for entity recognition and
named entity recognition (NER) tasks.


**NOTE: I am actively adding datasets to this list

Datasets for NER in English
===========================

.. |check| unicode:: 0x2714

The following table shows the list of datasets for English-language entity recognition (for a list of NER datasets in other languages, see below). The `data` directory
contains information on where to obtain those datasets which could not be shared
due to licensing restrictions, as well as code to convert them (if necessary)
to the CoNLL 2003 format. Links to NER corpora in other languages
are also listed below.

============== =============== ======================= =============================== ==================================
Dataset         Domain            License                             Language                   Reference
============== =============== ======================= =============================== ==================================
CONLL 2003      News                                                 en
CONLL 2002                                                          en-nl-es
============== =============== ======================= =============================== ==================================

Licenses
========

Notes on licenses:


The data set are under various type of licences.
I do not have the time to worry about the licences now
Datasets for NER in other languages
===================================

Lexical Named Entity resources
------------------------------

- HeiNER: http://heiner.cl.uni-heidelberg.de/index.shtml
- NECKAr: https://event.ifi.uni-heidelberg.de/?page_id=532#Wikidata_NE_dataset




