Metadata-Version: 2.1
Name: sermetric
Version: 0.1.4
Summary: metrics for evaluate how easy-to-read a text is.
Home-page: 
Author: Mirari San Martín
Author-email: miren.san-martin@unirioja.es
Keywords: easy-to-read
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Topic :: Software Development :: Build Tools
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Description-Content-Type: text/x-rst
License-File: LICENSE.txt

# SERM

SERM is an open-source library for evaluating how easy-to-read a text is. It supports a wide variety of indexes and allows the user to easily combine them. 


## Features

Several indexes are provided:

* indicePuntos: it is the number of points in the text divided by the number of words.
* indicePuntoyAparte: it is the number of new paragraphs in the text divided by the number of words.
* indiceComas: the number of commas in the text divided by the number of words.
* indiceExtension: ratio between the number of syllables in lexical words and the number of lexical words, lexical words being understood  as nouns, verbs, adjectives and adverbs.
* indiceTriPoli: ratio of the number of trisyllabic and polysyllabic words to the number of lexical words.
* indiceTriPoliLexica: ratio of the  number of trisyllabic and polysyllabic lexical  words to the numberof lexical words.
* indiceDiversidad: ratio between the  number of different words in the text and the total number of words.
* indiceFrecLexica: ratio between the number of low-frequency lexical words and the number of lexical words. The "Corpus de la Real Academia Española" (CREA) and the 'Gran diccionario del uso del español actual' will be used as a reference.
* indicePalFrase: quotient resulting from the division between the number of words in the text and the number of sentences.
* indiceComplejidadOracional: the result of dividing the number of sentences by the number of propositions.
* indiceComplejidad: quotient between the number of low-frequency syllables and the total number of syllables (reference: 'Diccionario de frecuencias de las unidades lingüísticas del castellano')
* fernandezHuerta: is the result of 206.84-0.6P-1.02F, where P is the number of syllables per 100 words  and F is the number of sentences per 100 words.
