A text complexity library for text analysis built on spaCy

With all the basic NLP capabilities provided by spaCy (dependency parsing, POS tagging, tokenizing), TRUNAJOD focuses on extracting measurements from texts that might be interesting for different applications and use cases.


import spacy from TRUNAJOD.entity_grid import EntityGrid nlp = spacy.load('es_core_news_sm', disable=['ner', 'textcat']) example_text = ( 'El espectáculo del cielo nocturno cautiva la mirada y suscita preguntas' 'sobre el universo, su origen y su funcionamiento. No es sorprendente que ' 'todas las civilizaciones y culturas hayan formado sus propias ' 'cosmologías. Unas relatan, por ejemplo, que el universo ha' 'sido siempre tal como es, con ciclos que inmutablemente se repiten; ' 'otras explican que este universo ha tenido un principio, ' 'que ha aparecido por obra creadora de una divinidad.' ) doc = nlp(example_text) egrid = EntityGrid(doc) print(egrid.get_egrid())

Author info

Diego Palma


Categories research standalone scientific

Submit your project

If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the project idea label on the issue tracker.

Read the docsJSON source