An efficient Pipeline for Danish NLP

DaCy is a Danish preprocessing pipeline trained in SpaCy. It has achieved State-of-the-Art performance on Named entity recognition, part-of-speech tagging and dependency parsing for Danish. This repository contains material for using the DaCy, reproducing the results and guides on usage of the package. Furthermore, it also contains a series of behavioural test for biases and robustness of Danish NLP pipelines.


import dacy print(dacy.models()) # get a list of dacy models nlp = dacy.load('medium') # load your spacy pipeline # DaCy also includes functionality for adding other Danish models to the pipeline # For instance you can add the BertTone model for classification of sentiment polarity to the pipeline: nlp = add_berttone_polarity(nlp)
Author info

Centre for Humanities Computing Aarhus


Categories pipeline

Submit your project

If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the project idea label on the issue tracker.

Read the docsJSON source