Language models

spaCy currently supports the following languages and capabilities:

LanguageTokenSBDLemmaPOSNERDepVectorSentiment
English en
German de

Alpha support

Work has started on the following languages. You can help by improving the existing language data and extending the tokenization patterns.

LanguageSource
Chinese zhspacy/zh
Spanish esspacy/es
Italian itspacy/it
French frspacy/fr
Portuguese ptspacy/pt
Dutch nlspacy/nl
Swedish svspacy/sv
Finnish fispacy/fi
Hungarian huspacy/hu

Chinese tokenization requires the Jieba library. Statistical models are coming soon.

Read next: Philosophy