Language models

spaCy currently supports the following languages and capabilities:

LanguageTokenSBDLemmaPOSNERDepVectorSentiment
English en
German de
Chinese zh
Spanish es

Chinese tokenization requires the Jieba library. Statistical models are coming soon.

Alpha support

Work has started on the following languages. You can help by improving the existing language data and extending the tokenization patterns.

LanguageSource
Italian itspacy/it
French frspacy/fr
Portuguese ptspacy/pt
Dutch nlspacy/nl
Swedish svspacy/sv
Hungarian huspacy/hu
Read next: Philosophy