spaCy .NET Wrapper

SpacyDotNet is a .NET Core compatible wrapper for spaCy, based on Python.NET

This projects relies on Python.NET to interop with spaCy. It's not meant to be a complete and exhaustive implementation of all spaCy features and APIs. Although it should be enough for basic tasks, it's considered as a starting point if you need to build a complex project using spaCy in .NET Most of the basic features in Spacy101 are available. All Container classes are present (Doc, Token, Span and Lexeme) with their basic properties/methods running and also Vocab and StringStore in a limited form. Anyway, any developer should be ready to add the missing properties or classes in a very straightforward manner.


var spacy = new Spacy(); var nlp = spacy.Load("en_core_web_sm"); var doc = nlp.GetDocument("Apple is looking at buying U.K. startup for $1 billion"); foreach (Token token in doc.Tokens) Console.WriteLine($"{token.Text} {token.Lemma} {token.PoS} {token.Tag} {token.Dep} {token.Shape} {token.IsAlpha} {token.IsStop}"); Console.WriteLine(""); foreach (Span ent in doc.Ents) Console.WriteLine($"{ent.Text} {ent.StartChar} {ent.EndChar} {ent.Label}"); nlp = spacy.Load("en_core_web_md"); var tokens = nlp.GetDocument("dog cat banana afskfsd"); Console.WriteLine(""); foreach (Token token in tokens.Tokens) Console.WriteLine($"{token.Text} {token.HasVector} {token.VectorNorm}, {token.IsOov}"); tokens = nlp.GetDocument("dog cat banana"); Console.WriteLine(""); foreach (Token token1 in tokens.Tokens) { foreach (Token token2 in tokens.Tokens) Console.WriteLine($"{token1.Text} {token2.Text} {token1.Similarity(token2) }"); } doc = nlp.GetDocument("I love coffee"); Console.WriteLine(""); Console.WriteLine(doc.Vocab.Strings["coffee"]); Console.WriteLine(doc.Vocab.Strings[3197928453018144401]); Console.WriteLine(""); foreach (Token word in doc.Tokens) { var lexeme = doc.Vocab[word.Text]; Console.WriteLine($@"{lexeme.Text} {lexeme.Orth} {lexeme.Shape} {lexeme.Prefix} {lexeme.Suffix} {lexeme.IsAlpha} {lexeme.IsDigit} {lexeme.IsTitle} {lexeme.Lang}"); }
Author info

Antonio Miras


Categories nonpython

Submit your project

If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the project idea label on the issue tracker.

Read the docsJSON source