Lexeme

An entry in the vocabulary.

Attributes

NameTypeDescription
vocabVocab
lowerintLower-case form of the word.
lower_unicodeLower-case form of the word.
shapeintTransform of the word's string, to show orthographic features.
shape_unicodeTransform of the word's string, to show orthographic features.
prefixintLength-N substring from the start of the word. Defaults to N=1.
prefix_unicodeLength-N substring from the start of the word. Defaults to N=1.
suffixintLength-N substring from the end of the word. Defaults to N=3.
suffix_unicodeLength-N substring from the start of the word. Defaults to N=3.
is_alphaboolEquivalent to word.orth_.isalpha().
is_asciiboolEquivalent to [any(ord(c) >= 128 for c in word.orth_)].
is_digitboolEquivalent to word.orth_.isdigit().
is_lowerboolEquivalent to word.orth_.islower().
is_titleboolEquivalent to word.orth_.istitle().
is_punctboolEquivalent to word.orth_.ispunct().
is_spaceboolEquivalent to word.orth_.isspace().
like_urlboolDoes the word resemble a URL?
like_numboolDoes the word represent a number? e.g. “10.9”, “10”, “ten”, etc.
like_emailboolDoes the word resemble an email address?
is_oovboolIs the word out-of-vocabulary?
is_stopboolIs the word part of a "stop list"?
langintLanguage of the parent vocabulary.
lang_unicodeLanguage of the parent vocabulary.
probfloatSmoothed log probability estimate of token's type.
sentimentfloatA scalar value indicating the positivity or negativity of the token.
lex_idintID of the token's lexical type.
textunicodeVerbatim text content.

Lexeme.__init__

Create a Lexeme object.

NameTypeDescription
vocabVocabThe parent vocabulary.
orthintThe orth id of the lexeme.
returnLexemeThe newly constructed object.

Lexeme.set_flag

Change the value of a boolean flag.

NameTypeDescription
flag_idintThe attribute ID of the flag to set.
valueboolThe new value of the flag.
returnNone-

Lexeme.check_flag

Check the value of a boolean flag.

NameTypeDescription
flag_idintThe attribute ID of the flag to query.
returnboolThe value of the flag.

Lexeme.similarity

Compute a semantic similarity estimate. Defaults to cosine over vectors.

NameTypeDescription
other- The object to compare with. By default, accepts Doc, Span, Token and Lexeme objects.
returnfloatA scalar similarity score. Higher is more similar.

Lexeme.vector

A real-valued meaning representation.

NameTypeDescription
returnnumpy.ndarray[ndim=1, dtype='float32']A real-valued meaning representation.

Lexeme.has_vector

A boolean value indicating whether a word vector is associated with the object.

NameTypeDescription
returnboolWhether a word vector is associated with the object.