Legacy

Legacy functions and architectures

Archived implementations available through spacy-legacy

The spacy-legacy package includes outdated registered functions and architectures. It is installed automatically as a dependency of spaCy, and provides backwards compatibility for archived functions that may still be used in projects.

You can find the detailed documentation of each such legacy function on this page.

Architectures

These functions are available from @spacy.registry.architectures.

spacy.Tok2Vec.v1

The spacy.Tok2Vec.v1 architecture was expecting an encode model of type Model[Floats2D, Floats2D] such as spacy.MaxoutWindowEncoder.v1 or spacy.MishWindowEncoder.v1.

Construct a tok2vec model out of two subnetworks: one for embedding and one for encoding. See the “Embed, Encode, Attend, Predict” blog post for background.

NameDescription
embedEmbed tokens into context-independent word vector representations. For example, CharacterEmbed or MultiHashEmbed. Model[List[Doc], List[Floats2d]]
encodeEncode context into the embeddings, using an architecture such as a CNN, BiLSTM or transformer. For example, MaxoutWindowEncoder.v1. Model[Floats2d,Floats2d]

spacy.MaxoutWindowEncoder.v1

The spacy.MaxoutWindowEncoder.v1 architecture was producing a model of type Model[Floats2D, Floats2D]. Since spacy.MaxoutWindowEncoder.v2, this has been changed to output type Model[List[Floats2d], List[Floats2d]].

Encode context using convolutions with maxout activation, layer normalization and residual connections.

NameDescription
widthThe input and output width. These are required to be the same, to allow residual connections. This value will be determined by the width of the inputs. Recommended values are between 64 and 300. int
window_sizeThe number of words to concatenate around each token to construct the convolution. Recommended value is 1. int
maxout_piecesThe number of maxout pieces to use. Recommended values are 2 or 3. int
depthThe number of convolutional layers. Recommended value is 4. int

spacy.MishWindowEncoder.v1

The spacy.MishWindowEncoder.v1 architecture was producing a model of type Model[Floats2D, Floats2D]. Since spacy.MishWindowEncoder.v2, this has been changed to output type Model[List[Floats2d], List[Floats2d]].

Encode context using convolutions with Mish activation, layer normalization and residual connections.

NameDescription
widthThe input and output width. These are required to be the same, to allow residual connections. This value will be determined by the width of the inputs. Recommended values are between 64 and 300. int
window_sizeThe number of words to concatenate around each token to construct the convolution. Recommended value is 1. int
depthThe number of convolutional layers. Recommended value is 4. int

spacy.HashEmbedCNN.v1

Identical to spacy.HashEmbedCNN.v2 except using spacy.StaticVectors.v1 if vectors are included.

spacy.MultiHashEmbed.v1

Identical to spacy.MultiHashEmbed.v2 except with spacy.StaticVectors.v1 if vectors are included.

spacy.CharacterEmbed.v1

Identical to spacy.CharacterEmbed.v2 except using spacy.StaticVectors.v1 if vectors are included.

spacy.TextCatEnsemble.v1

The spacy.TextCatEnsemble.v1 architecture built an internal tok2vec and linear_model. Since spacy.TextCatEnsemble.v2, this has been refactored so that the TextCatEnsemble takes these two sublayers as input.

Stacked ensemble of a bag-of-words model and a neural network model. The neural network has an internal CNN Tok2Vec layer and uses attention.

NameDescription
exclusive_classesWhether or not categories are mutually exclusive. bool
pretrained_vectorsWhether or not pretrained vectors will be used in addition to the feature vectors. bool
widthOutput dimension of the feature encoding step. int
embed_sizeInput dimension of the feature encoding step. int
conv_depthDepth of the tok2vec layer. int
window_sizeThe number of contextual vectors to concatenate from the left and from the right. int
ngram_sizeDetermines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3would give unigram, trigram and bigram features. int
dropoutThe dropout rate. float
nOOutput dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int]

spacy.TextCatCNN.v1

Since spacy.TextCatCNN.v2, this architecture has become resizable, which means that you can add labels to a previously trained textcat. TextCatCNN v1 did not yet support that. TextCatCNN has been replaced by the more general TextCatReduce layer. TextCatCNN is identical to TextCatReduce with use_reduce_mean=true, use_reduce_first=false, reduce_last=false and use_reduce_max=false.

A neural network model where token vectors are calculated using a CNN. The vectors are mean pooled and used as features in a feed-forward network. This architecture is usually less accurate than the ensemble, but runs faster.

NameDescription
exclusive_classesWhether or not categories are mutually exclusive. bool
tok2vecThe tok2vec layer of the model. Model
nOOutput dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int]

spacy.TextCatCNN.v2

A neural network model where token vectors are calculated using a CNN. The vectors are mean pooled and used as features in a feed-forward network. This architecture is usually less accurate than the ensemble, but runs faster.

TextCatCNN has been replaced by the more general TextCatReduce layer. TextCatCNN is identical to TextCatReduce with use_reduce_mean=true, use_reduce_first=false, reduce_last=false and use_reduce_max=false.

NameDescription
exclusive_classesWhether or not categories are mutually exclusive. bool
tok2vecThe tok2vec layer of the model. Model
nOOutput dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int]

TextCatCNN.v1 had the exact same signature, but was not yet resizable. Since v2, new labels can be added to this component, even after training.

spacy.TextCatBOW.v1

Since spacy.TextCatBOW.v2, this architecture has become resizable, which means that you can add labels to a previously trained textcat. TextCatBOW v1 did not yet support that. Versions of this model before spacy.TextCatBOW.v3 used an erroneous sparse linear layer that only used a small number of the allocated parameters.

An n-gram “bag-of-words” model. This architecture should run much faster than the others, but may not be as accurate, especially if texts are short.

NameDescription
exclusive_classesWhether or not categories are mutually exclusive. bool
ngram_sizeDetermines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3 would give unigram, trigram and bigram features. int
no_output_layerWhether or not to add an output layer to the model (Softmax activation if exclusive_classes is True, else Logistic). bool
nOOutput dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int]

spacy.TextCatBOW.v2

Versions of this model before spacy.TextCatBOW.v3 used an erroneous sparse linear layer that only used a small number of the allocated parameters.

An n-gram “bag-of-words” model. This architecture should run much faster than the others, but may not be as accurate, especially if texts are short.

NameDescription
exclusive_classesWhether or not categories are mutually exclusive. bool
ngram_sizeDetermines the maximum length of the n-grams in the BOW model. For instance, ngram_size=3 would give unigram, trigram and bigram features. int
no_output_layerWhether or not to add an output layer to the model (Softmax activation if exclusive_classes is True, else Logistic). bool
nOOutput dimension, determined by the number of different labels. If not set, the TextCategorizer component will set it when initialize is called. Optional[int]

spacy.TransitionBasedParser.v1

Identical to spacy.TransitionBasedParser.v2 except the use_upper was set to True by default.

Layers

These functions are available from @spacy.registry.layers.

spacy.StaticVectors.v1

Identical to spacy.StaticVectors.v2 except for the handling of tokens without vectors.

Loggers

These functions are available from @spacy.registry.loggers.

spacy.ConsoleLogger.v1

Writes the results of a training step to the console in a tabular format.

Note that the cumulative loss keeps increasing within one epoch, but should start decreasing across epochs.

NameDescription
progress_barWhether the logger should print the progress bar bool

Logging utilities for spaCy are implemented in the spacy-loggers repo, and the functions are typically available from @spacy.registry.loggers.

More documentation can be found in that repo’s readme file.