Pipeline

DependencyParser

class

This class is a subclass of Pipe and follows the same API. The pipeline component is available in the processing pipeline via the ID "parser".

DependencyParser.Model classmethod

Initialize a model for the pipe. The model should implement the thinc.neural.Model API. Wrappers are under development for most major machine learning libraries.

NameTypeDescription
**kwargs-Parameters for initializing the model

DependencyParser.__init__ method

Create a new pipeline instance. In your application, you would normally use a shortcut for this and instantiate the component using its string name and nlp.create_pipe.

NameTypeDescription
vocabVocabThe shared vocabulary.
modelthinc.neural.Model / TrueThe model powering the pipeline component. If no model is supplied, the model is created when you call begin_training, from_disk or from_bytes.
**cfg-Configuration parameters.

DependencyParser.__call__ method

Apply the pipe to one document. The document is modified in place, and returned. This usually happens under the hood when the nlp object is called on a text and all pipeline components are applied to the Doc in order. Both __call__ and pipe delegate to the predict and set_annotations methods.

NameTypeDescription
docDocThe document to process.

DependencyParser.pipe method

Apply the pipe to a stream of documents. This usually happens under the hood when the nlp object is called on a text and all pipeline components are applied to the Doc in order. Both __call__ and pipe delegate to the predict and set_annotations methods.

NameTypeDescription
streamiterableA stream of documents.
batch_sizeintThe number of texts to buffer. Defaults to 128.

DependencyParser.predict method

Apply the pipeline’s model to a batch of docs, without modifying them.

NameTypeDescription
docsiterableThe documents to predict.

DependencyParser.set_annotations method

Modify a batch of documents, using pre-computed scores.

NameTypeDescription
docsiterableThe documents to modify.
scores-The scores to set, produced by DependencyParser.predict.

DependencyParser.update method

Learn from a batch of documents and gold-standard information, updating the pipe’s model. Delegates to predict and get_loss.

NameTypeDescription
docsiterableA batch of documents to learn from.
goldsiterableThe gold-standard data. Must have the same length as docs.
dropfloatThe dropout rate.
sgdcallableThe optimizer. Should take two arguments weights and gradient, and an optional ID.
lossesdictOptional record of the loss during training. The value keyed by the model’s name is updated.

DependencyParser.get_loss method

Find the loss and gradient of loss for the batch of documents and their predicted scores.

NameTypeDescription
docsiterableThe batch of documents.
goldsiterableThe gold-standard data. Must have the same length as docs.
scores-Scores representing the model’s predictions.

DependencyParser.begin_training method

Initialize the pipe for training, using data examples if available. If no model has been initialized yet, the model is added.

NameTypeDescription
gold_tuplesiterableOptional gold-standard annotations from which to construct GoldParse objects.
pipelinelistOptional list of pipeline components that this component is part of.
sgdcallableAn optional optimizer. Should take two arguments weights and gradient, and an optional ID. Will be created via DependencyParser if not set.

DependencyParser.create_optimizer method

Create an optimizer for the pipeline component.

NameTypeDescription

DependencyParser.use_params methodcontextmanager

Modify the pipe’s model, to use the given parameter values.

NameTypeDescription
params-The parameter values to use in the model. At the end of the context, the original parameters are restored.

DependencyParser.add_label method

Add a new label to the pipe.

NameTypeDescription
labelunicodeThe label to add.

DependencyParser.to_disk method

Serialize the pipe to disk.

NameTypeDescription
pathunicode / PathA path to a directory, which will be created if it doesn’t exist. Paths may be either strings or Path-like objects.
excludelistString names of serialization fields to exclude.

DependencyParser.from_disk method

Load the pipe from disk. Modifies the object in place and returns it.

NameTypeDescription
pathunicode / PathA path to a directory. Paths may be either strings or Path-like objects.
excludelistString names of serialization fields to exclude.

DependencyParser.to_bytes method

Serialize the pipe to a bytestring.

NameTypeDescription
excludelistString names of serialization fields to exclude.

DependencyParser.from_bytes method

Load the pipe from a bytestring. Modifies the object in place and returns it.

NameTypeDescription
bytes_databytesThe data to load from.
excludelistString names of serialization fields to exclude.

DependencyParser.labels property

The labels currently added to the component.

NameTypeDescription

Serialization fields

During serialization, spaCy will export several data fields used to restore different aspects of the object. If needed, you can exclude them from serialization by passing in the string names via the exclude argument.

NameDescription
vocabThe shared Vocab.
cfgThe config file. You usually don’t want to exclude this.
modelThe binary model data. You usually don’t want to exclude this.