Example

classv3

A training instance

An Example holds the information for one training instance. It stores two Doc objects: one for holding the gold-standard reference data, and one for holding the predictions of the pipeline. An Alignment object stores the alignment between these two documents, as they can differ in tokenization.

Example.init method

Construct an Example object from the predicted document and the reference document. If alignment is None, it will be initialized from the words in both documents.

Name	Description
`predicted`	The document containing (partial) predictions. Cannot be `None`. Doc
`reference`	The document containing gold-standard annotations. Cannot be `None`. Doc
keyword-only
`alignment`	An object holding the alignment between the tokens of the `predicted` and `reference` documents. Optional[Alignment]

Example.from_dict classmethod

Construct an Example object from the predicted document and the reference annotations provided as a dictionary. For more details on the required format, see the training format documentation.

Name	Description
`predicted`	The document containing (partial) predictions. Cannot be `None`. Doc
`example_dict`	The gold-standard annotations as a dictionary. Cannot be `None`. Dict[str, Any]
RETURNS	The newly constructed object. Example

Example.text property

The text of the predicted document in this Example.

Name	Description
RETURNS	The text of the `predicted` document. str

Example.predicted property

The Doc holding the predictions. Occasionally also referred to as example.x.

Name	Description
RETURNS	The document containing (partial) predictions. Doc

Example.reference property

The Doc holding the gold-standard annotations. Occasionally also referred to as example.y.

Name	Description
RETURNS	The document containing gold-standard annotations. Doc

Example.alignment property

The Alignment object mapping the tokens of the predicted document to those of the reference document.

Name	Description
RETURNS	The document containing gold-standard annotations. Alignment

Example.get_aligned method

Get the aligned view of a certain token attribute, denoted by its int ID or string name.

Name	Description
`field`	Attribute ID or string name. Union[int, str]
`as_string`	Whether or not to return the list of values as strings. Defaults to `False`. bool
RETURNS	List of integer values, or string values if `as_string` is `True`. Union[List[int], List[str]]

Example.get_aligned_parse method

Get the aligned view of the dependency parse. If projectivize is set to True, non-projective dependency trees are made projective through the Pseudo-Projective Dependency Parsing algorithm by Nivre and Nilsson (2005).

Name	Description
`projectivize`	Whether or not to projectivize the dependency trees. Defaults to `True`. bool
RETURNS	List of integer values, or string values if `as_string` is `True`. Union[List[int], List[str]]

Example.get_aligned_ner method

Get the aligned view of the NER BILUO tags.

Name	Description
RETURNS	List of BILUO values, denoting whether tokens are part of an NER annotation or not. List[str]

Example.get_aligned_spans_y2x method

Get the aligned view of any set of Span objects defined over Example.reference. The resulting span indices will align to the tokenization in Example.predicted.

Name	Description
`y_spans`	`Span` objects aligned to the tokenization of `reference`. Iterable[Span]
`allow_overlap`	Whether the resulting `Span` objects may overlap or not. Set to `False` by default. bool
RETURNS	`Span` objects aligned to the tokenization of `predicted`. List[Span]

Example.get_aligned_spans_x2y method

Get the aligned view of any set of Span objects defined over Example.predicted. The resulting span indices will align to the tokenization in Example.reference. This method is particularly useful to assess the accuracy of predicted entities against the original gold-standard annotation.

Name	Description
`x_spans`	`Span` objects aligned to the tokenization of `predicted`. Iterable[Span]
`allow_overlap`	Whether the resulting `Span` objects may overlap or not. Set to `False` by default. bool
RETURNS	`Span` objects aligned to the tokenization of `reference`. List[Span]

Example.to_dict method

Return a dictionary representation of the reference annotation contained in this Example.

Name	Description
RETURNS	Dictionary representation of the reference annotation. Dict[str, Any]

Example.split_sents method

Split one Example into multiple Example objects, one for each sentence.

Name	Description
RETURNS	List of `Example` objects, one for each original sentence. List[Example]

Alignment v3.0

Calculate alignment tables between two tokenizations.

Alignment attributes

Alignment attributes are managed using AlignmentArray, which is a simplified version of Thinc’s Ragged type that only supports the data and length attributes.

Name	Description
`x2y`	The `AlignmentArray` object holding the alignment from `x` to `y`. AlignmentArray
`y2x`	The `AlignmentArray` object holding the alignment from `y` to `x`. AlignmentArray

Alignment.from_strings function

Name	Description
`A`	String values of candidate tokens to align. List[str]
`B`	String values of reference tokens to align. List[str]
RETURNS	An `Alignment` object describing the alignment. Alignment

Suggest edits