BaseVectors
BaseVectors is an abstract class to support the development of custom vectors
implementations.
For use in training with StaticVectors,
get_batch must be implemented. For improved performance, use efficient
batching in get_batch and implement to_ops to copy the vector data to the
current device. See an example custom implementation for
BPEmb subword embeddings.
BaseVectors.__init__ method
Create a new vector store.
| Name | Description |
|---|---|
| keyword-only | |
strings | The string store. A new string store is created if one is not provided. Defaults to None. Optional[StringStore] |
BaseVectors.__getitem__ method
Get a vector by key. If the key is not found in the table, a KeyError should
be raised.
| Name | Description |
|---|---|
key | The key to get the vector for. Union[int, str] |
| RETURNS | The vector for the key. numpy.ndarray[ndim=1, dtype=float32] |
BaseVectors.__len__ method
Return the number of vectors in the table.
| Name | Description |
|---|---|
| RETURNS | The number of vectors in the table. int |
BaseVectors.__contains__ method
Check whether there is a vector entry for the given key.
| Name | Description |
|---|---|
key | The key to check. int |
| RETURNS | Whether the key has a vector entry. bool |
BaseVectors.add method
Add a key to the table, if possible. If no keys can be added, return -1.
| Name | Description |
|---|---|
key | The key to add. Union[str, int] |
| RETURNS | The row the vector was added to, or -1 if the operation is not supported. int |
BaseVectors.shape property
Get (rows, dims) tuples of number of rows and number of dimensions in the
vector table.
| Name | Description |
|---|---|
| RETURNS | A (rows, dims) pair. Tuple[int, int] |
BaseVectors.size property
The vector size, i.e. rows * dims.
| Name | Description |
|---|---|
| RETURNS | The vector size. int |
BaseVectors.is_full property
Whether the vectors table is full and no slots are available for new keys.
| Name | Description |
|---|---|
| RETURNS | Whether the vectors table is full. bool |
BaseVectors.get_batch methodv3.2
Get the vectors for the provided keys efficiently as a batch. Required to use
the vectors with StaticVectors for
training.
| Name | Description |
|---|---|
keys | The keys. Iterable[Union[int, str]] |
BaseVectors.to_ops method
Dummy method. Implement this to change the embedding matrix to use different Thinc ops.
| Name | Description |
|---|---|
ops | The Thinc ops to switch the embedding matrix to. Ops |
BaseVectors.to_disk method
Dummy method to allow serialization. Implement to save vector data with the pipeline.
| Name | Description |
|---|---|
path | A path to a directory, which will be created if it doesn’t exist. Paths may be either strings or Path-like objects. Union[str,Path] |
BaseVectors.from_disk method
Dummy method to allow serialization. Implement to load vector data from a saved pipeline.
| Name | Description |
|---|---|
path | A path to a directory. Paths may be either strings or Path-like objects. Union[str,Path] |
| RETURNS | The modified vectors object. BaseVectors |
BaseVectors.to_bytes method
Dummy method to allow serialization. Implement to serialize vector data to a binary string.
| Name | Description |
|---|---|
| RETURNS | The serialized form of the vectors object. bytes |
BaseVectors.from_bytes method
Dummy method to allow serialization. Implement to load vector data from a binary string.
| Name | Description |
|---|---|
data | The data to load from. bytes |
| RETURNS | The vectors object. BaseVectors |