StringStore
Look up strings by 64-bit hashes. As of v2.0, spaCy uses hash values instead of
integer IDs. This ensures that strings always map to the same ID, even from
different StringStores.
StringStore.__init__ method
Create the StringStore.
| Name | Description |
|---|---|
strings | A sequence of strings to add to the store. Optional[Iterable[str]] |
StringStore.__len__ method
Get the number of strings in the store.
| Name | Description |
|---|---|
| RETURNS | The number of strings in the store. int |
StringStore.__getitem__ method
Retrieve a string from a given hash, or vice versa.
| Name | Description |
|---|---|
string_or_id | The value to encode. Union[bytes, str, int] |
| RETURNS | The value to be retrieved. Union[str, int] |
StringStore.__contains__ method
Check whether a string is in the store.
| Name | Description |
|---|---|
string | The string to check. str |
| RETURNS | Whether the store contains the string. bool |
StringStore.__iter__ method
Iterate over the strings in the store, in order. Note that a newly initialized
store will always include an empty string "" at position 0.
| Name | Description |
|---|---|
| YIELDS | A string in the store. str |
StringStore.add method
Add a string to the StringStore.
| Name | Description |
|---|---|
string | The string to add. str |
| RETURNS | The string’s hash value. int |
StringStore.to_disk method
Save the current state to a directory.
| Name | Description |
|---|---|
path | A path to a directory, which will be created if it doesn’t exist. Paths may be either strings or Path-like objects. Union[str,Path] |
StringStore.from_disk method
Loads state from a directory. Modifies the object in place and returns it.
| Name | Description |
|---|---|
path | A path to a directory. Paths may be either strings or Path-like objects. Union[str,Path] |
| RETURNS | The modified StringStore object. StringStore |
StringStore.to_bytes method
Serialize the current state to a binary string.
| Name | Description |
|---|---|
| RETURNS | The serialized form of the StringStore object. bytes |
StringStore.from_bytes method
Load state from a binary string.
| Name | Description |
|---|---|
bytes_data | The data to load from. bytes |
| RETURNS | The StringStore object. StringStore |
Utilities
strings.hash_string function
Get a 64-bit hash for a given string.
| Name | Description |
|---|---|
string | The string to hash. str |
| RETURNS | The hash. int |