StringStore
Look up strings by 64-bit hashes. As of v2.0, spaCy uses hash values instead of
integer IDs. This ensures that strings always map to the same ID, even from
different StringStores
.
StringStore.__init__ method
Create the StringStore
.
Name | Description |
---|---|
strings | A sequence of strings to add to the store. Optional[Iterable[str]] |
StringStore.__len__ method
Get the number of strings in the store.
Name | Description |
---|---|
RETURNS | The number of strings in the store. int |
StringStore.__getitem__ method
Retrieve a string from a given hash, or vice versa.
Name | Description |
---|---|
string_or_id | The value to encode. Union[bytes, str, int] |
RETURNS | The value to be retrieved. Union[str, int] |
StringStore.__contains__ method
Check whether a string is in the store.
Name | Description |
---|---|
string | The string to check. str |
RETURNS | Whether the store contains the string. bool |
StringStore.__iter__ method
Iterate over the strings in the store, in order. Note that a newly initialized
store will always include an empty string ""
at position 0
.
Name | Description |
---|---|
YIELDS | A string in the store. str |
StringStore.add method
Add a string to the StringStore
.
Name | Description |
---|---|
string | The string to add. str |
RETURNS | The string’s hash value. int |
StringStore.to_disk method
Save the current state to a directory.
Name | Description |
---|---|
path | A path to a directory, which will be created if it doesn’t exist. Paths may be either strings or Path -like objects. Union[str,Path] |
StringStore.from_disk method
Loads state from a directory. Modifies the object in place and returns it.
Name | Description |
---|---|
path | A path to a directory. Paths may be either strings or Path -like objects. Union[str,Path] |
RETURNS | The modified StringStore object. StringStore |
StringStore.to_bytes method
Serialize the current state to a binary string.
Name | Description |
---|---|
RETURNS | The serialized form of the StringStore object. bytes |
StringStore.from_bytes method
Load state from a binary string.
Name | Description |
---|---|
bytes_data | The data to load from. bytes |
RETURNS | The StringStore object. StringStore |
Utilities
strings.hash_string function
Get a 64-bit hash for a given string.
Name | Description |
---|---|
string | The string to hash. str |
RETURNS | The hash. int |