Other

StringStore

class

Look up strings by 64-bit hashes. As of v2.0, spaCy uses hash values instead of integer IDs. This ensures that strings always map to the same ID, even from different StringStores.

StringStore.__init__ method

Create the StringStore.

NameDescription
stringsA sequence of strings to add to the store. Optional[Iterable[str]]

StringStore.__len__ method

Get the number of strings in the store.

NameDescription

StringStore.__getitem__ method

Retrieve a string from a given hash, or vice versa.

NameDescription
string_or_idThe value to encode. Union[bytes, str, int]

StringStore.__contains__ method

Check whether a string is in the store.

NameDescription
stringThe string to check. str

StringStore.__iter__ method

Iterate over the strings in the store, in order. Note that a newly initialized store will always include an empty string "" at position 0.

NameDescription

StringStore.add method

Add a string to the StringStore.

NameDescription
stringThe string to add. str

StringStore.to_disk method

Save the current state to a directory.

NameDescription
pathA path to a directory, which will be created if it doesn’t exist. Paths may be either strings or Path-like objects. Union[str,Path]

StringStore.from_disk method

Loads state from a directory. Modifies the object in place and returns it.

NameDescription
pathA path to a directory. Paths may be either strings or Path-like objects. Union[str,Path]

StringStore.to_bytes method

Serialize the current state to a binary string.

NameDescription

StringStore.from_bytes method

Load state from a binary string.

NameDescription
bytes_dataThe data to load from. bytes

Utilities

strings.hash_string function

Get a 64-bit hash for a given string.

NameDescription
stringThe string to hash. str