What's New in v3.7
spaCy v3.7 adds support for Python 3.12, introduces the new standalone library Weasel for project workflows, and updates the transformer-based trained pipelines to use our new Curated Transformers library.
This release drops support for Python 3.6.
Weasel
The spaCy projects functionality has been moved into a new standalone library Weasel. This brings minor changes to spaCy-specific settings in spaCy projects (see upgrading below), but also makes it possible to use the same workflow functionality outside of spaCy.
All spacy project commands should run as before, just now they’re using Weasel
under the hood.
Registered vectors
You can specify a custom registered vectors class under [nlp.vectors] in order
to use static vectors in formats other than the ones supported by
Vectors. To implement your custom vectors, extend the abstract
class BaseVectors. See an example using
BPEmb subword embeddings.
Additional features and improvements
- Add support for Python 3.12.
- Extend to Thinc v8.2.
- Extend
transformersextra tospacy-transformersv1.3. - Add
--spans-keyoption for CLI evaluation withspacy benchmark accuracy. - Load the CLI module lazily for
spacy.info. - Add type stubs for for
spacy.training.example. - Warn for unsupported pattern keys in dependency matcher.
Language.replace_listeners: Pass the replaced listener and thetok2vecpipe to the callback in order to supportspacy-curated-transformers.- Always use
tqdmwithdisable=Nonein order to disable output in non-interactive environments. - Language updates:
- Add left and right pointing angle brackets as punctuation to ancient Greek.
- Update example sentences for Turkish.
- Package setup updates:
- Update NumPy build constraints for NumPy 1.25+. For Python 3.9+, it is no longer necessary to set build constraints while building binary wheels.
- Refactor Cython profiling in order to disable profiling for Python 3.12 in the package setup, since Cython does not currently support profiling for Python 3.12.
Trained pipelines
Pipeline updates
The transformer-based trf pipelines have been updated to use our new
Curated Transformers
library using the Thinc model wrappers and pipeline component from
spaCy Curated Transformers.
Notes about upgrading from v3.6
This release drops support for Python 3.6, drops mypy checks for Python 3.7 and
removes the ray extra. In addition there are several minor changes for spaCy
projects described in the following section.
Backwards incompatibilities for spaCy Projects
spacy project has a few backwards incompatibilities due to the transition to
the standalone library Weasel, which is
not as tightly coupled to spaCy. Weasel produces warnings when it detects older
spaCy-specific settings in your environment or project config.
- Support for the
spacy_versionconfiguration key has been dropped. - Support for the
check_requirementsconfiguration key has been dropped due to the deprecation ofpkg_resources. - The
SPACY_CONFIG_OVERRIDESenvironment variable is no longer checked. You can set configuration overrides usingWEASEL_CONFIG_OVERRIDES. - Support for
SPACY_PROJECT_USE_GIT_VERSIONenvironment variable has been dropped. - Error codes are now Weasel-specific and do not follow spaCy error codes.
Pipeline package version compatibility
When you’re loading a pipeline package trained with an earlier version of spaCy v3, you will see a warning telling you that the pipeline may be incompatible. This doesn’t necessarily have to be true, but we recommend running your pipelines against your test suite or evaluation data to make sure there are no unexpected results.
If you’re using one of the trained pipelines we provide, you should
run spacy download to update to the latest version. To
see an overview of all installed packages and their compatibility, you can run
spacy validate.
If you’ve trained your own custom pipeline and you’ve confirmed that it’s still
working as expected, you can update the spaCy version requirements in the
meta.json:
Updating v3.6 configs
To update a config from spaCy v3.6 with the new v3.7 settings, run
init fill-config:
In many cases (spacy train,
spacy.load), the new defaults will be filled in
automatically, but you’ll need to fill in the new settings to run
debug config and debug data.