gobbli is a Python library which wraps several modern deep learning models in a uniform interface that makes it easy to evaluate feasibility and conduct analyses. It leverages the abstractive powers of Docker to hide nearly all dependency management and functional differences between models from the user. It also contains an interactive app for exploring text data and evaluating classification models. spaCy's base text classification models, as well as models integrated from
spacy-transformers, are available in the collection of classification models. In addition, spaCy is used for data augmentation and document embeddings.
from gobbli.io import PredictInput, TrainInput from gobbli.model.bert import BERT train_input = TrainInput( X_train=['This is a training document.', 'This is another training document.'], y_train=['0', '1'], X_valid=['This is a validation sentence.', 'This is another validation sentence.'], y_valid=['1', '0'], ) clf = BERT() # Set up classifier resources -- Docker image, etc. clf.build() # Train model train_output = clf.train(train_input) predict_input = PredictInput( X=['Which class is this document?'], labels=train_output.labels, checkpoint=train_output.checkpoint, ) predict_output = clf.predict(predict_input)
Submit your project
If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the
project idea label on the issue tracker.