SpikeX is a collection of pipes ready to be plugged in a spaCy pipeline. It aims to help in building knowledge extraction tools with almost-zero effort.
from spacy import load as spacy_load from spikex.wikigraph import load as wg_load from spikex.pipes import WikiPageX # load a spacy model and get a doc nlp = spacy_load('en_core_web_sm') doc = nlp('An apple a day keeps the doctor away') # load a WikiGraph wg = wg_load('simplewiki_core') # get a WikiPageX and extract all pages wikipagex = WikiPageX(wg) doc = wikipagex(doc) # see all pages extracted from the doc for span in doc._.wiki_spans: print(span._.wiki_pages)
Submit your project
If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the
project idea label on the issue tracker.