This library links SpaCy with DBpedia Spotlight. You can easily get the DBpedia entities from your documents, using the public web service or by using your own instance of DBpedia Spotlight. The
doc.ents are populated with the entities and all their details (URI, type, ...).
import spacy_dbpedia_spotlight # load your model as usual nlp = spacy.load('en_core_web_lg') # add the pipeline stage nlp.add_pipe('dbpedia_spotlight') # get the document doc = nlp('The president of USA is calling Boris Johnson to decide what to do about coronavirus') # see the entities print('Entities', [(ent.text, ent.label_, ent.kb_id_) for ent in doc.ents]) # inspect the raw data from DBpedia spotlight print(doc.ents._.dbpedia_raw_result)
Submit your project
If you have a project that you want the spaCy community to make use of, you can suggest it by submitting a pull request to the spaCy website repository. The Universe database is open-source and collected in a simple JSON file. For more details on the formats and available fields, see the documentation. Looking for inspiration your own spaCy plugin or extension? Check out the
project idea label on the issue tracker.