Phrase matcher using RapidFuzz

Combination of the RapidFuzz library with Spacy PhraseMatcher The goal of this component is to find matches when there were NO "perfect matches" due to typos or abbreviations between a Spacy doc and a list of phrases.


import spacy from spacy.language import Language from phruzz_matcher.phrase_matcher import PhruzzMatcher famous_people = [ "Brad Pitt", "Demi Moore", "Bruce Willis", "Jim Carrey", ] @Language.factory("phrase_matcher") def phrase_matcher(nlp: Language, name: str): return PhruzzMatcher(nlp, famous_people, "FAMOUS_PEOPLE", 85) nlp = spacy.blank('es') nlp.add_pipe("phrase_matcher") doc = nlp("El otro día fui a un bar donde vi a brad pit y a Demi Moore, estaban tomando unas cervezas mientras charlaban de sus asuntos.") print(f"doc.ents: {doc.ents}") #OUTPUT #doc.ents: (brad pit, Demi Moore)
Author info

Martin Vallone


Categories pipeline research standalone

