I am doing clinical message normalization (spell check) in which I check each given word against 900,000 word medical dictionary. I am more concern about the time complexity/performance.
I want to do fuzzy string comparison, but I'm not sure which library to use.
Option 1:
import Levenshtein
Levenshtein.ratio('hello world', 'hello')
Result: 0.625
Option 2:
import difflib
difflib.SequenceMatcher(None, 'hello world', 'hello').ratio()
Result: 0.625
In this example both give the same answer. Do you think both perform alike in this case?