You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I just noticed that the fuzzywuzzy.ratio function, used for edit sim calculation here, calculates different scores, depending on whether python-Levenshtein is installed or not. Also, it does not actually calculate Levenshtein distance at all.
If python-Levenshtein is not installed, it falls back to difflib.SequenceMatcher.ratio(), which in fact calculates something similar to the Ratcliff/Obershelp algorithm, according to the docs, which is different from edit similarity: python/cpython#69578 (comment)
If python-Levenshtein is installed, the ratio function still does not return the Levenshtein distance ratio but the InDel ratio, which does not allow substitutions! seatgeek/thefuzz#53
I am confused, how do I correctly evaluate the code tasks? Without this dependency something completely different is calculated, even though in the paper it is explicitly mentioned that Levenshtein distance is used for edit sim. However, this repo does not mention python-Levenshtein as a dependency. And even when using this dependency, its still not the actual Levenshtein distance!
It seems that this is a widespread inconsistency, for example in the RepoBench-P repository, python-Levenshtein is listed as a requirement, but in the LCC repo it is not mentioned...
Just as an example, here are some results without python-Levenshtein:
Hi,
I just noticed that the
fuzzywuzzy.ratio
function, used for edit sim calculation here, calculates different scores, depending on whetherpython-Levenshtein
is installed or not. Also, it does not actually calculate Levenshtein distance at all.If
python-Levenshtein
is not installed, it falls back todifflib.SequenceMatcher.ratio()
, which in fact calculates something similar to the Ratcliff/Obershelp algorithm, according to the docs, which is different from edit similarity: python/cpython#69578 (comment)If
python-Levenshtein
is installed, the ratio function still does not return the Levenshtein distance ratio but the InDel ratio, which does not allow substitutions! seatgeek/thefuzz#53I am confused, how do I correctly evaluate the code tasks? Without this dependency something completely different is calculated, even though in the paper it is explicitly mentioned that Levenshtein distance is used for edit sim. However, this repo does not mention
python-Levenshtein
as a dependency. And even when using this dependency, its still not the actual Levenshtein distance!@bys0318
The text was updated successfully, but these errors were encountered: