You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As per #378, we removed the part of name collision detection that used a Levenshtein distance. We found this was not a helpful metric, and it only gave false positives.
@punchagan did some research and found focused work to specifically detect typosquatting problems. We think this would be a helpful replacement for the removed Levenshtein distance check.
There's some prior work done on other package archives (like PyPI, npm and Rust's crates) in this [paper=(https://arxiv.org/pdf/2003.03471), and the packages based on / related to it: typogard and typomania.
The paper (and the packages) primarily focus on malicious typo-squatting, and the package repositories are much larger than opam. But, we could adapt the Typosquatting Signals (Sec 3.3) explored in the paper for our use case 12. They use a concept of popular (and unpopular) packages for detecting malicious typosquatting, but we probably don't need that for our use case given we aren't doing strictly for malicious typosquatting checks, our repository size and the manual approval process for package addition/updates.
The text was updated successfully, but these errors were encountered:
As per #378, we removed the part of name collision detection that used a Levenshtein distance. We found this was not a helpful metric, and it only gave false positives.
@punchagan did some research and found focused work to specifically detect typosquatting problems. We think this would be a helpful replacement for the removed Levenshtein distance check.
From #378 (comment)
The text was updated successfully, but these errors were encountered: