Vladimir Sedach

Have Emacs - Will Hack

November 22, 2008

Some new Common Lisp algorithm and data structure implementations

Here are some Common Lisp implementations of algorithms and data structures that I have written recently:

I implementated the Jaro-Winkler and Levenshtein string similarity distance algorithms: https://github.com/vsedach/vas-string-metrics.

Levenshtein is a general algorithm based on insertions/deletions/ substitutions, while Jaro-Winkler is more suited to short strings such as names. One area where the latter comes in handy is denormalizing manually entered records where for example peoples' names may not be consistently entered. I found that Jaro-Winkler works best if you add the distance of the last name and the first name separately while giving the last name greater weight.

I also implemented sparse vectors: sparse-vectors.lisp