Saturday, October 18, 2008

Levenshtein distance

The Levenshtein distance between two strings is a measurement of similarity. The smaller the distance, the more similar two strings are. Our PHP programmer friends have a function to calculate this distance; we deserve one too.


ASP

  1. function levenshtein(byVal first, byVal second)
  2.     dim distance
  3.     dim truncateLength
  4.     if first = second then
  5.         ' The distance is zero if the strings are identical.
  6.         distance = 0

  7.     else
  8.         ' The distance is at least the difference of the lengths of the two strings.
  9.         distance = abs(len(first) - len(second))
  10.         ' Force the strings to be the same length to prevent overflows.
  11.         truncateLength = ((len(first) + len(second)) - distance) / 2
  12.         first = Left(first, truncateLength)
  13.         second = Left(second, truncateLength)
  14.         ' Compare the corresponding characters in each string.
  15.         for i = 1 to truncateLength
  16.             if Mid(first, i, 1) <> Mid(second, i, 1) then
  17.                 distance = distance + 1
  18.             end if
  19.         next
  20.     end if
  21.     levenshtein = distance
  22. end function

View ASP implementation on Snipplr

No comments: