Damerau–Levenshtein distance

Now you'd thing what the hell am I rambling about now, but here's the thing. Reading through my daily dose of programming news in Google Reader I stumbled upon this clever algorithm - the Damerau-Levenshtein distance algorithm. It is used to find the number of steps it would take to transform one string into another using the following operations: insertion, deletion, substitution and trasposition. Or put simply - Damerau-Levenshtein distance is a metric to defining the difference between two strings. The main areas of use are spell-checking (typos etc), DNA difference calculation etc. So I read up on the algorith at the Wiki page and decided to implement it just for the implementation's sake. Having fiddled around with the code for some time it dawned on me - I have this OCD when I perform the exact same operation in my head for two words that have a revelance to whichever situation I happen to be in at a given moment. I also provide a corresponding String extension method. Oh and if anyone hasn't picked it up yet, I mainly write code in C# 3.0 as is the case with this post. So without further ado, I present to you the implementation:

public static class DamerauLevenshtein
{
 public static int DamerauLevenshteinDistanceTo(this string @string, string targetString)
 {
  return DamerauLevenshteinDistance(@string, targetString);
 }

 public static int DamerauLevenshteinDistance(string string1, string string2)
 {
  if (String.IsNullOrEmpty(string1))
  {
   if (!String.IsNullOrEmpty(string2))
    return string2.Length;

   return 0;
  }

  if (String.IsNullOrEmpty(string2))
  {
   if (!String.IsNullOrEmpty(string1))
    return string1.Length;

   return 0;
  }

  int length1 = string1.Length;
  int length2 = string2.Length;

  int[,] d = new int[length1 + 1, length2 + 1];

  int cost, del, ins, sub;

  for (int i = 0; i <= d.GetUpperBound(0); i++)
   d[i, 0] = i;

  for (int i = 0; i <= d.GetUpperBound(1); i++)
   d[0, i] = i;

  for (int i = 1; i <= d.GetUpperBound(0); i++)
  {
   for (int j = 1; j <= d.GetUpperBound(1); j++)
   {
    if (string1[i - 1] == string2[j - 1])
     cost = 0;
    else
     cost = 1;

    del = d[i - 1, j] + 1;
    ins = d[i, j - 1] + 1;
    sub = d[i - 1, j - 1] + cost;

    d[i, j] = Math.Min(del, Math.Min(ins, sub));

    if (i > 1 && j > 1 && string1[i - 1] == string2[j - 2] && string1[i - 2] == string2[j - 1])
     d[i, j] = Math.Min(d[i, j], d[i - 2, j - 2] + cost);
   }
  }

  return d[d.GetUpperBound(0), d.GetUpperBound(1)];
 }
}

What do you guys think, should the presentation of interesting algorithm finds in my blog become a recurring subject? I think it wouldn't hurt.

Peace out,
Mihkel

The weirdest clouds I've ever seen

Today we had the weirdest type of clouds in the skies above Tallinn. I've never seen anything even remotely resembling something like this. The whole circus lasted for about an hour and was definitely a spectacular display of atmospheric art. If anyone out there knows the type of these clouds please leave a note in the comments section. Here are some quick snaps i took ...


And a panorama
Keeping eyes open towards the sky,
Mihkel

Cloudy Saab photos

On the same awesomely-cloudy day i decided to take some photos of my beloved Saab. Here are the results ...

Saab photographs darn well, doesn't it? :)
Mihkel

Debugging made easy

I know that debugging and how it's done is a very personal thing. Still - I'd like to share some of my experiences and tips I've gathered during my years of programming. Here goes...

The situation: I know a specific line that I'd like to debug to in a file. The solution before: navigate to the line, add a breakpoint, press F5. It's pretty simple in essence but still a bit cumbersome. And what if this breakpoint is a one-timer - you'd have to remove it afterwards. Furthermore, if you forget to remove it, the next time the debugger breaks at the line it ever so rudly interupts your already mentally challenging debugging session. It just so happens to be there's a solution for all this - a command called Run to cursor. I've set it under the F8 key so it's close to all the rest of the debugging keys. What it does is runs the code and breaks the debugger at the line where the caret is positioned - no need for temporary breakpoints etc. Cool, huh? :)

Next on the list - the Debugger.Break() method in the System.Diagnostics namespace. This method signals a breakpoint to an attached debugger. One might wonder - what's the point - I could have just as easily defined a breakpoint myself, why the fuss about defining it in code. A very valid point. Now imagine a case when there is no debugger yet attached or there is no simple way to attach one. If this is the case then calling this method asks the user to select a debugger. This method comes in handy when debugging e.g. windows services and processes than cannot be started out of Visual Studio. Awesome - I know.

Last but not least - DebuggerStepThroughAttribute in the System.Diagnostics namespace. This attribute can be used to decorate a class, struct, method and constructor and tells the debugger to ignore the decorated piece of code and not step into it, although a breakpoint can still be set an be breaked to. The step through attribute can also be defined on a property but has to be declared on the setter and getter individually. Common uses would be e.g. .net base class library extension methods, simple field accessor properties on domain objects etc. Here's a nuisance this attribute helps to overcome: some object properties are passed into a method as parameters and you want the debugger to step into the method withouth having to initially step through all the properties. Once the getters on the properties have the step through attribute on them the debugger can step right into the desired method ignoring the properties.

I sincerely hope any of this is helpful to some of the fellow programmers out there and once again - it wouldn't hurt to hear you, the reade's, handy tips about code debugging in Visual Studio in the comments section.


Debug away...
Mihkel