realbasic-nug
[Top] [All Lists]

Re: String similarity

To: REALbasic NUG <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: String similarity
From: Norman Palardy <palardyn at shaw dot ca>
Date: Sat, 30 Oct 2004 11:09:19 -0600
Delivered-to: realbasic-nug at lists dot realsoftware dot com
References: <69B44A0F-29ED-11D9-95A2-000A27B1C8AE at elfdata dot com> <20041030152642 dot 6083F76C6A at isis dot visi dot com>

On Oct 30, 2004, at 9:26 AM, Craig A. Finseth wrote:

   I have a distaste for the academia, because:

1) If you look at the pioneers of modern computing, few came from the
   academia.

You mean Turing, Shannon, and von Neumann: the ones that invented
computing?

Or guys like Dijkstra ? Knuth ?

[snipped]

3) They impose artificial barriers to learning. If you haven't paid up
   for a degree course and aren't willing to spend years of your life
   "learning" stuff that you already know, or will never need to know,
they won't teach you the things you want to know, and they won't spread
   the word of what you've discovered.

First, if you already know this stuff, you wouldn't find things like
O(N) notations difficult and you wouldn't have listed point (2).

Second, if you don't need to know it, why are you finding it
relevant to your problem?"

Third, because you didn't discover it: at best, you re-discovered it.
It was probably discovered 50 years ago.  If you ask them politely,
they can probably tell you who did discover it (the first time).

Agreed.
Big O is important to understand.
If you don't you'll have to experimentally determine what the fastest way to solve a problem is.
This is trial and error.
By understanding Big O and how to do the analysis of an algorithm, you can rules out a class of solutions that will be slow right away.

Anyhow, most people seem to do generalised (not application specific) string similarity with levenshtein. Things like soundex, is application specific. The idea of 3 basic operations (insert/delete/replace) is a logically wholesome one, so I have no issue there. (In fact it relates
   to the 3 archetypal forces of creation destruction and movement).

The "levenshtein" algorithm is probably named after someone -- most
likly one of those academicians that you have a distaste for -- who
invented it.  It probably took many years of study for him to be able
to do that invention.

It is indeed.
Levenshtein distance is named after the Russian scientist Vladimir Levenshtein, who devised the algorithm in 1965.

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://www.realsoftware.com/listarchives/lists.html>

<Prev in Thread] Current Thread [Next in Thread>