On Oct 30, 2004, at 2:05 PM, Theodore H.Smith wrote:
Does anyone know much about the field of string similarity?
I have a distaste for the academia, because:
1) If you look at the pioneers of modern computing, few came from the
academia.
2) Most of what they write is over-complexified, yet impractical.
"O(N)" notations and really bad source code writing skills.
The "O(N)" notation is a simple, expressive way to describe the
asymptotic growth rate of a function. I'm happy to explain it, if you
like.
No need to explain it, not to me as its simple enough to grasp. To me
its like bad C. C is a simple language, correct? But C can be used to
write awful things that no human should understand (obfuscated C). To
me, O(N) is just like that. Simple, but not nice looking. A step
towards obfuscation.
I don't think that the concept O(f(N)) represents is entirely simple,
but I've probably spent more time thinking about it.
O(N) notation does sense and is useful, its just unnecessarily
mathematical... Why not just express it like "linear cost", or
"quadratic cost". I suppose when you have functions whose cost really
IS an equation, like n*log(n) then it makes sense, but I'd really only
use that as a last resort, if there wasn't a better way of expressing
the costs, without resorting to obfuscation.
Perhaps you could suggest a less obfuscatory way to say O(N^(5/2)), or
O(N*log(log(N))^2).
Also, it doesn't mention the length of time that O takes. O(N) for
example, may be slower than O(N^2) for low values of N, if O in the
first case is high, but O in the second case is low.
That's right. This is part of the point of the concept, to allow one
to ignore behavior for small N. For situations in which one needs more
precise descriptions of function growth, big-O notation (and, more
importantly, the concept it represents) does not suffice.
Of course when N gets high the speed of the operation becomes
irrelevant, but for certain ranges the speed of the operation does
make a difference. For example, what if in the first case, O is 1000x
slower? And what if the values of N, only can range from 10-30? In
this case, the second function is better.
See above.
Things like that, make me wonder why use such notations in the first
place. The real world can't get summarised into mathematics so neatly,
much of the time. I'd only use such notation as a last resort, not in
preference to more natural notation.
Well, the notation has been around for a long time; it predates
computer science. This suggests that a lot of other people have found
it useful.
Charles Yeomans
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://www.realsoftware.com/listarchives/lists.html>
|