>From my last posting on StrComp you see that I need to compare two strings
that can potentially contain non-ascii characters.
The simplest example would be names. Take the contrived example of the
names Éastwôôd and Wu. Lexicographically Éastwôôd should come before Wu.
Given the following variable definitions:
dim string1 as string = "Éastwôôd"
dim string2 as string = "Wu"
dim string3 as string = "Eastwood"
dim string1Sys as string = ConvertEncoding( string1,
Encodings.SystemDefault )
dim string2Sys as string = ConvertEncoding( string2,
Encodings.SystemDefault)
dim result as integer
// should return -1 but return 1
result = StrComp( string1, string2, 1 )
// returns -1 as expected
result = StrComp( string3, string2, 1 )
// returns -1
result = StrComp( string1Sys, string2Sys, 1 )
Based on this I believe that StrComp is broken if the strings are UTF-8 and
contain any non-ascii characters.
Has no one else seen this? It happens in RB 5.5.5 and RB 2005. How do
people do string comparisons? Is this just a bug that I should be logging?
Converting the to system default encoding (MacRoman on Mac) is possible but
it would be a large performance hit and a potentially lossly conversion.
At this point I'm wondering about writing a plug-in that would use
CFString's on Mac so I could call CFStringCompare with kCFCompareLocalized.
It would be expensive to create the CFString's but it wouldn't be lossy. I
would have to research the functions to use on Windows.
Chris
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>
|