realbasic-nug
[Top] [All Lists]

How to compare non-ascii strings for sorting?

To: REALbasic Network Users Group <realbasic-nug at lists dot realsoftware dot com>
Subject: How to compare non-ascii strings for sorting?
From: Chris Little <cslittle at mac dot com>
Date: Thu, 30 Jun 2005 10:02:58 -0400
Delivered-to: realbasic-nug at lists dot realsoftware dot com
>From my last posting on StrComp you see that I need to compare two strings
that can potentially contain non-ascii characters.

The simplest example would be names.  Take the contrived example of the
names Éastwôôd and Wu.  Lexicographically Éastwôôd should come before Wu.

Given the following variable definitions:

  dim string1 as string = "Éastwôôd"
  dim string2 as string = "Wu"
  dim string3 as string = "Eastwood"
  dim string1Sys as string = ConvertEncoding( string1,
Encodings.SystemDefault )
  dim string2Sys as string = ConvertEncoding( string2,
Encodings.SystemDefault)
  dim result as integer

  // should return -1 but return 1
  result = StrComp( string1, string2, 1 )

  // returns -1 as expected
  result = StrComp( string3, string2, 1 )

  // returns -1
  result = StrComp( string1Sys, string2Sys, 1 )

Based on this I believe that StrComp is broken if the strings are UTF-8 and
contain any non-ascii characters.

Has no one else seen this?  It happens in RB 5.5.5 and RB 2005.  How do
people do string comparisons?  Is this just a bug that I should be logging?

Converting the to system default encoding (MacRoman on Mac) is possible but
it would be a large performance hit and a potentially lossly conversion.

At this point I'm wondering about writing a plug-in that would use
CFString's on Mac so I could call CFStringCompare with kCFCompareLocalized.
It would be expensive to create the CFString's but it wouldn't be lossy.  I
would have to research the functions to use on Windows.

Chris


_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

<Prev in Thread] Current Thread [Next in Thread>