realbasic-nug
[Top] [All Lists]

Re: How to compare non-ascii strings for sorting?

To: REALbasic NUG <realbasic-nug at lists dot realsoftware dot com>
Subject: Re: How to compare non-ascii strings for sorting?
From: Charles Yeomans <yeomans at desuetude dot com>
Date: Thu, 30 Jun 2005 11:21:42 -0400
Delivered-to: realbasic-nug at lists dot realsoftware dot com
References: <BEE973D2 dot 3068%cslittle at mac dot com> <E3F1EB1E-2B84-4400-AE28-FE73B3E72CE9 at ljug dot com>

On Jun 30, 2005, at 11:12 AM, Brady Duga wrote:


On Jun 30, 2005, at 7:02 AM, Chris Little wrote:

The simplest example would be names. Take the contrived example of the names Éastwôôd and Wu. Lexicographically Éastwôôd should come before Wu.

This is an oversimplification. Although that may be the sort order for American English, it may not be for other languages (don't actually know). The classic (Unicode) example is 'ø' - in most languages it is considered a variant of 'o', but it is not the case in Norwegian and Danish, where it is sorted after 'z'. Even in that case, sorts should not be done using code-point values, a mistake people often make.

That said, I am a little surprised by the behavior of Rb. I wonder if they have some special case code for UTF8 vs other encodings. I would expect they are just asking the system to perform the comparison - perhaps there is a problem with the way they are passing the data to the system call. It is also entirely possible the system calls are broken.


At this point I'm wondering about writing a plug-in that would use
CFString's on Mac so I could call CFStringCompare with kCFCompareLocalized. It would be expensive to create the CFString's but it wouldn't be lossy. I
would have to research the functions to use on Windows.

A plug in seems like overkill. I think this one declare will do it for you in 2005:

declare function CFStringCompare Lib "CarbonLib" (str1 as CFString, str2 as CFString, flags as integer) as integer

then:

dim res as integer
res = CFStringCompare("Éastwôôd", "Wu", kCFCompareLocalized) //where kCFCompareLocalized = 32

I haven't tested it, so there may be a typo, but it should work.

One small one; the variable type for the first two parameters should be CFStringRef.

--------------
Charles Yeomans

_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>

Search the archives of this list here:
<http://support.realsoftware.com/listarchives/lists.html>

<Prev in Thread] Current Thread [Next in Thread>