On Mar 29, 2008, at 2:36 PM, Joe Strout wrote:
> On Mar 29, 2008, at 11:10 AM, James Sentman wrote:
>
>> I"m sure Joe is correct when he suggests you're hitting a text
>> encoding issue. I understand and even acknowledge the necessity of
>> text encoding, but despise is not too strong a word for how I feel
>> towards the implementation we're using.
>
> Why's that? In most cases you don't have to worry about it in RB;
> the defaults do sensible things and everything just works, except of
> course when interacting with some external data source that's not
> Unicode-savvy. On the whole, I think it's pretty elegant.
>
I guess it's mostly because of my recent bad experiences with it ;) Un
leopard Apple forced all strings in AppleScripts to UTF16 and I use
RB's default of UTF8, and RB doesn't know that the strings now coming
>from AppleEvents are no longer binary equivalent so for example I
can't look up objects in a dictionary by name to resolve an apple
script object request anymore. Just out of the blue. And AppleScript
seems to completely ignore the property in the dictionary where you
tell it what type of variable to expect or allow... And I've found
cases where some encodings seem to cause random crashes when putting
things into CFObjects... lots of little things like that where I find
I have to deal with it when I shouldn't. If you're just using RB
inside RB then it rarely becoms an issue, but when you talk to other
things then all of a sudden things dont always work and sorting it out
is not always obvious.
>> However, once you read it in from ReadPString it doesn't
>> know that anymore as the encodings aren't saved along with the binary
>> data.
>
> As they shouldn't be, since the point of WritePString and ReadPString
> is to interact with old MacOS9 systems, which didn't store encoding
> data nor use Unicode, but generally assumed that all strings were in
> the default WorldScript encoding.
>
>
>> So, regardless of how you write the string, you may have to redefine
>> the encoding when you read it back in.
>
> No, if you use a TextInputStream, the encoding will be (correctly)
> defined as UTF-8 by default.
>
But I dont want to use a text stream possibly? I want to write a bunch
of stuff to a binary stream. If I write a string directly or if I use
WritePString or just write it still doesn't save that data into a
binary file. If you're only dealing with text then OK, use a text
stream and bypass the problem. but if you're using a binary stream
then you're storing more than just text I assumed. In which case you
need to be able to read in the data and reset the text encoding data
that was lost. If all the data comes from within RB as you say, then
it will always be UTF8 so just reset that.
>> To solve it across the board I believe that you can set the encoding
>> that is the default for the binary stream? And even if you do that
>> I"m
>> not sure that readPString will honor that because as Joe says it's an
>> old call that was never text encoding savvy.
>
> Right, but the proper way to avoid that problem is to quite using
> Read/WritePString.
But... if I just use write/read it still doesn't save that does it?
It's a binary stream and if you use that you have to deal with
resetting the encoding. Is there anything wrong with PStrings except
that they dont preserve encoding? I though it was just a length byte
at the head of a variable length amount of data up to 255 characters
which is the highest number you can represent with a single byte. I
use it sometimes, though mostly I've replaced it with my own calls
that use a 2 byte unsigned integer so I can write longer bits of data ;)
So the solution is less to avoid PStrings and more to use a text
stream, if you're using just text...
Thanks,
James
James Sentman http://sentman.com
http://MacHomeAutomation.com
_______________________________________________
Unsubscribe or switch delivery mode:
<http://www.realsoftware.com/support/listmanager/>
Search the archives:
<http://support.realsoftware.com/listarchives/lists.html>
|