Problem with CEGUI::String

For help with general CEGUI usage:
- Questions about the usage of CEGUI and its features, if not explained in the documentation.
- Problems with the CMAKE configuration or problems occuring during the build process/compilation.
- Errors or unexpected behaviour.

Moderators: CEGUI MVP, CEGUI Team

Impz0r
Not too shy to talk
Not too shy to talk
Posts: 26
Joined: Fri Sep 04, 2009 12:54

Problem with CEGUI::String

Postby Impz0r » Mon Sep 21, 2009 19:13

Hey I've just ran into a problem concerning the CEGUI::String.

I'm trying to use german "umlaute" like: "äöü' within an edit box. This works just fine until the point i try actually accessing them by getText().
The thing is that I'm using std::string within my entire application except the gui of course. To get a CEGUI::String content into a std::string i do something like:

Code: Select all

std::string blah = control->getText().c_str();

This works most of the time, but not with these "äöü". Internally the string is converted to utf8 and thereby the "äüö" get fecked up somehow.

So may question is, how do I get it right, if it is even possible?

PS: And besides, why is CEGUI not using the std::string class it does also support unicode?


Thanks in advance!

Mfg Imp

User avatar
CrazyEddie
CEGUI Project Lead
Posts: 6760
Joined: Wed Jan 12, 2005 12:06
Location: England
Contact:

Re: Problem with CEGUI::String

Postby CrazyEddie » Tue Sep 22, 2009 08:40

Impz0r wrote:To get a CEGUI::String content into a std::string i do something like:

Code: Select all

std::string blah = control->getText().c_str();

This works most of the time, but not with these "äöü". Internally the string is converted to utf8 and thereby the "äüö" get fecked up somehow.

Can you clarify this bit: Internally the string is converted to utf8 - do you mean by std::string, or the fact that CEGUI::String does this? I'm not aware of std::string having such a function, and the reason CEGUI does it is because it's impossible to represent the entire set of Unicode code points in 8 bit chars (so we use utf8 in that case).

Impz0r wrote:PS: And besides, why is CEGUI not using the std::string class it does also support unicode?

I don't believe std::string does support unicode, even the wide character type that's in the standard is not too helpful for us because the actual representation is not specified and so varies by implementation - it's for this reason we wrote a string class that we can rely on to do what we expect in all cases ;)

So, to clear up a couple of points. What representation are you yourself using for characters? Some form of actual unicode or ISO/IEC 8859-1 or something else? :) Knowing this will aid in coming up with a suitable conversion, though largely it will involve accessing the UTF32 codes in the CEGUI::String and stuffing them into your std:: string (after applying any required conversion).

CE.

Impz0r
Not too shy to talk
Not too shy to talk
Posts: 26
Joined: Fri Sep 04, 2009 12:54

Re: Problem with CEGUI::String

Postby Impz0r » Tue Sep 22, 2009 10:25

Hey CE, thanks for your answer.

I'm sorry, i did not express my concern very well.

What I need is ISO/IEC 8859-1 because it supports "ÄäÖöÜü" as long as the Wikipedia page does nod lie ;)

Can you clarify this bit: Internally the string is converted to utf8 - do you mean by std::string, or the fact that CEGUI::String does this?


Sorry I was also quite unclear here. What i meant was, that the CEGUI::String internally converts the String to utf8 which is correct, but the outcome seems not to fit. Meaning, I put "ÄäÖöÜü" into it and I get chars like "Á" back. Dunno if I'm doing something wrong here?


Thanks in advance!

Mfg Imp

User avatar
CrazyEddie
CEGUI Project Lead
Posts: 6760
Joined: Wed Jan 12, 2005 12:06
Location: England
Contact:

Re: Problem with CEGUI::String

Postby CrazyEddie » Tue Sep 22, 2009 13:16

Thanks for the clarification, it should be a simple 'stuffing' exercise. Try something like this:

Code: Select all

std::string& CEGUIStringToStdString(const CEGUI::String& in_str, std::string& out_str)
{
    out_str.resize(in_str.length());

    for (size_t i = 0; i < in_str.length(); ++i)
        out_str[i] = (char)in_str[i];

    return out_str;
}


Btw, the reason your Ä turns into Á is because utf8 is a multibyte representation where each glyph is represented by a variable number of chars - for normal ASCII it's all fine because code points 0 to 127 translate directly, for values above this, code points are represented by two or more bytes: Ä which is 0xC4 is encoded into utf8 as the sequence 0xC3 0x84 - all fun stuff :)

CE.

Impz0r
Not too shy to talk
Not too shy to talk
Posts: 26
Joined: Fri Sep 04, 2009 12:54

Re: Problem with CEGUI::String

Postby Impz0r » Wed Sep 23, 2009 08:23

Hey CE thanks for the quick snipped you've posted. As far as i understand, you just typcast the 2Byte unicode string into a 1Byte. So the upper part of it will just be cut off, right?


Thanks again for your great support!

Mfg Imp

User avatar
CrazyEddie
CEGUI Project Lead
Posts: 6760
Joined: Wed Jan 12, 2005 12:06
Location: England
Contact:

Re: Problem with CEGUI::String

Postby CrazyEddie » Wed Sep 23, 2009 08:55

Well, we use UTF32 which is four bytes, but other than that, yes that function is just using the low byte as the final char - this does not work in all cases, but should be fine for ISO/IEC 8859-1.

CE.


Return to “Help”

Who is online

Users browsing this forum: No registered users and 23 guests