Page 1 of 1

CEGUI multulanguage ??

Posted: Sat Sep 24, 2005 09:27
by reload
How to write a russian text in edit_box or eny were in the window in CEGUI ?? :shock:

Re: CEGUI multulanguage ??

Posted: Sat Sep 24, 2005 13:00
by zap
1. The String class used in CEGUI supports the so-called "utf32" characters which are 32-bit wide. By the way, utf32 is an incorrect name, it's really ucs4 and NOT utf32 (utf32 is a prefix encoding system just like utf8 but with 32-bit words) (ref: www.unicode.org).

2. You have to define the glyphs used with Font::createFontGlyphSet() method so that CEGUI creates the respective glyph images from the TTF font (hint! hint! CEGUI authors, why not create glyphs on-the-fly in getGlyphData() rather that having the developer to deal with this, it's often impossible to know in advance which glyphs will be printed, esp. in a mega-multi-language environment).

3. To be able to actually type Cyrillic (and any other non-latin) input you'll need a proper input layer. CEGUI itself doesn't deal with reading keyboard input, it relies on the OS-specific layer to feed him with correct utf32 (or, rather, ucs4) keycodes via the CEGUI::System::injectChar() method.

Re: CEGUI multulanguage ??

Posted: Sun Sep 25, 2005 09:04
by CrazyEddie
zap wrote:
1. The String class used in CEGUI supports the so-called "utf32" characters which are 32-bit wide. By the way, utf32 is an incorrect name, it's really ucs4 and NOT utf32 (utf32 is a prefix encoding system just like utf8 but with 32-bit words) (ref: www.unicode.org).


UCS4 is also an encoding system. UCS4 is from ISO/IEC 10646 and UTF-32 is from the Unicode Consortium's Unicode specification - these are two discreet standards documents. Although Unicode 4.0 now effectively implements ISO/IEC 10646, and uses much of the same terminology, please do not introduce confusion by mixing different terminology from the two.

With some of the last revisions made to ISO/IEC 10646 (relating to usage of reserved ranges) UCS4 and UTF-32 are now effectively the same thing. The unicode standard actually states this in Appendix C (section C.2).

The character injector takes a Unicode codepoint encoded as a single UTF-32 code-unit. Unicode is not currently 'fully' implemented though will be at some stage in the future. As this implementation progresses we will use the Unicode specification as a guide. We will not, howver, be referring directly to the ISO/IEC 10646 standard.