[0.8.2] Clipboard::getText() and UTF-8
Posted: Mon Sep 02, 2013 17:42
Hi,
when implementing a NativeClipboardProvider for Win32 that supports Unicode, I came across a bug in said function.
I'm retrieving the Text from the Win32 Clipboard using CT_UNICODETEXT as UTF-16 data, which I subsequently convert to UTF-8 and give it back to the CEGUI::Clipboard instance. It is however misinterpreted as plain ASCII data which of course causes the wrong output. The reason for this appears to be the following (Clipboard.cpp lines 155 - 159):
The constructor of the String instance is always called with d_buffer given as const char*, which is interpreted as plain ASCII. To have it interpreted as UTF-8, it would need to be passed as const utf8*, of course. I propose the following fix, which keeps compatibility for all usable string classes:
Alternatively, a different, non-standard MIME type "text/utf8" could be implemented...
Thanks in advance,
- BigG
when implementing a NativeClipboardProvider for Win32 that supports Unicode, I came across a bug in said function.
I'm retrieving the Text from the Win32 Clipboard using CT_UNICODETEXT as UTF-16 data, which I subsequently convert to UTF-8 and give it back to the CEGUI::Clipboard instance. It is however misinterpreted as plain ASCII data which of course causes the wrong output. The reason for this appears to be the following (Clipboard.cpp lines 155 - 159):
Code: Select all
// d_buffer an utf8 or ASCII C string (ASCII if std::string is used)
// !!! However it is not null terminated !!! So we have to tell String
// how many code units (not code points!) there are.
return String(static_cast<const char*>(d_buffer), d_bufferSize);
The constructor of the String instance is always called with d_buffer given as const char*, which is interpreted as plain ASCII. To have it interpreted as UTF-8, it would need to be passed as const utf8*, of course. I propose the following fix, which keeps compatibility for all usable string classes:
Code: Select all
// d_buffer an utf8 or ASCII C string (ASCII if std::string is used)
// !!! However it is not null terminated !!! So we have to tell String
// how many code units (not code points!) there are.
#if CEGUI_STRING_CLASS == CEGUI_STRING_CLASS_UNICODE
return String(static_cast<const utf8*>(d_buffer), d_bufferSize);
#else
return String(static_cast<const char*>(d_buffer), d_bufferSize);
#endif
Alternatively, a different, non-standard MIME type "text/utf8" could be implemented...
Thanks in advance,
- BigG