Page 1 of 1

[0.8.2] Clipboard::getText() and UTF-8

Posted: Mon Sep 02, 2013 17:42
by BigG
Hi,
when implementing a NativeClipboardProvider for Win32 that supports Unicode, I came across a bug in said function.

I'm retrieving the Text from the Win32 Clipboard using CT_UNICODETEXT as UTF-16 data, which I subsequently convert to UTF-8 and give it back to the CEGUI::Clipboard instance. It is however misinterpreted as plain ASCII data which of course causes the wrong output. The reason for this appears to be the following (Clipboard.cpp lines 155 - 159):

Code: Select all

        // d_buffer an utf8 or ASCII C string (ASCII if std::string is used)
       
        // !!! However it is not null terminated !!! So we have to tell String
        // how many code units (not code points!) there are.
        return String(static_cast<const char*>(d_buffer), d_bufferSize);


The constructor of the String instance is always called with d_buffer given as const char*, which is interpreted as plain ASCII. To have it interpreted as UTF-8, it would need to be passed as const utf8*, of course. I propose the following fix, which keeps compatibility for all usable string classes:

Code: Select all

        // d_buffer an utf8 or ASCII C string (ASCII if std::string is used)
       
        // !!! However it is not null terminated !!! So we have to tell String
        // how many code units (not code points!) there are.
#if CEGUI_STRING_CLASS == CEGUI_STRING_CLASS_UNICODE
        return String(static_cast<const utf8*>(d_buffer), d_bufferSize);
#else
        return String(static_cast<const char*>(d_buffer), d_bufferSize);
#endif


Alternatively, a different, non-standard MIME type "text/utf8" could be implemented...

Thanks in advance,
- BigG

Re: [0.8.2] Clipboard::getText() and UTF-8

Posted: Mon Sep 02, 2013 18:18
by Kulik
Hi, thanks for noticing this. I have fixed it in v0-8.

https://bitbucket.org/cegui/cegui/commi ... e05b307210

btw: We would be very very interested in a Windows NativeClipboardProvider. Would be awesome to ship it with CEGUI if it works well. Are you planning to release the code?

Re: [0.8.2] Clipboard::getText() and UTF-8

Posted: Mon Sep 02, 2013 18:59
by BigG
Thank you for the fix! My implementation is actually really basic and only supports text, no fancy stuff such as images or anything as I don't need those for my project. It should, however, be easy to add those in. Also, it's not following any coding guidelines CEGUI is using, so it needs some polishing ;)

Note that the UTF-8 <-> UTF-16 conversion uses several temporary buffers which is not necessarily the most efficient way to do it, but this way I am able to use CEGUI::StringTranscoder. Also, due to using the temporary internal buffer, it is not thread safe. This shouldn't be an issue as CEGUI itself isn't either as far as I know.

Win32ClipboardProvider.h

Code: Select all

#ifndef WIN32CLIPBOARDPROVIDER_H
#define WIN32CLIPBOARDPROVIDER_H

class Win32ClipboardProvider : public CEGUI::NativeClipboardProvider
{
private:
   char* m_Buffer;
   size_t m_BufferSize;

private:
   void _allocateBuffer(size_t Size);
   void _deallocateBuffer();

public:
   Win32ClipboardProvider();
   ~Win32ClipboardProvider();

public:
   // NativeClipboardProvider overloads
    void sendToClipboard(const CEGUI::String& mimeType, void* buffer, size_t size);
    void retrieveFromClipboard(CEGUI::String& mimeType, void*& buffer, size_t& size);
};

#endif


Win32ClipboardProvider.cpp

Code: Select all

#include <CEGUI.h>
#include <Windows.h>
#include "Win32ClipboardProvider.h"

Win32ClipboardProvider::Win32ClipboardProvider()
   : m_Buffer(NULL), m_BufferSize(0)
{

}

Win32ClipboardProvider::~Win32ClipboardProvider()
{
   _deallocateBuffer();
}

void Win32ClipboardProvider::_allocateBuffer(size_t Size)
{
   if(m_Buffer)
      delete [] m_Buffer;

   m_Buffer = new char[Size];
   m_BufferSize = Size;
}

void Win32ClipboardProvider::_deallocateBuffer()
{
   delete [] m_Buffer;
   m_Buffer = NULL;
   m_BufferSize = 0;
}

void Win32ClipboardProvider::sendToClipboard(const CEGUI::String& mimeType, void* buffer, size_t size)
{
   if(mimeType == "text/plain")
   {
      if(OpenClipboard(NULL))
      {
         // Transcode buffer to UTF-16
#if CEGUI_STRING_CLASS == CEGUI_STRING_CLASS_UNICODE
         CEGUI::String str(static_cast<const CEGUI::utf8*>(buffer), size);
#else
         CEGUI::String str(static_cast<const char*>(buffer), size);
#endif
         CEGUI::uint16* str_utf16 = CEGUI::System::getSingleton().getStringTranscoder().stringToUTF16(str);
         size_t SizeInBytes = (str.size() + 1) * sizeof(CEGUI::uint16);

         // Copy to clipboard
         EmptyClipboard();
         HGLOBAL hClipboardData = GlobalAlloc(GMEM_DDESHARE,SizeInBytes);
         LPWSTR Clipboard = static_cast<LPWSTR>(GlobalLock(hClipboardData));
         if(Clipboard)
            memcpy(Clipboard, str_utf16, SizeInBytes);
         GlobalUnlock(hClipboardData);
         SetClipboardData(CF_UNICODETEXT, hClipboardData);
         CloseClipboard();

         // Free temporary UTF-16 buffer
         CEGUI::System::getSingleton().getStringTranscoder().deleteUTF16Buffer(str_utf16);
      }
   }
}

void Win32ClipboardProvider::retrieveFromClipboard(CEGUI::String& mimeType, void*& buffer, size_t& size)
{
   if(OpenClipboard(NULL))
   {
      // Open & read UTF-16 clipboard data
      HGLOBAL hClipboardData = GetClipboardData(CF_UNICODETEXT);
      const CEGUI::uint16* Clipboard = static_cast<const CEGUI::uint16*>(GlobalLock(hClipboardData));
      if(Clipboard)
      {
         // Transcode UTF-16 to native format and copy to local buffer
         CEGUI::String str = CEGUI::System::getSingleton().getStringTranscoder().stringFromUTF16(Clipboard);
         _allocateBuffer(strlen(str.c_str())); // We need the actual byte count which can be different from str.size() when using UTF-8!
         memcpy(m_Buffer, str.c_str(), m_BufferSize);

         mimeType = "text/plain";
         buffer = m_Buffer;
         size = m_BufferSize;
      }
      // Close clipboard
      GlobalUnlock(hClipboardData);
      CloseClipboard();
   }
}


Hope that helps :D

Greetings,
- BigG

Re: [0.8.2] Clipboard::getText() and UTF-8

Posted: Tue Sep 03, 2013 09:21
by Kulik
This helps tremendously, thanks!

Are you OK with us polishing this and putting it into CEGUI? Means it would be licensed under MIT. I would like to put your full name as the original author on it, could you send it if it's not a secret? Otherwise I will just put BigG there.

Re: [0.8.2] Clipboard::getText() and UTF-8

Posted: Tue Sep 03, 2013 20:39
by BigG
I'm glad I could help - of course I'm totally OK with it ;) I'll PM you my full name.

Re: [0.8.2] Clipboard::getText() and UTF-8

Posted: Wed Sep 18, 2013 14:00
by lindbes
Thanks for this,­ ive been trying to find solution everywhere. Youre the best!