Not only did you necro an ancient thread, you also necro'd the forum-corpse of CrazyEddie!
Did not read the entire thread but I would like to add something: I rewrote CEGUI::String almost entirely for C++11 and had tried to use std's string conversion functionality. It is absolutely unusable. codecvt is very poorly supported (known bugs, linker problems, etc) in Visual Studio 2013 and 2015 to the point where it makes absolutely no sense to use it.
What we have on default, and I would like to keep that, is 3 options for CEGUI::String
ASCII
UTF_8
UTF_32
Ascii type is just a typedef of std::string
UTF_8 type replicates the entire std::basic_string interface and is UTF-8 and UTF-32 aware for all inputs and has additional convenience functionality for this purpose. It can be used like an std::basic_string and internally stored a std::string, where each char is a code unit. As always, UTF-8 Is backwards compatible to ASCII. This is my personal favourite as a solution, but of course is slower than raw ASCII strings since we need to always check the code units when it comes to iteration and length checks etc.
UTF_32 stores code points and expects normalised input. We do not normalise anywhere atm., but afaik you can't keyboard-input combined characters anyways (they are received normalised already), so this would mainly be an issue for clipboards. This one is also UTF-8 aware and has convenience functions as well. Internally stores everything as std::u32string. It will of course waste a ton of memory, but computations will most likely tend be faster than in the UTF-8 version. I would still prefer the UTF-8 sring.
It is best to implement the conversions between UTF-8 and UTF-32 ourselves in our library since they are relatively simple, adding a library for it alone is quite ridiculous and we already had existing code for it which I refactored and (hopefully) improved based on what knowledge I gained from other sources and source. Codecvt would be nice but like I said it is broken and I remember there were also some caveats.
Imho, wide string and UTF-16 conversions should be done using the libraries present in your OS or wherever - clearly those libraries are using UTF-16 if they give it to you, so this duty is something we leave to the Windows, etc. functions. Since both UTF_32 and UTF_8 are supported in the unicode CEGUI::String types, there should be no issue to get at least one unicode type we support.
Edit:
Added a ticket to remind me of the normalisation stuff, we should do this for 1.0 if possible:
https://bitbucket.org/cegui/cegui/issue ... ter-inputs