YYCCommonplace/src/EncodingHelper.hpp

81 lines
3.8 KiB
C++
Raw Normal View History

2024-04-25 10:38:13 +08:00
#pragma once
#include "YYCCInternal.hpp"
2024-04-26 15:37:28 +08:00
#include <string>
#if YYCC_OS == YYCC_OS_WINDOWS
2024-04-25 10:38:13 +08:00
#include "WinImportPrefix.hpp"
#include <Windows.h>
#include "WinImportSuffix.hpp"
#endif
2024-04-25 10:38:13 +08:00
/**
* @brief The namespace handling encoding issues.
* @details
* \par Windows Encoding Convertion
* This namespace provides the convertion between wchar_t, UTF8 and code-page-based string:
* The function name has following format: \c AAAToBBB.
* AAA is the source string and BBB is target string.
* AAA and BBB has following possible value:
* \li \c Char: Code-page-based string. Usually it will add a code page parameter for function to get the code page of this string. For code page, please see Microsoft document.
* \li \c UTF8: UTF8 string.
* \li \c Wchar: wchar_t string.
* \par
* For example: \c WcharToUTF8 will perform the convertion from wchar_t to UTF8,
* and \c CharToChar will perform the convertion between 2 code-page-based string and caller can specify individual code page for these 2 string.
* \par
* These functions are Windows specific and are unavailable on other platforms.
* Becasue Windows use wchar_t string as its function arguments for globalization, and this library use UTF8 everywhere.
* So it should have a bidirectional way to do convertion between wchar_t string and UTF8 string.
*
* \par UTF32, UTF16 and UTF8 Convertion
* This namespace also provide the convertion among UTF32, UTF16 and UTF8.
* These convertion functions are suit for all platforms, not Windows oriented.
* \par
* Due to implementation, this library assume all non-Windows system use UTF8 as their C locale.
* Otherwise these functions will produce wrong result.
*
* \par Function Parameters
* We provide these encoding convertion functions with following 2 types:
* \li Function returns \c bool and its parameter order source string pointer and a corresponding \c std::basic_string container for receiving result.
* \li Function returns corresponding \c std::basic_string result, and its parameter only order source string pointer.
* \par
* For these 2 declarations, both of them will not throw any exception and do not accept nullptr as source string.
* The only difference is that the way to indicate convertion error.
* \par
* First declaration will return false to indicate there is an error when doing convertion. Please note that the content of string container passing in may still be changed!
* Last declaration will return empty string to indicate error. Please note if you pass empty string in, they still will output empty string but it doesn't mean an error.
* So last declaration is used in the scenario that we don't care whether the convertion success did. For example, output something to console.
*
*/
2024-04-25 10:38:13 +08:00
namespace YYCC::EncodingHelper {
#if YYCC_OS == YYCC_OS_WINDOWS
2024-05-20 21:41:48 +08:00
bool WcharToChar(const wchar_t* src, std::string& dest, UINT codepage);
bool WcharToUTF8(const wchar_t* src, std::string& dest);
std::string WcharToChar(const wchar_t* src, UINT codepage);
std::string WcharToUTF8(const wchar_t* src);
bool CharToWchar(const char* src, std::wstring& dest, UINT codepage);
bool UTF8ToWchar(const char* src, std::wstring& dest);
std::wstring CharToWchar(const char* src, UINT codepage);
std::wstring UTF8ToWchar(const char* src);
bool CharToChar(const char* src, std::string& dest, UINT src_codepage, UINT dest_codepage);
std::string CharToChar(const char* src, UINT src_codepage, UINT dest_codepage);
2024-04-25 10:38:13 +08:00
#endif
bool UTF8ToUTF16(const char* src, std::u16string& dest);
std::u16string UTF8ToUTF16(const char* src);
bool UTF8ToUTF32(const char* src, std::u32string& dest);
std::u32string UTF8ToUTF32(const char* src);
2024-04-26 15:37:28 +08:00
bool UTF16ToUTF8(const char16_t* src, std::string& dest);
std::string UTF16ToUTF8(const char16_t* src);
bool UTF32ToUTF8(const char32_t* src, std::string& dest);
std::string UTF32ToUTF8(const char32_t* src);
}