refactor: rename Native String to Ordinary String.
- rename Native to Ordinary in code and documentation. - fulfill some documentations.
This commit is contained in:
@ -4,16 +4,22 @@
|
||||
|
||||
YYCC::EncodingHelper namespace include all encoding related functions:
|
||||
|
||||
\li The convertion between native string and UTF8 string which has been introduced in chapter \ref library_encoding.
|
||||
\li The convertion between ordinary string and UTF8 string which has been introduced in chapter \ref library_encoding.
|
||||
\li Windows specific convertion between \c WCHAR, UTF8 string and string encoded by other encoding.
|
||||
\li The convertion among UTF8, UTF16 and UTF32.
|
||||
|
||||
|
||||
\section encoding_helper__native_utf8_conv Native & UTF8 Convertion
|
||||
\section encoding_helper__ordinary_utf8_conv Ordinary & UTF8 Convertion
|
||||
|
||||
These convertion functions have been introduced in previous page.
|
||||
See \ref library_encoding for more infomation.
|
||||
|
||||
YYCC supports following convertions:
|
||||
|
||||
\li YYCC::EncodingHelper::ToUTF8: Convert ordinary string to UTF8 string.
|
||||
\li YYCC::EncodingHelper::ToUTF8View: Same as ToUTF8, but return string view instead.
|
||||
\li YYCC::EncodingHelper::ToOrdinary: Convert UTF8 string to ordinary string.
|
||||
\li YYCC::EncodingHelper::ToOrdinaryView: Same as ToOrdinary, but return string view instead.
|
||||
|
||||
\section encoding_helper__win_conv Windows Specific Convertion
|
||||
|
||||
During Windows programming, the convertion between Microsoft specified \c wchar_t and \c char is an essential operation.
|
||||
@ -26,11 +32,11 @@ Please use them carefully (make sure that you are using them only in Windows env
|
||||
|
||||
YYCC supports following convertions:
|
||||
|
||||
\li \c WcharToChar: Convert \c wchar_t string to code page specified string.
|
||||
\li \c CharToWchar: The reversed convertion of WcharToChar.
|
||||
\li \c CharToChar: Convert string between 2 different code pages. It's a shortcut of calling CharToWchar and WcharToChar successively.
|
||||
\li \c WcharToUTF8: Convert \c wchar_t string to UTF8 string.
|
||||
\li \c UTF8ToWchar: The reversed convertion of WcharToUTF8.
|
||||
\li YYCC::EncodingHelper::WcharToChar: Convert \c wchar_t string to code page specified string.
|
||||
\li YYCC::EncodingHelper::CharToWchar: The reversed convertion of WcharToChar.
|
||||
\li YYCC::EncodingHelper::CharToChar: Convert string between 2 different code pages. It's a shortcut of calling CharToWchar and WcharToChar successively.
|
||||
\li YYCC::EncodingHelper::WcharToUTF8: Convert \c wchar_t string to UTF8 string.
|
||||
\li YYCC::EncodingHelper::UTF8ToWchar: The reversed convertion of WcharToUTF8.
|
||||
|
||||
Code Page is a Windows concept.
|
||||
If you don't understand it, please view corresponding Microsoft documentation.
|
||||
@ -47,14 +53,14 @@ They can be used in any platform, not confined in Windows platforms.
|
||||
|
||||
YYCC supports following convertions:
|
||||
|
||||
\li \c UTF8ToUTF16: Convert UTF8 string to UTF16 string.
|
||||
\li \c UTF16ToUTF8: The reversed convertion of UTF8ToUTF16.
|
||||
\li \c UTF8ToUTF32: Convert UTF8 string to UTF32 string.
|
||||
\li \c UTF32ToUTF8: The reversed convertion of UTF8ToUTF32.
|
||||
\li YYCC::EncodingHelper::UTF8ToUTF16: Convert UTF8 string to UTF16 string.
|
||||
\li YYCC::EncodingHelper::UTF16ToUTF8: The reversed convertion of UTF8ToUTF16.
|
||||
\li YYCC::EncodingHelper::UTF8ToUTF32: Convert UTF8 string to UTF32 string.
|
||||
\li YYCC::EncodingHelper::UTF32ToUTF8: The reversed convertion of UTF8ToUTF32.
|
||||
|
||||
\section encoding_helper__overloads Function Overloads
|
||||
|
||||
Every encoding convertion functions (except the convertion between UTF8 and native string) have 4 different overloads for different scenarios.
|
||||
Every encoding convertion functions (except the convertion between UTF8 and ordinary string) have 4 different overloads for different scenarios.
|
||||
Take YYCC::EncodingHelper::WcharToChar for example.
|
||||
There are following 4 overloads:
|
||||
|
||||
|
@ -59,6 +59,23 @@ I notice standard library change UTF8 related functions frequently and its API a
|
||||
For example, standard library brings \c std::codecvt_utf8 in C++ 11, deprecate it in C++ 17 and even remove it in C++ 26.
|
||||
That's unacceptable! So I create my own UTF8 type to avoid the scenario that standard library remove \c char8_t in future.
|
||||
|
||||
\section library_encoding__concept Concepts
|
||||
|
||||
In following content, you may be face with 2 words: ordinary string and UTF8 string.
|
||||
|
||||
UTF8 string, as its name, is the string encoded with UTF8.
|
||||
The char type of it must is \c yycc_char8_t.
|
||||
(equivalent to \c char8_t after C++ 20.)
|
||||
|
||||
Ordinary string means the plain, native string.
|
||||
The result of C++ string literal without any prefix \c "foo bar" is a rdinary string.
|
||||
The char type of it is \c char.
|
||||
Its encoding depends on compiler and environment.
|
||||
(UTF8 in Linux, or system code page in Windows if UTF8 switch was not enabled in MSVC.)
|
||||
|
||||
For more infomation, please browse CppReference:
|
||||
https://en.cppreference.com/w/cpp/language/string_literal
|
||||
|
||||
\section library_encoding__utf8_literal UTF8 Literal
|
||||
|
||||
String literal is a C++ concept.
|
||||
@ -123,35 +140,35 @@ char* mutable_utf8 = const_cast<char*>(absolutely_is_utf8); // This is not safe.
|
||||
yycc_char8_t* mutable_converted = YYCC::EncodingHelper::ToUTF8(mutable_utf8);
|
||||
\endcode
|
||||
|
||||
YYCC::EncodingHelper::ToUTF8 has 2 overloads which can handle const and mutable stirng pointer convertion respectively.
|
||||
YYCC::EncodingHelper::ToUTF8 has 2 overloads which can handle constant and mutable stirng pointer convertion respectively.
|
||||
|
||||
YYCC also has ability that convert YYCC UTF8 char type to native char type by YYCC::EncodingHelper::ToNative.
|
||||
YYCC also has ability that convert YYCC UTF8 char type to ordinary char type by YYCC::EncodingHelper::ToOrdinary.
|
||||
Here is an exmaple:
|
||||
|
||||
\code
|
||||
const yycc_char8_t* yycc_utf8 = YYCC_U8("I am UTF8 string.");
|
||||
const char* converted = YYCC::EncodingHelper::ToNative(yycc_utf8);
|
||||
const char* converted = YYCC::EncodingHelper::ToOrdinary(yycc_utf8);
|
||||
|
||||
yycc_char8_t* mutable_yycc_utf8 = const_cast<char*>(yycc_utf8); // Not safe. Also just for example.
|
||||
char* mutable_converted = YYCC::EncodingHelper::ToNative(mutable_yycc_utf8);
|
||||
char* mutable_converted = YYCC::EncodingHelper::ToOrdinary(mutable_yycc_utf8);
|
||||
\endcode
|
||||
|
||||
Same as YYCC::EncodingHelper::ToUTF8, YYCC::EncodingHelper::ToNative also has 2 overloads to handle const and mutable string pointer.
|
||||
Same as YYCC::EncodingHelper::ToUTF8, YYCC::EncodingHelper::ToOrdinary also has 2 overloads to handle constant and mutable string pointer.
|
||||
|
||||
\section library_encoding__utf8_container UTF8 String Container
|
||||
|
||||
String container usually means the standard library string container, such as \c std::string, \c std::wstring, \c std::u32string and etc.
|
||||
|
||||
In many personal project, programmer may use \c std::string everywhere because \c std::u8string may not be presented when writing peoject.
|
||||
How to do convertion between native string container and YYCC UTF8 string container?
|
||||
How to do convertion between ordinary string container and YYCC UTF8 string container?
|
||||
It is definitely illegal that directly do force convertion. Because they may have different class layout.
|
||||
Calm down and I will tell you how to do correct convertion.
|
||||
YYCC provides YYCC::EncodingHelper::ToUTF8 to convert native string container to YYCC UTF8 string container.
|
||||
YYCC provides YYCC::EncodingHelper::ToUTF8 to convert ordinary string container to YYCC UTF8 string container.
|
||||
There is an exmaple:
|
||||
|
||||
\code
|
||||
std::string native_string("I am UTF8");
|
||||
yycc_u8string yycc_string = YYCC::EncodingHelper::ToUTF8(native_string);
|
||||
std::string ordinary_string("I am UTF8");
|
||||
yycc_u8string yycc_string = YYCC::EncodingHelper::ToUTF8(ordinary_string);
|
||||
auto result = YYCC::EncodingHelper::UTF8ToUTF32(yycc_string);
|
||||
\endcode
|
||||
|
||||
@ -160,19 +177,19 @@ However, there is a implicit convertion from \c std::string to \c std::string_vi
|
||||
so you can directly pass a \c std::string instance to it.
|
||||
|
||||
String view will reduce unnecessary memory copy.
|
||||
If you just want to pass native string container to function, and this function accepts \c yycc_u8string_view as its argument,
|
||||
If you just want to pass ordinary string container to function, and this function accepts \c yycc_u8string_view as its argument,
|
||||
you can use alternative YYCC::EncodingHelper::ToUTF8View.
|
||||
|
||||
\code
|
||||
std::string native_string("I am UTF8");
|
||||
yycc_u8string_view yycc_string = YYCC::EncodingHelper::ToUTF8View(native_string);
|
||||
std::string ordinary_string("I am UTF8");
|
||||
yycc_u8string_view yycc_string = YYCC::EncodingHelper::ToUTF8View(ordinary_string);
|
||||
auto result = YYCC::EncodingHelper::UTF8ToUTF32(yycc_string);
|
||||
\endcode
|
||||
|
||||
Comparing with previous one, this example use less memory.
|
||||
The reduced memory is the content of \c yycc_string because string view is a view, not the copy of original string.
|
||||
|
||||
Same as UTF8 string pointer, we also have YYCC::EncodingHelper::ToNative and YYCC::EncodingHelper::ToNativeView do correspondant reverse convertion.
|
||||
Same as UTF8 string pointer, we also have YYCC::EncodingHelper::ToOrdinary and YYCC::EncodingHelper::ToOrdinaryView do correspondant reverse convertion.
|
||||
Try to do your own research and figure out how to use them.
|
||||
It's pretty easy.
|
||||
|
||||
|
Reference in New Issue
Block a user