doc: update documentation

This commit is contained in:
2024-06-28 11:38:19 +08:00
parent 73ef8af56c
commit 91ba0c22d6
6 changed files with 136 additions and 18 deletions

View File

@ -32,7 +32,7 @@ That's unacceptable! So I create my own UTF8 type to avoid the scenario that sta
\section library_encoding_utf8_literal UTF8 Literal
C++ standard allows programmer declare a UTF8 literal explicitly by writing code like this:
C++ standard allows programmer declare an UTF8 literal explicitly by writing code like this:
\code
u8"foo bar"
@ -41,11 +41,10 @@ u8"foo bar"
This is okey. But it may incompatible with YYCC UTF8 char type.
According to C++ standard, this UTF8 literal syntax will only return \c const \c char8_t* if your C++ standard higher or equal to C++ 20,
otherwise it will return \c const \c char*.
This behavior cause that you can not assign this UTF8 literal to yycc_u8string if you are in the environment which do not support \c char8_t,
This behavior cause that you can not assign this UTF8 literal to \c yycc_u8string if you are in the environment which do not support \c char8_t,
because their types are different.
Thereas you can not use the functions provided by this library because they are all use YYCC defined UTF8 char type.
So I will tell you how to create UTF8 literal in following content of this section.
So I will tell you how to correctly create UTF8 literal in the following content.
YYCC provides a macro \c YYCC_U8 to resolve this issue.
You can declare UTF8 literal like this:
@ -58,11 +57,12 @@ You don't need add extra \c u8 prefix in string given to the macro.
This macro will do this automatically.
In detail, this macro do a \c reinterpret_cast to change the type of given argument to \c const \c yycc_char8_t* forcely.
This ensure that declared UTF8 literal is compatible with YYCC defined UTF8 types.
This ensure that declared UTF8 literal is compatible with YYCC UTF8 types.
\section library_encoding_utf8_pointer UTF8 String Pointer
Besides UTF8 literal, another issue you may be faced is how to convert native UTF8 string pointer to YYCC UTF8 type.
Besides UTF8 literal, another issue you may be faced is how to convert native UTF8 string pointer to YYCC UTF8 type
(\e native means \c const \c char* or \c char*, the string using char as its char type).
Many legacy code assume \c char* is encoded with UTF8 (the exception is Windows). But \c char* is incompatible with yycc_char8_t.
YYCC provides YYCC::EncodingHelper::ToUTF8 to resolve this issue. There is an exmaple:
@ -75,7 +75,7 @@ char* mutable_utf8 = const_cast<char*>(absolutely_is_utf8); // This is not safe.
yycc_char8_t* mutable_converted = YYCC::EncodingHelper::ToUTF8(mutable_utf8);
\endcode
YYCC::EncodingHelper::ToUTF8 has 2 overloads which can handle const and non-const stirng pointer convertion respectively.
YYCC::EncodingHelper::ToUTF8 has 2 overloads which can handle const and mutable stirng pointer convertion respectively.
YYCC also provide ability that convert YYCC UTF8 char type to native char type by YYCC::EncodingHelper::ToNative.
Here is an exmaple:
@ -88,12 +88,12 @@ yycc_char8_t* mutable_yycc_utf8 = const_cast<char*>(yycc_utf8); // Not safe. Als
char* mutable_converted = YYCC::EncodingHelper::ToNative(mutable_yycc_utf8);
\endcode
Same as YYCC::EncodingHelper::ToUTF8, YYCC::EncodingHelper::ToNative also has 2 overloads to handle const and non-const string pointer.
Same as YYCC::EncodingHelper::ToUTF8, YYCC::EncodingHelper::ToNative also has 2 overloads to handle const and mutable string pointer.
\section library_encoding_utf8_container UTF8 String Container
The final issue you faced is string container.
In many personal project, you may use \c std::string everywhere because \c std::u8string may not be presented when you writing your peoject.
In many personal project, programmer may use \c std::string everywhere because \c std::u8string may not be presented when writing peoject.
How to do convertion between native string container and YYCC UTF8 string container?
It is definitely illegal that directly do force convertion. Because they may have different class layout.
@ -108,12 +108,12 @@ yycc_u8string yycc_string = YYCC::EncodingHelper::ToUTF8(native_string);
auto result = YYCC::EncodingHelper::UTF8ToUTF32(yycc_string);
\endcode
Actually, YYCC::EncodingHelper::ToUTF8 accept one \c std::string_view as argument.
Actually, YYCC::EncodingHelper::ToUTF8 accepts a reference to \c std::string_view as argument.
However, there is a implicit convertion from \c std::string to \c std::string_view,
so you can directly pass a \c std::string instance to it.
String view will reduce unnecessary memory copy.
If you just want to pass native string container to function, and this function accept yycc_u8string_view as its argument,
If you just want to pass native string container to function, and this function accepts \c yycc_u8string_view as its argument,
you can use alternative YYCC::EncodingHelper::ToUTF8View.
\code
@ -126,6 +126,8 @@ Comparing with previous one, this example use less memory.
The reduced memory is the content of \c yycc_string because string view is a view, not the copy of original string.
Same as UTF8 string pointer, we also have YYCC::EncodingHelper::ToNative and YYCC::EncodingHelper::ToNativeView do correspondant reverse convertion.
Try to do your own research and figure out how to use them.
It's pretty easy.
\section library_encoding_windows Warnings to Windows Programmer