Unicode

Unicode Character Table

Usefull char

UTF-8, UTF-16, and UTF-32 ?

Stick to UTF-8 and these three character sets - coding policy for software devlopment

UTF-8 has an advantage in the case where ASCII characters represent the majority of characters in a block of text, because UTF-8 encodes these into 8 bits (like ASCII). It is also advantageous in that a UTF-8 file containing only ASCII characters has the same encoding as an ASCII file.

UTF-16 is better where ASCII is not predominant, since it uses 2 bytes per character, primarily. UTF-8 will start to use 3 or more bytes for the higher order characters where UTF-16 remains at just 2 bytes for most characters.

UTF-32 will cover all possible characters in 4 bytes. This makes it pretty bloated. I can’t think of any advantage to using it.

see also

Written on August 13, 2020, Last update on March 30, 2024
ascii utf8 binary text encoding online software