What is Unicode?
Significant expansion of original ASCII codebook to include more scripts and also emojis
- As of February 12, 2024, Unicode supports 149,813 characters
In C
, there is no native encoding of Unicode, so it uses the underlying character data type.
Completely backwards compatible with ASCII
UTF-8 Encoding
“8 bit Unicode Transformation Format”
Uses a variable length character encoding, using 1, 2, 3, or 4 bytes