UTF-8

UTF-8 is a character encoding standard. It is defined by the Unicode Standard. The acronym stands for Unicode Transformation Format 8-bit.

PHP handles the UTF-8 encoding natively: at input time, with the input_encoding directive; at output time, with the output_encoding and internally with the internal_encoding. All of them can use independant values, yet they all default to default_charset, which uses UTF-8 as default value.

For other manipulations, such as converting sources such as file or databases content to and from UTF-8, PHP relies on mbstring and iconv libraries.

Special functions, such as utf8_encode and utf8_decode() are deprecated since PHP 8.2, and should be be used.

Documentation