Use case for multibyte safe programming
You won't notice anything is wrong, until a Norwegian guy called Øystein Øvretveit signs up and your website breaks. What could have happened?
Let me explain. In PHP, a technique to get a single character from a string is to treat the string as an array, or to use a function like substr(). Here is the array approach:
$string = 'This is a string'; $string; // first character: T $string; // second: h $string[-1]; // last: g
$string1 = 'Café'; $string2 = 'Österreich'; $string1; // C $string1[-1]; // � $string2; // � $string2[-1]; // h
If these � characters are part of an array or string and used as input for a JSON endpoint, they could end up breaking the complete REST service. JSON encoders can crash because of broken characters. In Drupal you could get a PHP error like this one:
Symfony\Component\Serializer\Exception\NotEncodableValueException: Malformed UTF-8 characters, possibly incorrectly encoded in Symfony\Component\Serializer\Encoder\JsonEncode->encode() (regel 63 van /Users/dries/sites/🦄/vendor/symfony/serializer/Encoder/JsonEncode.php).
Maybe it's the array technique; how about substr() to get single characters and substrings? Let's try:
substr('Österreich', 0, 100); // Österreich substr('Österreich', 0, 10); // Österreic substr('Österreich', 0, 3); // Ös substr('Österreich', 0, 2); // Ö substr('Österreich', 0, 1); // � substr('Österreich', 0, 0); // empty string substr('Café', 0, 4); // Caf� substr('Café', 0, 5); // Café strlen('Café'); // 5 strlen('é'); // 2 strlen('🤯'); // 12
There is something strange going on. Diacritics like 'é' and 'Ö' are not seen as a single character. Functions like substr, strlen, and  use the number of bytes to extract substrings, completely ignoring the representation of the strings. There are functions in PHP that fix this behavior, the so called 'Multibyte String Functions'.
mb_substr('Österreich', 0, 1); //Ö mb_substr('Café', 0, 4); // Café mb_strlen('Café'); // 4 mb_strlen('é'); // 1 mb_strlen('👍'); // 12
So, referring back to the first paragraph: what could have happened? Maybe there was a block on the homepage populated by an API, a block containing all the recently registered users. As a nice bonus, the initials were displayed for every user without avatar. The function taking the first characters of the full name, ØØ in this case, returned ��, throwing an exception in JsonEncode.php, creating a cascade of errors crashing the homepage.
It's a good practice to use these 'mb_' functions if you are planning on manipulating strings, certainly when these are sourced from user input. Diacritics are rare in the English language, but not non-existent. Some words like the aforementioned café, and names like Chloë or Renée could break your website. For most other languages it's even more important to be aware of this potential problem.