As a PHP developer, I am sure you use strlen often to check for the length of strings. strlen does not return the length of a string but the number of bytes in a string. In PHP, one character is one byte therefore for characters that fall within the 0 – 255 range in ASCII/UTF-8 all seem well; that is string length matches the number of bytes. There is nothing wrong with this approach of checking the length of a string using strlen if you are checking the length of string that you typed in Latin characters for your program because ASCII and some basic UTF-8 characters fall within the range of a single byte.
The problem with using strlen occurs when there is a character outside of the 1-byte range, then strlen returns values greater than the string length, which can lead to bugs and general confusion. The solution to this is to use mb_strlen, which returns the exact length of the character by checking the encoding set. Check out the snippet displaying this:
// strlen okay: Result is 4
echo strlen('Rose');
// strlen not okay: Result is 7
echo strlen('Michał');
// mb_strlen okay: Result is 6
echo mb_strlen('Michał');
A rule of thumb I use when checking length of character input from user especially from a web browser I use mb_strlen
but when I need to use the actual size of the string I use strlen
for example when transferring string through or saving them in a database or if it is a string I typed out myself.
Next time you want to check the length of a string in PHP or any language it is best to know the appropriate function to use for the type of string characters.
Follow me on Twitter / Mastodon / LinkedIn or check out my code on Github or my other articles.
Leave a Reply