| 
 | 
  Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-04 19:57:20  
And still not a word about native and good unicode support, as far as I can see. Performance is great, however without unicode support it drops the level of the language as a whole very low. Personally I like PHP, that's why it's a double pain for me. 
  
  Manuel Lemos - 2014-08-04 20:08:39 -  In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë 
Nothing has been commented about Unicode at least for PHP 7. Maybe in PHP 8 somebody brave faces that problem again. 
 
I remember Rasmus mentioning they may have a go at it in the future using a simpler library than ICU, but that is all I can remember. 
  
  Joeri Sebrechts - 2014-08-06 07:30:13 -  In reply to message 1 from Andre Polykanine A.K.A. Menelion Elensúlë 
To be fair, you don't need it. PHP basically has the same level of unicode support as C and C++. The built-in strings are byte-arrays, and you can use ICU (intl extension), iconv or mbstring to deal with them as unicode in places where you care that one character != one byte. 
 
Admittedly, it is annoying that sort() can't actually sort UTF-8 properly on windows machines but with the Collator class in intl you now have a cross-platform sorting solution, so the gaps have been filled. 
 
So, yeah, it's a bit awkward to work with unicode, and you need to know what you're doing, but there is nothing missing to handle unicode absolutely perfectly. See this presentation I made which explains how to work with strings in PHP: http://sebrechts.net/slides/strings/ 
 
  
  Andre Polykanine A.K.A. Menelion Elensúlë - 2014-08-06 20:12:37 -  In reply to message 3 from Joeri Sebrechts 
Of course I use mbstring. However I believe it's slower than if there would be native Unicode support in the language core. Am I wrong? 
  
  Manuel Lemos - 2014-08-06 20:37:34 -  In reply to message 4 from Andre Polykanine A.K.A. Menelion Elensúlë 
I think any multi-byte text encoding manipulation is slower than the regular single byte encoding text manipulation. 
 
If I am not mistaken, the original PHP 6 plans were using UTF-16 to manipulate all text strings. 
 
This means that single-byte text would be slower to manipulate than what we have today. That could hurt PHP speed in general. 
 
So I am afraid the transparent Unicode support that some developers desire, comes at a price, of either speed and memory usage. 
  
   |