Unicode

Recommend this page to a friend!

Unicode

Subject:	Unicode
Summary:	Still no unicode support?
Messages:	5
Author:	Andre Polykanine A.K.A. Menelion Elens�l�
Date:	2014-08-04 14:13:46
Update:	2014-08-06 20:37:34

1. Unicode

Report abuse

Andre Polykanine A.K.A. Menelion Elens�l� - 2014-08-04 19:57:20

And still not a word about native and good unicode support, as far as I can see. Performance is great, however without unicode support it drops the level of the language as a whole very low. Personally I like PHP, that's why it's a double pain for me.

2. Re: Unicode

Report abuse

Manuel Lemos - 2014-08-04 20:08:39 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elens�l�

Nothing has been commented about Unicode at least for PHP 7. Maybe in PHP 8 somebody brave faces that problem again.

I remember Rasmus mentioning they may have a go at it in the future using a simpler library than ICU, but that is all I can remember.

3. Re: Unicode

Report abuse

Joeri Sebrechts - 2014-08-06 07:30:13 - In reply to message 1 from Andre Polykanine A.K.A. Menelion Elens�l�

To be fair, you don't need it. PHP basically has the same level of unicode support as C and C++. The built-in strings are byte-arrays, and you can use ICU (intl extension), iconv or mbstring to deal with them as unicode in places where you care that one character != one byte.

Admittedly, it is annoying that sort() can't actually sort UTF-8 properly on windows machines but with the Collator class in intl you now have a cross-platform sorting solution, so the gaps have been filled.

So, yeah, it's a bit awkward to work with unicode, and you need to know what you're doing, but there is nothing missing to handle unicode absolutely perfectly. See this presentation I made which explains how to work with strings in PHP: http://sebrechts.net/slides/strings/

4. Re: Unicode

Report abuse

Andre Polykanine A.K.A. Menelion Elens�l� - 2014-08-06 20:12:37 - In reply to message 3 from Joeri Sebrechts

Of course I use mbstring. However I believe it's slower than if there would be native Unicode support in the language core. Am I wrong?

5. Re: Unicode

Report abuse

Manuel Lemos - 2014-08-06 20:37:34 - In reply to message 4 from Andre Polykanine A.K.A. Menelion Elens�l�

I think any multi-byte text encoding manipulation is slower than the regular single byte encoding text manipulation.

If I am not mistaken, the original PHP 6 plans were using UTF-16 to manipulate all text strings.

This means that single-byte text would be slower to manipulate than what we have today. That could hurt PHP speed in general.

So I am afraid the transparent Unicode support that some developers desire, comes at a price, of either speed and memory usage.

About us

Advertise on this site

For more information send a message to info at phpclasses dot org.