PHP正式不支持\u, 用\x代替,不过原理是一样的,每个字节用一个16进制数字表达,utf-8三字节,gb二字节。
utf-8: [\x80-\xff]{3}
gb: [\x80-\xff]{2}
$str="哈噢哦呃哇哟"; //utf8
$str=preg_replace('/[\x80-\xff]{3}/','呵',$str);
结果: 呵呵呵呵呵呵
分别获取中英混合字符串的“词”,已解决全角空格
$str='你是 我的小苹果 how are you? 弄啥呢 记得q我 q你妹 @陈sir'
preg_match_all('/([^ ]*?[\x{4e00}-\x{9fa5}]+[^ ]*)|(([0-9A-Za-z~@#\$%\^&\*\(\)\-_\+=\[\]{}:;\"\'\|\\\<>,\.\?\/]+ ?)+)/u',$str,$match);
//结果:
[0] => array(7) { [0] => string(6) "你是" [1] => string(15) "我的小苹果" [2] => string(13) "how are you? " [3] => string(9) "弄啥呢" [4] => string(10) "记得q我" [5] => string(7) "q你妹" [6] => string(7) "@陈sir" }