码迷,mamicode.com
首页 > Web开发 > 详细

PHP 删除非法UTF-8字符

时间:2015-09-15 14:19:59      阅读:326      评论:0      收藏:0      [点我收藏+]

标签:

//reject overly long 2 byte sequences, as well as characters above U+10000 and replace with ?
$some_string = preg_replace(/[x00-x08x10x0Bx0Cx0E-x19x7F].
 |[x00-x7F][x80-xBF]+.
 |([xC0xC1]|[xF0-xFF])[x80-xBF]*.
 |[xC2-xDF]((?![x80-xBF])|[x80-xBF]{2,}).
 |[xE0-xEF](([x80-xBF](?![x80-xBF]))|(?![x80-xBF]{2})|[x80-xBF]{3,})/S,
 ?, $some_string );

//reject overly long 3 byte sequences and UTF-16 surrogates and replace with ?
$some_string = preg_replace(/xE0[x80-x9F][x80-xBF].
 |xED[xA0-xBF][x80-xBF]/S,?, $some_string );

 

PHP 删除非法UTF-8字符

标签:

原文地址:http://www.cnblogs.com/chenshuo/p/4809950.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!