码迷,mamicode.com
首页 > 其他好文 > 详细

Perl中正则\p属性

时间:2015-08-06 16:53:10      阅读:96      评论:0      收藏:0      [点我收藏+]

标签:perl   正则   p   

今天研究一个Perl脚本,有几个正则非常不解:

    $text =~ s/([?!]) +([\‘\"\(\[\?\?\p{IsPi}]*[\p{IsUpper}])/$1\n$2/g;

    #multi-dots followed by sentence starters
    $text =~ s/(\.[\.]+) +([\‘\"\(\[\?\?\p{IsPi}]*[\p{IsUpper}])/$1\n$2/g;

    # add breaks for sentences that end with some sort of punctuation inside a quote or parenthetical and are followed by a possible sentence starter punctuation and upper case
    $text =~ s/([?!\.][\ ]*[\‘\"\)\]\p{IsPf}]+) +([\‘\"\(\[\?\?\p{IsPi}]*[\ ]*[\p{IsUpper}])/$1\n$2/g;

    # add breaks for sentences that end with some sort of punctuation are followed by a sentence starter punctuation and upper case
    $text =~ s/([?!\.]) +([\‘\"\(\[\?\?\p{IsPi}]+[\ ]*[\p{IsUpper}])/$1\n$2/g;

其中\p后面的字符代表了一个unicode属性。也就是在perl里每个unicode编码都有一个独特的属性,我们可以根据它们各自的unicode属性找到匹配的字符。
关于unicode属性的介绍如下:
http://shouce.jb51.net/perl/PatternMatching.html
http://blog.csdn.net/wushuai1346/article/details/7206749
http://perldoc.perl.org/perluniprops.html

版权声明:本文为博主原创文章,未经博主允许不得转载。

Perl中正则\p属性

标签:perl   正则   p   

原文地址:http://blog.csdn.net/lampqiu/article/details/47317951

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!