Perl中正则\p属性

时间：2015-08-06 16:53:10 阅读：96 评论：0 收藏：0 [点我收藏+]

今天研究一个Perl脚本，有几个正则非常不解：

    $text =~ s/([?!]) +([\‘\"\(\[\?\?\p{IsPi}]*[\p{IsUpper}])/$1\n$2/g;

    #multi-dots followed by sentence starters
    $text =~ s/(\.[\.]+) +([\‘\"\(\[\?\?\p{IsPi}]*[\p{IsUpper}])/$1\n$2/g;

    # add breaks for sentences that end with some sort of punctuation inside a quote or parenthetical and are followed by a possible sentence starter punctuation and upper case
    $text =~ s/([?!\.][\ ]*[\‘\"\)\]\p{IsPf}]+) +([\‘\"\(\[\?\?\p{IsPi}]*[\ ]*[\p{IsUpper}])/$1\n$2/g;

    # add breaks for sentences that end with some sort of punctuation are followed by a sentence starter punctuation and upper case
    $text =~ s/([?!\.]) +([\‘\"\(\[\?\?\p{IsPi}]+[\ ]*[\p{IsUpper}])/$1\n$2/g;

其中\p后面的字符代表了一个unicode属性。也就是在perl里每个unicode编码都有一个独特的属性，我们可以根据它们各自的unicode属性找到匹配的字符。
关于unicode属性的介绍如下：
http://shouce.jb51.net/perl/PatternMatching.html
http://blog.csdn.net/wushuai1346/article/details/7206749
http://perldoc.perl.org/perluniprops.html

Perl中正则\p属性

标签：perl 正则 p

原文地址：http://blog.csdn.net/lampqiu/article/details/47317951

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行