码迷,mamicode.com
首页 > Web开发 > 详细

HtmlParser中的各种Filter(1)

时间:2016-02-22 20:44:01      阅读:226      评论:0      收藏:0      [点我收藏+]

标签:

所有的Filter均实现了NodeFilter接口,此接口只有一个方法Boolean accept(Node node),用于确定某个节点 是否属于此Filter过滤的范围。 HtmlParser在org.htmlparser.filters包之内一共定义了16个不同的Filter,也可以分为几类。

判断类Filter: TagNameFilter

                  HasAttributeFilter

                  HasChildFilter

                  HasParentFilter

                  HasSiblingFilter

                  IsEqualFilter

逻辑运算Filter

                  AndFilter

                  NotFilter

                  OrFilter

                  XorFilter

其他Filter:

                 NodeClassFilter

                 StringFilter

                 LinkStringFilter

                 LinkRegexFilter

                 RegexFilter

                 CssSelectorNodeFilter

除此之外,可以自定义一些Filter,用于完成特殊需求的过滤

 

Tag类

  主要和NodeClassFilter配合使用

         Remark:注释

         AppletTag:

         BaseHrefTag:

         Body Tag:"BODY";//getBody();内部调用额是toPlainTextString();

         Bullet:"LI"

         BulletList:"UL","OL"

         CompositeTag:

         DefinitionList:"DL"

         DefinitionListBullet:"DD","DT"

         Div:"DIV"

         DoctypeTag:“!DOCTYPE"

         FormTag:

         FrameSetTag:

         FrameTag:

         HeadingTag:"H1","H2","H3","H4","H5","H6"

         HeadTag:"HEAD"

         Html:"HTML"

         ImageTag:

         InputTag:"INPUT"

         JspTag:"%","%=","%@"

         LabelTag:"LABEL"

        

         LinkTag:

         MetaTag:

         ObjectTag:

         OptionTag:

         ParagraphTag:"P"

         ProcessingInstructionTag:"?"

         ScriptTag:

         SelectTag:"SELECT"

         Span:"SPAN"

         StyleTag:"STYLE"

          TableColumn:"TD"

          TableHeader:"TH"

          TableRow:"TR"

          TableTag:"TABLE"

          TagNode:

          TextareaTag:"TEXTAREA"

          TitleTag:"TITLE"

           TextNode:

        

 

HtmlParser中的各种Filter(1)

标签:

原文地址:http://www.cnblogs.com/fighting-ayong/p/5208016.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!