标签:
前言:因为在开发中很可能会遇到html解析,如果后台提供的数据只有html数据,或者开发的app需要从web前端的html里获取数据,就需要html解析工具了。
关于HTML解析库,可以阅读:收集几个Objective-C的HTML解析库 了解。下面是我在学习开源项目Coding遇到用到了hpple解析工具,所以就拿出来学习然后总结总结一下了。
新建一个普通的工程,然后我使用了pod集成这个hpple库来使用:
然后我累解析解析index.html文件:
主要的代码:
1 - (void)touchesBegan:(NSSet<UITouch *> *)touches withEvent:(UIEvent *)event{ 2 NSData * data = [NSData dataWithContentsOfFile:@"/Users/HeYang/Documents/Xcode/object-c/第三方库hpple的学习/hppleTest/hppleTest/index.html"]; 3 NSLog(@"数据:%@",data); 4 TFHpple * doc = [[TFHpple alloc] initWithHTMLData:data]; 5 6 NSArray * elements = [doc searchWithXPathQuery:@"//a"]; 7 8 NSLog(@"a节点个数:%ld",elements.count); 9 for (int i = 0; i < elements.count; i++) { 10 TFHppleElement * e = [elements objectAtIndex:i]; 11 NSLog(@"1:%@",[e text]); // The text inside the HTML element (the content of the first text node) 12 NSLog(@"2:%@",[e tagName]); // "a" 13 NSLog(@"3:%@",[e attributes]); // NSDictionary of href, class, id, etc. 14 NSLog(@"4:%@",[e objectForKey:@"href"]); // Easy access to single attribute 15 NSLog(@"5:%@",[e firstChildWithTagName:@"img"]); // The first "b" child node 16 NSLog(@"===========解析完毕一次==========="); 17 } 18 }
打印结果是:
2016-07-10 23:36:00.251 hppleTest[4478:107019] 数据:<3c703ee4 b88be58d 88e4ba94 e782b9e5 a49ae5b0 b1e588b0 e8bebee4 ba86e79b aee79a84 e59cb0e7 a8bbe59f 8ee58ebf e59f8ee5 a283e586 85efbc8c e4bd8fe5 9ca8e4ba 86e4b880 e4b8aae5 85b7e69c 89e5be88 e5a5bde5 9cb0e790 86e4bd8d e7bdaee7 9a84e985 92e5ba97 efbc8ce4 bd86e698 afe6b2a1 e69c89e7 bd91e7bb 9ce6b2a1 e69c89e7 83ade6b0 b4efbc8c e68891e5 a5bde583 8fe9ab98 e58f8de4 ba86efbc 8ce4b880 e5a4a9e5 9d90e8bd a6efbc8c e79c8be5 88b0e4ba 86e8aeb8 e5a49ae7 9a84e7be 8ee4b8bd e699afe8 89b2efbc 8ce5a5bd e6bc82e4 baaee5a5 bde6bc82 e4baaeef bc8ce698 8ee5a4a9 e699afe5 8cbae79a 84e69c80 e4bd8ee6 b5b7e68b 94e59b9b e58d83e4 b8a4e799 bee5a49a e7b1b3ef bc8ce5b8 8ce69c9b e4b88de8 a681e9ab 98e58f8d e4ba86e2 80a6e280 a60a2020 20203c61 20687265 663d2268 74747073 3a2f2f64 6e2d636f 64696e67 2d6e6574 2d70726f 64756374 696f6e2d 70702e71 626f782e 6d652f62 65663363 3537392d 64633964 2d343633 642d6235 34662d35 31646133 33393163 6665622e 6a706722 20746172 6765743d 225f626c 616e6b22 20636c61 73733d22 62756262 6c652d6d 61726b64 6f776e2d 696d6167 652d6c69 6e6b2220 72656c3d 226e6f66 6f6c6c6f 77223e0a 20202020 20202020 3c696d67 20737263 3d226874 7470733a 2f2f646e 2d636f64 696e672d 6e65742d 70726f64 75637469 6f6e2d70 702e7162 6f782e6d 652f6265 66336335 37392d64 6339642d 34363364 2d623534 662d3531 64613333 39316366 65622e6a 70672220 616c743d 22222063 6c617373 3d226275 62626c65 2d6d6172 6b646f77 6e2d696d 61676522 3e0a2020 20203c2f 613e0a20 2020203c 61206872 65663d22 68747470 733a2f2f 646e2d63 6f64696e 672d6e65 742d7072 6f647563 74696f6e 2d70702e 71626f78 2e6d652f 30353330 39633035 2d636263 642d3466 63612d38 3632372d 39313433 62323762 39323266 2e6a7067 22207461 72676574 3d225f62 6c616e6b 2220636c 6173733d 22627562 626c652d 6d61726b 646f776e 2d696d61 67652d6c 696e6b22 2072656c 3d226e6f 666f6c6c 6f77223e 0a202020 203c696d 67207372 633d2268 74747073 3a2f2f64 6e2d636f 64696e67 2d6e6574 2d70726f 64756374 696f6e2d 70702e71 626f782e 6d652f30 35333039 6330352d 63626364 2d346663 612d3836 32372d39 31343362 32376239 3232662e 6a706722 20616c74 3d222220 636c6173 733d2262 7562626c 652d6d61 726b646f 776e2d69 6d616765 223e0a20 2020203c 2f613e0a 3c2f703e> 2016-07-10 23:36:00.255 hppleTest[4478:107019] a节点个数:2 2016-07-10 23:36:00.256 hppleTest[4478:107019] 1: 2016-07-10 23:36:00.256 hppleTest[4478:107019] 2:a2016-07-10 23:36:00.258 hppleTest[4478:107019] 5:{ nodeAttributeArray = ( { attributeName = src; nodeContent = "https://dn-coding-net-production-pp.qbox.me/bef3c579-dc9d-463d-b54f-51da3391cfeb.jpg"; }, { attributeName = alt; nodeContent = ""; }, { attributeName = class; nodeContent = "bubble-markdown-image"; } ); nodeName = img; raw = "<img src=\"https://dn-coding-net-production-pp.qbox.me/bef3c579-dc9d-463d-b54f-51da3391cfeb.jpg\" alt=\"\" class=\"bubble-markdown-image\"/>"; } 2016-07-10 23:36:00.258 hppleTest[4478:107019] ===========解析完毕一次=========== 2016-07-10 23:36:00.258 hppleTest[4478:107019] 1: 2016-07-10 23:36:00.259 hppleTest[4478:107019] 2:a 2016-07-10 23:36:00.259 hppleTest[4478:107019] 3:{ class = "bubble-markdown-image-link"; href = "https://dn-coding-net-production-pp.qbox.me/05309c05-cbcd-4fca-8627-9143b27b922f.jpg"; rel = nofollow; target = "_blank"; } 2016-07-10 23:36:00.259 hppleTest[4478:107019] 4:https://dn-coding-net-production-pp.qbox.me/05309c05-cbcd-4fca-8627-9143b27b922f.jpg 2016-07-10 23:36:00.260 hppleTest[4478:107019] 5:{ nodeAttributeArray = ( { attributeName = src; nodeContent = "https://dn-coding-net-production-pp.qbox.me/05309c05-cbcd-4fca-8627-9143b27b922f.jpg"; }, { attributeName = alt; nodeContent = ""; }, { attributeName = class; nodeContent = "bubble-markdown-image"; } ); nodeName = img; raw = "<img src=\"https://dn-coding-net-production-pp.qbox.me/05309c05-cbcd-4fca-8627-9143b27b922f.jpg\" alt=\"\" class=\"bubble-markdown-image\"/>"; }
项目源码链接: http://pan.baidu.com/s/1eSK58eY 密码: g4ic
标签:
原文地址:http://www.cnblogs.com/goodboy-heyang/p/5658936.html