码迷,mamicode.com
首页 > Web开发 > 详细

curl,fsocketopen,socket 三种函数抓取html页面

时间:2015-08-10 12:04:42      阅读:146      评论:0      收藏:0      [点我收藏+]

标签:socket   curl   fsocketopen   抓取   爬虫   

(1) php - curl 

<?php
    $ch_article = curl_init();
    $url        = 'www.baidu.com';
    curl_setopt($ch_article, CURLOPT_URL, $url);
    curl_setopt($ch_article, CURLOPT_RETURNTRANSFER, 0);
    curl_setopt($ch_article, CURLOPT_HEADER, 0);
    $article_output = curl_exec($ch_article);
    curl_close($ch_article);
    echo $article_output;
?>


(2) php - fsocketopen

<?php
$fp = fsockopen("www.baidu.com", 80, $errno, $errstr, 30);
$out = "GET / HTTP/1.1\r\n";
$out .= "Host: www.baidu.com\r\n";
$out .= "Connection: Close\r\n\r\n";
fwrite($fp, $out);
while (!feof($fp)) {
    echo fgets($fp, 128);
}
fclose($fp);
?>


(3) php - socket

<?php
$url='www.baidu.com';
$Port = 80;
$host_ip  = gethostbyname('www.baidu.com');
$Header  .= trim('Host:www.baidu.com')."\r\n";
$Header  .= trim('Connection: Close')."\r\n";
$method   = 'GET';
$Request  = $method." " . '/' . " HTTP/1.1\r\n";
$Request .= $Header;
$Request .= "\r\n";
$sockHttp    = socket_create(AF_INET, SOCK_STREAM, SOL_TCP);
$resSockHttp = socket_connect($sockHttp, $host_ip, $Port);
socket_write($sockHttp, $Request, strlen($Request));
$Response = '';
while ($Read_data = socket_read($sockHttp, 4096)){
	$Response .= $Read_data;
}
socket_close($sockHttp);
echo $Response;
?>


版权声明:本文为博主原创文章,未经博主允许不得转载。

curl,fsocketopen,socket 三种函数抓取html页面

标签:socket   curl   fsocketopen   抓取   爬虫   

原文地址:http://blog.csdn.net/wab5168/article/details/47395655

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!