码迷,mamicode.com
首页 > Web开发 > 详细

[PHP学习教程]006.获取网页内容(URL Content)

时间:2015-09-13 17:24:48      阅读:252      评论:0      收藏:0      [点我收藏+]

标签:

引言:获取网页内容是我们实现网页操作的基本之基本,今天这一讲,我们和大家讲一下基本请求网页内容的几种方法。

我们似乎每天都要做这样一件事情,打开一个浏览器,输入网址,回车,一个空白的页面顿时有了东西,它可能是百度之类的搜索页面,或是一个挤满了文字和图片的门户网站。

我们可以从三个方面理解这个过程,一个是浏览器,二个是服务器,第三个是浏览器和服务器之间通信的协议。

当然,我们今天不讲<网页请求过程 >

这一次,我们说一下如何用PHP代码请求网页内容

技术分享

 

获取网页内容方法


1.file_get_contents+[请求方式:GET]

<?php 
$url  = ‘http://do.org.cn‘; 
$html = file_get_contents($url); 
echo $html;
?> 

 

2.file_get_contents+[请求方式:POST]

无需Cookie操作,则使用以下方式:

<?php 
$url    = ‘http://do.org.cn/upload.php‘; 
$data   = http_build_query(array(‘foo‘ => ‘bar‘)); 
$params = array(
    ‘http‘ => array(
        ‘method‘  => ‘POST‘, 
        ‘content‘ => $data,
        ‘header‘  => 
            "Content-type: application/x-www-form-urlencoded\r\n" . 
            "Content-Length: " . strlen($data) . "\r\n"
    )
);
$context = stream_context_create($params); 
$html    = @file_get_contents($url, ‘‘, $context); 

有需Cookie操作,则在$params里的‘header‘添加下一行(类似如下文):

"cookie:cookie1=c1;cookie2=c2\r\n" ;

 

3.fopen+[请求方式:GET]

<?php
// 尝试打开网页
$fp = fopen($url, ‘r‘); 
// 获取报头信息
$header = stream_get_meta_data($fp);
while (!feof($fp)) { 
    $result .= fgets($fp, 1024); 
} 
fclose($fp); 
// 输出结果
echo "url header: {$header} <br/>"; 
echo "url body: $result"; 
?> 

 

4.fopen+[请求方式:POST]

<?php 
$data   = http_build_query(array(‘foo1‘ => ‘bar1‘, ‘foo2‘ => ‘bar2‘)); 
$params = array( 
    ‘http‘ => array( 
        ‘method‘  => ‘POST‘, 
        ‘content‘ => $data,
        ‘header‘  => "Content-type: application/x-www-form-urlencoded\r\nCookie:cook1=c3;cook2=c4\r\n" . 
        "Content-Length: " . strlen($data) . "\r\n"
    )
); 
 
$context = stream_context_create($params); 
$fp      = fopen(‘http://do.org.cn/upload.php‘, ‘rb‘, false, $context); 
$content = fread($fp, 1024);
fclose($fp);
echo $content; 
?>

 

5.fsockopen+[请求方式:GET]

用fsockopen函数打开网址URL,以GET方式请求完整的数据,包括header和body.

<?php 
function get_url($url, $cookie=false) {
    $url   = parse_url($url); 
    $query = $url[path]. "?" .$url[query]; 
    echo "Query:" . $query;
    $fp = fsockopen($url[host], $url[port] ? $url[port] : 80, $errno, $errstr, 30); 
    if (!$fp) { 
        return false;
    } else { 
        $request  = "GET $query HTTP/1.1\r\n"; 
        $request .= "Host: $url[host]\r\n"; 
        $request .= "Connection: Close\r\n"; 
        if ($cookie) {
            $request.="Cookie: $cookie\r\n"; 
        }
        $request .= "\r\n"; 
        fwrite($fp, $request); 
        while (!@feof($fp)) { 
            $result .= @fgets($fp, 1024); 
        }
        fclose($fp); 
        return $result; 
    }
}
// 获取url的html部分,去掉header 
function get_html($url, $cookie=false) {
    $data = get_url($url, $cookie); 
    if ($data) {
        $body = stristr($data, "\r\n\r\n"); 
        $body = substr($body, 4, strlen($body)); 
        return $body; 
    } 
    return false; 
}
?> 

 

6.fsockopen+[请求方式:POST]

用fsockopen函数打开网址URL,以POST方式请求完整的数据,包括header和body.

<?php 
function post_url($url, $data, $cookie, $referrer="") {
    // parsing the given URL 
    $url_info = parse_url($url); 
     
    // Building referrer 
    if ($referrer == "") { // if not given use this script as referrer 
        $referrer = "111";
    } 
     
    // making string from $data 
    foreach ($data as $key => $value) { 
        $values[] = "$key=" . urlencode($value);
    } 
    $data_string = implode("&", $values); 
     
    // Find out which port is needed - if not given use standard (=80) 
    if (!isset($url_info["port"])) {
        $url_info["port"] = 80; 
    }
    
    // building POST-request: 
    $request .= "POST ".$url_info["path"]." HTTP/1.1\n"; 
    $request .= "Host: ".$url_info["host"]."\n"; 
    $request .= "Referer: $referer\n"; 
    $request .= "Content-type: application/x-www-form-urlencoded\n"; 
    $request .= "Content-length: ".strlen($data_string)."\n"; 
    $request .= "Connection: close\n"; 
    $request .= "Cookie: $cookie\n"; 
    $request .= "\n"; 
    $request .= $data_string."\n"; 
    
    $fp = fsockopen($url_info["host"], $url_info["port"]); 
    fputs($fp, $request); 
    while (!feof($fp)) { 
        $result .= fgets($fp, 1024); 
    }
    fclose($fp); 
    
    return $result; 
}
?>

 

7.curl库

注意:使用curl库之前,需要查看一下php.ini是否已经打开了curl扩展.

<?php 
$ch = curl_init(); 
$timeout = 5; 
curl_setopt($ch, CURLOPT_URL, ‘http://do.org.cn/‘); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
$content = curl_exec($ch); 
curl_close($ch); 
 
echo $content; 
?>

 

 

 

 技术分享

本站文章为 宝宝巴士 SD.Team 原创,转载务必在明显处注明:(作者官方网站: 宝宝巴士 
转载自【宝宝巴士SuperDo团队】 原文链接: http://www.cnblogs.com/superdo/p/4805187.html

 

 

[PHP学习教程]006.获取网页内容(URL Content)

标签:

原文地址:http://www.cnblogs.com/superdo/p/4805187.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!