[PHP学习教程]006.获取网页内容(URL Content)

时间：2015-09-13 17:24:48 阅读：252 评论：0 收藏：0 [点我收藏+]

标签：

引言：获取网页内容是我们实现网页操作的基本之基本，今天这一讲，我们和大家讲一下基本请求网页内容的几种方法。

我们似乎每天都要做这样一件事情，打开一个浏览器，输入网址，回车，一个空白的页面顿时有了东西，它可能是百度之类的搜索页面，或是一个挤满了文字和图片的门户网站。

我们可以从三个方面理解这个过程，一个是浏览器，二个是服务器，第三个是浏览器和服务器之间通信的协议。

当然，我们今天不讲<网页请求过程 >

这一次，我们说一下如何用PHP代码请求网页内容。

技术分享

获取网页内容方法

１.file_get_contents＋[请求方式：GET]

<?php 
$url  = ‘http://do.org.cn‘; 
$html = file_get_contents($url); 
echo $html;
?>

２.file_get_contents＋[请求方式：POST]

无需Cookie操作，则使用以下方式：

<?php 
$url    = ‘http://do.org.cn/upload.php‘; 
$data   = http_build_query(array(‘foo‘ => ‘bar‘)); 
$params = array(
    ‘http‘ => array(
        ‘method‘  => ‘POST‘, 
        ‘content‘ => $data,
        ‘header‘  => 
            "Content-type: application/x-www-form-urlencoded\r\n" . 
            "Content-Length: " . strlen($data) . "\r\n"
    )
);
$context = stream_context_create($params); 
$html    = @file_get_contents($url, ‘‘, $context);

有需Cookie操作，则在$params里的‘header‘添加下一行(类似如下文)：

"cookie:cookie1=c1;cookie2=c2\r\n" ;

３.fopen＋[请求方式：GET]

<?php
// 尝试打开网页
$fp = fopen($url, ‘r‘); 
// 获取报头信息
$header = stream_get_meta_data($fp);
while (!feof($fp)) { 
    $result .= fgets($fp, 1024); 
} 
fclose($fp); 
// 输出结果
echo "url header: {$header} <br/>"; 
echo "url body: $result"; 
?>

4.fopen＋[请求方式：POST]

<?php 
$data   = http_build_query(array(‘foo1‘ => ‘bar1‘, ‘foo2‘ => ‘bar2‘)); 
$params = array( 
    ‘http‘ => array( 
        ‘method‘  => ‘POST‘, 
        ‘content‘ => $data,
        ‘header‘  => "Content-type: application/x-www-form-urlencoded\r\nCookie:cook1=c3;cook2=c4\r\n" . 
        "Content-Length: " . strlen($data) . "\r\n"
    )
); 
 
$context = stream_context_create($params); 
$fp      = fopen(‘http://do.org.cn/upload.php‘, ‘rb‘, false, $context); 
$content = fread($fp, 1024);
fclose($fp);
echo $content; 
?>

5.fsockopen＋[请求方式：GET]

用fsockopen函数打开网址URL，以GET方式请求完整的数据，包括header和body．

<?php 
function get_url($url, $cookie=false) {
    $url   = parse_url($url); 
    $query = $url[path]. "?" .$url[query]; 
    echo "Query:" . $query;
    $fp = fsockopen($url[host], $url[port] ? $url[port] : 80, $errno, $errstr, 30); 
    if (!$fp) { 
        return false;
    } else { 
        $request  = "GET $query HTTP/1.1\r\n"; 
        $request .= "Host: $url[host]\r\n"; 
        $request .= "Connection: Close\r\n"; 
        if ($cookie) {
            $request.="Cookie: $cookie\r\n"; 
        }
        $request .= "\r\n"; 
        fwrite($fp, $request); 
        while (!@feof($fp)) { 
            $result .= @fgets($fp, 1024); 
        }
        fclose($fp); 
        return $result; 
    }
}
// 获取url的html部分，去掉header 
function get_html($url, $cookie=false) {
    $data = get_url($url, $cookie); 
    if ($data) {
        $body = stristr($data, "\r\n\r\n"); 
        $body = substr($body, 4, strlen($body)); 
        return $body; 
    } 
    return false; 
}
?>

6.fsockopen＋[请求方式：POST]

用fsockopen函数打开网址URL，以POST方式请求完整的数据，包括header和body．

<?php 
function post_url($url, $data, $cookie, $referrer="") {
    // parsing the given URL 
    $url_info = parse_url($url); 
     
    // Building referrer 
    if ($referrer == "") { // if not given use this script as referrer 
        $referrer = "111";
    } 
     
    // making string from $data 
    foreach ($data as $key => $value) { 
        $values[] = "$key=" . urlencode($value);
    } 
    $data_string = implode("&", $values); 
     
    // Find out which port is needed - if not given use standard (=80) 
    if (!isset($url_info["port"])) {
        $url_info["port"] = 80; 
    }
    
    // building POST-request: 
    $request .= "POST ".$url_info["path"]." HTTP/1.1\n"; 
    $request .= "Host: ".$url_info["host"]."\n"; 
    $request .= "Referer: $referer\n"; 
    $request .= "Content-type: application/x-www-form-urlencoded\n"; 
    $request .= "Content-length: ".strlen($data_string)."\n"; 
    $request .= "Connection: close\n"; 
    $request .= "Cookie: $cookie\n"; 
    $request .= "\n"; 
    $request .= $data_string."\n"; 
    
    $fp = fsockopen($url_info["host"], $url_info["port"]); 
    fputs($fp, $request); 
    while (!feof($fp)) { 
        $result .= fgets($fp, 1024); 
    }
    fclose($fp); 
    
    return $result; 
}
?>

7.curl库

注意：使用curl库之前，需要查看一下php.ini是否已经打开了curl扩展．

<?php 
$ch = curl_init(); 
$timeout = 5; 
curl_setopt($ch, CURLOPT_URL, ‘http://do.org.cn/‘); 
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); 
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout); 
$content = curl_exec($ch); 
curl_close($ch); 
 
echo $content; 
?>

技术分享

本站文章为 宝宝巴士 SD.Team 原创，转载务必在明显处注明：（作者官方网站：宝宝巴士 )
转载自【宝宝巴士SuperDo团队】 原文链接: http://www.cnblogs.com/superdo/p/4805187.html

[PHP学习教程]006.获取网页内容(URL Content)

标签：

原文地址：http://www.cnblogs.com/superdo/p/4805187.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行