码迷,mamicode.com
首页 > Web开发 > 详细

网页中抓取数据

时间:2014-06-27 07:51:35      阅读:280      评论:0      收藏:0      [点我收藏+]

标签:网页获取数据   爬虫   

下面写个例子,实现从网页中抓取数据。

这个例子中,只是从网页中获取了数据,但是没有进行任何处理,只是将数据保存到一个txt文件中。

该例子是在android工程中写的。

package com.example.creepertest;


import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStreamReader;
import java.io.OutputStream;
import java.io.OutputStreamWriter;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;


public class Controller
{
public static final String SD_LOTTERY_URL = 
"http://www.sdticai.com/zjhfx/cpmain.asp?cptype=115";

public static final String SD_LOTTERY_FILE_PATH = "/data/data/com.example.creepertest/test.txt";

private BufferedWriter mBufferedWriter = null;
private BufferedReader mBufferedReader = null;

public Controller(){
Runnable runnable = new Runnable(){
@Override
public void run() {
captureHtml();
}
};
Thread thread = new Thread(runnable);
thread.start();

}

private void captureHtml(){
try{
URL sdLotterUrl = new URL(SD_LOTTERY_URL);
HttpURLConnection httpConn = 
(HttpURLConnection) sdLotterUrl.openConnection();


InputStreamReader inputStreamReader = 
new InputStreamReader(httpConn.getInputStream(), "utf-8");
mBufferedReader = new BufferedReader(inputStreamReader);  
 
OutputStream outputStream = 
new FileOutputStream(SD_LOTTERY_FILE_PATH,true);
OutputStreamWriter outputStreamWriter = new OutputStreamWriter(outputStream);
mBufferedWriter = new BufferedWriter(outputStreamWriter); 
 
String lineStr = null;
while(true){
lineStr = mBufferedReader.readLine();
if(lineStr != null){
mBufferedWriter.write(lineStr);
mBufferedWriter.newLine();
mBufferedWriter.flush();
}
else
break;
}
 
}
catch (MalformedURLException e){
e.printStackTrace();
}
catch (IOException e){
e.printStackTrace();
}

finally{
try {
if(mBufferedWriter != null)
mBufferedWriter.close();

if(mBufferedReader != null)
mBufferedReader.close();
} catch (Exception exception){
exception.printStackTrace();
}
   }
}
}

网页中抓取数据,布布扣,bubuko.com

网页中抓取数据

标签:网页获取数据   爬虫   

原文地址:http://blog.csdn.net/u011882998/article/details/34878687

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!