码迷,mamicode.com
首页 > Web开发 > 详细

采集网页数据生成到静态模板newslist.html文件中(正则表达式)

时间:2015-09-20 20:28:17      阅读:201      评论:0      收藏:0      [点我收藏+]

标签:

技术分享

 

采集数据源:http://www.sgcc.com.cn/xwzx/gsyw/

 

//根据URL地址获取所有html
        public static string GetUrltoHtml(string Url, string type)
        {
            try
            {
                System.Net.WebRequest wReq = System.Net.WebRequest.Create(Url);
                // Get the response instance.
                System.Net.WebResponse wResp = wReq.GetResponse();
                System.IO.Stream respStream = wResp.GetResponseStream();
                // Dim reader As StreamReader = New StreamReader(respStream)
                using (System.IO.StreamReader reader = new System.IO.StreamReader(respStream, Encoding.GetEncoding(type)))
                {
                    return reader.ReadToEnd();
                }
            }
            catch (System.Exception ex)
            {
                //errorMsg = ex.Message;
            }
            return "";
        }

  

 /// <summary>
        /// GetSubString截取字符串
        /// </summary>
        /// <param name="strSource">原始字符</param>
        /// <param name="strIndexOf">开始字符</param>
        /// <param name="strLastOf">结束字符</param>
        /// <returns></returns>
        public static string GetSubString(string strSource, string strIndexOf, string strLastOf)
        {
            string strResult = string.Empty;
            int indexOf = strSource.IndexOf(strIndexOf);
            if (indexOf > -1)
            {
                string strTemp = strSource.Substring(indexOf + strIndexOf.Length);
                if (!string.IsNullOrEmpty(strTemp))
                {
                    strResult = strTemp.Substring(0, strTemp.IndexOf(strLastOf));
                }
            }
            return strResult;
        }

  

采集网页数据生成到静态模板newslist.html文件中(正则表达式)

标签:

原文地址:http://www.cnblogs.com/500k/p/4824086.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!