码迷,mamicode.com
首页 > 其他好文 > 详细

URL去重

时间:2014-06-28 21:53:30      阅读:234      评论:0      收藏:0      [点我收藏+]

标签:blog   get   os   name   for   c   

import socket

dictlist ={};

def ReadHost():
    hosts = [];
    obn = open(‘d:/sss.txt‘, ‘rb‘);
    for line in obn:
        #sometime you should filter \r\n
        line = line.strip(‘\n‘)
        hosts.append(line)
    obn.close();
    return hosts;

def SysDNS():
    hosts = ReadHost();

    for host in hosts:
        #print(host)
        try:
            myaddrs = socket.getaddrinfo(host,None)
            for eachaddr in myaddrs:
    	        addrs = eachaddr[4][0]
                #print((addrs))
                if(dictlist.has_key(addrs)):
                    break;
                else:
                    dictlist[addrs] = host;
                    #print(host)
                    break;
        except socket.herror,e:
            continue;
        except socket.gaierror,e1:
            continue;



def showDict():
    fw = open("d:/out.txt","wb");
    for (k,v) in dictlist.items():
        #print(k,v)
        fw.writelines(v);
    fw.close();

if __name__ == "__main__":
    SysDNS();
    showDict();

  

URL去重,布布扣,bubuko.com

URL去重

标签:blog   get   os   name   for   c   

原文地址:http://www.cnblogs.com/xiaobaichuangtianxia/p/3794453.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!