标签:-- encoding 通过 pretty fill ++ 支持 邮件数据 选项
之前老是用Python发邮件,用起来挺方便的。但是一直没实现用Python收邮件,最近忙着笔试面试,但是又不能时刻打开浏览器,刷手机看看有没有新邮件(尤其是有没有关于面试,笔试通知的邮件)啊。所以写个脚本来做下定时任务,一旦有相关的主题邮件,结合GUI控件,声音组件,给出提醒。
想来还是不错的。
收邮件其实是被Python内置支持的,名为poplib。安装了Python的话,就会默认存在于标准库中,用起来也很方便。下面就一步步的来实现今天的任务吧。
由于要使用非官方客户端的方式来收取邮件,所以需要打开相应的服务。我本人使用的邮箱是163邮箱,其他的邮箱操作也是类似的。如下:
开启pop/smtp/imap协议
先发一封邮件
给目标邮箱发一封邮件吧。我的内容如下:
至此,准备阶段就算是完成了。
初体验嘛,肯定是会比较简单的了。比如下面我先获取一些常用的信息:
# coding: utf8
import poplib
# 邮箱个人信息
useraccount = ‘你的邮箱‘
password = ‘你的密码(注意这个密码是授权码,不是你客户端直接登录用的密码)‘
# 邮件服务器地址。如果你的邮箱是163,那么可以这么写。qq的话就是qq.163.com
pop3_server = ‘pop.163.com‘
# 开始连接到服务器
server = poplib.POP3(pop3_server)
# 可选项: 打开或者关闭调试信息,1为打开,会在控制台打印客户端与服务器的交互信息
server.set_debuglevel(1)
# 可选项: 打印POP3服务器的欢迎文字,验证是否正确连接到了邮件服务器
print(server.getwelcome().decode(‘utf8‘))
# 开始进行身份验证
server.user(useraccount)
server.pass_(password)
# 返回邮件总数目和占用服务器的空间大小(字节数), 通过stat()方法即可
print("Mail counts: {0}, Storage Size: {0}".format(server.stat()))
# 使用list()返回所有邮件的编号,默认为字节类型的串
resp, mails, octets = server.list()
print("响应信息: ", resp)
print("所有邮件简要信息: ", mails)
print("list方法返回数据大小(字节): ", octets)
# 关闭与服务器的连接,释放资源
server.close()
运行结果呢?如下:
+OK Welcome to coremail Mail Pop3 Server (163coms[726cd87d72d896a1ac393507346040fas])
*cmd* ‘USER 我的邮箱账号‘
*cmd* ‘PASS 哈哈不给你看‘
*cmd* ‘STAT‘
*stat* [b‘+OK‘, b‘9‘, b‘52140‘]
Mail counts: (9, 52140), Storage Size: (9, 52140)
*cmd* ‘LIST‘
响应信息: b‘+OK 9 52140‘
所有邮件简要信息: [b‘1 1595‘, b‘2 1631‘, b‘3 1568‘, b‘4 26710‘, b‘5 2851‘, b‘6 6856‘, b‘7 1494‘, b‘8 6685‘, b‘9 2750‘]
list方法返回数据大小(字节): 73
经过刚才的初体验,想必已经会和邮件服务器互动了吧。下面正式开始获取邮件。
比如我想获取邮件服务器上其中一封邮件,那么指定一下索引就可以了。
# coding: utf8
import poplib
from email.parser import Parser
import base64
def get_parsed_msg():
# 邮箱个人信息
useraccount = ‘我的邮箱账号‘
password = ‘密码不告诉你‘
# 邮件服务器地址
pop3_server = ‘pop.163.com‘
# 开始连接到服务器
server = poplib.POP3(pop3_server)
# 可选项: 打开或者关闭调试信息,1为打开,会在控制台打印客户端与服务器的交互信息
server.set_debuglevel(1)
# 可选项: 打印POP3服务器的欢迎文字,验证是否正确连接到了邮件服务器
print(server.getwelcome().decode(‘utf8‘))
# 开始进行身份验证
server.user(useraccount)
server.pass_(password)
# 使用list()返回所有邮件的编号,默认为字节类型的串
resp, mails, octets = server.list()
print(‘邮件总数: {}‘.format(len(mails)))
# 下面单纯获取最新的一封邮件
total_mail_numbers = len(mails)
# 默认下标越大,邮件越新,所以total_mail_numbers代表最新的那封邮件
response_status, mail_message_lines, octets = server.retr(total_mail_numbers)
print(‘邮件获取状态: {}‘.format(response_status))
print(‘原始邮件数据:\n{}‘.format(mail_message_lines))
print(‘该封邮件所占字节大小: {}‘.format(octets))
msg_content = b‘\r\n‘.join(mail_message_lines).decode(‘gbk‘)
# 邮件原始数据没法正常浏览,因此需要相应的进行解码操作
msg = Parser().parsestr(text=msg_content)
print(‘解码后的邮件信息:\n{}‘.format(msg))
# 关闭与服务器的连接,释放资源
server.close()
return msg
msg = get_parsed_msg()
print(msg)
获取到的结果如下:
+OK Welcome to coremail Mail Pop3 Server (163coms[726cd87d72d896a1ac393507346040fas])
*cmd* ‘USER 我的邮箱账号‘
*cmd* ‘PASS 密码不告诉你‘
*cmd* ‘LIST‘
邮件总数: 9
*cmd* ‘RETR 9‘
邮件获取状态: b‘+OK 2750 octets‘
原始邮件数据:
[b‘Received: from smtpbg323.qq.com (unknown [14.17.32.33])‘, b‘\tby mx38 (Coremail) with SMTP id WMCowEAJ7HMTgthYHvSICA--.9482S3;‘, b‘\tMon, 27 Mar 2017 11:08:03 +0800 (CST)‘, b‘DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512;‘, b‘\tt=1490584083; bh=5mkEI/McRebiKTOSeIfoEIueMTpC8wHHmBUHOC6EIeY=;‘, b‘\th=From:To:Subject:Mime-Version:Content-Type:Content-Transfer-Encoding:Date:Message-ID;‘, b‘\tb=DOHiu0sjQqdYNTqsgSnoUcWztwB0g1xTHHdTJXXShRp8R72USTGblJP6lRU02p2JR‘, b‘\t U4oG22TWhv3IJ3Or9qd1cKqJ8W/3Ya1ih+L1BTfEXhbUE59v1HDA5GjpCc/Cg7aMgA‘, b‘\t PdtmwvW6H45brmgj3P8/KFeOz2GVKTsdZqV8VK1Y=‘, b‘X-QQ-FEAT: Gf8h89u9tNzDNu+6K07CGaVRAG8UpkukGtC6J/Do7Z8trKQlxG+/B3qJrb7U5‘, b‘\tp1QE1l6aary3W8oy+/VAtrDPVFS54LQa27g7fce+ra/0dXGlXVZsqlieRerMeDF/AgwswQF‘, b‘\twZxxr068ee9tfDe5jX7JccTWC1uZPlqzuks9BPjfYmmnzjayMEYch+msiLwNMwLOZba24mn‘, b‘\tucXTswC0032crI2RaLmiBzCuAdeKmZa+L9J6aS9JUD6zihYObJ6l4P/ps97QHqGBEs4MP4c‘, b‘\tFKzdVlvFxtRg3X‘, b‘X-QQ-SSF: 00010000000000F000000000000000Z‘, b‘X-HAS-ATTACH: no‘, b‘X-QQ-BUSINESS-ORIGIN: 2‘, b‘X-Originating-IP: 111.117.136.219‘, b‘X-QQ-STYLE: ‘, b‘X-QQ-mid: webmail585t1490584082t821567‘, b‘From: "=?gb18030?B?ufnosQ==?=" <1064319632@qq.com>‘, b‘To: "=?gb18030?B?c3BpZGVyc21hbGw=?=" <spidersmall@163.com>‘, b‘Subject: Test for poplib in Python3‘, b‘Mime-Version: 1.0‘, b‘Content-Type: multipart/alternative;‘, b‘\tboundary="----=_NextPart_58D88212_0AF3CB08_64DBF547"‘, b‘Content-Transfer-Encoding: 8Bit‘, b‘Date: Mon, 27 Mar 2017 11:08:02 +0800‘, b‘X-Priority: 3‘, b‘Message-ID: <tencent_00BB432438B2D5FF27AA917D@qq.com>‘, b‘X-QQ-MIME: TCMime 1.0 by Tencent‘, b‘X-Mailer: QQMail 2.x‘, b‘X-QQ-Mailer: QQMail 2.x‘, b‘X-QQ-SENDSIZE: 520‘, b‘Feedback-ID: webmail:qq.com:bgweb:bgweb125‘, b‘X-CM-TRANSID:WMCowEAJ7HMTgthYHvSICA--.9482S3‘, b‘Authentication-Results: mx38; spf=pass smtp.mail=1064319632@qq.com; dk‘, b‘\tim=pass header.i=@qq.com‘, b‘X-Coremail-Antispam: 1Uf129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73‘, b‘\tVFW2AGmfu7bjvjm3AaLaJ3UbIYCTnIWIevJa73UjIFyTuYvjxUDMUqUUUUU‘, b‘‘, b‘This is a multi-part message in MIME format.‘, b‘‘, b‘------=_NextPart_58D88212_0AF3CB08_64DBF547‘, b‘Content-Type: text/plain;‘, b‘\tcharset="gb18030"‘, b‘Content-Transfer-Encoding: base64‘, b‘‘, b‘SGkgYnJvLg0KDQoNCiBUaGlzIGlzIGEgc2ltcGxlIHRleHQgZm9yIHRlc3RpbmcgcmVjZWl2‘, b‘aW5nIG1haWwgaW4gUHl0aG9uMy4NCiAgICA8YSBocmVmPSdodHRwOi8vYmxvZy5jc2RuLm5l‘, b‘dC9tYXJrc2lub2JlcmcnPk15IEJsb2cgU2l0ZS48L2E+‘, b‘‘, b‘------=_NextPart_58D88212_0AF3CB08_64DBF547‘, b‘Content-Type: text/html;‘, b‘\tcharset="gb18030"‘, b‘Content-Transfer-Encoding: base64‘, b‘‘, b‘PGRpdj48ZGl2PkhpIGJyby48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2PiZuYnNwO1RoaXMg‘, b‘aXMgYSBzaW1wbGUgdGV4dCBmb3IgdGVzdGluZyByZWNlaXZpbmcgbWFpbCBpbiBQeXRob24z‘, b‘LjwvZGl2PjxkaXY+Jm5ic3A7ICZuYnNwOyAmbHQ7YSBocmVmPSdodHRwOi8vYmxvZy5jc2Ru‘, b‘Lm5ldC9tYXJrc2lub2JlcmcnJmd0O015IEJsb2cgU2l0ZS4mbHQ7L2EmZ3Q7PC9kaXY+PC9k‘, b‘aXY+‘, b‘‘, b‘------=_NextPart_58D88212_0AF3CB08_64DBF547--‘, b‘‘]
该封邮件所占字节大小: 2750
解码后的邮件信息:
Received: from smtpbg323.qq.com (unknown [14.17.32.33])
by mx38 (Coremail) with SMTP id WMCowEAJ7HMTgthYHvSICA--.9482S3;
Mon, 27 Mar 2017 11:08:03 +0800 (CST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512;
t=1490584083; bh=5mkEI/McRebiKTOSeIfoEIueMTpC8wHHmBUHOC6EIeY=;
h=From:To:Subject:Mime-Version:Content-Type:Content-Transfer-Encoding:Date:Message-ID;
b=DOHiu0sjQqdYNTqsgSnoUcWztwB0g1xTHHdTJXXShRp8R72USTGblJP6lRU02p2JR
U4oG22TWhv3IJ3Or9qd1cKqJ8W/3Ya1ih+L1BTfEXhbUE59v1HDA5GjpCc/Cg7aMgA
PdtmwvW6H45brmgj3P8/KFeOz2GVKTsdZqV8VK1Y=
X-QQ-FEAT: Gf8h89u9tNzDNu+6K07CGaVRAG8UpkukGtC6J/Do7Z8trKQlxG+/B3qJrb7U5
p1QE1l6aary3W8oy+/VAtrDPVFS54LQa27g7fce+ra/0dXGlXVZsqlieRerMeDF/AgwswQF
wZxxr068ee9tfDe5jX7JccTWC1uZPlqzuks9BPjfYmmnzjayMEYch+msiLwNMwLOZba24mn
ucXTswC0032crI2RaLmiBzCuAdeKmZa+L9J6aS9JUD6zihYObJ6l4P/ps97QHqGBEs4MP4c
FKzdVlvFxtRg3X
X-QQ-SSF: 00010000000000F000000000000000Z
X-HAS-ATTACH: no
X-QQ-BUSINESS-ORIGIN: 2
X-Originating-IP: 111.117.136.219
X-QQ-STYLE:
X-QQ-mid: webmail585t1490584082t821567
From: "=?gb18030?B?ufnosQ==?=" <1064319632@qq.com>
To: "=?gb18030?B?c3BpZGVyc21hbGw=?=" <spidersmall@163.com>
Subject: Test for poplib in Python3
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_58D88212_0AF3CB08_64DBF547"
Content-Transfer-Encoding: 8Bit
Date: Mon, 27 Mar 2017 11:08:02 +0800
X-Priority: 3
Message-ID: <tencent_00BB432438B2D5FF27AA917D@qq.com>
X-QQ-MIME: TCMime 1.0 by Tencent
X-Mailer: QQMail 2.x
X-QQ-Mailer: QQMail 2.x
X-QQ-SENDSIZE: 520
Feedback-ID: webmail:qq.com:bgweb:bgweb125
X-CM-TRANSID: WMCowEAJ7HMTgthYHvSICA--.9482S3
Authentication-Results: mx38; spf=pass smtp.mail=1064319632@qq.com; dk
im=pass header.i=@qq.com
X-Coremail-Antispam: 1Uf129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73
VFW2AGmfu7bjvjm3AaLaJ3UbIYCTnIWIevJa73UjIFyTuYvjxUDMUqUUUUU
This is a multi-part message in MIME format.
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/plain;
charset="gb18030"
Content-Transfer-Encoding: base64
SGkgYnJvLg0KDQoNCiBUaGlzIGlzIGEgc2ltcGxlIHRleHQgZm9yIHRlc3RpbmcgcmVjZWl2
aW5nIG1haWwgaW4gUHl0aG9uMy4NCiAgICA8YSBocmVmPSdodHRwOi8vYmxvZy5jc2RuLm5l
dC9tYXJrc2lub2JlcmcnPk15IEJsb2cgU2l0ZS48L2E+
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/html;
charset="gb18030"
Content-Transfer-Encoding: base64
PGRpdj48ZGl2PkhpIGJyby48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2PiZuYnNwO1RoaXMg
aXMgYSBzaW1wbGUgdGV4dCBmb3IgdGVzdGluZyByZWNlaXZpbmcgbWFpbCBpbiBQeXRob24z
LjwvZGl2PjxkaXY+Jm5ic3A7ICZuYnNwOyAmbHQ7YSBocmVmPSdodHRwOi8vYmxvZy5jc2Ru
Lm5ldC9tYXJrc2lub2JlcmcnJmd0O015IEJsb2cgU2l0ZS4mbHQ7L2EmZ3Q7PC9kaXY+PC9k
aXY+
------=_NextPart_58D88212_0AF3CB08_64DBF547--
Received: from smtpbg323.qq.com (unknown [14.17.32.33])
by mx38 (Coremail) with SMTP id WMCowEAJ7HMTgthYHvSICA--.9482S3;
Mon, 27 Mar 2017 11:08:03 +0800 (CST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=qq.com; s=s201512;
t=1490584083; bh=5mkEI/McRebiKTOSeIfoEIueMTpC8wHHmBUHOC6EIeY=;
h=From:To:Subject:Mime-Version:Content-Type:Content-Transfer-Encoding:Date:Message-ID;
b=DOHiu0sjQqdYNTqsgSnoUcWztwB0g1xTHHdTJXXShRp8R72USTGblJP6lRU02p2JR
U4oG22TWhv3IJ3Or9qd1cKqJ8W/3Ya1ih+L1BTfEXhbUE59v1HDA5GjpCc/Cg7aMgA
PdtmwvW6H45brmgj3P8/KFeOz2GVKTsdZqV8VK1Y=
X-QQ-FEAT: Gf8h89u9tNzDNu+6K07CGaVRAG8UpkukGtC6J/Do7Z8trKQlxG+/B3qJrb7U5
p1QE1l6aary3W8oy+/VAtrDPVFS54LQa27g7fce+ra/0dXGlXVZsqlieRerMeDF/AgwswQF
wZxxr068ee9tfDe5jX7JccTWC1uZPlqzuks9BPjfYmmnzjayMEYch+msiLwNMwLOZba24mn
ucXTswC0032crI2RaLmiBzCuAdeKmZa+L9J6aS9JUD6zihYObJ6l4P/ps97QHqGBEs4MP4c
FKzdVlvFxtRg3X
X-QQ-SSF: 00010000000000F000000000000000Z
X-HAS-ATTACH: no
X-QQ-BUSINESS-ORIGIN: 2
X-Originating-IP: 111.117.136.219
X-QQ-STYLE:
X-QQ-mid: webmail585t1490584082t821567
From: "=?gb18030?B?ufnosQ==?=" <1064319632@qq.com>
To: "=?gb18030?B?c3BpZGVyc21hbGw=?=" <spidersmall@163.com>
Subject: Test for poplib in Python3
Mime-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_58D88212_0AF3CB08_64DBF547"
Content-Transfer-Encoding: 8Bit
Date: Mon, 27 Mar 2017 11:08:02 +0800
X-Priority: 3
Message-ID: <tencent_00BB432438B2D5FF27AA917D@qq.com>
X-QQ-MIME: TCMime 1.0 by Tencent
X-Mailer: QQMail 2.x
X-QQ-Mailer: QQMail 2.x
X-QQ-SENDSIZE: 520
Feedback-ID: webmail:qq.com:bgweb:bgweb125
X-CM-TRANSID: WMCowEAJ7HMTgthYHvSICA--.9482S3
Authentication-Results: mx38; spf=pass smtp.mail=1064319632@qq.com; dk
im=pass header.i=@qq.com
X-Coremail-Antispam: 1Uf129KBjDUn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73
VFW2AGmfu7bjvjm3AaLaJ3UbIYCTnIWIevJa73UjIFyTuYvjxUDMUqUUUUU
This is a multi-part message in MIME format.
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/plain;
charset="gb18030"
Content-Transfer-Encoding: base64
SGkgYnJvLg0KDQoNCiBUaGlzIGlzIGEgc2ltcGxlIHRleHQgZm9yIHRlc3RpbmcgcmVjZWl2
aW5nIG1haWwgaW4gUHl0aG9uMy4NCiAgICA8YSBocmVmPSdodHRwOi8vYmxvZy5jc2RuLm5l
dC9tYXJrc2lub2JlcmcnPk15IEJsb2cgU2l0ZS48L2E+
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/html;
charset="gb18030"
Content-Transfer-Encoding: base64
PGRpdj48ZGl2PkhpIGJyby48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2PiZuYnNwO1RoaXMg
aXMgYSBzaW1wbGUgdGV4dCBmb3IgdGVzdGluZyByZWNlaXZpbmcgbWFpbCBpbiBQeXRob24z
LjwvZGl2PjxkaXY+Jm5ic3A7ICZuYnNwOyAmbHQ7YSBocmVmPSdodHRwOi8vYmxvZy5jc2Ru
Lm5ldC9tYXJrc2lub2JlcmcnJmd0O015IEJsb2cgU2l0ZS4mbHQ7L2EmZ3Q7PC9kaXY+PC9k
aXY+
------=_NextPart_58D88212_0AF3CB08_64DBF547--
这样就获取到真正的数据了,对比之后不难发现。未解码之前和解码之后,差距还是很大的。
因为Python早已经想到了这一点。所以把原生数据解码后转换成email.message.Message对象。这样我们就可以根据属性获取到相应的值了。相比于原生的那些加密的数据,可谓是用心良苦。
下面再往下挖一挖,已经获取到了解码后的数据了,怎么提取出我们想要的内容呢?
不妨想一下,我们需要什么?无非是发件人,收件人,主题,邮件正文。其他的貌似不是很重要了。
再次观察打印出的数据。不难发现有这样一串数据:
From: "=?gb18030?B?ufnosQ==?=" <1064319632@qq.com>
To: "=?gb18030?B?c3BpZGVyc21hbGw=?=" <spidersmall@163.com>
Subject: Test for poplib in Python3
对比准备阶段的发出的那封邮件,是不是有点眼熟呢?
仔细思考一下发现from等字段是被经过了编码处理的。而且根据对比From和To字段的格式。发现是
=?编码?B?base64编码串?=
注意B应该是代表的byte字节类型。于是提取出关键部分,我们需要的就是编码和base64编码后的串嘛。所以使用split方法就会很轻松了。因为要多次使用这些代码,所以还是封装成一个函数的好。
def decode_base64(s, charset=‘utf8‘):
return str(base64.decodebytes(s.encode(encoding=charset)), encoding=charset)
获取解码后的结果
好了,既然这样就可以了,剩下的To, Subject字段也就可以这样做了。至于正文部分比较麻烦,待会再聊。
这里先把代码写出来(没有进行重构呢还,别急。)
def get_details(msg):
# 保存核心信息的字典,用于返回
details = {}
# 获取发件人详情
fromstr = msg.get(‘From‘)
print(fromstr)
from_nickname, from_account = get_mail_info(fromstr)
print(from_nickname, from_account)
# 获取收件人详情
tostr = msg.get(‘To‘)
to_nickname, to_account = get_mail_info(tostr)
print(to_account, to_nickname)
# 获取主题信息,也就是标题内容
subject = msg.get(‘Subject‘)
print(subject)
简单查看一下获取的结果。
"=?gb18030?B?ufnosQ==?=" <1064319632@qq.com>
郭璞 1064319632@qq.com
spidersmall@163.com spidersmall
Test for poplib in Python3
恩,这就可以了。该获取的也都获取到了。
最重要的部分要来了。那就是正文部分。因为正文部分比较特殊,所以处理起来要分而治之。
在163邮箱中,收到的邮件被分为两部分,一部分是文本信息,另一部分是添加了HTML代码的形式。但是不管哪一样,都是我们需要了解的。
还是查看刚才解码后的数据。
This is a multi-part message in MIME format.
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/plain;
charset="gb18030"
Content-Transfer-Encoding: base64
SGkgYnJvLg0KDQoNCiBUaGlzIGlzIGEgc2ltcGxlIHRleHQgZm9yIHRlc3RpbmcgcmVjZWl2
aW5nIG1haWwgaW4gUHl0aG9uMy4NCiAgICA8YSBocmVmPSdodHRwOi8vYmxvZy5jc2RuLm5l
dC9tYXJrc2lub2JlcmcnPk15IEJsb2cgU2l0ZS48L2E+
------=_NextPart_58D88212_0AF3CB08_64DBF547
Content-Type: text/html;
charset="gb18030"
Content-Transfer-Encoding: base64
PGRpdj48ZGl2PkhpIGJyby48L2Rpdj48ZGl2Pjxicj48L2Rpdj48ZGl2PiZuYnNwO1RoaXMg
aXMgYSBzaW1wbGUgdGV4dCBmb3IgdGVzdGluZyByZWNlaXZpbmcgbWFpbCBpbiBQeXRob24z
LjwvZGl2PjxkaXY+Jm5ic3A7ICZuYnNwOyAmbHQ7YSBocmVmPSdodHRwOi8vYmxvZy5jc2Ru
Lm5ldC9tYXJrc2lub2JlcmcnJmd0O015IEJsb2cgU2l0ZS4mbHQ7L2EmZ3Q7PC9kaXY+PC9k
aXY+
------=_NextPart_58D88212_0AF3CB08_64DBF547--
是不是发现了Content-type, charset等熟悉的字段呢?这其实就是相关于正文的信息了。
正文信息是被base64编码后的串,这是因为要做一下安保措施(但是base64不是加密手段,切记切记)。
查阅了官方手册之后,我发现还是比较容易处理的。如下:
parts = msg.get_payload()
# print(‘8‘*9, parts[0].as_string())
content_type = parts[0].get_content_type()
content_charset = parts[0].get_content_charset()
# parts[0] 默认为文本信息,而parts[1]默认为添加了HTML代码的数据信息
content = parts[0].as_string().split(‘base64‘)[-1]
print(‘Content*********‘, decode_base64(content, content_charset))
content = parts[1].as_string().split(‘base64‘)[-1]
print(‘HTML Content:‘, decode_base64(content, content_charset))
打印的结果呢?如下:
Content********* Hi bro.
This is a simple text for testing receiving mail in Python3.
<a href=‘http://blog.csdn.net/marksinoberg‘>My Blog Site.</a>
HTML Content: <div><div>Hi bro.</div><div><br></div><div> This is a simple text for testing receiving mail in Python3.</div><div> <a href=‘http://blog.csdn.net/marksinoberg‘>My Blog Site.</a></div></div>
对比咱们发送的真实的邮件内容。发现没毛病了。
至此,简单的获取文本邮件就算是搞定了。
至于获取附件等复杂类型的邮件,我还没做具体的测试,但是大致看了下官方文档。
应该还是不会很难的,毕竟Python。
有点跑偏了,现在已经获取到邮件相关的详细内容了,下一步对发件人进行白名单,黑名单的判断啊也变得很随意了。
关于内容可以借助贝叶斯来过滤出垃圾邮件,并调用poplib的删除邮件的方法自动的删除。
或者统计一下邮件正文的词频,借助nltk实现简单的统计分析。等等吧。
最后,因为Python3.6没能装上去pywin32,控件的事就先暂且搁置一下。
最后的最后,来分析一下收邮件的使用场景。其实这个收邮件的作用并不是很大,日常主要是发邮件用的比较多。
但是对于不能时刻查阅重要邮件的场景,比如等待面试,笔试通知。还是比较实用的。
需要源码的小伙伴,可以在博客下面留下您的邮箱,或者在博客栏目左侧找到我的联系方式与我联系。
标签:-- encoding 通过 pretty fill ++ 支持 邮件数据 选项
原文地址:http://blog.csdn.net/marksinoberg/article/details/66969620