python下对word文档做自动替换（包括页眉页脚）

时间：2019-11-26 19:42:13 阅读：278 评论：0 收藏：0 [点我收藏+]

标签：征文 out 解压 nbsp utf-8 filename style 压缩后缀

常说的python-docx库并不好。我使用的时候碰到了部分文字未读取的情况。其实完全可以不用这个包。

doc文档本身是一个压缩包，改后缀为zip后，可解压看其中的内容:

技术图片

xml格式我不了解，基本上征文所有的文字都在document.xml文档中，页眉页脚在header和footer中,写一个文档改变其中的值就行了。

这其中可以用库来快速操作：zipfile。可以免去解压什么的繁琐步骤。

def docx_replace(old_file,new_file,rep):
    zin = zipfile.ZipFile (old_file, ‘r‘)
    zout = zipfile.ZipFile (new_file, ‘w‘)
    for item in zin.infolist():
        buffer = zin.read(item.filename)
        if (item.filename == ‘word/document.xml‘ or ‘header‘ in item.filename):
            res = buffer.decode("utf-8")
            for r in rep:
                res = res.replace(r,rep[r])
            buffer = res.encode("utf-8")
        zout.writestr(item, buffer)
    zout.close()
    zin.close()

做替换，rep是替换前后内容的dict。

so easy

python下对word文档做自动替换（包括页眉页脚）

标签：征文 out 解压 nbsp utf-8 filename style 压缩后缀

原文地址：https://www.cnblogs.com/waldenlake/p/11937567.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行