python multiprocessing.pool.apply_async 占用内存多解决方法

时间：2020-06-11 19:31:35 阅读：117 评论：0 收藏：0 [点我收藏+]

标签：util org html sla lib utils cache ret tac

multiprocessing.pool.apply_async 可以执行并行的进程，但是会将所有进程先读入列表，对于不是很多数量的进程来说没有问题，但是如果进行数量很多，比如100万条，1000万条，而进程不能很快完成，内存就会占用很多，甚至挤爆内存。那么如何限制内存的占有量呢，可以检测pool._cache的长度，如果超过一定的长度，就让最后进入pool中的进程等待，以达到减少内存占有的目录。

from multiprocessing import Pool
import time

def downloadGif(arg):
    print(arg[0])
    time.sleep(1)

def downloading_over(arg):
    pass

def foo(num):
    for i in range(num,1000001):
        pic_info=[]
        pic_info.append(str(i)+‘gif‘)

        txt_info=[]
        txt_info.append(str(i)+‘txt‘)
        yield pic_info,txt_info

if __name__ == ‘__main__‘:
    pool = Pool(processes=5)    # set the processes max number
    count=1
    for download in foo(2):
        pool.apply_async(func=downloadGif, args=(download[0],),callback=downloading_over)
        last=pool.apply_async(func=downloadGif, args=(download[1],),callback=downloading_over)

        count=count+1
        print(count)

        if len(pool._cache) > 1e3:
            print("waiting for cache to clear...")
            last.wait()

#1e3,500条,占有内存10M
#1e4,5000条,占有内存20M
#1e5,50000条,占有内存200M
#1e6,500000条,占有内存2000M

    pool.close()
    pool.join()

核心代码：

        if len(pool._cache) > 1e3:
            print("waiting for cache to clear...")
            last.wait()

last 是 AsyncResult的实例，是pool的返回值

https://docs.python.org/3/library/multiprocessing.html

class multiprocessing.pool.AsyncResult¶

The class of the result returned by Pool.apply_async() and Pool.map_async().

get([timeout]): Return the result when it arrives. If timeout is not None and the result does not arrive within timeout seconds then multiprocessing.TimeoutError is raised. If the remote call raised an exception then that exception will be reraised by get().

wait([timeout]): Wait until the result is available or until timeout seconds pass.

ready(): Return whether the call has completed.

successful(): Return whether the call completed without raising an exception. Will raise ValueError if the result is not ready.

本文参考下面链接回答：

https://stackoverflow.com/questions/18414020/memory-usage-keep-growing-with-pythons-multiprocessing-pool

python multiprocessing.pool.apply_async 占用内存多解决方法

标签：util org html sla lib utils cache ret tac

原文地址：https://www.cnblogs.com/yanghao2008/p/13095368.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

python multiprocessing.pool.apply_async 占用内存多 解决方法

python multiprocessing.pool.apply_async 占用内存多解决方法