python 进程、线程（二）

时间：2018-11-22 20:39:12 阅读：188 评论：0 收藏：0 [点我收藏+]

一、多线程与多进程的对比

在python 进程、线程 (一)中简单的说过，CPython中的GIL使得同一时刻只能有一个线程运行，即并发执行。并且即使是多核CPU，GIL使得同一个进程中的多个线程也无法映射到多个CPU上运行，这么做最初是为了安全着想，慢慢的也成为了限制CPython性能的问题。
就像是一个线程想要执行，就必须得到GIL，否则就不能拿到CPU资源。但是也不是说一个线程在拿到CPU资源后就一劳永逸，在执行的过程中GIL可能会释放并被其他线程获取，所以说其它的线程会与本线程竞争CPU资源。
在understand GIL：http://www.dabeaz.com/python/UnderstandingGIL.pdf中有关于GIL释放和GIL的概要。
多线程在python2中：当一个线程进行I/O的时候会释放锁，另外当ticks计数达到100（ticks可以看作是Python自身的一个计数器，也可对比着字节码指令理解，专门做用于GIL，每次释放后归零，这个计数可以通过 sys.setcheckinterval 来调整)。锁释放之后，就涉及到线程的调度，线程的锁进行，线程的切换。这是会消耗CPU资源，因此会造成程序性能问题和等待时延。特别是在CPU密集型代码时。
但是对于多进程，GIL就无法限制，多个进程可以再多个CPU上运行，充分利用多核优势。事情往往是相对的，虽然可以充分利用多核优势，但是进程之间的切换却比线程的切换代价更高。
所以选择多线程还是多进程，主要还是看怎样权衡代价，什么样的情况。

1、CPU密集代码

下面来利用斐波那契数列模拟CPU密集运算。

def fib(n):
    # 求斐波那契数列的第n个值
    if n<=2:
        return 1
    return fib(n-1)+fib(n-2)

<1>、多进程

打印第25到35个斐波那契数，并计算程序运行时间

import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from concurrent.futures import ProcessPoolExecutor


def fib(n):
    if n<=2:
        return 1
    return fib(n-1)+fib(n-2)

if __name__ == "__main__":
    with ProcessPoolExecutor(3) as executor:  # 使用进程池控制  每次执行3个进程
        all_task = [executor.submit(fib, (num)) for num in range(25,35)]
        start_time = time.time()
        for future in as_completed(all_task):
            data = future.result()
            print("exe result: {}".format(data))

        print("last time is: {}".format(time.time()-start_time))

# 输出
exe result: 75025
exe result: 121393
exe result: 196418
exe result: 317811
exe result: 514229
exe result: 832040
exe result: 1346269
exe result: 2178309
exe result: 3524578
exe result: 5702887
last time is: 4.457437038421631

输出结果，每次打印三个exe result，总重打印十个结果，多进程运行时间为4.45秒

<2>、多线程

import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from concurrent.futures import ProcessPoolExecutor


def fib(n):
    if n<=2:
        return 1
    return fib(n-1)+fib(n-2)

if __name__ == "__main__":
    with ThreadPoolExecutor(3) as executor:  # 使用线程池控制  每次执行3个线程
        all_task = [executor.submit(fib, (num)) for num in range(25,35)]
        start_time = time.time()
        for future in as_completed(all_task):
            data = future.result()
            print("exe result: {}".format(data))

        print("last time is: {}".format(time.time()-start_time))

# 输出
exe result: 121393
exe result: 75025
exe result: 196418
exe result: 317811
exe result: 514229
exe result: 832040
exe result: 1346269
exe result: 2178309
exe result: 3524578
exe result: 5702887
last time is: 7.3467772006988525

最终程序运行时间为7.34秒

程序的执行之间与计算机的性能有关，每天计算机的执行时间都会有差异。从上述结果中看显然多线程比多进程要耗费时间。这就是因为对于密集代码(密集运算，循环语句等)，tick计数很快达到100，GIL来回的释放竞争，线程之间频繁切换，所以对于密集代码的执行中，多线程性能不如对进程。

2、I/O密集代码

一个线程在I/O阻塞的时候，会释放GIL，挂起，然后其他的线程会竞争CPU资源，涉及到线程的切换，但是这种代价与较高时延的I/O来说是不足为道的。
下面用sleep函数模拟密集I/O

def random_sleep(n):
    time.sleep(n)
    return n

<1>、多进程

def random_sleep(n):
    time.sleep(n)
    return n

if __name__ == "__main__":
    with ProcessPoolExecutor(5) as executor:
        all_task = [executor.submit(random_sleep, (num)) for num in [2]*30]
        start_time = time.time()
        for future in as_completed(all_task):
            data = future.result()
            print("exe result: {}".format(data))

        print("last time is: {}".format(time.time()-start_time))
#  输出
exe result: 2
exe result: 2
......（30个）
exe result: 2
exe result: 2
last time is: 12.412866353988647

每次打印5个结果，总共二十个打印结果，多进程运行时间为12.41秒

<2>、多线程

def random_sleep(n):
    time.sleep(n)
    return n

if __name__ == "__main__":
    with ThreadPoolExecutor(5) as executor:
        all_task = [executor.submit(random_sleep, (num)) for num in [2]*30]
        start_time = time.time()
        for future in as_completed(all_task):
            data = future.result()
            print("exe result: {}".format(data))

        print("last time is: {}".format(time.time()-start_time))

#  输出
exe result: 2
exe result: 2
......（30个）
exe result: 2
exe result: 2
last time is: 12.004231214523315

I/O密集多线程情况下，程序的性能较多进程有了略微的提高。IO密集型代码(文件处理、网络爬虫等)，多线程能够有效提升效率(单线程下有IO操作会进行IO等待，造成不必要的时间浪费，而开启多线程能在线程A等待时，自动切换到线程B，可以不浪费CPU的资源，从而能提升程序执行效率)。所以python的多线程对IO密集型代码比较友好。

3、总结

CPU密集型代码(各种循环处理、计数等等)，多线程性能不如多进程。
I/O密集型代码(文件处理、网络爬虫等)，多进程不如多线程。

python 进程、线程（二）

标签：task from logs process 运算优势之间 val 提升

原文地址：https://www.cnblogs.com/welan/p/10003312.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行

python 进程、线程 （二）

一、多线程与多进程的对比

1、CPU密集代码

<1>、多进程

<2>、多线程

2、I/O密集代码

<1>、 多进程

<2>、多线程

3、总结

python 进程、线程（二）

<1>、多进程