python - 多线程计算 Pi（无加速 - 怎么办）

Question

我正在尝试使用 Ramanujan 的公式之一在 Python 上以任意精度计算 pi：http ://en.wikipedia.org/wiki/Approximations_of_%CF%80#20th_century 。它基本上需要大量的阶乘和高精度的浮点数除法。

我正在使用多个线程来划分无限级数计算，方法是为每个线程提供在除以线程数时具有一定模数的所有成员。所以如果你有 3 个线程，总和应该像这样划分线程 1 ---> 1, 4, 7... 成员线程 2 ---->2, 5, 8... 线程 3 ---- >3、6、9...

到目前为止，这是我的代码：

from decimal   import *
from math      import sqrt, ceil
from time      import clock
from threading import *
import argparse

memoizedFactorials = []
memoizedFactorials.append( 1 )
memoizedFactorials.append( 1 )

class Accumulator:
    def __init__( self ):
        self._sum = Decimal( 0 )

    def accumulate( self, decimal ):
        self._sum += decimal

    def sum( self ):
        return self._sum

def factorial( k ):
    if k < 2: return 1
    elif len(memoizedFactorials) <= k:
        product = memoizedFactorials[ - 1 ] #last element 
        for i in range ( len(memoizedFactorials), k+1 ):
            product *= i;
            memoizedFactorials.append(product)

    return memoizedFactorials[ k ]

class Worker(Thread):
    def __init__( self, startIndex, step, precision, accumulator ):
        Thread.__init__( self, name = ("Thread - {0}".format( startIndex ) ) )
        self._startIndex = startIndex
        self._step = step
        self._precision = precision
        self._accumulator = accumulator

    def run( self ):
        sum = Decimal( 0 )
        result = Decimal( 1 )
        zero = Decimal( 0 )

        delta = Decimal(1)/( Decimal(10)**self._precision + 1 )
        #print "Delta - {0}".format( delta ) 
        i = self._startIndex
        while( result - zero > delta ):
            numerator = Decimal(factorial(4 * i)*(1103 + 26390 * i))
            denominator = Decimal((factorial(i)**4)*(396**(4*i)))
            result =  numerator / denominator
            print "Thread - {2} --- Iteration - {0:3} --->{1:3}".format( i, result, self._startIndex )
            sum += result
            i += self._step

        self._accumulator.accumulate( sum ) 
        print 

def main( args ):
    numberOfDigits = args.numberOfDigits;
    getcontext().prec = numberOfDigits + 8
    zero = Decimal(1) / Decimal( 10**( numberOfDigits + 1 ) )

    start = clock()
    accumulator = Accumulator()

    threadsCount = args.numberOfThreads;
    threadPool = []
    for i in range(0, threadsCount ):
        worker = Worker( i, threadsCount, numberOfDigits, accumulator )
        worker.start()
        threadPool.append( worker )

    for worker in threadPool:
        worker.join()

    sum = accumulator.sum();

    rootOfTwo = Decimal(2).sqrt()

    result = Decimal( 9801 ) / ( Decimal( 2 ) * rootOfTwo * sum ) 
    end = clock();

    delta = end - start;

    print result;
    print ("Took it {0} second to finish".format( delta ) )

    #testing the results
    #realPiFile = open("pi.txt");
    #myPi = str(result)
    #realPi = realPiFile.read( len(myPi) - 1 )

    #if ( myPi[:-1] != realPi ):
    #    print "Answer not correct!"
    #    print "My pi   - {0}".format(myPi)
    #    print "Real pi - {0}".format(realPi)

if __name__ == '__main__':
    parser = argparse.ArgumentParser(description = 'Calculate Pi at with arbitrary precision')
    parser.add_argument('-p',            dest = 'numberOfDigits',  default=20, type = int, help ='Number of digits in pi ')
    parser.add_argument('-t', '--tasks', dest = 'numberOfThreads', default=1,  type = int, help ='Number of tasks( threads )')
    parser.add_argument('-o',            dest = 'outputFileName',  type = str,             help ='Connect to VCS testing servers')
    parser.add_argument('-q', '--quet',  dest = 'quetMode'      ,  action='store_true',    help ='Run in quet mode')

    args = parser.parse_args()

    print args
    main(args)
    a = raw_input("Press any key to continue...")

让我担心的是，在使用多个线程时，我的时间加速非常小或没有。例如 1000 位 pi：1 线程 --> 0.68 秒 2 线程 --> 0.74 秒 4 线程 --> 0.75 秒 10 线程 --> 0.96 秒

你对如何减少时间有什么想法吗？我在任务管理器上看到，当使用四个线程时，我的两个内核都 100% 参与其中。然而时间似乎是一样的。

PS：这是一个家庭作业，所以我不能使用其他公式。PSS：我正在使用 python 2.7

谢谢：）

score 5 · Accepted Answer

Python 有一个 GIL（全局解释器锁），它可以防止多个线程同时执行 python 代码，也就是说，您无法使用多个线程来加速 CPU 密集型任务。您必须使用多个进程。

score 3 · Accepted Answer

由于 GIL（全局解释锁），线程不会加快速度。用于multiprocessing此类任务。它的用法与线程非常相似。

score 2 · Accepted Answer

而不是暴力破解系列和所有那些讨厌的阶乘，你一定要了解二进制拆分算法。

http://numbers.computation.free.fr/Constants/Algorithms/splitting.html

这个人已经为你完成了工作。它具有适用于 Chudnovsky 公式的二进制拆分结构的 python 实现：
http ://www.craig-wood.com/nick/articles/pi-chudnovsky/

这种结构的主要优点是它在计算级数时消除了除法、阶乘和任何浮点计算的需要。然后你在分子和分母之间执行一个单一的、最终的、超大质量的除法。实际上我不知道如何对其进行多线程处理，但这是一个开始。

python - 多线程计算 Pi（无加速 - 怎么办）

3 回答 3

Related

Reference