1

在我完成最后一些更改后,我的代码变得非常慢。一个搜索任务需要 102 秒而不是 2-3 秒。

我尝试使用profile该类来查找限制函数,这是输出:

>>> import WebParser
>>>
>>> w = WebParser.LinksGrabber
>>>
>>> import cProfile
>>> cProfile.run("w.search('nicki minaj', 15)")
         50326 function calls in 102.745 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000  102.745  102.745 <string>:1(<module>)
        6    0.000    0.000    0.000    0.000 Config.py:110(__getattr__)
        1    0.000    0.000  102.745  102.745 LinksGrabber.py:427(search)
        5    0.000    0.000    0.002    0.000 Queue.py:107(put)
      911    0.040    0.000  102.726    0.113 Queue.py:150(get)
       ..................................
      }
     6836    0.022    0.000    0.022    0.000 {min}
      917    0.009    0.000    0.009    0.000 {thread.allocate_lock}
        3    0.000    0.000    0.000    0.000 {thread.get_ident}
        3    0.000    0.000    0.000    0.000 {thread.start_new_thread}
    6835  100.458    0.015  100.458    0.015 {time.sleep}
    11346    0.035    0.000    0.035    0.000 {time.time}

它显示time.sleep代码正在等待,但我在课堂100.458s上找不到代码。WebParser.LinksGrabber

如何使用profile来获取有关慢速代码的更多信息?

4

1 回答 1

2

我有一段样板代码添加到我需要分析的程序中,我可以通过将PROFILE变量更改为True或轻松启用或禁用它False...

#!/usr/bin/env python

# Should we profile the code?
PROFILE = True

# What proportion of the profile data should we keep?
PROFILE_LIMIT = 0.5         # 50%

# Where to store the raw profile data
PROFILE_DAT = 'profile.dat'

# Where to store the formatted profile data
PROFILE_TXT = 'profile.txt'


# Main code starts here
import time

def do_something():
    for i in range(5):
        time.sleep(0.1)

def do_something_else():
    for i in range(10):
        time.sleep(0.1)

def main():
    do_something()
    do_something_else()


if __name__ == '__main__':

    if PROFILE:
        import os, cProfile, pstats
        cProfile.runctx('main()', globals(), locals(), PROFILE_DAT)
        f = open(PROFILE_TXT, 'wb')
        for sort_key in 'time', 'cumulative':
            stats = pstats.Stats(PROFILE_DAT, stream=f)
            stats.sort_stats(sort_key)
            stats.print_stats(PROFILE_LIMIT)
            stats.strip_dirs()
            stats.sort_stats(sort_key)
            if sort_key == 'time':
                stats.print_callers(PROFILE_LIMIT)
            else:
                stats.print_callees(PROFILE_LIMIT)
        f.close()
        os.unlink(PROFILE_DAT)

    else:
        main()

...创建一个profile.txt包含...的文本文件

  1. 最慢的函数(内部)以及调用它们的位置
  2. 最慢的函数(累积)以及它们所调用的函数

...看起来像这样...

[...]

   Ordered by: internal time
   List reduced from 7 to 4 due to restriction <0.5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       15    1.627    0.108    1.627    0.108 {time.sleep}
        1    0.000    0.000    1.092    1.092 foo.py:23(do_something_else)
        1    0.000    0.000    0.536    0.536 foo.py:19(do_something)
        1    0.000    0.000    1.628    1.628 foo.py:27(main)


[...]

Function                      was called by...
                                  ncalls  tottime  cumtime
{time.sleep}                  <-       5    0.535    0.535  foo.py:19(do_something)
                                      10    1.092    1.092  foo.py:23(do_something_else)
foo.py:23(do_something_else)  <-       1    0.000    1.092  foo.py:27(main)
foo.py:19(do_something)       <-       1    0.000    0.536  foo.py:27(main)
foo.py:27(main)               <-       1    0.000    1.628  <string>:1(<module>)


[...]

   Ordered by: cumulative time
   List reduced from 7 to 4 due to restriction <0.5>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    1.628    1.628 <string>:1(<module>)
        1    0.000    0.000    1.628    1.628 foo.py:27(main)
       15    1.627    0.108    1.627    0.108 {time.sleep}
        1    0.000    0.000    1.092    1.092 foo.py:23(do_something_else)


[...]

Function                      called...
                                  ncalls  tottime  cumtime
<string>:1(<module>)          ->       1    0.000    1.628  foo.py:27(main)
foo.py:27(main)               ->       1    0.000    0.536  foo.py:19(do_something)
                                       1    0.000    1.092  foo.py:23(do_something_else)
{time.sleep}                  ->
foo.py:23(do_something_else)  ->       1    0.000    0.000  {range}
                                      10    1.092    1.092  {time.sleep}

...这通常足以追踪代码需要改进的地方。

于 2013-06-02T14:01:50.450 回答