2

我有一个分析 Perforce 服务器输出的 python 脚本。有了这个,我有两种不同的方式来输出给定时间正在运行的进程。

1.匹配进程缩进与进程缩进开始(P120将有119个缩进)

+P1 +P2 -P1 +P3 -P3 -P2

2.匹配进程缩进与正在运行的进程数

+P1 +P2 +P3 -P2 -P3 -P1

对于这两个函数,输入是我创建的流程对象列表。

类型 1 的代码:(慢)

def visual_output_match(processes):

output_string = ""
process_number = 1
indent_number = 0
running_processes = []

# Go through each process
for process in processes:
    # Assign a process number to the new process that is about to start
    process.set_process_number(process_number)

    # This will append the process onto a list of processes that are currently running
    running_processes.append(process)

    # Get a list of processes that have ended before the new process started
    ending_processes = check_ended_processes(running_processes, process)

    # Delete the processes that have ended from the running_processes list
    running_processes = remove_running_processes(running_processes, ending_processes)

    # Print all the processes that finished before the new process began
    for p in ending_processes:
        # Tabs over the correct amount depending on how many processes are running
        indent_number = p.get_process_number() - 1
        output_string += ('\t' * indent_number)
        output_string += ("-P" + str(p.get_process_number()) + "   " + p.short_srting_summary_end() + '\n')

    # Tabs over the correct amount depending on how many processes are running
    indent_number = process.get_process_number() - 1
    output_string += ('\t' * indent_number)
    output_string += ("+P" + str(process.get_process_number()) + "   " + process.short_srting_summary_start() + '\n')
    process_number += 1
return output_string`

类型 2 的代码:(快速)

def visual_output_not_match(processes):

output_string = ""
number_tabs = 0
process_number = 1
running_processes = []

# Go through each process
for process in processes:
    # Assign a process number to the new process that is about to start
    process.set_process_number(process_number)

    # This will append the process onto a list of processes that are currently running
    running_processes.append(process)

    # Get a list of processes that have ended before the new process started
    ending_processes = check_ended_processes(running_processes, process)

    # Delete the processes that have ended from the running_processes list
    running_processes = remove_running_processes(running_processes, ending_processes)

    # Print all the processes that finished before the new process began
    for p in ending_processes:
        number_tabs -= 1
        # Tabs over the correct amount depending on how many processes are running
        output_string += ('\t' * number_tabs)
        output_string += ("-P" + str(p.get_process_number()) + "   " + p.short_srting_summary_end() + '\n')

    # Tabs over the correct amount depending on how many processes are running
    output_string += ("\t" * number_tabs)
    output_string += ("+P" + str(process.get_process_number()) + "   " + process.short_srting_summary_start() + '\n')
    number_tabs += 1
    process_number += 1
return output_string

对于确切的一些进程,第一种类型将花费 11 分钟以上,而另一种只需要大约 1 秒。现在我意识到,对于类型 1,如果我有 11,000 个进程,那么我将在某个时候在给定的行上有 11,000 个选项卡,而对于类型 2,这是不正确的。虽然这是唯一让我的脚本变慢的事情吗?其他人是否看到任何其他严重错误。如果您需要查看我在此脚本中调用的其他一些方法,请告诉我。

现在我确实在这两个函数上运行了一个 cProfiler,这就是我得到的:

1:(慢)

     3381490 function calls in 681.195 seconds

     Ordered by: internal time

     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
          1  679.175  679.175  681.182  681.182 visual_output.py:107(visual_output_match)
      13706    0.568    0.000    0.769    0.000 visual_output.py:147(check_ended_processes)
      13706    0.404    0.000    0.657    0.000 visual_output.py:162(remove_running_processes)
      13706    0.258    0.000    0.264    0.000 server_info.py:398(short_srting_summary_start)
      13702    0.257    0.000    0.264    0.000 server_info.py:402(short_srting_summary_end)
     787125    0.206    0.000    0.206    0.000 server_info.py:67(__eq__)
     800837    0.118    0.000    0.118    0.000 server_info.py:267(get_datetime_end)
     800837    0.080    0.000    0.080    0.000 server_info.py:264(get_datetime_start)
     800828    0.038    0.000    0.038    0.000 {len}
      54816    0.028    0.000    0.028    0.000 server_info.py:308(get_process_number)
      13706    0.018    0.000    0.018    0.000 server_info.py:406(set_process_number)
          1    0.013    0.013  681.195  681.195 <string>:1(<module>)
      27408    0.013    0.000    0.013    0.000 {method 'time' of 'datetime.datetime' objects}
      27408    0.011    0.000    0.011    0.000 {method 'append' of 'list' objects}
      13702    0.010    0.000    0.010    0.000 {method 'pop' of 'list' objects}
          1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

2:(快速)

           3354269 function calls (3354219 primitive calls) in 1.981 seconds

     Ordered by: internal time

     ncalls  tottime  percall  cumtime  percall filename:lineno(function)
          1    0.688    0.688    1.981    1.981 visual_output.py:67(visual_output_not_match)
      13706    0.407    0.000    0.563    0.000 visual_output.py:147(check_ended_processes)
      13706    0.375    0.000    0.603    0.000 visual_output.py:162(remove_running_processes)
     787125    0.189    0.000    0.189    0.000 server_info.py:67(__eq__)
     800837    0.079    0.000    0.079    0.000 server_info.py:267(get_datetime_end)
     800837    0.075    0.000    0.075    0.000 server_info.py:264(get_datetime_start)
      13706    0.059    0.000    0.061    0.000 server_info.py:398(short_srting_summary_start)
      13702    0.053    0.000    0.055    0.000 server_info.py:402(short_srting_summary_end)
     800850    0.035    0.000    0.035    0.000 {len}
      13706    0.005    0.000    0.005    0.000 server_info.py:406(set_process_number)
      27408    0.005    0.000    0.005    0.000 server_info.py:308(get_process_number)
      13702    0.004    0.000    0.004    0.000 {method 'pop' of 'list' objects}
      27434    0.003    0.000    0.003    0.000 {method 'append' of 'list' objects}
      27408    0.003    0.000    0.003    0.000 {method 'time' of 'datetime.datetime' objects}
          1    0.000    0.000    1.981    1.981 <string>:1(<module>)
          2    0.000    0.000    0.000    0.000 {method 'send' of '_socket.socket' objects}
       24/2    0.000    0.000    0.000    0.000 brine.py:202(_dump)
       12/2    0.000    0.000    0.000    0.000 brine.py:179(_dump_tuple)
         10    0.000    0.000    0.000    0.000 brine.py:106(_dump_int)
       10/2    0.000    0.000    0.000    0.000 brine.py:360(dumpable)
          2    0.000    0.000    0.000    0.000 protocol.py:220(_send)
          2    0.000    0.000    0.000    0.000 protocol.py:227(_send_request)
          2    0.000    0.000    0.000    0.000 channel.py:56(send)
          2    0.000    0.000    0.000    0.000 <string>:531(write)
          2    0.000    0.000    0.000    0.000 stream.py:173(write)
          2    0.000    0.000    0.000    0.000 brine.py:332(dump)
       14/8    0.000    0.000    0.000    0.000 brine.py:369(<genexpr>)
          5    0.000    0.000    0.000    0.000 {method 'pack' of 'Struct' objects}
          2    0.000    0.000    0.000    0.000 protocol.py:438(_async_request)
          2    0.000    0.000    0.000    0.000 brine.py:150(_dump_str)
        6/2    0.000    0.000    0.000    0.000 {all}
          2    0.000    0.000    0.000    0.000 protocol.py:241(_box)
         24    0.000    0.000    0.000    0.000 {method 'get' of 'dict' objects}
          2    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
          2    0.000    0.000    0.000    0.000 brine.py:173(_dump_long)
          2    0.000    0.000    0.000    0.000 {method 'acquire' of 'thread.lock' objects}
          2    0.000    0.000    0.000    0.000 {next}
          4    0.000    0.000    0.000    0.000 compat.py:17(BYTES_LITERAL)
          2    0.000    0.000    0.000    0.000 {method 'release' of 'thread.lock' objects}
          1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

我试图理解这一点,看看它在哪里变慢了。我真正看到的唯一区别是该函数的第一次调用具有不同的 totime。

有没有人可以使用更好的代码分析器?

让我知道您是否还需要其他任何东西。

4

1 回答 1

0

您可能想尝试建立一个“标签字符串”的查找表。

代替:

output_string += ('\t' * indent_number)

尝试:

output_string += indent_string[ident_number]

显然,这需要一些逻辑来仅将数组预填充到某个级别,并在超出时继续动态构建。但是如果这是你的问题,快速破解应该会让你知道。

于 2014-06-26T06:33:53.560 回答