10

This is mostly to make sure my methodology is correct, but my basic question was is it worth it to check outside of a function if I need to access the function at all. I know, I know, premature optimization, but in many cases, its the difference between putting an if statement inside the function call to determine whether I need to run the rest of the code, or putting it before the function call. In other words, it takes no effort to do it one way or the other. Right now, all the checks are mixed between both, and I'd like the get it all nice and standardized.

The main reason I asked is because the other answers I saw mostly referenced timeit, but that gave me negative numbers, so I switched to this:

import timeit
import cProfile

def aaaa(idd):
    return idd

def main():
    #start = timeit.timeit()
    for i in range(9999999):
        a = 5
    #end = timeit.timeit()
    #print("1", end - start)

def main2():
    #start = timeit.timeit()
    for i in range(9999999):
        aaaa(5)
    #end = timeit.timeit()
    #print("2", end - start)

cProfile.run('main()', sort='cumulative')
cProfile.run('main2()', sort='cumulative')

and got this for output

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.310    0.310 {built-in method exec}
        1    0.000    0.000    0.310    0.310 <string>:1(<module>)
        1    0.310    0.310    0.310    0.310 test.py:7(main)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.044    2.044 {built-in method exec}
        1    0.000    0.000    2.044    2.044 <string>:1(<module>)
        1    1.522    1.522    2.044    2.044 test.py:14(main2)
  9999999    0.521    0.000    0.521    0.000 test.py:4(aaaa)

To me that shows that not calling the function is .31 seconds, and calling it takes 1.52 seconds, which is almost 5 times slower. But like I said, I got negative numbers with timeit, so I want to make sure its actually that slow.

Also from what I gather, the reason function calls are so slow is because python needs to look up to make sure the function still exists before it can run it or something? Isn't there any way to just tell it to like...assume that everything is still there so that it doesn't have to do unnecessary work that (apparently) slows it down 5x?

4

1 回答 1

42

You are comparing apples and pears here. One method does simple assignment, the other calls a function. Yes, function calls will add overhead.

You should strip this down to the bare minimum for timeit:

>>> import timeit
>>> timeit.timeit('a = 5')
0.03456282615661621
>>> timeit.timeit('foo()', 'def foo(): a = 5')
0.14389896392822266

Now all we did was add a function call (foo does the same thing), so you can measure the extra time a function call takes. You cannot state that this is nearly 4 times slower, no, the function call adds a 0.11 second overhead for 1.000.000 iterations.

If instead of a = 5 we do something that takes 0.5 seconds to execute one million iterations, moving them to a function won't make things take 2 seconds. It'll now take 0.61 seconds because the function overhead doesn't grow.

A function call needs to manipulate the stack, pushing the local frame onto it, creating a new frame, then clear it all up again when the function returns.

In other words, moving statements to a function adds a small overhead, and the more statements you move to that function, the smaller the overhead becomes as a percentage of the total work done. A function never makes those statements themselves slower.

A Python function is just an object stored in a variable; you can assign functions to a different variable, replace them with something completely different, or delete them at any time. When you invoke a function, you first reference the name by which they are stored (foo) and then invoke the function object ((arguments)); that lookup has to happen every single time in a dynamic language.

You can see this in the bytecode generated for a function:

>>> def foo():
...     pass
... 
>>> def bar():
...     return foo()
... 
>>> import dis
>>> dis.dis(bar)
  2           0 LOAD_GLOBAL              0 (foo)
              3 CALL_FUNCTION            0
              6 RETURN_VALUE        

The LOAD_GLOBAL opcode looks up the name (foo) in the global namespace (basically a hash table lookup), and pushes the result onto the stack. CALL_FUNCTION then invokes whatever is on the stack, replacing it with the return value. RETURN_VALUE returns from a function call, again taking whatever is topmost on the stack as the return value.

于 2013-02-01T14:33:43.410 回答