First try enabling compiler optimizations with -O3
and/or -fast
. A quick test on my system showed a factor or 3 performance improvement
Also, when experimenting with code changes to improve performance, it is beneficial to have a quicker runtime, perhaps by changing your main loop to for(a=0;a<10 /* 512*/ ;a++)
Also note: GCC supports complex numbers and see man pages complex
, cpow
, and cexp
and include file /usr/include/complex.h
I profiled the application, and saw it is spending most of the time in powc()
. Unfortunately when I changed powc() to use cpow()
from the math library, it ran slower than your implementation.
If the system you are running on has multiple cores, wall clock time could probably be brought down fairly easily by parallelizing the outer main-loop with OpenMP. However, when you are generating image frames for the animation, it will likely be most efficient to just have each frame being generated with a separate process (I like xargs -P # -n 1
for this type of coarse grain parallelization.)