With the abundance of techniques being employed to increase parallelization in today's compiler-tools (especially auto-parallelization of certain viable for
-constructs, c.f. the Intel C++ Compiler, Microsoft Visual Studio 2011, alongside various others), I wondered if parallelization is always guaranteed to improve or have no impact on performance.
Are there any cases in which parallelization would have a distinctly negative impact on performance?
A quick internet search didn't yield much hope, so I decided to turn here to see if anyone has any knowledge of cases where parallelization has a detrimental impact on performance, or better yet, experience in a project where parallelization actually caused difficulties.
I am also curious about whether there are any negative performance implication of auto-vectorization, although I find it quite unlikely that there would be.
Thanks in advance!