ASTs represent the structure of program. For complex languages,
your AST will necessarily be complicated. So don't assume this
should be "easy".
Many folks assume that having an AST, life will be easier.
But parsing is just the foothills of the Himalayas.
ASTs don't represent the common inferences,
such as the meaning of an identifier, what statement executes next,
where this data is consumed.
Unless you have all these available, you're not going to
be able to do much with a real language, let alone
do it easily.
It is best to make those inference results cached or explicit:
symbol tables, control flows, data flows, ...
One can add pattern matching langauges to enable the
recognition of syntax structures, or even to write transformation
rules:
optimize_increment(x:exp):statement
= " \x=\x+1; " --> " \x++ " if no_side_effects(x);
Such rules need to draw on the cached inferences (eg.,
"side_effects").
Trying to build all this is really hard. One way to make
this practical is to amortize the infrastructure cost across
many lanuages and transformation applications.
Our DMS Software Reengineering Toolkit has all this machinery to varying degree for a wide variety of langauges, including C, C++, Java, C# and COBOL.