The problem is the following:
I developed an expression evaluation engine that provides a XPath-like language to the user so he can build the expressions. These expressions are then parsed and stored as an expression tree. There are many kinds of expressions, including logic (and/or/not), relational (=, !=, >, <, >=, <=), arithmetic (+, -, *, /) and if/then/else expressions.
Besides these operations, the expression can have constants (numbers, strings, dates, etc) and also access to external information by using a syntax similar to XPath to navigate in a tree of Java objects.
Given the above, we can build expressions like:
/some/value and /some/other/value
/some/value or /some/other/value
if (<evaluate some expression>) then
<evaluate some other expression>
else
<do something else>
Since the then-part and the else-part of the if-then-else expressions are expressions themselves, and everything is considered to be an expression, then anything can appear there, including other if-then-else's, allowing the user to build large decision trees by nesting if-then-else's.
As these expressions are built manually and prone to human error, I decided to build an automatic learning process capable of optimizing these expression trees based on the analysis of common external data. For example: in the first expression above (/some/value and /some/other/value), if the result of /some/other/value is false most of the times, we can rearrange the tree so this branch will be the left branch to take advantage of short-circuit evaluation (the right side of the AND is not evaluated since the left side already determined the result).
Another possible optimization is to rearrange nested if-then-else expressions (decision trees) so the most frequent path taken, based on the most common external data used, will be executed sooner in the future, avoiding unnecessary evaluation of some branches most of the times.
Do you have any ideas on what would be the best or recommended approach/algorithm to use to perform this automatic refactoring of these expression trees?