3

嗨 boost::xpressive 用户,

尝试使用 boost::xpressive 解析一些决策树时出现堆栈溢出错误。它似乎适用于达到一定大小的树,但在“大”树上失败,其中“大”似乎意味着大约 3000 个节点,而 gdb 的堆栈深度为 133979 帧。我想我需要以某种方式优化正则表达式,但没有 .* 任何地方,所以我不知道从这里去哪里。

#include <boost/regex.hpp>
#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

using namespace boost::xpressive;
using namespace regex_constants;


sregex integral_number;
sregex floating_point_number;

sregex bid;
sregex ask;
sregex side;
sregex value_on_market_limit_ratio_gt;
sregex value_on_market_delta_ratio_gt;

sregex stdevs_from_mean_auction_time_gt;
sregex no_orders_on_opposite_side;
sregex is_pushing_price;
sregex is_desired;

sregex predicate, leaf, branch, tree;

integral_number = sregex_compiler().compile("[-+]?[0-9]+");
floating_point_number = sregex_compiler().compile("[-+]?[0-9]*\\.?[0-9]+([eE][-+]?[0-9]+)?");
stdevs_from_mean_auction_time_gt = "StdevsFromMeanAuctionTimeGT(" >> floating_point_number >> ")";
side = sregex_compiler().compile("def::BID|def::ASK");
value_on_market_limit_ratio_gt = "ValueOnMarketLimitRatioGT<" >> side >> ">(" >> floating_point_number >> ")";
value_on_market_delta_ratio_gt = "ValueOnMarketDeltaRatioGT(" >> floating_point_number >> ")";
no_orders_on_opposite_side = sregex_compiler().compile("NoOrdersOnOppositeSide");
is_pushing_price = sregex_compiler().compile("IsPushingPrice");
is_desired = sregex_compiler().compile("IsDesired");
predicate = value_on_market_limit_ratio_gt | value_on_market_delta_ratio_gt | stdevs_from_mean_auction_time_gt | no_orders_on_opposite_side | is_pushing_price | is_desired;
leaf = sregex_compiler().compile("SEARCH_TO_MAX|AMEND_TO_AVAILABLE|AMEND_TO_AVAILABLE_MINUS_RECENT_ORDER_SIZE|AMEND_TO_CURRENT_MINUS_RECENT_ORDER_SIZE|SEARCH_BY_RECENT_ORDER_SIZE|PULL|DO_NOTHING");
branch = "Branch(" >> predicate >> "," >> by_ref(tree) >> "," >> by_ref(tree) >> ")";
tree = leaf | branch;

smatch what;
regex_match(s, what, tree)

在这里, s 未定义,因为它是一个不适合问题的 75000 个字符的字符串。如何修改这些表达式以使匹配在更小的空间内执行?

4

1 回答 1

4

我找到了解决这个问题的方法,将分支的定义更改为

branch = "Branch(" >> keep(predicate) >> "," >> keep(by_ref(tree)) >> "," >> keep(by_ref(tree)) >> ")";

为了限制回溯,从而限制内存使用。

于 2018-06-01T04:42:34.793 回答