1

最近我一直在使用 boost xpressive 来解析文件。这些文件每个大小为 10 MB,将有数百个文件需要解析。

Xpressive 很好用,语法清晰,但问题在于性能。令人难以置信的是它在调试版本中的抓取方式,而在发布版本中,每个文件花费的时间超过一秒。我已经针对旧的普通 get_line()、find() 和 sscanf() 代码进行了测试,它可以轻松击败 xpressive。

我知道类型检查、回溯等是有代价的,但这对我来说似乎太过分了。我怎么想,我做错了什么?有什么方法可以优化它以以不错的速度运行吗?是否应该努力将代码迁移到 boost::spirit?

我准备了一个精简版的代码,其中嵌入了几行真实文件,以防有人测试和帮助。

注意 - 作为要求,必须使用 VS 2010(不幸的是,不完全符合 c++11)

#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

const char input[] = "[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: BTN-1002 - Km: 90.0 - SWITCH_ON: 1\n\
[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:14:04.510025] - 5.130 s => Driver: 0 - Speed: 4.9 - Road: B-302 - Km: 91.1 - GEAR: 3";

const auto len = std::distance(std::begin(input), std::end(input));

struct Sequence
{
    int ms;
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    std::string date;
    std::string road;
};

namespace xp = boost::xpressive;

int main()
{
    Sequence data;
    std::vector<Sequence> sequences;

    using namespace xp;

    cregex real = (+_d >> '.' >> +_d);
    cregex keyword = " - SEQUENCE: " >> (+_d)[xp::ref(data.sequence) = as<int>(_)];
    cregex date = repeat<4>(_d) >> '-' >> repeat<3>(alpha) >> '-' >> repeat<2>(_d) >> _s >> repeat<2>(_d) >> ':' >> repeat<2>(_d) >> ':' >> repeat<2>(_d);

    cregex header = '[' >> date[xp::ref(data.date) = _] >> '.' >> (+_d)[xp::ref(data.ms) = as<int>(_)] >> "] - "
                    >> real[xp::ref(data.time) = as<double>(_)]
                    >> " s => Driver: " >> (+_d)[xp::ref(data.driver) = as<int>(_)]
                    >> " - Speed: " >> real[xp::ref(data.vel) = as<double>(_)]
                    >> " - Road: " >> (+set[alnum | '-'])[xp::ref(data.road) = _]
                    >> " - Km: " >> real[xp::ref(data.km) = as<double>(_)];

    xp::cregex parser = (header >> keyword >> _ln);

    xp::cregex_iterator cur(input, input + len, parser);
    xp::cregex_iterator end;

    for (; cur != end; ++cur)
        sequences.emplace_back(data);

    return 0;
}

请注意 VS 2010 的限制。

4

2 回答 2

2

我大致看到两个需要改进的地方:

  • 您基本上会解析所有行,包括您不感兴趣的行
  • 你分配了很多字符串

我建议使用字符串视图来修复分配。接下来,您可以尝试避免解析与​​ SEQUENCE 模式不匹配的行。原则上没有理由不能使用 Boost Xpressive 完成此操作,但我选择的武器恰好是 Boost Spirit,所以我也将其包括在内。

有选择性

您可以在花费更多精力之前检测出有趣的行,如下所示:

cregex signature = -*~_n >> " - SEQUENCE: " >> (+_d) >> before(_ln|eos); 
for (xp::cregex_iterator cur(b, e, signature), end; cur != end; ++cur) {
    std::cout << "'" << cur->str() << "'\n";
}

这打印

'[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1'
'[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4'
'[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8'
'[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15'
'[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21'
'[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29'
'[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34'
'[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45'

什么都没有分配。这应该很快。

减少分配

为此,我将切换到 Spirit,因为它会使事情变得更容易。

注意:我切换到这里的真正原因是,与 Boost Spirit 相比,Xpressive 似乎没有可扩展的属性传播特性。这可能是我缺乏这方面的经验。

另一种方法几乎肯定会用手动传播代码代替操作,这反过来会通知命名的捕获组,以使事情保持清晰。我不确定这些的性能开销,所以我们现在不要使用它们。

您可以使用boost::string_viewtrait 来“教” Qi 为其分配文本:

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

这样,Qi 语法看起来就像这样:

template <typename It> struct QiParser : qi::grammar<It, Sequence()> {
    QiParser() : QiParser::base_type(line) {
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        line = '[' >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);
    }
  private:
    qi::rule<It, Sequence()> line;
};

使用它非常简单,尤其是在没有“选择性”的情况下。

这恰好是“获胜”配置。这是删除所有与基准测试相关的泛型和选项后该算法的独立简化版本:Live on Coliru

基准测试结果:惊喜

使用选择性解析方法只会使 Xpressive 方法变慢:Interactive

在此处输入图像描述

与 Spirit 相比,我最初也是从选择性方法开始的(完全预期它会更快)。这是不那么令人鼓舞的结果:Interactive

在此处输入图像描述

哎呀。最初的 Xpressive 方法仍然更胜一筹!

调整假设

好的,显然先进行浅扫描,然后“完整解析”会损害性能。从理论上讲,这很可能归结为缓存/预取效应。此外,线性方法可能会获胜,因为它更容易发现一行不是以'['字符开头的情况,而不是查看它是否以模式结尾SEQUENCE

所以我决定将精神方法也适用于线性模式,看看减少分配的胜利是否仍然值得:Interactive

在此处输入图像描述

现在我们得到了结果。让我们详细看看std::stringboost::string_view方法之间的区别:Interactive

在此处输入图像描述

总结/结论

减少的分配有利于提高30% 的效率。总的来说,比原来的方法提高了10 倍。

请注意,基准代码竭尽全力消除实现之间的不公平差异(例如,通过在 Spirit 和 Xpressive 上预编译所有内容)。查看完整的基准代码:

孤立的获胜实施:Live on Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

struct Sequence {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    boost::string_view date;
    boost::string_view road;
};

BOOST_FUSION_ADAPT_STRUCT(::Sequence, date, time, driver, vel, road, km, sequence)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

std::vector<Sequence> parse_spirit(It b, It e) {

    qi::rule<It, Sequence()> static const line = []{
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        qi::rule<It, Sequence()> r = '[' >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);

        return r;
    }();

    std::vector<Sequence> sequences;

    parse(b, e, *boost::spirit::repository::qi::seek[line], sequences);

    return sequences;
}

static char input[] = /*... see question ...*/;
static const size_t len = strlen(input);

int main() {
    auto sequences = parse_spirit(input, input+len);
    std::cout << "Parsed: " << sequences.size() << " sequence lines\n";
}

完整的基准代码

基准测试使用Nonius进行测量和统计分析。

#include <cstring> // strlen

static char input[] = 
"[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:13:59.580482] - 0.200 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - SLOPE: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.0 - Road: A-11 - Km: 90.0 - GEAR: 0\n\
[2018-Mar-13 13:14:01.170203] - 1.790 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - GEAR: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.1 - Road: A-11 - Km: 90.0 - SEQUENCE: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.440 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - CLUTCH: 1\n\
[2018-Mar-13 13:14:01.819966] - 2.540 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.2 - Road: A-11 - Km: 90.0 - SEQUENCE: 4\n\
[2018-Mar-13 13:14:02.409855] - 3.030 s => Driver: 0 - Speed: 0.3 - Road: A-11 - Km: 90.0 - SEQUENCE: 8\n\
[2018-Mar-13 13:14:01.819966] - 3.110 s => Backup to regestry\n\
[2018-Mar-13 13:14:02.620424] - 3.240 s => Driver: 0 - Speed: 0.4 - Road: A-11 - Km: 90.1 - SEQUENCE: 15\n\
[2018-Mar-13 13:14:02.829983] - 3.450 s => Driver: 0 - Speed: 0.6 - Road: B-302 - Km: 90.1 - SLOPE: -5\n\
[2018-Mar-13 13:14:03.039600] - 3.660 s => Driver: 0 - Speed: 0.8 - Road: B-302 - Km: 90.1 - SEQUENCE: 21\n\
[2018-Mar-13 13:14:03.250451] - 3.870 s => Driver: 0 - Speed: 1.2 - Road: B-302 - Km: 90.2 - GEAR: 2\n\
[2018-Mar-13 13:14:03.460012] - 4.080 s => Driver: 0 - Speed: 1.7 - Road: B-302 - Km: 90.3 - SEQUENCE: 29\n\
[2018-Mar-13 13:14:03.669448] - 4.290 s => Driver: 0 - Speed: 2.2 - Road: B-302 - Km: 90.4 - SEQUENCE: 34\n\
[2018-Mar-13 13:14:03.880066] - 4.500 s => Driver: 0 - Speed: 2.8 - Road: B-302 - Km: 90.5 - CLUTCH: 1\n\
[2018-Mar-13 13:14:04.090444] - 4.710 s => Driver: 0 - Speed: 3.5 - Road: B-302 - Km: 90.7 - SEQUENCE: 45\n\
[2018-Mar-13 13:14:04.300160] - 4.920 s => Driver: 0 - Speed: 4.2 - Road: B-302 - Km: 90.9 - SLOPE: 10\n\
[2018-Mar-13 13:14:04.510025] - 5.130 s => Driver: 0 - Speed: 4.9 - Road: B-302 - Km: 91.1 - GEAR: 3";
static const size_t len = strlen(input);

#include <boost/utility/string_view.hpp>
#include <boost/fusion/adapted/struct.hpp>

template <typename String> struct Sequence {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    String date;
    String road;
};

BOOST_FUSION_ADAPT_TPL_STRUCT((T),(Sequence)(T), date, time, driver, vel, road, km, sequence)

// Declare implementations under test:
using It = char const*;
template <typename S> std::vector<S> parse_xpressive_linear(It b, It e);
template <typename S> std::vector<S> parse_xpressive_selective(It b, It e);
template <typename S> std::vector<S> parse_spirit_linear(It b, It e);
template <typename S> std::vector<S> parse_spirit_selective(It b, It e);

#ifdef VERIFY_OUTPUT
    #include <boost/fusion/include/io.hpp>
    using boost::fusion::operator<<;
    #include <iostream>

    #define VERIFY()                                                                    \
        do {                                                                            \
            std::cout << "L:" << __LINE__ << " Parsed: " << sequences.size() << "\n";   \
            for (auto r : sequences) {                                                  \
                std::cout << r << "\n";                                                 \
            }                                                                           \
        } while (0)
#else
    #define VERIFY() do { } while (0)
#endif

#ifdef USE_NONIUS
    #include <nonius/benchmark.h++>
    #define NONIUS_RUNNER
    #include <nonius/main.h++>
#else
    // mock nonius
    namespace nonius {
        struct chronometer{
            template <typename F> static inline void measure(F&& f) { std::forward<F>(f)(); }
        };
        static std::vector<std::function<void(chronometer)>> s_benchmarks;
        #define TOKENPASTE(x, y) x ## y
        #define TOKENPASTE2(x, y) TOKENPASTE(x, y)
        #define NONIUS_BENCHMARK(name, f) static auto TOKENPASTE2(s_reg_, __LINE__) = []{ ::nonius::s_benchmarks.push_back(f); return 42; }();

        void run() { for (auto& b : s_benchmarks) b({}); }
    }

    int main() {
        nonius::run();
    }
#endif

template <typename R>
void do_test_kernel(nonius::chronometer& cm, std::vector<R> (*f)(It, It)) {
    std::vector<R> sequences;
    cm.measure([&sequences,f]{ sequences = f(input, input + len); });
    VERIFY();
}

#define TEST_CASE(name, string) NONIUS_BENCHMARK(#name"-"#string, [](nonius::chronometer cm) { do_test_kernel(cm, &name<Sequence<string> >); })
// Xpressive doesn't support string_view
TEST_CASE(parse_xpressive_linear,    std::string)
TEST_CASE(parse_xpressive_selective, std::string)

TEST_CASE(parse_spirit_linear,       std::string)
TEST_CASE(parse_spirit_linear,       boost::string_view)
TEST_CASE(parse_spirit_selective,    std::string)
TEST_CASE(parse_spirit_selective,    boost::string_view)

#include <boost/xpressive/xpressive.hpp>
#include <boost/xpressive/regex_actions.hpp>

namespace xp = boost::xpressive;

namespace XpressiveDetail {
    using namespace xp;

    struct Scanner {
        cregex scan {-*~xp::_n >> " - SEQUENCE: " >> (+xp::_d) >> xp::_ln};
    };

    template <typename Seq> struct Parser : Scanner {
        mutable Seq seq; // non-thread-safe, but fairer to compare to Spirit

        cregex real    = (+_d >> '.' >> +_d);
        cregex keyword = " - SEQUENCE: " >> (+_d)[xp::ref(seq.sequence) = as<int>(_)];
        cregex date    = repeat<4>(_d) >> '-' 
            >> repeat<3>(alpha) >> '-' 
            >> repeat<2>(_d) 
            >> _s 
            >> repeat<2>(_d) >> ':' 
            >> repeat<2>(_d) >> ':' 
            >> repeat<2>(_d)
            >> '.' >> (+_d);

        cregex header = '[' >> date[xp::ref(seq.date) = _] >> "] - "
            >> real[xp::ref(seq.time) = as<double>(_)]
            >> " s => Driver: " >> (+_d)             [ xp ::ref(seq.driver) = as<int>(_) ]
            >> " - Speed: "     >> real              [ xp ::ref(seq.vel)    = as<double>(_) ]
            >> " - Road: "      >> (+set[alnum|'-']) [ xp ::ref(seq.road)   = _ ]
            >> " - Km: "        >> real              [ xp ::ref(seq.km)     = as<double>(_) ];

        cregex parser = (header >> keyword >> _ln);
    };
}

template <typename Seq>
std::vector<Seq> parse_xpressive_linear(It b, It e) {
    std::vector<Seq> sequences;
    using namespace xp;

    static const XpressiveDetail::Parser<Seq> precompiled{};

    for (xp::cregex_iterator cur(b, e, precompiled.parser), end; cur != end; ++cur)
        sequences.push_back(std::move(precompiled.seq));

    return sequences;
}

template <typename Seq>
std::vector<Seq> parse_xpressive_selective(It b, It e) {
    std::vector<Seq> sequences;
    using namespace xp;

    static const XpressiveDetail::Parser<Seq> precompiled{};
    xp::match_results<It> m;

    for (auto& match : boost::make_iterator_range(xp::cregex_iterator{b, e, precompiled.scan}, {})) {
        if (xp::regex_match(match[0].first, match[0].second, m, precompiled.parser))
            sequences.push_back(std::move(precompiled.seq));
    }

    return sequences;
}

//#define BOOST_SPIRIT_DEBUG
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix.hpp>
namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

template <typename It, typename Attribute> struct QiParser : qi::grammar<It, Attribute()> {
    QiParser() : QiParser::base_type(line) {
        using namespace qi;
        auto date_time = copy(
            repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
            repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

        line = '[' >> eps(clear(_val)) >> raw[date_time] >> "] - "
            >> double_ >> " s"
            >> " => Driver: "  >> int_
            >> " - Speed: "    >> double_
            >> " - Road: "     >> raw[+graph]
            >> " - Km: "       >> double_
            >> " - SEQUENCE: " >> int_
            >> (eol|eoi);

        BOOST_SPIRIT_DEBUG_NODES((line))
    }
  private:
    struct clear_f {
        // only required for linear approach to std::string-based
        bool operator()(Sequence<std::string>& v)      const { v = {};      return true; }
        bool operator()(Sequence<boost::string_view>&) const { /*no_op();*/ return true; }
    };
    boost::phoenix::function<clear_f> clear;

    qi::rule<It, Attribute()> line;
};

template <typename Seq = Sequence<std::string> >
std::vector<Seq> parse_spirit_selective(It b, It e) {
    static QiParser<It, Seq> const qi_parser{};
    static XpressiveDetail::Scanner const precompiled{};

    std::vector<Seq> sequences;

    for (auto& match : boost::make_iterator_range(xp::cregex_iterator{b, e, precompiled.scan}, {})) {
        Seq r;
        if (parse(match[0].first, match[0].second, qi_parser, r))
            sequences.push_back(r);
    }

    return sequences;
}

#include <boost/spirit/repository/include/qi_seek.hpp>

template <typename Seq = Sequence<std::string> >
std::vector<Seq> parse_spirit_linear(It b, It e) {
    using boost::spirit::repository::qi::seek;

    static QiParser<It, Seq> const qi_parser{};

    std::vector<Seq> sequences;
    parse(b, e, *seek[qi_parser], sequences);
    return sequences;
}

示例文本报告:

clock resolution: mean is 17.7534 ns (40960002 iterations)

benchmarking parse_xpressive_linear-std::string
collecting 100 samples, 1 iterations each, in estimated 15.7252 ms
mean: 156.418 μs, lb 155.863 μs, ub 158.24 μs, ci 0.95
std dev: 4.62848 μs, lb 1637.89 ns, ub 10.4043 μs, ci 0.95
found 4 outliers among 100 samples (4%)
variance is moderately inflated by outliers

benchmarking parse_xpressive_selective-std::string
collecting 100 samples, 1 iterations each, in estimated 31.5459 ms
mean: 313.992 μs, lb 313.39 μs, ub 315.599 μs, ci 0.95
std dev: 4.5415 μs, lb 1105.98 ns, ub 9.07809 μs, ci 0.95
found 11 outliers among 100 samples (11%)
variance is slightly inflated by outliers

benchmarking parse_spirit_linear-std::string
collecting 100 samples, 1 iterations each, in estimated 2.1556 ms
mean: 21.2533 μs, lb 21.1623 μs, ub 21.6854 μs, ci 0.95
std dev: 870.481 ns, lb 53.2809 ns, ub 2.0738 μs, ci 0.95
found 7 outliers among 100 samples (7%)
variance is moderately inflated by outliers

benchmarking parse_spirit_linear-boost::string_view
collecting 100 samples, 2 iterations each, in estimated 2.944 ms
mean: 14.6677 μs, lb 14.6342 μs, ub 14.8279 μs, ci 0.95
std dev: 318.252 ns, lb 22.5097 ns, ub 757.555 ns, ci 0.95
found 5 outliers among 100 samples (5%)
variance is moderately inflated by outliers

benchmarking parse_spirit_selective-std::string
collecting 100 samples, 1 iterations each, in estimated 27.5512 ms
mean: 273.052 μs, lb 272.77 μs, ub 273.952 μs, ci 0.95
std dev: 2.31473 μs, lb 835.184 ns, ub 5.1322 μs, ci 0.95
found 10 outliers among 100 samples (10%)
variance is unaffected by outliers

benchmarking parse_spirit_selective-boost::string_view
collecting 100 samples, 1 iterations each, in estimated 27.0766 ms
mean: 269.446 μs, lb 269.208 μs, ub 270.268 μs, ci 0.95
std dev: 2.01634 μs, lb 627.834 ns, ub 4.56949 μs, ci 0.95
found 10 outliers among 100 samples (10%)
variance is unaffected by outliers
于 2018-03-14T22:58:36.390 回答
1

您可以将融合与精神特征一起使用(参见例如解析成几个向量成员),但我会考虑使用语义动作。

这是设计难题:

vector用 trait分隔s

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

struct BaseEvent {
    int driver;
    int sequence;
    double time;
    double vel;
    double km;
    boost::string_view date;
    boost::string_view road;
};
struct Sequence : BaseEvent{};
struct Clutch : BaseEvent{};
struct Gear : BaseEvent{};

BOOST_FUSION_ADAPT_STRUCT(::Sequence, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(::Clutch, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(::Gear, date, time, driver, vel, road, km, sequence)

struct LogEvents {
    std::vector<Sequence> sequence;
    std::vector<Clutch> clutch;
    std::vector<Gear> gear;

    void add(Sequence const& s) { sequence.push_back(s); }
    void add(Clutch   const& c) { clutch.push_back(c);   }
    void add(Gear     const& g) { gear.push_back(g);     }
};

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };

    template <> struct is_container<LogEvents> : std::true_type {};

    template <> struct container_value<LogEvents> {
        using type = boost::variant<::Sequence, ::Clutch, ::Gear>;
    };

    template <typename T> struct push_back_container<LogEvents, T> {
        struct Visitor {
            LogEvents& _log;
            template <typename U> void operator()(U const& ev) const { _log.add(ev); }
            using result_type = void;
        };

        template <typename... U>
        static bool call(LogEvents& log, boost::variant<U...> const& attribute) {
            boost::apply_visitor(Visitor{log}, attribute);
            return true;
        }
    };
} } }


namespace QiParsers {
    template <typename It, typename Attribute>
    struct BaseEventParser : qi::grammar<It, Attribute()> {
        BaseEventParser(std::string const& event_type) : BaseEventParser::base_type(start) {
            using namespace qi;
            auto date_time = copy(
                    repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                    repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

            start 
                = '[' >> raw[date_time] >> "] - "
                >> double_ >> " s"
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> lit(event_type) >> ": " >> int_
                >> (eol|eoi);
        }

      private:
        qi::rule<It, Attribute()> start;
    };
}

LogEvents parse_spirit(It b, It e) {
    QiParsers::BaseEventParser<It, ::Sequence> sequence("SEQUENCE");
    QiParsers::BaseEventParser<It, ::Clutch>   clutch("CLUTCH");
    QiParsers::BaseEventParser<It, ::Gear>     gear("GEAR");

    LogEvents events;
    assert(parse(b, e, *boost::spirit::repository::qi::seek[sequence|clutch|gear], events));

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Events: "
        << events.sequence.size() << " sequence, "
        << events.clutch.size() << " clutch, "
        << events.gear.size() << " gear events\n";

    using boost::fusion::operator<<;
    for (auto& s : events.sequence) { std::cout << "SEQUENCE: " <<  s << "\n"; }
    for (auto& c : events.clutch)   { std::cout << "CLUTCH:   " <<  c << "\n"; }
    for (auto& g : events.gear)     { std::cout << "GEAR:     " <<  g << "\n"; }
}

翻转它:1vector<variant<>>

有一个变体向量不是更有意义吗?

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    struct BaseEvent {
        int driver;
        int sequence;
        double time;
        double vel;
        double km;
        boost::string_view date;
        boost::string_view road;
    };
    struct Sequence : BaseEvent{};
    struct Clutch : BaseEvent{};
    struct Gear : BaseEvent{};

    using LogEvent = boost::variant<Sequence, Clutch, Gear>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::Sequence, date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::Clutch,   date, time, driver, vel, road, km, sequence)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::Gear,     date, time, driver, vel, road, km, sequence)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It, typename Attribute>
    struct BaseEventParser : qi::grammar<It, Attribute()> {
        BaseEventParser(std::string const& event_type) : BaseEventParser::base_type(start) {
            using namespace qi;
            auto date_time = copy(
                    repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                    repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit);

            start 
                = '[' >> raw[date_time] >> "] - "
                >> double_ >> " s"
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> lit(event_type) >> ": " >> int_
                >> (eol|eoi);
        }

      private:
        qi::rule<It, Attribute()> start;
    };

    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        LogParser() : LogParser::base_type(start) {
            using namespace qi;
            using boost::spirit::repository::qi::seek;

            event = sequence | clutch | gear ; // TODO add types
            start = *seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;
        BaseEventParser<It, MyEvents::Sequence> sequence{"SEQUENCE"};
        BaseEventParser<It, MyEvents::Clutch>   clutch{"CLUTCH"};
        BaseEventParser<It, MyEvents::Gear>     gear{"GEAR"};
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;
    static inline char const* kind(Sequence const&) { return "SEQUENCE"; }
    static inline char const* kind(Clutch   const&) { return "CLUTCH"; }
    static inline char const* kind(Gear     const&) { return "GEAR"; }

    struct KindVisitor : boost::static_visitor<char const*> {
        template <typename T> char const* operator()(T const& ev) const { return kind(ev); }
    };
    static inline char const* kind(LogEvent const& ev) {
        return boost::apply_visitor(KindVisitor{}, ev);
    }
}

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    for (auto& e : events)
        std::cout << kind(e) << ": " << e << "\n"; 
}

概括:常见字段和其他事件

特别是如果您继续概括:

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    enum class Kind { Sequence, Clutch, Gear, Slope, Other };

    struct CommonFields {
        boost::string_view date;
        double duration;
    };

    struct BaseEvent {
        CommonFields common;
        int driver;
        int event_id;
        double vel;
        double km;
        boost::string_view road;
        Kind kind;
    };

    struct OtherEvent {
        CommonFields common;
        std::string message;
    };

    using LogEvent = boost::variant<BaseEvent, OtherEvent>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::CommonFields, date, duration)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::BaseEvent, common, driver, vel, road, km, kind, event_id)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::OtherEvent, common, message)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        using Kind = MyEvents::Kind;

        LogParser() : LogParser::base_type(start) {
            using namespace qi;

            kind.add
                ("SEQUENCE", Kind::Sequence)
                ("CLUTCH", Kind::Clutch)
                ("GEAR", Kind::Gear)
                ("SLOPE", Kind::Slope)
                ;

            common_fields
                = '[' >> raw[
                        repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                        repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit
                ] >> "]"
                >> " - " >> double_ >> " s";

            base_event
                = common_fields
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> kind >> ": " >> int_;

            other_event
                = common_fields
                >> " => " >> *~char_("\r\n");

            event 
                = (base_event | other_event) 
                >> (eol|eoi);

            start = *boost::spirit::repository::qi::seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;

        qi::rule<It, MyEvents::CommonFields()> common_fields;
        qi::rule<It, MyEvents::BaseEvent()> base_event;
        qi::rule<It, MyEvents::OtherEvent()> other_event;

        qi::symbols<char, MyEvents::Kind> kind;
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;

    static inline Kind getKind(BaseEvent const& be) { return be.kind; }
    static inline Kind getKind(OtherEvent const&) { return Kind::Other; }

    struct KindVisitor : boost::static_visitor<Kind> {
        template <typename T> Kind operator()(T const& ev) const { return getKind(ev); }
    };
    static inline Kind getKind(LogEvent const& ev) {
        return boost::apply_visitor(KindVisitor{}, ev);
    }

    static inline std::ostream& operator<<(std::ostream& os, Kind k) {
        switch(k) {
            case Kind::Sequence: return os << "SEQUENCE";
            case Kind::Clutch:   return os << "CLUTCH";
            case Kind::Gear:     return os << "GEAR";
            case Kind::Slope:    return os << "SLOPE";
            case Kind::Other:    return os << "(Other)";
        }
        return os;
    }
}

int main() {
    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    for (auto& e : events)
        std::cout << getKind(e) << ": " << e << "\n"; 
}

打印例如

Parsed: 37 events
SLOPE: ((2018-Mar-13 13:13:59.580482 0.2) 0 0 A-11 90 SLOPE 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)
[...]

奖励:多索引

如果你使用多索引容器,你也可以吃蛋糕。

这是一个示例定义,允许您根据一些相当随意选择的特征来索引向量:

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/global_fun.hpp>

namespace Indexing {
    namespace bmi = boost::multi_index;

    using MyEvents::LogEvent;

    double getDuration(LogEvent const& ev) { return getCommon(ev).duration; }

    using Table = bmi::multi_index_container<
        std::reference_wrapper<LogEvent const>, //LogEvent,
        bmi::indexed_by<
            bmi::ordered_non_unique<
                bmi::tag<struct primary>,
                bmi::composite_key<
                    LogEvent,
                    bmi::global_fun<LogEvent const&, MyEvents::Kind, MyEvents::getKind>,
                    bmi::global_fun<LogEvent const&, int,            MyEvents::getEventId>
                >
            >,
            bmi::ordered_non_unique<
                bmi::tag<struct duration>,
                bmi::global_fun<LogEvent const&, double, getDuration>
            >
        >
    >;
}

现在你可以做一些有趣的事情,比如:

Indexing::Table idx(events.begin(), events.end());

/*
 * // To print all events, grouped by by kind and event id:
 * for (MyEvents::LogEvent const& e : idx)
 *     std::cout << getKind(e) << ": " << e << "\n"; 
 *
 * // Ordered by duration:
 * for (MyEvents::LogEvent const& e : idx.get<Indexing::duration>())
 *     std::cout << getKind(e) << ": " << e << "\n"; 
 */

std::cout << "\nAll GEAR events ordered by event id:\n";
for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Gear))))
    std::cout << getKind(e) << ": " << e << "\n"; 

std::cout << "\nOnly the SLOPE events with id 10:\n";
for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Slope, 10))))
    std::cout << getKind(e) << ": " << e << "\n"; 

std::cout << "\nEvents with durations in [2s..3s):\n";
auto& by_dur = idx.get<Indexing::duration>();

for (MyEvents::LogEvent const& e : make_iterator_range(by_dur.lower_bound(2), by_dur.upper_bound(3)))
    std::cout << getKind(e) << ": " << e << "\n"; 

Live On Coliru

#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/repository/include/qi_seek.hpp>
#include <boost/utility/string_view.hpp>
#include <cstring> // strlen

using It = char const*;

namespace MyEvents {
    enum class Kind { Sequence, Clutch, Gear, Slope, Other };

    struct CommonFields {
        boost::string_view date;
        double duration;
    };

    struct BaseEvent {
        CommonFields common;
        int driver;
        int event_id;
        double vel;
        double km;
        boost::string_view road;
        Kind kind;
    };

    struct OtherEvent {
        CommonFields common;
        std::string message;
    };

    using LogEvent = boost::variant<BaseEvent, OtherEvent>;
    using LogEvents = std::vector<LogEvent>;
}

BOOST_FUSION_ADAPT_STRUCT(MyEvents::CommonFields, date, duration)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::BaseEvent, common, driver, vel, road, km, kind, event_id)
BOOST_FUSION_ADAPT_STRUCT(MyEvents::OtherEvent, common, message)

namespace qi = boost::spirit::qi;

namespace boost { namespace spirit { namespace traits {
    template <typename It>
    struct assign_to_attribute_from_iterators<boost::string_view, It, void> {
        static inline void call(It f, It l, boost::string_view& attr) { attr = boost::string_view { &*f, size_t(std::distance(f,l)) }; }
    };
} } }

namespace QiParsers {
    template <typename It>
    struct LogParser : qi::grammar<It, MyEvents::LogEvents()> {
        using Kind = MyEvents::Kind;

        LogParser() : LogParser::base_type(start) {
            using namespace qi;

            kind.add
                ("SEQUENCE", Kind::Sequence)
                ("CLUTCH", Kind::Clutch)
                ("GEAR", Kind::Gear)
                ("SLOPE", Kind::Slope)
                ;

            common_fields
                = '[' >> raw[
                        repeat(4)[digit] >> '-' >> repeat(3)[alpha] >> '-' >> repeat(2)[digit] >> ' ' >> 
                        repeat(2)[digit] >> ':' >> repeat(2)[digit] >> ':' >> repeat(2)[digit] >> '.' >> +digit
                ] >> "]"
                >> " - " >> double_ >> " s";

            base_event
                = common_fields
                >> " => Driver: "  >> int_
                >> " - Speed: "    >> double_
                >> " - Road: "     >> raw[+graph]
                >> " - Km: "       >> double_
                >> " - " >> kind >> ": " >> int_;

            other_event
                = common_fields
                >> " => " >> *~char_("\r\n");

            event 
                = (base_event | other_event) 
                >> (eol|eoi);

            start = *boost::spirit::repository::qi::seek[event];
        }

      private:
        qi::rule<It, MyEvents::LogEvents()> start;
        qi::rule<It, MyEvents::LogEvent()> event;

        qi::rule<It, MyEvents::CommonFields()> common_fields;
        qi::rule<It, MyEvents::BaseEvent()> base_event;
        qi::rule<It, MyEvents::OtherEvent()> other_event;

        qi::symbols<char, MyEvents::Kind> kind;
    };
}

MyEvents::LogEvents parse_spirit(It b, It e) {
    static QiParsers::LogParser<It> const parser {};

    MyEvents::LogEvents events;
    parse(b, e, parser, events);

    return events;
}

static char input[] = /* see question */;
static const size_t len = strlen(input);

namespace MyEvents { // for debug/demo
    using boost::fusion::operator<<;

    static inline CommonFields const& getCommon(BaseEvent const& be) { return be.common; }
    static inline CommonFields const& getCommon(OtherEvent const& oe) { return oe.common; }
    static inline Kind getKind(BaseEvent const& be) { return be.kind; }
    static inline Kind getKind(OtherEvent const&) { return Kind::Other; }
    static inline int getEventId(BaseEvent const& be) { return be.event_id; }
    static inline int getEventId(OtherEvent const&) { return 0; }

#define IMPL_DISPATCH(name, T)                                                                     \
    struct name##Visitor : boost::static_visitor<T> {                                              \
        template <typename E> T operator()(E const &ev) const { return name(ev); }                 \
    };                                                                                             \
    static inline T name(LogEvent const &ev) { return boost::apply_visitor(name##Visitor{}, ev); }

    IMPL_DISPATCH(getCommon, CommonFields const&)
    IMPL_DISPATCH(getKind, Kind)
    IMPL_DISPATCH(getEventId, int)

    static inline std::ostream& operator<<(std::ostream& os, Kind k) {
        switch(k) {
            case Kind::Sequence: return os << "SEQUENCE";
            case Kind::Clutch:   return os << "CLUTCH";
            case Kind::Gear:     return os << "GEAR";
            case Kind::Slope:    return os << "SLOPE";
            case Kind::Other:    return os << "(Other)";
        }
        return os;
    }
}

#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/composite_key.hpp>
#include <boost/multi_index/global_fun.hpp>

namespace Indexing {
    namespace bmi = boost::multi_index;

    using MyEvents::LogEvent;

    double getDuration(LogEvent const& ev) { return getCommon(ev).duration; }

    using Table = bmi::multi_index_container<
        std::reference_wrapper<LogEvent const>, //LogEvent,
        bmi::indexed_by<
            bmi::ordered_non_unique<
                bmi::tag<struct primary>,
                bmi::composite_key<
                    LogEvent,
                    bmi::global_fun<LogEvent const&, MyEvents::Kind, MyEvents::getKind>,
                    bmi::global_fun<LogEvent const&, int,            MyEvents::getEventId>
                >
            >,
            bmi::ordered_non_unique<
                bmi::tag<struct duration>,
                bmi::global_fun<LogEvent const&, double, getDuration>
            >
        >
    >;
}

using boost::make_iterator_range;
using boost::make_tuple;

int main() {
    using MyEvents::LogEvent;
    using MyEvents::Kind;

    auto events = parse_spirit(input, input+len);
    std::cout << "Parsed: " << events.size() << " events\n";

    Indexing::Table idx(events.begin(), events.end());

    /*
     * // To print all events, grouped by by kind and event id:
     * for (MyEvents::LogEvent const& e : idx)
     *     std::cout << getKind(e) << ": " << e << "\n"; 
     *
     * // Ordered by duration:
     * for (MyEvents::LogEvent const& e : idx.get<Indexing::duration>())
     *     std::cout << getKind(e) << ": " << e << "\n"; 
     */

    std::cout << "\nAll GEAR events ordered by event id:\n";
    for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Gear))))
        std::cout << getKind(e) << ": " << e << "\n"; 

    std::cout << "\nOnly the SLOPE events with id 10:\n";
    for (MyEvents::LogEvent const& e : make_iterator_range(idx.equal_range(make_tuple(Kind::Slope, 10))))
        std::cout << getKind(e) << ": " << e << "\n"; 

    std::cout << "\nEvents with durations in [2s..3s):\n";
    auto& by_dur = idx.get<Indexing::duration>();

    for (MyEvents::LogEvent const& e : make_iterator_range(by_dur.lower_bound(2), by_dur.upper_bound(3)))
        std::cout << getKind(e) << ": " << e << "\n"; 
}

印刷:

Parsed: 37 events

All GEAR events ordered by event id:
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0 A-11 90 GEAR 0)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
GEAR: ((2018-Mar-13 13:14:01.170203 1.79) 0 0.1 A-11 90 GEAR 1)
GEAR: ((2018-Mar-13 13:14:03.250451 3.87) 0 1.2 B-302 90.2 GEAR 2)
GEAR: ((2018-Mar-13 13:14:03.250451 3.87) 0 1.2 B-302 90.2 GEAR 2)
GEAR: ((2018-Mar-13 13:14:04.510025 5.13) 0 4.9 B-302 91.1 GEAR 3)

Only the SLOPE events with id 10:
SLOPE: ((2018-Mar-13 13:14:04.300160 4.92) 0 4.2 B-302 90.9 SLOPE 10)
SLOPE: ((2018-Mar-13 13:14:04.300160 4.92) 0 4.2 B-302 90.9 SLOPE 10)

Events with durations in [2s..3s):
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
SEQUENCE: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.1 A-11 90 SEQUENCE 1)
CLUTCH: ((2018-Mar-13 13:14:01.819966 2.44) 0 0.2 A-11 90 CLUTCH 1)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)
(Other): ((2018-Mar-13 13:14:01.819966 2.54) Backup to regestry)
于 2018-03-16T02:41:53.380 回答