c++ - C++ 的“yield”关键字，如何从我的函数返回迭代器？

Question

考虑以下代码。

std::vector<result_data> do_processing() 
{
    pqxx::result input_data = get_data_from_database();
    return process_data(input_data);
}

std::vector<result_data> process_data(pqxx::result const & input_data)
{
    std::vector<result_data> ret;
    pqxx::result::const_iterator row;
    for (row = input_data.begin(); row != inpupt_data.end(); ++row) 
    {
        // somehow populate output vector
    }
    return ret;
}

当我在考虑是否可以期望返回值优化 (RVO) 发生时，我找到了 Jerry Coffin [强调我的]的答案：

至少在国际海事组织，这通常是一个糟糕的主意，但不是出于效率原因。这是一个糟糕的主意，因为所讨论的函数通常应该编写为通过迭代器产生其输出的通用算法。几乎所有接受或返回容器而不是在迭代器上操作的代码都应该被认为是可疑的。

不要误解我的意思：有时传递类似集合的对象（例如，字符串）是有意义的，但对于引用的示例，我认为传递或返回向量是一个糟糕的主意。

有一些 Python 背景，我非常喜欢 Generators。实际上，如果它是 Python，我会将上面的函数编写为生成器，即避免在发生其他任何事情之前处理整个数据的必要性。例如像这样：

def process_data(input_data):
    for item in input_data:
        # somehow process items
        yield result_data

如果我正确解释了 Jerry Coffins 的注释，这就是他的建议，不是吗？如果是这样，我如何在 C++ 中实现它？

score 17 · Accepted Answer

不，这不是杰瑞的意思，至少不是直接的意思。

yield在 Python 中实现协程。C++ 没有它们（但它们当然可以被模拟，但如果做得干净，那就有点牵扯了）。

但是 Jerry 的意思很简单，你应该传入一个输出迭代器，然后将其写入：

template <typename O>
void process_data(pqxx::result const & input_data, O iter) {
    for (row = input_data.begin(); row != inpupt_data.end(); ++row)
        *iter++ = some_value;
}

并称之为：

std::vector<result_data> result;
process_data(input, std::back_inserter(result));

我不相信这通常比只返回向量更好。

score 12 · Accepted Answer

Boost.Asio 的作者 Chris Kohlhoff 有一篇关于此的博客文章：http: //blog.think-async.com/2009/08/secret-sauce-revealed.html

yield他用宏模拟

#define yield \
  if ((_coro_value = __LINE__) == 0) \
  { \
    case __LINE__: ; \
    (void)&you_forgot_to_add_the_entry_label; \
  } \
  else \
    for (bool _coro_bool = false;; \
         _coro_bool = !_coro_bool) \
      if (_coro_bool) \
        goto bail_out_of_coroutine; \
      else

这必须与coroutine类一起使用。有关更多详细信息，请参阅博客。

score 3 · Accepted Answer

当您以递归方式解析某些内容或处理具有状态时，生成器模式可能是一个好主意，并大大简化了代码——那时不能轻易地迭代，通常回调是替代方案。我想拥有yield，发现Boost.Coroutine2现在似乎很好用。

下面的代码是文件的示例cat。当然这是没有意义的，直到你想进一步处理文本行：

#include <fstream>
#include <functional>
#include <iostream>
#include <string>
#include <boost/coroutine2/all.hpp>

using namespace std;

typedef boost::coroutines2::coroutine<const string&> coro_t;

void cat(coro_t::push_type& yield, int argc, char* argv[])
{
    for (int i = 1; i < argc; ++i) {
        ifstream ifs(argv[i]);
        for (;;) {
            string line;
            if (getline(ifs, line)) {
                yield(line);
            } else {
                break;
            }
        }
    }
}

int main(int argc, char* argv[])
{
    using namespace std::placeholders;
    coro_t::pull_type seq(
            boost::coroutines2::fixedsize_stack(),
            bind(cat, _1, argc, argv));
    for (auto& line : seq) {
        cout << line << endl;
    }
}

score 0 · Accepted Answer

我发现类似 istream 的行为会接近我的想法。考虑以下（未经测试的）代码：

struct data_source {
public:
    // for delivering data items
    data_source& operator>>(input_data_t & i) {
        i = input_data.front(); 
        input_data.pop_front(); 
        return *this; 
    }
    // for boolean evaluation
    operator void*() { return input_data.empty() ? 0 : this; }

private:
    std::deque<input_data_t> input_data;

    // appends new data to private input_data
    // potentially asynchronously
    void get_data_from_database();
};

现在我可以按照以下示例所示进行操作：

int main () {
    data_source d;
    input_data_t i;
    while (d >> i) {
        // somehow process items
        result_data_t r(i);
        cout << r << endl;
    }
}

通过这种方式，数据采集以某种方式与处理分离，从而允许延迟/异步发生。也就是说，我可以在物品到达时对其进行处理，而不必像另一个示例中那样等到向量完全填充。

c++ - C++ 的“yield”关键字，如何从我的函数返回迭代器？

4 回答 4

Related

Reference