c++ - 一种在 C++ 中实现惰性求值的方法

Question

所以我在回答一个关于惰性求值的问题（在这里，我的回答对于这种情况来说有点矫枉过正，但这个想法似乎很有趣），这让我想到了如何在 C++ 中进行惰性求值。我想出了一个方法，但我不确定其中的所有陷阱。还有其他实现惰性评估的方法吗？如何做到这一点？有什么陷阱以及这个和其他设计？

这是我的想法：

#include <iostream>
#include <functional>
#include <memory>
#include <string>

#define LAZY(E) lazy<decltype((E))>{[&](){ return E; }}

template<class T>
class lazy {
private:
    typedef std::function<std::shared_ptr<T>()> thunk_type;
    mutable std::shared_ptr<thunk_type> thunk_ptr;
public:
    lazy(const std::function<T()>& x)
        : thunk_ptr(
            std::make_shared<thunk_type>([x](){
                return std::make_shared<T>(x());
            })) {}
    const T& operator()() const {
        std::shared_ptr<T> val = (*thunk_ptr)();
        *thunk_ptr = [val](){ return val; };
        return *val;
    }
    T& operator()() {
        std::shared_ptr<T> val = (*thunk_ptr)();
        *thunk_ptr = [val](){ return val; };
        return *val;
    }
};

void log(const lazy<std::string>& msg) {
    std::cout << msg() << std::endl;
}

int main() {
    std::string hello = "hello";
    std::string world = "world";
    auto x = LAZY((std::cout << "I was evaluated!\n", hello + ", " + world + "!"));
    log(x);
    log(x);
    log(x);
    log(x);
    return 0;
}

我在设计中关心的一些事情。

decltype 有一些奇怪的规则。我对 decltype 的使用有什么问题吗？我在 LAZY 宏中的 E 周围添加了额外的括号，以确保单个名称得到公平对待，就像 vec[10] 一样的引用。还有其他我没有考虑的事情吗？
在我的示例中有很多间接层。这似乎是可以避免的。
这是否正确地记住了结果，以便无论有什么或有多少东西引用了惰性值，它只会评估一次（我对此非常有信心，但惰性评估加上大量共享指针可能会让我陷入循环）

你怎么认为？

score 12 · Accepted Answer

是的，你所拥有的是懒惰的。基本上，只需传递一个计算参数而不是参数的函数。评估后，对象被计算值替换。基本上就是这样，是的，使用引用计数指针实现这样的方法非常昂贵。

记忆是一个古老的术语，它通常意味着将函数的结果表化。没有现代语言可以做到这一点（也许是 PROLOG），它非常昂贵。完全惰性（从不计算一件事两次）在 lambda 提升的过程中实现，即消除自由变量（通过将它们作为参数）。在完全惰性的 lambda 提升中，提升的是最大的自由表达式（假设 x 是自由的，因此 sqrt x 的出现被新参数 sqrtx 替换）。还有所谓的最优还原。

我认为没有其他方法可以做到这一点。为什么在 Haskell 等惰性函数式语言中速度要快得多？好吧，基本上，没有引用计数的指针，然后是严格分析（严格与惰性相反），它允许编译器事先知道某些东西最好严格评估，将严格评估且机器类型已知的值拆箱...更不用说其他典型的函数式编程语言优化...但本质上，如果您查看图形缩减机的实现，如果您查看堆栈如何演变，您会发现基本上您是在传递函数堆栈而不是参数，基本上就是这样。

现在，在这些机器中，计算参数的节点被其值覆盖。所以你可能错过了一个优化，但在类型安全的上下文中是不可能的。

假设您的所有“节点”都是主超类的子类，称为“节点”，它只有一个计算值的虚函数......然后它可能被另一个返回已计算值的“覆盖”。那件事，带有函数指针，就是为什么他们说 Haskell 的 STG 机器是“无标签的”（Spineless Tagless G-Machine），因为它们不标记数据元素，而是使用计算或返回的函数指针价值。

我不认为它在 C++ 中的效率不如在 Haskell 中的效率高……除非我们开始考虑以完全不同的方式实现 C++（可以而且应该这样做）。我们已经习惯了如此复杂的序言和结语以及复杂的调用约定等……函数调用在 C/C++ 中过于官僚。

现在，当你感到懒惰的时候，这本书绝对是 Simon Peyton-Jones 的《函数式编程语言的实现》。然而，现代实现在免费提供的文章“在库存硬件上实现功能语言：Spineless Tagless G-machine”中进行了描述，该文章非常适合阅读实现优化，但另一篇是阅读以了解基本面。

score 3 · Accepted Answer

您可能希望将thunk_type其作为单独的对象进行引用。现在，副本lazy<T>将不会从原产地评估中获得任何收益。但在这种情况下，您将获得额外的间接访问权限。
有时您可以通过简单地使用模板来摆脱对 std::function 的包装。
我不确定该值是否需要为 shared_ptr。也许来电者应该决定。
您将在每次访问时生成新的闭包。

考虑下一个修改：

template<typename F>
lazy(const F& x) :
  thunk_ptr([&x,&this](){
    T val = (*x)();
    thunk_ptr = [val]() { return val; };
    return val;
  })
{}

或者替代实现可能如下所示：

template<typename F>
auto memo(const F &x) -> std::function<const decltype(x()) &()> {
    typedef decltype(x()) return_type;
    typedef std::function<const return_type &()> thunk_type;
    auto thunk_ptr = std::make_shared<thunk_type>();
    auto *thunk_cptr = thunk_ptr.get();

    // note that this lambda is called only from scope which holds thunk_ptr
    *thunk_ptr = [thunk_cptr, &x]() {
        auto val = std::move(x());
        auto &thunk = *thunk_cptr;
        thunk = [val]() { return val; };
        // at this moment we can't refer to catched vars
        return thunk();
    };

    return [thunk_ptr]() { return (*thunk_ptr)(); };
};

score 2 · Accepted Answer

这是我需要的懒惰的另一个方面。

// REMARK: Always use const for lazy objects. Any, even const operation coming from ValueType called over Lazy<ValueType> freezes it.
template < typename ValueType >
struct Lazy
{
    typedef ValueType              Value;
    typedef std::function<Value()> Getter;

    Lazy( const Value& value = Value() )
        : m_value( value )
    { }

    Lazy( Value&& value )
        : m_value( value )
    { }

    Lazy( Lazy& other )
        : Lazy( const_cast<const Lazy&>(other) )
    { }

    Lazy( const Lazy&  other ) = default;
    Lazy(       Lazy&& other ) = default;
    Lazy& operator = ( const Lazy&  other ) = default;
    Lazy& operator = (       Lazy&& other ) = default;


    template < typename GetterType,
               typename = typename std::enable_if<std::is_convertible<GetterType,Getter>::value>::type >
    Lazy( GetterType&& getter )
        : m_pGetter( std::make_shared<Getter>( std::move(getter) ) )
    { }

    void Freeze()
    {
        if ( m_pGetter )
        {
            m_value = (*m_pGetter)();
            m_pGetter.reset();
        }
    }

    operator Value () const
    {
        return m_pGetter ? (*m_pGetter)() : m_value;
    }

    operator Value& ()
    {
        Freeze();
        return m_value;
    }

private:
    Value                   m_value;
    std::shared_ptr<Getter> m_pGetter;
};

像这样使用：

template < typename VectorType,
           typename VectorIthValueGetter = std::function<typename VectorType::const_reference (const size_t)>
         >
static auto MakeLazyConstRange( const VectorType& vector )
    -> decltype( boost::counting_range( Lazy<size_t>(), Lazy<size_t>() ) | boost::adaptors::transformed( VectorIthValueGetter() ) )
{
    const Lazy<size_t> bb( 0 ) ;
    const Lazy<size_t> ee( [&] () -> size_t { return vector.size(); } );
    const VectorIthValueGetter tt( [&] (const size_t i) -> typename VectorType::const_reference { return vector[i]; } );
    return boost::counting_range( bb, ee ) | boost::adaptors::transformed( tt );
}

然后：

std::vector<std::string> vv;
boost::any_range<const std::string&, boost::forward_traversal_tag, const std::string&, int>
    rr = MakeLazyConstRange( vv );

vv.push_back( "AA" );
vv.push_back( "BB" ); 
vv.push_back( "CC" ); 
vv.push_back( "DD" ); 

for ( const auto& next : rr )
    std::cerr << "---- " << next << std::endl;

score 0 · Accepted Answer

Boost Phoenix 库实现了 Lazyness，以及其他 FP 的优点，但我自己没有使用过，我不确定它在 C++ 11 上的表现如何，或者它至少部分符合 2011 标准。

http://www.boost.org/doc/libs/1_43_0/libs/spirit/phoenix/doc/html/index.html

score 0 · Accepted Answer

在我的 Lazy 类的实现中，我采用了一些其他方式——lambda 函数不返回值，它把它作为参数。它有助于实现一些好处：

节省了调用封装类型的移动构造函数的时间（当初始化函数返回结果时）。
不需要封装类型的复制构造函数和赋值运算符（仅当您想为惰性类型执行此操作时）。

另外，这个版本应该是线程安全的（如果我做错了请纠正我）。仍然存在的一项要求 - 默认构造函数。

#pragma once
#include <mutex>
#include <atomic>
#include <functional>

template <typename T>
struct Lazy
{
    using value_type = T;

    Lazy() : mInitializer(nullptr) {}

    Lazy(const std::function<void(T&)>& initializer)
        : mInitializer(std::move(initializer))
        , mInitFlag(false)
    { }

    Lazy(const Lazy& other)
        : mInitializer(other.mInitializer)
        , mInitFlag(other.mInitFlag.load())
        , mValue(other.mValue)
    { }

    Lazy(Lazy&& other)
        : mInitializer(std::move(other.mInitializer))
        , mInitFlag(other.mInitFlag.load())
        , mValue(std::move(other.mValue))
    { }

    Lazy& operator=(const std::function<void(T&)>& initializer)
    {
        mInitFlag.store(false);
        mInitializer = initializer;
        return *this;
    };

    Lazy& operator=(const Lazy& rhs)
    {
        if (this != &rhs)
        {
            std::lock_guard<std::mutex> lock(mMutex);

            mInitializer = rhs.mInitializer;
            mInitFlag = rhs.mInitFlag.load();
            if (mInitFlag) {
                mValue = rhs.mValue;
            }
        }
        return *this;
    };

    Lazy& operator=(Lazy&& rhs)
    {
        if (this != &rhs)
        {
            std::lock_guard<std::mutex> lock(mMutex);

            mInitializer = std::move(rhs.mInitializer);
            mInitFlag = rhs.mInitFlag.load();
            if (mInitFlag) {
                mValue = std::move(rhs.mValue);
            }
        }
        return *this;
    };

    inline operator T&()                { return get(); }
    inline operator const T&() const    { return get(); }

    inline T& get()             { return const_cast<T&>(_getImpl()); }
    inline const T& get() const { return _getImpl(); }

private:
    const T& _getImpl() const
    {
        if (mInitializer != nullptr && mInitFlag.load() == false)
        {
            std::lock_guard<std::mutex> lock(mMutex);
            if (mInitFlag.load() == false)
            {
                mInitializer(mValue);
                mInitFlag.store(true);
            }
        }
        return mValue;
    }

    mutable std::mutex mMutex;
    std::function<void(T&)> mInitializer;
    mutable std::atomic_bool mInitFlag;
    mutable T mValue;   // Value should be after mInitFlag due initialization order
};

使用示例：

using ValuesList = std::vector<int>;
Lazy<ValuesList> lazyTest = [](ValuesList& val) { val.assign({1, 2, 3, 4, 5}); };
const Lazy<ValuesList> lazyTestConst = lazyTest;

ValuesList& value = lazyTest;
const ValuesList& cvalue = lazyTestConst;

c++ - 一种在 C++ 中实现惰性求值的方法

5 回答 5

Related

Reference