halide - Halide中的加权和调度

Question

我在卤化物中实现径向基函数，虽然我成功运行它但速度很慢。对于每个像素，我计算距离，然后取该距离的加权和以产生输出。为了遍历权重，我使用了 RDom（如下所示）。在这个实现中，每个像素计算都需要重新加载所有（3000+）个权重，因此速度很慢。

我的问题是如何在这种情况下利用 Halide 的调度功能。我的愿望是加载一些权重，计算像素子集的部分加权和，加载下一组权重，然后继续完成。这为每个较小的权重组保留了局部性，而这种事情正是 Halide 的目的。不幸的是，我还没有找到任何针对这个特定问题的东西。RDom 似乎比调度原语处于较低的抽象级别，因此不清楚如何调度它。

欢迎任何有关在 Halide 中实现加权和的替代建议。无需使用 RDom 执行此操作，我只是不知道任何其他方式。

  Func rbf_ctrl_pts("rbf_ctrl_pts");
    // Initialization with all zero
    rbf_ctrl_pts(x,y,c) = cast<float>(0);
    // Index to iterate with
    RDom idx(0,num_ctrl_pts);
    // Loop code
    // Subtract the vectors 
    Expr red_sub   = (*in_func)(x,y,0) - (*ctrl_pts_h)(0,idx);
    Expr green_sub = (*in_func)(x,y,1) - (*ctrl_pts_h)(1,idx);
    Expr blue_sub  = (*in_func)(x,y,2) - (*ctrl_pts_h)(2,idx);
    // Take the L2 norm to get the distance
    Expr dist      = sqrt( red_sub*red_sub +
                              green_sub*green_sub +
                              blue_sub*blue_sub );
    // Update persistant loop variables
    rbf_ctrl_pts(x,y,c) = select( c == 0, rbf_ctrl_pts(x,y,c) +
                                    ( (*weights_h)(0,idx) * dist),
                                  c == 1, rbf_ctrl_pts(x,y,c) +
                                    ( (*weights_h)(1,idx) * dist),
                                          rbf_ctrl_pts(x,y,c) +
                                    ( (*weights_h)(2,idx) * dist));

score 1 · Accepted Answer

您可以在 rbf_ctrl_pts 的 idx 维度中使用 split 或 tile 和 rfactor 来分解和调度归约操作。应该可以通过这些机制获得权重的局部性。我不是 100% 确定关联证明者会处理选择，因此可能需要按通道展开或移动到跨通道使用元组，尽管在上面的代码中，我不确定选择是否在做任何比较让 c 通过。

halide - Halide中的加权和调度

1 回答 1

Related

Reference