numpy - xtensor 相当于 numpy a[a>3] = 1

Question

标题说 - 什么是 numpy 的 xtensor 等价物

# set all elements > 3 to 1
sometensor[sometensor > 3] = 1

?

它看起来像xt::filter作品：

xt::filter(sometensor, sometensor > 3) = 1

但看起来 numpy 版本要快得多。我已经用 xsimd 构建了 xtensor，但在这种情况下似乎没有帮助。有没有更好、更简单的方法来做到这一点？

编辑

我发现filtration，它确实更快（大约 3 倍），但仍然比 numpy 慢（大约 10 倍）......

解决方案（谢谢汤姆！）

a = xt::where(a > 0.5, 1.0, a);

是最快的 - 比快大约 10 倍filtration，所以它看起来像是 simd-d！

score 2 · Accepted Answer

xt::filter似乎是一个视图，它（当前）在xtensor中效率不高。我会用xt::where. 它可能会导致暂时的，但在NumPy中可能不是这种情况。由于我不知道临时的详细信息，让我们至少做一些时间安排：

1. NumPy 索引：

import numpy as np 
from datetime import datetime

a = np.random.random([1000000])
start = datetime.now()
a[a > 0.5] = 1.0
stop = datetime.now()
print((stop - start).microseconds)

在我的系统上大约 5000 微秒。

2.NumPy在哪里

import numpy as np 
from datetime import datetime

a = np.random.random([1000000])
start = datetime.now()
a = np.where(a > 0.5, 1.0, a)
stop = datetime.now()
print((stop - start).microseconds)

在我的系统上大约 2500 微秒。

3. xtensor where

#include <iostream>
#include <chrono>
#include <xtensor.hpp>

using namespace std;

int main() 
{
    xt::xtensor<double, 1> a = xt::random::rand<double>({1000000});

    auto start = std::chrono::high_resolution_clock::now();    
    a = xt::where(a > 0.5, 1.0, a);
    auto stop = std::chrono::high_resolution_clock::now();
    auto duration = duration_cast<std::chrono::microseconds>(stop - start);
    cout << duration.count() << endl;
}

在我的系统上，使用 xsimd 的时间在 2500 到 5000 微秒之间（比 NumPy 分布得多），而没有xsimd的时间大约是两倍。

4. xtensor 滤波器

#include <iostream>
#include <chrono>
#include <xtensor.hpp>

using namespace std;

int main() 
{
    xt::xtensor<double, 1> a = xt::random::rand<double>({1000000});

    auto start = std::chrono::high_resolution_clock::now();    
    xt::filter(a, a > 0.5) = 1.0;
    auto stop = std::chrono::high_resolution_clock::now();
    auto duration = duration_cast<std::chrono::microseconds>(stop - start);
    cout << duration.count() << endl;
}

在我的系统上大约 30000 microsconds 有和没有xsimd。

汇编

我用

cmake_minimum_required(VERSION 3.1)

project(Run)

set(CMAKE_BUILD_TYPE Release)

find_package(xtensor REQUIRED)
find_package(xsimd REQUIRED)
add_executable(${PROJECT_NAME} main.cpp)
target_link_libraries(${PROJECT_NAME} xtensor xtensor::optimize xtensor::use_xsimd)

没有xsimd我省略了最后一行。

罗塞塔/本地人

我正在运行 Mac 的 M1。列出的时间安排在 Rosetta（即x86）上。对于原生构建，时间为：

4500 微秒。
1500 微秒。
有和没有xsimd的 2000 微秒（我认为xsimd根本还不能在那个芯片上工作！）。
15000 微秒。

numpy - xtensor 相当于 numpy a[a>3] = 1

1 回答 1

1. NumPy 索引：

2.NumPy在哪里

3. xtensor where

4. xtensor 滤波器

汇编

罗塞塔/本地人

Related

Reference