filter - tf.boolean_mask, mask_dimension 必须指定吗？

Question

使用时tf.boolean_mask()，会引发值错误。它显示“必须指定掩码尺寸的数量，即使某些尺寸为无。例如 shape=[None] 可以，但 shape=None 不是。

我怀疑当我创建布尔掩码时出现问题，因为当我手动创建布尔掩码时，一切正常。但是，到目前为止，我已经检查了 s 的形状和 dtype，并没有发现任何可疑之处。两者似乎都与我手工创建的布尔蒙版的形状和类型相同。

请查看问题的屏幕截图。以下内容应允许您在机器上重现错误。你需要 tensorflow、numpy 和 scipy。

with tf.Session() as sess:
    # receive five embedded vectors
    v0 = tf.constant([[3.0,1.0,2.,4.,2.]])
    v1 = tf.constant([[4.0,0,1.0,4,1.]])
    v2 = tf.constant([[1.0,1.0,0.0,4.,8.]])
    v3 = tf.constant([[1.,4,2.,5.,2.]])
    v4 = tf.constant([[3.,2.,3.,2.,5.]])

    # concatenate the five embedded vectors into a matrix
    VT = tf.concat([v0,v1,v2,v3,v4],axis=0)

    # perform SVD on the concatenated matrix
    s, u1, u2   = tf.svd(VT)
    e = tf.square(s) # list of eigenvalues
    v = u1 # eigenvectors as column vectors

    # sample a set
    s = tf.py_func(sample_dpp_bin,[e,v],tf.bool)
    X = tf.boolean_mask(VT,s)
    print(X.eval())

这是生成 s 的代码。s 是来自行列式点过程的样本（对于数学感兴趣的人）。请注意，我使用 tf.py_func 来包装这个 python 函数：

import tensorflow as tf
import numpy as np
from scipy.linalg import orth

def sample_dpp_bin(e_val,e_vec):
    # e_val = np.array of eigenvalues
    # e_vec = array of eigenvectors (= column vectors)
    eps = 0.01

    # sample a set of eigenvectors
    ind = (np.random.rand(len(e_val)) <= (e_val)/(1+e_val))
    k = sum(ind)
    if k == e_val.size:
        return np.ones(e_val.size,dtype=bool) # check for full set
    if k == 0:
        return np.zeros(e_val.size,dtype=bool)
    V = e_vec[:,np.array(ind)]

    # sample a set of k items 
    sample = np.zeros(e_val.size,dtype=bool)
    for l in range(k-1,-1,-1):
        p = np.sum(V**2,axis=1)
        p = np.cumsum(p / np.sum(p)) # item cumulative probabilities
        i = int((np.random.rand() <= p).argmax()) # choose random item
        sample[i] = True

        j = (np.abs(V[i,:])>eps).argmax() # pick an eigenvector not orthogonal to e_i
        Vj = V[:,j]
        V = orth(V - (np.outer(Vj,(V[i,:]/Vj[i]))))

    return sample

如果我打印 s 并且tf.reshape(s)是

[False  True  True  True  True]
[5]

如果我打印 VT 并且tf.reshape(VT)是

[[ 3.  1.  2.  4.  2.]
 [ 4.  0.  1.  4.  1.]
 [ 1.  1.  0.  4.  8.]
 [ 1.  4.  2.  5.  2.]
 [ 3.  2.  3.  2.  5.]]
[5 5]

非常感谢任何帮助。

score 2 · Accepted Answer

以下示例对我有用。

import tensorflow as tf
import numpy as np

tensor = [[1, 2], [3, 4], [5, 6]]
mask = np.array([True, False, True])

t_m = tf.boolean_mask(tensor, mask)
sess = tf.Session()
print(sess.run(t_m))

输出：

[[1 2]
 [5 6]]

提供可运行的代码片段以重现错误。我认为您可能在 s 中做错了什么。

更新：

s = tf.py_func(sample_dpp_bin,[e,v],tf.bool)
s_v = (s.eval())
X = tf.boolean_mask(VT,s_v)
print(X.eval())

掩码应该是一个 np 数组而不是 TF 张量。您不必使用 tf.pyfunc。

score 0 · Accepted Answer

错误消息指出未定义蒙版的形状。如果你打印，你会得到什么tf.shape(s)？我敢打赌，您的代码的问题在于的形状s是完全未知的，您可以通过简单的调用来解决这个问题s.set_shape((None))（简单地指定它s是一维张量）。考虑这个代码片段：

X = np.random.randint(0, 2, (100, 100, 3))
with tf.Session() as sess:
    X_tf = tf.placeholder(tf.int8)
    # X_tf.set_shape((None, None, None))
    y = tf.greater(tf.reduce_max(X_tf, axis=(0, 1)), 0)
    print(tf.shape(y))
    z = tf.boolean_mask(X_tf, y, axis=2)
    print(sess.run(z, feed_dict={X_tf: X}))

这将打印一个形状Tensor("Shape_3:0", shape=(?,), dtype=int32)（即，甚至的尺寸y都是未知的）并返回与您相同的错误。但是，如果取消注释该set_shape行，则X_tf已知为 3 维，s1 维也是如此。然后代码就可以工作了。所以，我认为你需要做的就是在s.set_shape((None))通话之后添加一个py_func通话。

filter - tf.boolean_mask, mask_dimension 必须指定吗？

2 回答 2

更新：

Related

Reference