我必须在 CNN 模型中添加一个 k-max 池化层来检测虚假评论。请让我知道如何使用 keras 实现它。
我搜索了互联网,但没有找到好的资源。
我必须在 CNN 模型中添加一个 k-max 池化层来检测虚假评论。请让我知道如何使用 keras 实现它。
我搜索了互联网,但没有找到好的资源。
根据本文,k-Max Pooling 是一种池化操作,它是 Max-TDNN 句子模型中使用的时间维度上最大池化的泛化,不同于用于对象识别的卷积网络中应用的局部最大池化操作。 LeCun 等人,1998 年)。
k-max 池化操作可以将 p 中可能相隔多个位置的 k 个最活跃的特征池化;它保留了特征的顺序,但对它们的特定位置不敏感。
很少有资源可以展示如何在 tensorflow 或 keras 中实现它:
正如@Anubhav_Singh 建议的那样,这里似乎有一个解决方案。在 github keras 问题链接上,此响应获得的赞数 (24) 几乎是反对数 (5) 的 5 倍。我只是在as-is
这里引用它,让人们尝试一下,然后说它是否对他们有用。
原作者:arbackus
from keras.engine import Layer, InputSpec
from keras.layers import Flatten
import tensorflow as tf
class KMaxPooling(Layer):
"""
K-max pooling layer that extracts the k-highest activations from a sequence (2nd dimension).
TensorFlow backend.
"""
def __init__(self, k=1, **kwargs):
super().__init__(**kwargs)
self.input_spec = InputSpec(ndim=3)
self.k = k
def compute_output_shape(self, input_shape):
return (input_shape[0], (input_shape[2] * self.k))
def call(self, inputs):
# swap last two dimensions since top_k will be applied along the last dimension
shifted_input = tf.transpose(inputs, [0, 2, 1])
# extract top_k, returns two tensors [values, indices]
top_k = tf.nn.top_k(shifted_input, k=self.k, sorted=True, name=None)[0]
# return flattened output
return Flatten()(top_k)
注意:据报道它运行得非常慢(尽管它对人有用)。
这是我在上面@Anubhav Singh 的评论中解释的 k-max 池的实现(保留了 topk 的顺序)
def test60_simple_test(a):
# swap last two dimensions since top_k will be applied along the last dimension
#shifted_input = tf.transpose(a) #[0, 2, 1]
# extract top_k, returns two tensors [values, indices]
res = tf.nn.top_k(a, k=3, sorted=True, name=None)
b = tf.sort(res[1],axis=0,direction='ASCENDING',name=None)
e=tf.gather(a,b)
#e=e[0:3]
return (e)
a = tf.constant([7, 2, 3, 9, 5], dtype = tf.float64)
print('*input:',a)
print('**output', test60_simple_test(a))
结果:
*input: tf.Tensor([7. 2. 3. 9. 5.], shape=(5,), dtype=float64)
**output tf.Tensor([7. 9. 5.], shape=(3,), dtype=float64)
看一下这个。没有经过彻底测试,但对我来说效果很好。让我知道你的想法。PS 最新的张量流版本。
tf.nn.top_k 不保留值的出现顺序。所以,这就是需要努力的想法
import tensorflow as tf
from tensorflow.keras import layers
class KMaxPooling(layers.Layer):
"""
K-max pooling layer that extracts the k-highest activations from a sequence (2nd dimension).
TensorFlow backend.
"""
def __init__(self, k=1, axis=1, **kwargs):
super(KMaxPooling, self).__init__(**kwargs)
self.input_spec = layers.InputSpec(ndim=3)
self.k = k
assert axis in [1,2], 'expected dimensions (samples, filters, convolved_values),\
cannot fold along samples dimension or axis not in list [1,2]'
self.axis = axis
# need to switch the axis with the last elemnet
# to perform transpose for tok k elements since top_k works in last axis
self.transpose_perm = [0,1,2] #default
self.transpose_perm[self.axis] = 2
self.transpose_perm[2] = self.axis
def compute_output_shape(self, input_shape):
input_shape_list = list(input_shape)
input_shape_list[self.axis] = self.k
return tuple(input_shape_list)
def call(self, x):
# swap sequence dimension to get top k elements along axis=1
transposed_for_topk = tf.transpose(x, perm=self.transpose_perm)
# extract top_k, returns two tensors [values, indices]
top_k_vals, top_k_indices = tf.math.top_k(transposed_for_topk,
k=self.k, sorted=True,
name=None)
# maintain the order of values as in the paper
# sort indices
sorted_top_k_ind = tf.sort(top_k_indices)
flatten_seq = tf.reshape(transposed_for_topk, (-1,))
shape_seq = tf.shape(transposed_for_topk)
len_seq = tf.shape(flatten_seq)[0]
indices_seq = tf.range(len_seq)
indices_seq = tf.reshape(indices_seq, shape_seq)
indices_gather = tf.gather(indices_seq, 0, axis=-1)
indices_sum = tf.expand_dims(indices_gather, axis=-1)
sorted_top_k_ind += indices_sum
k_max_out = tf.gather(flatten_seq, sorted_top_k_ind)
# return back to normal dimension but now sequence dimension has only k elements
# performing another transpose will get the tensor back to its original shape
# but will have k as its axis_1 size
transposed_back = tf.transpose(k_max_out, perm=self.transpose_perm)
return transposed_back
这是Pytorch
k-max pooling的一个版本实现:
import torch
def kmax_pooling(x, dim, k):
index = x.topk(k, dim = dim)[1].sort(dim = dim)[0]
return x.gather(dim, index)
希望它会有所帮助。