1

我正在尝试编写一个自定义层来处理可变长度向量,并将它们减少到相同长度的向量。长度是预先知道的,因为可变长度的原因是我有几种不同的数据类型,我使用不同数量的特征进行编码。从某种意义上说,它类似于Embedding only for numeric values。我尝试过使用填充,但结果很糟糕,所以我正在尝试这种方法。

因此,例如,假设我有 3 种数据类型,我使用 3、4、6 个长度向量对其进行编码。

arr = [
    # example one (data type 1 [len()==3], datat type 3[len()==6]) - force values as floats
    [[1.0,2.0,3],[1,2,3,4,5,6]],
    # example two (data type 2 [len()==4], datat type 3len()==6]) - force values as floats
    [[1.0,2,3,4],[1,2,3,4,5,6]],
]

我尝试实现一个自定义层,如:

class DimensionReducer(tf.keras.layers.Layer):
    def __init__(self, output_dim, expected_lengths):
        super(DimensionReducer, self).__init__()
        self._supports_ragged_inputs = True
        self.output_dim = output_dim
        for l in expected_lengths:
            setattr(self,f'w_{l}', self.add_weight(shape=(l, self.output_dim),initializer='random_normal',trainable=True))
            setattr(self, f'b_{l}',self.add_weight(shape=(self.output_dim,), initializer='random_normal',trainable=True))

    def call(self, inputs):
        print(inputs.shape)

        # batch
        if len(inputs.shape) == 3:
            print("batch")
            result = []
            for i,x in enumerate(inputs):
                _result = []
                for v in x:
                    l = len(v)
                    print(l)
                    print(v)
                    w = getattr(self, f'w_{l}')
                    b = getattr(self, f'b_{l}')
                    out = tf.matmul([v],w) + b
                    _result.append(out)

                result.append(tf.concat(_result, 0))
            r = tf.stack(result)
            print("batch output:",r.shape)
            return r

直接调用时似乎有效:

dim = DimensionReducer(3, [3,4,6])
dim(tf.ragged.constant(arr))

但是当我尝试将其合并到模型中时,它失败了:

import tensorflow as tf

val_ragged = tf.ragged.constant(arr)

inputs_ragged = tf.keras.layers.Input(shape=(None,None), ragged=True)
outputs_ragged = DimensionReducer(3, [3,4,6])(inputs_ragged)
model_ragged = tf.keras.Model(inputs=inputs_ragged, outputs=outputs_ragged)

# this one with RaggedTensor doesn't
print(model_ragged(val_ragged))

AttributeError: 'DimensionReducer' object has no attribute 'w_Tensor("dimension_reducer_98/strided_slice:0", shape=(), dtype=int32)'

我不确定如何实现这样的层,或者我做错了什么。

4

0 回答 0