0

作为整个机器学习领域的初学者,我决定尝试使用 numpy 从头开始​​实现卷积神经网络。

我已经完成了图层和训练算法,一切正常。
我目前正在尝试实现一个小批量训练算法。我可能错了,但我在某处听说小批量计算都是同时发生的。目前我只是一个一个地计算每个小批量的梯度下降,然后通过这些计算的总和来更新可训练的参数。

如果可能的话,我怎样才能同时计算批次而不是遍历整个批次?我的模型的输入是形状为 (height, width, channels) 的 3d 数组我可以给出一个 (height, width, channels) 的 3d 数组作为输入,其中通道中的每个元素都是批次中的一个图像(意思是 batch_size = = 频道)?

    def train(self, train_x, train_y, test_x, test_y, epochs=1000, learning_rate=0.01, verbose=True, seed=99):
        np.random.seed(seed)
        print("Started Training!")
        idx = 0
        for epoch in range(epochs):
            start_time = dt.now()
            for batch in self.iterate_minibatches(train_x, train_y, 64, to_shuffle=True):
                for x, y in zip(batch[0], batch[1]):
                    # Feed forwards
                    output = self.predict(x, True)
                    # Compute Error
                    output_gradient = self.loss.compute_derivative(y, output)
                    # Feed backwards
                    self._back_propagate(output_gradient)

                self._update(learning_rate)

            cost, accuracy = self.evaluate(test_x, test_y)
            self.cost_history.append(cost)
            self.accuracy_history.append(accuracy)

            if verbose:
                epoch_time = (dt.now() - start_time).seconds
                print(f"Epoch: {epoch + 1} / {epochs} | cost: {cost} | accuracy: {accuracy} | time: {epoch_time}")

** 辅助函数 **


def shuffle(x: np.ndarray, y: np.ndarray, seed: int = 99) -> Tuple[np.ndarray, np.ndarray]:
    np.random.seed(seed)
    """
    Randomizes two nd.arrays with the same length in unison
    :param x: images
    :param y: hot one encoding of y
    :param seed:
    :return: Randomized x, y
    """
    if len(x) != len(y):
        raise ValueError('x, y cannot have different lengths!')

    # Allocate space
    shuffled_x = np.empty(x.shape, dtype=x.dtype)
    shuffled_y = np.empty(y.shape, dtype=y.dtype)
    # All indexes in random order
    permutation = np.random.permutation(len(x))
    # Shuffle
    for old_index, new_index in enumerate(permutation):
        shuffled_x[new_index] = x[old_index]
        shuffled_y[new_index] = y[old_index]

    return shuffled_x, shuffled_y

# Thanks to @dsachar
def iterate_minibatches(x, y, batchsize, to_shuffle=False):
     assert x.shape[0] == y.shape[0]
     if to_shuffle:
         x, y = shuffle(x, y)
     for start_idx in range(0, x.shape[0], batchsize):
         end_idx = min(start_idx + batchsize, x.shape[0])
         excerpt = slice(start_idx, end_idx)
         yield x[excerpt], y[excerpt]
4

0 回答 0