作为整个机器学习领域的初学者,我决定尝试使用 numpy 从头开始实现卷积神经网络。
我已经完成了图层和训练算法,一切正常。
我目前正在尝试实现一个小批量训练算法。我可能错了,但我在某处听说小批量计算都是同时发生的。目前我只是一个一个地计算每个小批量的梯度下降,然后通过这些计算的总和来更新可训练的参数。
如果可能的话,我怎样才能同时计算批次而不是遍历整个批次?我的模型的输入是形状为 (height, width, channels) 的 3d 数组我可以给出一个 (height, width, channels) 的 3d 数组作为输入,其中通道中的每个元素都是批次中的一个图像(意思是 batch_size = = 频道)?
def train(self, train_x, train_y, test_x, test_y, epochs=1000, learning_rate=0.01, verbose=True, seed=99):
np.random.seed(seed)
print("Started Training!")
idx = 0
for epoch in range(epochs):
start_time = dt.now()
for batch in self.iterate_minibatches(train_x, train_y, 64, to_shuffle=True):
for x, y in zip(batch[0], batch[1]):
# Feed forwards
output = self.predict(x, True)
# Compute Error
output_gradient = self.loss.compute_derivative(y, output)
# Feed backwards
self._back_propagate(output_gradient)
self._update(learning_rate)
cost, accuracy = self.evaluate(test_x, test_y)
self.cost_history.append(cost)
self.accuracy_history.append(accuracy)
if verbose:
epoch_time = (dt.now() - start_time).seconds
print(f"Epoch: {epoch + 1} / {epochs} | cost: {cost} | accuracy: {accuracy} | time: {epoch_time}")
** 辅助函数 **
def shuffle(x: np.ndarray, y: np.ndarray, seed: int = 99) -> Tuple[np.ndarray, np.ndarray]:
np.random.seed(seed)
"""
Randomizes two nd.arrays with the same length in unison
:param x: images
:param y: hot one encoding of y
:param seed:
:return: Randomized x, y
"""
if len(x) != len(y):
raise ValueError('x, y cannot have different lengths!')
# Allocate space
shuffled_x = np.empty(x.shape, dtype=x.dtype)
shuffled_y = np.empty(y.shape, dtype=y.dtype)
# All indexes in random order
permutation = np.random.permutation(len(x))
# Shuffle
for old_index, new_index in enumerate(permutation):
shuffled_x[new_index] = x[old_index]
shuffled_y[new_index] = y[old_index]
return shuffled_x, shuffled_y
# Thanks to @dsachar
def iterate_minibatches(x, y, batchsize, to_shuffle=False):
assert x.shape[0] == y.shape[0]
if to_shuffle:
x, y = shuffle(x, y)
for start_idx in range(0, x.shape[0], batchsize):
end_idx = min(start_idx + batchsize, x.shape[0])
excerpt = slice(start_idx, end_idx)
yield x[excerpt], y[excerpt]