math - Hard Sigmoid 是如何定义的

Question

我正在使用 keras 开发 Deep Nets。有一个激活“硬 sigmoid”。它的数学定义是什么？

我知道什么是 Sigmoid。有人在 Quora 上问过类似的问题：https ://www.quora.com/What-is-hard-sigmoid-in-artificial-neural-networks-Why-is-it-faster-than-standard-sigmoid-Are-there - 任何超过标准 sigmoid 的缺点

但是我在任何地方都找不到精确的数学定义？

score 13 · Accepted Answer

由于 Keras 同时支持 Tensorflow 和 Theano，因此每个后端的确切实现可能不同——我将仅介绍 Theano。对于 Keras 使用的 Theano 后端T.nnet.hard_sigmoid，它又是线性近似的标准 sigmoid：

slope = tensor.constant(0.2, dtype=out_dtype)
shift = tensor.constant(0.5, dtype=out_dtype)
x = (x * slope) + shift
x = tensor.clip(x, 0, 1)

即它是：max(0, min(1, x*0.2 + 0.5))

score 2 · Accepted Answer

作为参考，hard sigmoid function在不同的地方可能会有不同的定义。在 Courbariaux 等人。2016 [1] 定义为：

σ 是“硬 sigmoid”函数：σ(x) = clip((x + 1)/2, 0, 1) = max(0, min(1, (x + 1)/2))

目的是提供一个概率值（因此将其限制在0和之间1）以用于神经网络参数（例如权重、激活、梯度）的随机二值化。您使用p = σ(x)从 hard sigmoid 函数返回的概率将参数设置x为+1with pprobability 或-1with probability 1-p。

[1] https://arxiv.org/abs/1602.02830 -“二值化神经网络：训练权重和激活限制为 +1 或 -1 的深度神经网络”，Matthieu Courbariaux，Itay Hubara，Daniel Soudry，Ran El-Yaniv , Yoshua Bengio, (提交于 2016 年 2 月 9 日 (v1), 最后修订于 2016 年 3 月 17 日 (此版本, v3))

score 1 · Accepted Answer

硬 sigmoid 通常是逻辑 sigmoid 函数的分段线性逼近。根据您想要保留的原始 sigmoid 的属性，您可以使用不同的近似值。

我个人喜欢保持函数正确为零，即σ(0) = 0.5（移位）和σ'(0) = 0.25（斜率）。这可以编码如下

def hard_sigmoid(x):
    return np.maximum(0, np.minimum(1, (x + 2) / 4))

score -3 · Accepted Answer

-3

它是

  clip((x + 1)/2, 0, 1)

在编码用语中：

  max(0, min(1, (x + 1)/2))

于 2018-02-28T13:18:31.587 回答

math - Hard Sigmoid 是如何定义的

4 回答 4

Related

Reference