我试图通过进行以下更改来使用 tflearn 构建自定义激活函数:
将我的自定义激活函数添加到activation.py
def my_activation(x):
return tf.where(x >= 0.0, tf.div( x**2 , x + tf.constant(0.6)) , 0.01*x)
并将其添加到__init__.py
from .activations import linear, tanh, sigmoid, softmax, softplus, softsign,\
relu, relu6, leaky_relu, prelu, elu, crelu, selu, my_activation
由于 tensorflow 可以自动进行梯度计算,所以我不需要实现梯度函数。正如文章深度学习编程风格所指出的,
过去,每当有人定义一个新模型时,他们都必须手动进行导数计算。虽然数学相当简单,但对于复杂的模型,它可能是耗时且乏味的工作。所有现代深度学习库都通过自动解决梯度计算问题,使从业者/研究人员的工作变得更加轻松。
我使用以下代码在 cifar10 数据集上训练了模型:https ://github.com/tflearn/tflearn/blob/master/examples/images/convnet_cifar10.py 但将所有relu激活更改为my_activation。
可悲的是,这个简单的修改导致网络无法学习任何东西:
Training Step: 46 | total loss: 0.00002 | time: 1.434s
| Adam | epoch: 001 | loss: 0.00002 - acc: 0.0885 -- iter: 04416/50000
Training Step: 47 | total loss: 0.00002 | time: 1.448s
| Adam | epoch: 001 | loss: 0.00002 - acc: 0.0945 -- iter: 04512/50000
Training Step: 48 | total loss: 0.00001 | time: 1.462s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0927 -- iter: 04608/50000
Training Step: 49 | total loss: 0.00001 | time: 1.476s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0896 -- iter: 04704/50000
Training Step: 50 | total loss: 0.00001 | time: 1.491s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0919 -- iter: 04800/50000
Training Step: 51 | total loss: 0.00001 | time: 1.504s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0890 -- iter: 04896/50000
Training Step: 52 | total loss: 0.00001 | time: 1.518s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0944 -- iter: 04992/50000
Training Step: 53 | total loss: 0.00001 | time: 1.539s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0989 -- iter: 05088/50000
Training Step: 54 | total loss: 0.00001 | time: 1.553s
| Adam | epoch: 001 | loss: 0.00001 - acc: 0.0951 -- iter: 05184/50000
Training Step: 55 | total loss: 0.00000 | time: 1.567s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0964 -- iter: 05280/50000
Training Step: 56 | total loss: 0.00000 | time: 1.580s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0931 -- iter: 05376/50000
Training Step: 57 | total loss: 0.00000 | time: 1.594s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0903 -- iter: 05472/50000
Training Step: 58 | total loss: 0.00000 | time: 1.613s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0851 -- iter: 05568/50000
Training Step: 59 | total loss: 0.00000 | time: 1.641s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0835 -- iter: 05664/50000
Training Step: 60 | total loss: 0.00000 | time: 1.674s
| Adam | epoch: 001 | loss: 0.00000 - acc: 0.0834 -- iter: 05760/50000
由于我只是一个初学者,我不知道导致网络成为零损失和低准确率的原因(NaN输出?无谓?)。谁能告诉我如何解决这个问题?谢谢!
请注意,我不是在问如何构建自定义激活函数。关于如何制作自定义函数的问题: