我正在阅读一本python书..但使用Julialang代替..为了学习语言等......我在这里遇到了另一个我不太清楚的领域..
但是当我开始折腾更复杂的矩阵时,它就崩溃了..
include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")
coords, color = spiral_data(100, 3)
dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)
forward(dense1, coords)
println("Forward 1 layer")
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
println("Forward 2 layer")
activated_output2 = softmax_activation(dense2.output)
println("\n", activated_output2)
我得到了一个合适的矩阵
julia> activated_output2
300×3 Matrix{Float64}:
0.00333346 0.00333337 0.00333335
0.00333345 0.00333337 0.00333335
0.00333345 0.00333336 0.00333335
0.00333344 0.00333336 0.00333335
0.00333343 0.00333336 0.00333334
0.00333311 0.00333321 0.00333322
但是这本书有
>>>
[[0.33333 0.3333 0.3333]
...
看来我比书低一个数量级?即使使用 FluxML 的 softmax 函数
编辑:
我想也许是我的 ReLU 激活码导致了差异......并尝试切换到 FluxML NNlib 版本......但得到相同activated_output2
的0.0033333
而不是0.333333
将继续检查其他部分,例如我的转发功能
编辑2:
添加我的DenseLayer
实现以确保完整性
密集层
# see https://github.com/FluxML/Flux.jl/blob/b78a27b01c9629099adb059a98657b995760b617/src/layers/basic.jl#L71-L111
using Base: Integer, Float64
mutable struct LayerDense
weights::Matrix{Float64}
biases::Matrix{Float64}
num_inputs::Integer
num_neurons::Integer
output::Matrix{Float64}
LayerDense(num_inputs::Integer, num_neurons::Integer) = new(0.01 * randn(num_inputs, num_neurons), zeros((1, num_neurons)),num_inputs, num_neurons)
end
function forward(layer::LayerDense, inputs::Matrix{Float64})
layer.output = inputs * layer.weights .+ layer.biases
end
编辑3:
使用图书馆..我开始检查我的spiral_data
实现..似乎在合理范围内
Python
import numpy as np
import nnfs
from nnfs.datasets import spiral_data
nnfs.init()
X, y = spiral_data(samples=100, classes=3)
print(X[:4]). # just check the first couple
>>>
[[0. 0. ]
[0.00299556 0.00964661]
[0.01288097 0.01556285]
[0.02997479 0.0044481 ]]
朱莉娅朗
include("activation_function_exercise/spiral_data.jl")
coords, color = spiral_data(100, 3)
julia> coords
300×2 Matrix{Float64}:
0.0 0.0
-0.00133462 0.0100125
0.00346739 0.0199022
-0.00126302 0.0302767
0.00184948 0.0403617
0.0113095 0.0492225
0.0397276 0.0457691
0.0144484 0.0692151
0.0181726 0.0787382
0.0320308 0.0850793