1

我正在阅读一本python书..但使用Julialang代替..为了学习语言等......我在这里遇到了另一个我不太清楚的领域..

但是当我开始折腾更复杂的矩阵时,它就崩溃了..

include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

forward(dense1, coords)
println("Forward 1 layer")
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
println("Forward 2 layer")
activated_output2 = softmax_activation(dense2.output)

println("\n", activated_output2)

我得到了一个合适的矩阵

julia> activated_output2
300×3 Matrix{Float64}:
 0.00333346  0.00333337  0.00333335
 0.00333345  0.00333337  0.00333335
 0.00333345  0.00333336  0.00333335
 0.00333344  0.00333336  0.00333335
 0.00333343  0.00333336  0.00333334
 0.00333311  0.00333321  0.00333322

但是这本书有

>>>
[[0.33333 0.3333 0.3333]
...

看来我比书低一个数量级?即使使用 FluxML 的 softmax 函数

编辑:

我想也许是我的 ReLU 激活码导致了差异......并尝试切换到 FluxML NNlib 版本......但得到相同activated_output20.0033333而不是0.333333

将继续检查其他部分,例如我的转发功能

编辑2:

添加我的DenseLayer实现以确保完整性

密集层

# see https://github.com/FluxML/Flux.jl/blob/b78a27b01c9629099adb059a98657b995760b617/src/layers/basic.jl#L71-L111
using Base: Integer, Float64

mutable struct LayerDense
    weights::Matrix{Float64}
    biases::Matrix{Float64}
    num_inputs::Integer
    num_neurons::Integer
    output::Matrix{Float64}
    LayerDense(num_inputs::Integer, num_neurons::Integer) = new(0.01 * randn(num_inputs, num_neurons), zeros((1, num_neurons)),num_inputs, num_neurons)
end


function forward(layer::LayerDense, inputs::Matrix{Float64})
    layer.output = inputs * layer.weights .+ layer.biases
end

编辑3:

使用图书馆..我开始检查我的spiral_data实现..似乎在合理范围内

Python

import numpy as np
import nnfs

from nnfs.datasets import spiral_data

nnfs.init()


X, y = spiral_data(samples=100, classes=3)

print(X[:4]). # just check the first couple

>>>
[[0.         0.        ]
 [0.00299556 0.00964661]
 [0.01288097 0.01556285]
 [0.02997479 0.0044481 ]]

朱莉娅朗

include("activation_function_exercise/spiral_data.jl")

coords, color = spiral_data(100, 3)

julia> coords
300×2 Matrix{Float64}:
  0.0         0.0
 -0.00133462  0.0100125
  0.00346739  0.0199022
 -0.00126302  0.0302767
  0.00184948  0.0403617
  0.0113095   0.0492225
  0.0397276   0.0457691
  0.0144484   0.0692151
  0.0181726   0.0787382
  0.0320308   0.0850793
4

1 回答 1

1

原来我NNlib在整个矩阵上使用了softmax..python书没有这样做..所有需要做的就是修改我的softmax()调用likeso

using NNlib

function softmax_activation(inputs)
    return softmax(inputs, dims=2)
end

然后我的大长示例末尾的输出按预期出现

#using Pkg
#Pkg.add("Plots")

include("activation_function_exercise/spiral_data.jl")
include("activation_function_exercise/dense_layer.jl")
include("activation_function_exercise/activation_relu.jl")
include("activation_function_exercise/activation_softmax.jl")

coords, color = spiral_data(100, 3)

dense1 = LayerDense(2,3)
dense2 = LayerDense(3,3)

# Julia doesn't lend itself to OO programming...
# so the following will just be function
# activation1 = activation_relu
# activation2 = activation_softmax

forward(dense1, coords)
activated_output = relu_activation(dense1.output)
forward(dense2, activated_output)
activated_output2 = softmax_activation(dense2.output)


using Plots

#scatter(coords[:,1], coords[:,2])
scatter(coords[:,1], coords[:,2], zcolor=color, framestyle=:box)

display(activated_output2)

300×3 Matrix{Float64}:
 0.333333  0.333333  0.333333
 0.333336  0.333334  0.33333
 0.333338  0.333339  0.333323
 0.33334   0.333344  0.333316
 0.333339  0.333361  0.3333
 0.333341  0.333365  0.333294
 0.333345  0.333362  0.333293
 0.333345  0.333374  0.333281
 0.333349  0.33337   0.333281
 0.333347  0.33339   0.333262
 ⋮                   
 0.333564  0.332673  0.333764
 0.333583  0.332885  0.333532
 0.333588  0.332967  0.333445
 0.333587  0.333148  0.333265
 0.333593  0.332935  0.333472
 0.333596  0.333006  0.333398
 0.333583  0.33333   0.333086
 0.3336    0.333062  0.333338
 0.333603  0.333082  0.333316
于 2021-07-07T16:43:42.997 回答