我有一个数组,其中包含作为两个不同数量(alpha 和 eigRange)函数的误差值。
我像这样填充我的数组:
for j in range(n):
for i in range(alphaLen):
alpha = alpha_list[i]
c = train.eig(xt_, yt_,m-j, m,alpha, "cpu")
costListTrain[j, i] = cost.err(xt_, xt_, yt_, c)
normedValues=costListTrain/np.max(costListTrain.ravel())
在哪里
n = 20
alpha_list = [0.0001,0.0003,0.0008,0.001,0.003,0.006,0.01,0.03,0.05]
我的costListTrain
数组包含一些差异很小的值,例如:
2.809458902485728 2.809458905776425 2.809458913576337 2.809459011062461 2.030326752376704 2.030329906064879 2.030337351188699 2.030428976282031 1.919840839066182 1.919846470077076 1.919859731440199 1.920021453630778 1.858436351617677 1.858444223016128 1.858462730482461 1.858687054377165 1.475871326997542 1.475901926855846 1.475973476249240 1.476822830933632 1.475775410801635 1.475806023102173 1.475877601316863 1.476727286424228 1.475774284270633 1.475804896751524 1.475876475382906 1.476726165223209 1.463578292548192 1.463611627166494 1.463689466240788 1.464609083309240 1.462859608038034 1.462893157900139 1.462971489632478 1.463896516033939 1.461912706143012 1.461954067956570 1.462047793798572 1.463079574605320 1.450581041157659 1。452770209885761 1.454835202839513 1.459676311335618 1.450581041157643 1.452770209885764 1.454835202839484 1.459676311335624 1.450581041157651 1.452770209885735 1.454835202839484 1.459676311335610 1.450581041157597 1.452770209885784 1.454835202839503 1.459676311335620 1.450581041157575 1.452770209885757 1.454835202839496 1.459676311335619 1.450581041157716 1.452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622450581041157643 1.452770209885764 1.454835202839484 1.459676311335624 1.450581041157651 1.452770209885735 1.454835202839484 1.459676311335610 1.450581041157597 1.452770209885784 1.454835202839503 1.459676311335620 1.450581041157575 1.452770209885757 1.454835202839496 1.459676311335619 1.450581041157716 1.452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622450581041157643 1.452770209885764 1.454835202839484 1.459676311335624 1.450581041157651 1.452770209885735 1.454835202839484 1.459676311335610 1.450581041157597 1.452770209885784 1.454835202839503 1.459676311335620 1.450581041157575 1.452770209885757 1.454835202839496 1.459676311335619 1.450581041157716 1.452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622454835202839484 1.459676311335610 1.450581041157597 1.452770209885784 1.454835202839503 1.459676311335620 1.450581041157575 1.452770209885757 1.454835202839496 1.459676311335619 1.450581041157716 1.452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622454835202839484 1.459676311335610 1.450581041157597 1.452770209885784 1.454835202839503 1.459676311335620 1.450581041157575 1.452770209885757 1.454835202839496 1.459676311335619 1.450581041157716 1.452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622452770209885711 1.454835202839499 1.459676311335613 1.450581041157667 1.452770209885744 1.454835202839509 1.459676311335625 1.450581041157649 1.452770209885750 1.454835202839476 1.459676311335617 1.450581041157655 1.452770209885708 1.454835202839442 1.459676311335622 1.450581041157571 1.452770209885700 1.454835202839498 1.459676311335622
正如你可以在这里价值非常非常接近!
我试图以在 x、y 轴上有两个量并且误差值由点颜色表示的方式绘制这些数据。
这就是我绘制数据的方式:
alpha_list = np.log(alpha_list)
eigenvalues, alphaa = np.meshgrid(eigRange, alpha_list)
vMin = np.min(costListTrain)
vMax = np.max(costListTrain)
plt.scatter(x, y, s=70, c=normedValues, vmin=vMin, vmax=vMax, alpha=0.50)
但结果不正确。
我试图通过将所有值除以 来标准化我的错误值
max
,但它没有用!我可以使它工作的唯一方法(这是不正确的)是以两种不同的方式标准化我的数据。一个是基于每一列(这意味着因子 1 是恒定的,因子 2 是变化的),另一个是基于行的(意味着因子 2 是恒定的,因子 1 是变化的)。但这并没有真正的意义,因为我需要一个图来显示两个量在误差值上的权衡。
更新
这就是我最后一段的意思。基于与特征值对应的每一行的最大值对值进行归一化:
maxsEigBasedTrain= np.amax(costListTrain.T,1)[:,np.newaxis]
maxsEigBasedTest= np.amax(costListTest.T,1)[:,np.newaxis]
normEigCostTrain=costListTrain.T/maxsEigBasedTrain
normEigCostTest=costListTest.T/maxsEigBasedTest
基于与 alpha 对应的每一列的最大值对值进行归一化:
maxsAlphaBasedTrain= np.amax(costListTrain,1)[:,np.newaxis]
maxsAlphaBasedTest= np.amax(costListTest,1)[:,np.newaxis]
normAlphaCostTrain=costListTrain/maxsAlphaBasedTrain
normAlphaCostTest=costListTest/maxsAlphaBasedTest
情节1:
哪里没有。eigenvalue = 10
和 alpha 变化(应对应于图 1 的第 10 列):
其中alpha = 0.0001
和特征值发生变化(应对应于 plot1 的第一行)
但正如您所见,结果与图 1 不同!
更新:
只是为了澄清更多的东西,这就是我读取数据的方式:
from sklearn.datasets.samples_generator import make_regression
rng = np.random.RandomState(0)
diabetes = datasets.load_diabetes()
X_diabetes, y_diabetes = diabetes.data, diabetes.target
X_diabetes=np.c_[np.ones(len(X_diabetes)),X_diabetes]
ind = np.arange(X_diabetes.shape[0])
rng.shuffle(ind)
#===============================================================================
# Split Data
#===============================================================================
import math
cross= math.ceil(0.7*len(X_diabetes))
ind_train = ind[:cross]
X_train, y_train = X_diabetes[ind_train], y_diabetes[ind_train]
ind_val=ind[cross:]
X_val,y_val= X_diabetes[ind_val], y_diabetes[ind_val]
我也在这里上传了.csv
文件
log.csv
包含图 1 标准化之前的原始值
normalizedLog.csv
对于情节 1
eigenConst.csv
对于情节 2
alphaConst.csv
对于情节 3