我正在尝试按照此处描述的 GPFlow(使用 v2.1.3)中的多类分类:
https://gpflow.readthedocs.io/en/master/notebooks/advanced/multiclass_classification.html
与示例的不同之处在于X向量是 10 维的,要预测的类数是 5。但是在使用诱导变量时似乎存在维数错误。我更改了内核并使用虚拟数据来实现可重复性,只是想让这段代码运行。我把变量的维度放在了以防万一。任何损失计算都会导致错误,例如:
ValueError: Dimensions must be equal, but are 10 and 5 for '{{node truediv}} = RealDiv[T=DT_DOUBLE](strided_slice_2, truediv/softplus/forward/IdentityN)' with input shapes: [200,10], [5].
就好像它需要诱导变量的Y结果,但 gpflow 站点上的示例不需要它,或者它混淆了X输入的长度和要预测的类数。
我尝试在gpflow 分类实现中扩展Y的维度,但没有帮助。
可重现的代码:
import gpflow
from gpflow.utilities import ops, print_summary, set_trainable
from gpflow.config import set_default_float, default_float, set_default_summary_fmt
from gpflow.ci_utils import ci_niter
import random
import numpy as np
import tensorflow as tf
np.random.seed(0)
tf.random.set_seed(123)
num_classes = 5
num_of_data_points = 1000
num_of_functions = num_classes
num_of_independent_vars = 10
data_gp_train = np.random.rand(num_of_data_points, num_of_independent_vars)
data_gp_train_target_hot = np.eye(num_classes)[np.array(random.choices(list(range(num_classes)), k=num_of_data_points))].astype(bool)
data_gp_train_target = np.apply_along_axis(np.argmax, 1, data_gp_train_target_hot)
data_gp_train_target = np.expand_dims(data_gp_train_target, axis=1)
data_gp = ( data_gp_train, data_gp_train_target )
lengthscales = [0.1]*num_classes
variances = [1.0]*num_classes
kernel = gpflow.kernels.Matern32(variance=variances, lengthscales=lengthscales)
# Robustmax Multiclass Likelihood
invlink = gpflow.likelihoods.RobustMax(num_of_functions) # Robustmax inverse link function
likelihood = gpflow.likelihoods.MultiClass(num_of_functions, invlink=invlink) # Multiclass likelihood
inducing_inputs = data_gp_train[::5].copy() # inducing inputs (20% of obs are inducing)
# inducing_inputs = data_gp_train[:200,:].copy() # inducing inputs (20% of obs are inducing)
m = gpflow.models.SVGP(
kernel=kernel,
likelihood=likelihood,
inducing_variable=inducing_inputs,
num_latent_gps=num_of_functions,
whiten=True,
q_diag=True,
)
set_trainable(m.inducing_variable, False)
print_summary(m)
opt = gpflow.optimizers.Scipy()
opt_logs = opt.minimize(
m.training_loss_closure(data_gp), m.trainable_variables, options=dict(maxiter=ci_niter(1000))
)
print_summary(m, fmt="notebook")
方面:
data_gp[0].shape
Out[132]: (1000, 10)
data_gp[1].shape
Out[133]: (1000, 5)
inducing_inputs.shape
Out[134]: (200, 10)
错误:
ValueError: Dimensions must be equal, but are 10 and 5 for '{{node truediv}} = RealDiv[T=DT_DOUBLE](strided_slice_2, truediv/softplus/forward/IdentityN)' with input shapes: [200,10], [5].