我正在尝试解决以下一维最小绝对差 (LAD) 优化问题
我正在使用二分法来找到最好的贝塔(标量)。我有以下代码:
import numpy as np
x = np.random.randn(10)
y = np.sign(np.random.randn(10)) + np.random.randn(10)
def find_best(x, y):
best_obj = 1000000
for beta in np.linspace(-100,100,1000000):
if np.abs(beta*x - y).sum() < best_obj:
best_obj = np.abs(beta*x - y).sum()
best_beta = beta
return best_obj, best_beta
def solve_bisect(x, y):
it = 0
u = 100
l = -100
while True:
it +=1
if it > 40:
# maxIter reached
return obj, beta, subgrad
# select the mid-point
beta = (l + u)/2
# subgrad calculation. \partial |x*beta - y| = sign(x*beta - y) * x. np.abs(x * beta -y) > 1e-5 is to avoid numerical issue
subgrad = (np.sign(x * beta - y) * (np.abs(x * beta - y) > 1e-5) * x).sum()
obj = np.sum(np.abs(x * beta - y))
print 'obj = %f, subgrad = %f, current beta = %f' % (obj, subgrad, beta)
# bisect. check subgrad to decide which part of the space to cut out
if np.abs(subgrad) <1e-3:
return obj, beta, subgrad
elif subgrad > 0:
u = beta + 1e-3
else:
l = beta - 1e-3
brute_sol = find_best(x,y)
bisect_sol = solve_bisect(x,y)
print 'brute_sol: obj = %f, beta = %f' % (brute_sol[0], brute_sol[1])
print 'bisect_sol: obj = %f, beta = %f, subgrad = %f' % (bisect_sol[0], bisect_sol[1], bisect_sol[2])
正如你所看到的,我还有一个蛮力实现,它搜索空间以获得预言答案(最多一些数字错误)。每次运行都可以找到最优的最佳和最小目标值。但是, subgrad 不为 0(甚至不接近)。例如,我的一次跑步得到了以下结果:
brute_sol: obj = 10.974381, beta = -0.440700
bisect_sol: obj = 10.974374, beta = -0.440709, subgrad = 0.840753
客观值和最佳值是正确的,但 subgrad 根本不接近 0。所以问题:
- 为什么 subgrad 不接近 0 ?当且仅当它是最优时,最优条件是 0 不是在次微分中吗?
- 我们应该改用什么停止标准?