5

I am trying to implement NMF with Alternating Least Squares method. I am just curious about the following basic implementation of the problem: enter image description here

If I understand correctly, we can solve for each matrix equation stated in this pseudocode without nonnegativity constraints, with closed form solution and set the negative entries to 0, in a brute force way. Is this understanding correct? Is this a basic alternative to more complicated, constrained optimization problems, where we use projected gradient descent, for example? More importantly, if implemented in this basic way, will the algorithm have any practical value? I want to use NMF for variable reduction purposes and it is important that I use NMF, since my data is by definition non-negative. I am looking for opinions on this one.

4

3 回答 3

3
  1. 如果我理解正确,我们可以在没有非负约束的情况下求解此伪代码中所述的每个矩阵方程,使用封闭形式的解并将负条目设置为 0,以蛮力方式。这种理解正确吗?是的。

  2. 这是更复杂、受约束的优化问题的基本替代方案,例如,我们使用投影梯度下降?---从某种意义上说,是的。这确实是一种快速的非负因式分解方法。然而,与 NMF 相关的文章会指出,这种方法虽然速度快,但并不能保证非负因子的收敛。更好的实现是 NMF 的分层交替最小二乘法 (HALS-NMF)。查看这篇论文以比较一些流行的 NMF 算法:http ://www.cc.gatech.edu/~hpark/papers/jgo.pdf

  3. 更重要的是,如果按照这种基本的方式实现,算法会有什么实用价值吗?仅根据我的经验,我会说结果不如 HALS 或 BPP(Block Pivoting Principle)好。

于 2015-12-14T14:11:55.700 回答
1

在这个算法中使用非负最小二乘法而不是剪掉负值在这个算法中显然会更好,但总的来说我不会推荐这种基本的 ALS/ANNLS 方法,因为它的收敛性很差(它经常波动甚至可以显示发散) - 一种更好的方法的最小 Matlab 实现,NMF 的加速分层交替最小二乘法(Cichocki 等人的),这是目前最快的方法之一,如下所示(Nicolas Gillis 的代码):

% Accelerated hierarchical alternating least squares (HALS) algorithm of
% Cichocki et al. 
%
% See N. Gillis and F. Glineur, "Accelerated Multiplicative Updates and 
% Hierarchical ALS Algorithms for Nonnegative Matrix Factorization”, 
% Neural Computation 24 (4), pp. 1085-1105, 2012. 
% See http://sites.google.com/site/nicolasgillis/ 
%
% [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)
%
% Input.
%   M              : (m x n) matrix to factorize
%   (U,V)          : initial matrices of dimensions (m x r) and (r x n)
%   alpha          : nonnegative parameter of the accelerated method
%                    (alpha=0.5 seems to work well)
%   delta          : parameter to stop inner iterations when they become
%                    inneffective (delta=0.1 seems to work well). 
%   maxiter        : maximum number of iterations
%   timelimit      : maximum time alloted to the algorithm
%
% Output.
%   (U,V)    : nonnegative matrices s.t. UV approximate M
%   (e,t)    : error and time after each iteration, 
%               can be displayed with plot(t,e)
%
% Remark. With alpha = 0, it reduces to the original HALS algorithm.  

function [U,V,e,t] = HALSacc(M,U,V,alpha,delta,maxiter,timelimit)

% Initialization
etime = cputime; nM = norm(M,'fro')^2; 
[m,n] = size(M); [m,r] = size(U);
a = 0; e = []; t = []; iter = 0; 

if nargin <= 3, alpha = 0.5; end
if nargin <= 4, delta = 0.1; end
if nargin <= 5, maxiter = 100; end
if nargin <= 6, timelimit = 60; end

% Scaling, p. 72 of the thesis
eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1; j = 0;
scaling = sum(sum(A.*U))/sum(sum( B.*(U'*U) )); U = U*scaling; 
% Main loop
while iter <= maxiter && cputime-etime <= timelimit
    % Update of U
    if j == 1, % Do not recompute A and B at first pass
        % Use actual computational time instead of estimates rhoU
        eit1 = cputime; A = M*V'; B = V*V'; eit1 = cputime-eit1; 
    end
    j = 1; eit2 = cputime; eps = 1; eps0 = 1;
    U = HALSupdt(U',B',A',eit1,alpha,delta); U = U';
    % Update of V
    eit1 = cputime; A = (U'*M); B = (U'*U); eit1 = cputime-eit1;
    eit2 = cputime; eps = 1; eps0 = 1; 
    V = HALSupdt(V,B,A,eit1,alpha,delta); 
    % Evaluation of the error e at time t
    if nargout >= 3
        cnT = cputime;
        e = [e sqrt( (nM-2*sum(sum(V.*A))+ sum(sum(B.*(V*V')))) )]; 
        etime = etime+(cputime-cnT);
        t = [t cputime-etime];
    end
    iter = iter + 1; j = 1; 
end

% Update of V <- HALS(M,U,V)
% i.e., optimizing min_{V >= 0} ||M-UV||_F^2 
% with an exact block-coordinate descent scheme
function V = HALSupdt(V,UtU,UtM,eit1,alpha,delta)
[r,n] = size(V); 
eit2 = cputime; % Use actual computational time instead of estimates rhoU
cnt = 1; % Enter the loop at least once
eps = 1; eps0 = 1; eit3 = 0;
while cnt == 1 || (cputime-eit2 < (eit1+eit3)*alpha && eps >= (delta)^2*eps0)
    nodelta = 0; if cnt == 1, eit3 = cputime; end
        for k = 1 : r
            deltaV = max((UtM(k,:)-UtU(k,:)*V)/UtU(k,k),-V(k,:));
            V(k,:) = V(k,:) + deltaV;
            nodelta = nodelta + deltaV*deltaV'; % used to compute norm(V0-V,'fro')^2;
            if V(k,:) == 0, V(k,:) = 1e-16*max(V(:)); end % safety procedure
        end
    if cnt == 1
        eps0 = nodelta; 
        eit3 = cputime-eit3; 
    end
    eps = nodelta; cnt = 0; 
end

有关完整代码和与其他方法的比较,请参阅 https://sites.google.com/site/nicolasgillis/code (用于 NMF 的加速 MU 和 HALS 算法部分)和 N. Gillis 和 F. Glineur,“加速乘法更新和分层ALS 非负矩阵分解算法”,神经计算 24 (4),第 1085-1105 页,2012 年。

于 2017-03-22T08:44:30.240 回答
0

是的,这可以做到,但不,你不应该这样做。

NMF 的瓶颈不是非负最小二乘计算,而是最小二乘方程右侧的计算和损失计算(如果用于确定收敛)。根据我的经验,使用快速 NNLS 求解器,与基本最小二乘求解相比,NNLS 增加了不到 1% 的相对运行时间。现在(也许当你问这个问题时不是)有非常快的方法,例如 TNT-NN 和顺序坐标下降,它们使事情变得非常快。

我试过这种方法,模型质量真的很差。它几乎不会让人想起 HALS 或乘法更新。

于 2021-03-06T23:53:40.147 回答