Skip to content

DTUComputeCognitiveSystems/cpNonNeg

Repository files navigation

cpNonNeg

Mirror from this repository

This is an implementation of the CANDECOMP/PARAFAC model for tensor factorization of non-negative data. Missing values are handled by marginalization, i.e. ignored during optimization. The code was used in a project at Technical University of Denmark as part of one of the authors master degree, that ultimately lead to the publication Non-negative Tensor Factorization with missing data for the modeling of gene expressions in the Human Brain. All implementation was done in MATLAB.

Inlcudes:

  • cpNonNeg.m - main MATLAB function
  • cpNonNeg_sub.m - NMF solver for CP-subproblem
  • krprod.m - Kathri-Rao product for tensors
  • matricizing.m - matricizing operation
  • tmult.m - tensor multiplication (mode specific)
  • unmatricizing.m - tensor reconstruction from matrix

Example function call:

The following script (available in the repository) shows a basic usage of the code. NB! The code will terminate quickly and not neccessarily give meaningful results (due to the random data).

% example script
%% Generate synthetic data
D_true = 5;
N = [1000 50 25]; % Tensor dimensions
Nx = length(N);
F = cell(Nx,1);
for i = 1:Nx
        F{i} = rand(N(i),D_true);
end

% Diagonal identity tensor
I=zeros(D_true*ones(1,Nx));
for j=1:D_true
        I(j,j,j)=1;
end

Y=tmult(I,F{1},1);
% Data tensor
for ip = 2:Nx
        Y=tmult(Y,F{ip},ip);
end

sig2 = 0.5; % noise level
C = 5; % affine transformation to ensure non-negatitivty
X = Y + sqrt(sig2)*randn(N) + C*ones(N);

assert(min(X(:))>0);

%% Holdout missing data
p = 0.20; % holdout fraction (missing data)
NE = prod(size(X)); % number of elements in tensor
R = rand(NE,1)>(1-p); % holdout logical indices
X(R) = nan; % missing values are treated as NaN

%% Model specification
D = 5; % number of latent componenents in the model
Finit = cell(Nx,1); % initialization of factors (default)
scale = nanstd(X(:)); % scale of data

for i = 1:Nx
   Finit{i}=(scale.^(1/Nx))*rand(N(i),D); 
end

% options
options.maxiter = 250; % number of iterations
options.mu = 0; % no multiplicative update steps are taken
options.hals = 1; % hierarchical alternating least sqaures steps are taken

%% Run
[FACT,SSEv,CPUt]=cpNonNeg(X,D,Finit,options);

% FACT gives back factors in a cell array just as Finit was initialized
% SSEv is the sum of squared (reconstruction) errors in each iteration
% (vector)
% CPUt is the CPU time used in each iteration

Written by: Søren Føns Vind Nielsen and Morten Mørup CogSys, Technical University of Denmark, May 2014

About

Canonical Polyadic Non Negative Tensor Factorization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages