vrnmf: Volume-regularized NMF

The R package vrnmf implements a set of methods to perform non-negative matrix decomposition with minimum volume constraints. A general problem is to decompose a non-negative matrix in a product of non-negative matrix and matrix of lower rank r: . In case of additional non-negativity constraints on the matrix , the problem is known as NMF.

This problem, and NMF as a particular case, is not identifiable in the general case, meaning that there are potentially many different solutions that deliver the same decomposition quality [1]. This both makes interpretation of factorized matrices challenging and limits applications of NMF to instrumental dimensionality reduction. However, recent theoretical advances have shown that the issue can be overcome under a relatively mild assumption based on “spread”. That is, the column vectors of C are known as “sufficiently spread”[2-3] if the matrix C is non-negative and the matrix C has sufficiently spread column vectors then the volume minimization of a matrix D delivers a correct and unique, up to a scale and permutation, solution (C, D).

The AnchorFree approach enables efficient estimation of matrix C by reformulating the problem in the covariance domain using the application of the volume minimization criterion [4]. A short walkthrough can be found here

A more general formulation of the problem that accounts for noise in matrix X, such that only approximately , is called volume-regularized NMF (vrnmf). To balance goodness of matrix approximation and matrix D volume, vrnmf minimizes the following objective function [5-6]:

We provide implementation of vrnmf approach and devise its reformulation in covariance domain.

Installation

To install the stable version from CRAN, use:

install.packages('vrnmf')

install.packages('devtools')
devtools::install_github('kharchenkolab/vrnmf')

References

R package

The R package can be cited as:

Ruslan Soldatov, Peter Kharchenko, Viktor Petukhov, and Evan Biederstedt (2021). vrnmf:
Volume-regularized structured matrix factorization. R package version
1.0.2. https://github.com/kharchenkolab/vrnmf

Publication

If you find this software useful for your research, please cite the corresponding paper:

Vladimir B. Seplyarskiy Ruslan A. Soldatov, et al.
Population sequencing data reveal a compendium of mutational processes in the human germ line.
Science, 12 Aug 2021. doi: 10.1126/science.aba7408