Matrix completion is a procedure for imputing the missing elements in matrices by using the information of observed elements. This procedure can be visualized as:

Matrix completion has attracted a lot of attention, it is widely applied in:

- tabular data imputation: recover the missing elements in data table;
- recommend system: estimate usersâ€™ potantial preference for items pending purchased;
- image inpainting: inpaint the missing elements in digit images.

A computationally efficient R package, **eimpute** is
developed for matrix completion. In **eimpute**, matrix
completion problem is solved by iteratively performing low-rank
approximation and data calibration, which enjoy two admirable
advantages:

- unbiased low-rank approximation for incomplete matrix

- less time consumption via truncated SVD

Compare **eimpute** and **softimpute** in
systhesis datasets \(X_{m \times m}\)
with \(p\) proportion missing
observations. The square matrix \(X_{m \times
m}\) is generated by \(X = UV +
\epsilon\), where \(U\) and
\(V\) are \(m
\times r\), \(r \times n\)
matrices whose entries are \(i.i.d.\)
sampled standard normal distribution, \(\epsilon \sim N(0, r/3)\).

- \(m\) is chosen as 1000, 2000, 3000, 4000
- \(p\) is chosen as 0.1, 0.5, 0.9.

In high dimension case, als method in **softimpute** is
a little faster than **eimpute** in low proportion of
missing observations, as the proportion of missing observations
increase, rsvd method in **eimpute** have a better
performance than **softimpute** in time cost and test
error. Compare with two method in **eimpute*, rsvd method is better than
tsvd in time cost.

Install the stable version from CRAN:

`install.packages("eimpute")`

Install the development version from github:

```
library(devtools)
install_github("Mamba413/eimpute", build_vignettes = TRUE)
```

We start with a toy example. Let us generate a small matrix with some
values missing via **incomplete.generator** function.

```
<- 6
m <- 5
n <- 3
r <- incomplete.generator(m, n, r)
x_na
x_na#> [,1] [,2] [,3] [,4] [,5]
#> [1,] -0.8269428 1.2228586 NA NA NA
#> [2,] -2.2410010 4.5095165 NA NA NA
#> [3,] 0.4499102 NA -0.2818085 0.7718102 -0.8364048
#> [4,] NA 1.7167365 0.9480745 NA 3.5680208
#> [5,] NA 0.7240437 NA NA 0.2633712
#> [6,] NA -2.8879249 NA 1.2027552 NA
```

Use **eimpute** function to impute missing values.

```
<- eimpute(x_na, r)
x_impute "x.imp"]]
x_impute[[#> [,1] [,2] [,3] [,4] [,5]
#> [1,] -0.8269428 1.2228586 0.19035820 0.9514541 0.2994880
#> [2,] -2.2410010 4.5095165 0.39560039 0.7295574 0.4911418
#> [3,] 0.4499102 -1.2083884 -0.28180850 0.7718102 -0.8364048
#> [4,] -0.3408353 1.7167365 0.94807452 0.1835412 3.5680208
#> [5,] -0.3669454 0.7240437 0.11988844 0.3294654 0.2633712
#> [6,] 1.3875965 -2.8879249 0.01871091 1.2027552 0.4512052
```