HotellingEllipse

Christian L. Goueguel

The HotellingEllipse package is designed to help draw the Hotelling ellipse on the PCA or PLS score scatterplot. HotellingEllipse computes the Hotelling’s T\(^2\) value, the semi-minor axis (denoted a), the semi-major axis (denoted b) along with the x-y coordinates for drawing a confidence ellipse based on Hotelling’s T\(^2\). Specifically, there are two functions available:

Data

library(HotellingEllipse)
data("specData")

In this example, we use FactoMineR::PCA() to perform the Principal Component Analysis (PCA) from a LIBS spectral dataset specData and extract the PCA scores as a data frame tibble::as_tibble().

set.seed(123)
pca_mod <- specData %>%
  select(where(is.numeric)) %>%
  PCA(scale.unit = FALSE, graph = FALSE)
pca_scores <- pca_mod %>%
  pluck("ind", "coord") %>%
  as_tibble()
pca_scores
#> # A tibble: 171 x 5
#>      Dim.1   Dim.2   Dim.3   Dim.4   Dim.5
#>      <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#>  1 144168. -36399.   2228.   -670.  13805.
#>  2 118520. -31465.  16300. -20686. -13872.
#>  3  90303. -28356.  31340. -60615.  15157.
#>  4 107107. -38209.  24897. -60366.  19449.
#>  5  74350.  -2148.  29814.  -8351.    494.
#>  6  97511. -17932.  22254. -15406.  -4195.
#>  7  82142.  19297. -34299. -12498.   -648.
#>  8  76261.  16566. -34382. -16293.    137.
#>  9  73705.  31091. -22577. -17182.   2438.
#> 10  68042.  25124. -26063. -19389.   6051.
#> # … with 161 more rows

Hotelling ellipse: semi-axes

To add a confidence ellipse, we use the function ellipseParam(). We want to compute the length of the ellipse semi-axes for bivariate data within the PC1-PC2 subspace. To do this, we set the number of components, k, to 2, while the pcx and pcy inputs are respectively set to 1 and 2.

res <- ellipseParam(data = pca_scores, k = 2, pcx = 1, pcy = 2)
str(res)
#> List of 4
#>  $ Tsquare     : tibble[,1] [171 × 1] (S3: tbl_df/tbl/data.frame)
#>   ..$ value: num [1:171] 2.28 2.65 8 8.63 1.05 ...
#>  $ Ellipse     : tibble[,4] [1 × 4] (S3: tbl_df/tbl/data.frame)
#>   ..$ a.99pct: num 319536
#>   ..$ b.99pct: num 91816
#>   ..$ a.95pct: num 256487
#>   ..$ b.95pct: num 73699
#>  $ cutoff.99pct: num 9.52
#>  $ cutoff.95pct: num 6.14

We can extract parameters for further use:

a1 <- pluck(res, "Ellipse", "a.99pct")
b1 <- pluck(res, "Ellipse", "b.99pct")
a2 <- pluck(res, "Ellipse", "a.95pct")
b2 <- pluck(res, "Ellipse", "b.95pct")
Tsq <- pluck(res, "Tsquare", "value")

Hotelling ellipse: x and y coordinates

Another way to add Hotelling ellipse is to use the function ellipseCoord(). This function provides the x and y coordinates of the confidence ellipse at user-defined confidence interval. The confidence interval confi.limit is set at 95% by default. Below, the x-y coordinates are estimated based on data projected into the PC1-PC3 subspace.

xy_coord <- ellipseCoord(data = pca_scores, pcx = 1, pcy = 3, conf.limit = 0.95, pts = 500)
str(xy_coord)
#> tibble[,2] [500 × 2] (S3: tbl_df/tbl/data.frame)
#>  $ x: num [1:500] 256487 256466 256405 256304 256161 ...
#>  $ y: num [1:500] -1.73e-12 7.93e+02 1.59e+03 2.38e+03 3.17e+03 ...