Fitting 2D Gaussians to Data

Vikram B. Baliga

2021-01-10

Overview

The function fit_gaussian_2D() can be used fit 2D Gaussians to data, and has several methods for how the fitting is implemented. This vignette will run you through what these methods mean with worked examples.

We’ll begin by loading gaussplotR and loading the sample data set provided within. The raw data we’d like to use are in columns 1:3, so we’ll shave the data set down to those columns before running through the examples.

library(gaussplotR)

## We'll also use lattice, ggplot2 and metR
library(lattice); library(ggplot2); library(metR)

## Load the sample data set
data(gaussplot_sample_data)

## The raw data we'd like to use are in columns 1:3
samp_dat <-
  gaussplot_sample_data[,1:3]

It generally helps to plot the data beforehand to get a sense of its overall shape. We’ll simply produce a contour plot.

lattice::levelplot(
  response ~ X_values * Y_values,
  data = samp_dat,
  col.regions = colorRampPalette(
    c("white", "blue")
    )(100),
  xlim = c(-5, 0),
  ylim = c(-1, 4), 
  asp = 1
)

The method and constrain_orientation arguments

method

gaussplotR::fit_gaussian_2D() has three main options for its method argument: 1) "elliptical", 2) "elliptical_log", or 3) "circular".

The most generic method (and the default) is method = "elliptical". This allows the fitted 2D Gaussian to take an ellipsoid shape. If you would like the best-fitting 2D Gaussian, this is most likely your best bet.

A slightly-altered method to fit an ellipsoid Gaussian is available in method = "elliptical_log". This method follows Priebe et al. 20031 and is geared towards use with log2-transformed data.

A third option is method = "circular". This produces a very simple 2D Gaussian that is constrained to have to have a roughly circular shape (i.e. spread in X- and Y- are roughly equal).

An additional argument, constrain_orientation gives additional control over the orientation of the fitted Gaussian. By default, the constrain_orientation is "unconstrained", meaning that the best-fit orientation is returned.

constrain_orientation

Setting constrain_orientation to a numeric (e.g. constrain_orientation = pi/2) will force the orientation of the Gaussian to the specified value, but this is only available when using method = "elliptical" or method = "elliptical_log"

Note that supplying a numeric to constrain_orientation is handled differently by method = "elliptical" vs method = "elliptical_log". With method = "elliptical", a theta parameter dictates the rotation, in radians, from the x-axis in the clockwise direction. Thus, using method = "elliptical", constrain_orientation = pi/2 will return parameters for an elliptical 2D Gaussian that is constrained to a 90-degree (pi/2) orientation. In contrast, the method = "elliptical_log" procedure uses a Q parameter to determine the orientation of the 2D Gaussian. Setting method = "elliptical_log", constrain_orientation = 0 will result in a diagonally-oriented Gaussian, whereas setting constrain_orientation = -1 will result in horizontal orientation. Again, see Priebe et al. 2003 for more details.

Example 1: Unconstrained elliptical

Unconstrained ellipticals are the default option and are generally recommended for most purposes. Here’s an example:

gauss_fit_ue <-
    fit_gaussian_2D(samp_dat)

gauss_fit_ue
#> $coefs
#>         A_o      Amp    theta    X_peak   Y_peak         a         b
#> 1 0.8272942 32.25132 3.581234 -2.638124 2.021262 0.9072492 0.9611345
#> 
#> $model
#> Nonlinear regression model
#>   model: response ~ A_o + Amp * exp(-((((X_values - X_peak) * cos(theta) -     (Y_values - Y_peak) * sin(theta))/a)^2 + (((X_values - X_peak) *     sin(theta) - (Y_values - Y_peak) * cos(theta))/b)^2)/2)
#>    data: data
#>     A_o     Amp   theta  X_peak  Y_peak       a       b 
#>  0.8273 32.2513  3.5812 -2.6381  2.0213  0.9072  0.9611 
#>  residual sum-of-squares: 156.2
#> 
#> Number of iterations to convergence: 13 
#> Achieved convergence tolerance: 3.935e-06
#> 
#> $model_error_stats
#>        rss     rmse deviance      AIC
#> 1 156.2272 2.083181 156.2272 171.0041
#> 
#> $fit_method
#>          method       amplitude     orientation 
#>    "elliptical" "unconstrained" "unconstrained" 
#> 
#> attr(,"gaussplotR")
#> [1] "gaussplotR_fit"
attributes(gauss_fit_ue)
#> $names
#> [1] "coefs"             "model"             "model_error_stats"
#> [4] "fit_method"       
#> 
#> $gaussplotR
#> [1] "gaussplotR_fit"

Fitting an unconstrained ellipse returns an object (here: gauss_fit_ue) that is a data.frame with one column per fitted parameter. The fitted parameters are: A_o (a constant term), Amp (amplitude), theta (rotation, in radians, from the x-axis in the clockwise direction), X_peak (x-axis peak location), Y_peak (y-axis peak location), a (width of Gaussian along x-axis), and b (width of Gaussian along y-axis).

Note that the attribute fit_method indicates that an "elliptical_unconstr" approach was used. This attribute is used by predict_gaussian_2D() to automatically determine what method (and therefore, identity of parameters) was used and then sample points from that fitted Gaussian.

We can elect to sample more points from the fitted Gaussian by feeding in a grid of x- and y- values on which to predict (via expand.grid().

Then, the fitted object gauss_fit_ue along with the grid of points can be supplied to predict_gaussian_2D to sample more points from the fit, which can be useful for plotting.

## Generate a grid of x- and y- values on which to predict
grid <-
  expand.grid(X_values = seq(from = -5, to = 0, by = 0.1),
              Y_values = seq(from = -1, to = 4, by = 0.1))

## Predict the values using predict_gaussian_2D
gauss_data_ue <-
  predict_gaussian_2D(
    fit_object = gauss_fit_ue,
    X_values = grid$X_values,
    Y_values = grid$Y_values,
  )

## Plot via ggplot2 and metR
ggplot_gaussian_2D(gauss_data_ue)

Example 2: Constrained elliptical

As noted above, the constrain_orientation can be used to dictate the orientation. Please note that this will very likely result in a poorer fit, but may be useful for certain types of analyses.

Here we’ll force the Gaussian to be horizontally-oriented.

gauss_fit_ce <-
    fit_gaussian_2D(samp_dat, 
                    constrain_orientation = 0)

gauss_fit_ce
#> $coefs
#>        A_o      Amp theta    X_peak   Y_peak        a        b
#> 1 1.259775 24.76887     0 -2.526168 1.949814 1.199245 1.327056
#> 
#> $model
#> Nonlinear regression model
#>   model: response ~ A_o + Amp * exp(-((((X_values - X_peak) * cos(constrain_orientation) -     (Y_values - Y_peak) * sin(constrain_orientation))/a)^2 +     (((X_values - X_peak) * sin(constrain_orientation) - (Y_values -         Y_peak) * cos(constrain_orientation))/b)^2)/2)
#>    data: data
#>    A_o    Amp X_peak Y_peak      a      b 
#>  1.260 24.769 -2.526  1.950  1.199  1.327 
#>  residual sum-of-squares: 885.8
#> 
#> Number of iterations to convergence: 157 
#> Achieved convergence tolerance: 9.813e-06
#> 
#> $model_error_stats
#>       rss     rmse deviance      AIC
#> 1 885.831 4.960486  885.831 231.4718
#> 
#> $fit_method
#>          method       amplitude     orientation 
#>    "elliptical" "unconstrained"   "constrained" 
#> 
#> attr(,"gaussplotR")
#> [1] "gaussplotR_fit"

We’ll use the same grid of x- and y- points as above

## Predict the values using predict_gaussian_2D
gauss_data_ce <-
  predict_gaussian_2D(
    fit_object = gauss_fit_ce,
    X_values = grid$X_values,
    Y_values = grid$Y_values,
  )

## Plot via ggplot2 and metR
ggplot_gaussian_2D(gauss_data_ce)

Example 3: Unconstrained elliptical_log

This procedure follows the formula used in Priebe et al. 2003 and is geared towards log2-transformed data (which the example data are).

Parameters from this model include: Amp (amplitude), Q (orientation parameter), X_peak (x-axis peak location), Y_peak (y-axis peak location), X_sig (spread along x-axis), and Y_sig (spread along y-axis).

gauss_fit_uel <-
    fit_gaussian_2D(samp_dat, 
                    method = "elliptical_log")

gauss_fit_uel
#> $coefs
#>        Amp          Q    X_peak   Y_peak    X_sig    Y_sig
#> 1 32.23457 -0.1942029 -2.634429 2.007712 2.108038 1.424922
#> 
#> $model
#> Nonlinear regression model
#>   model: response ~ Amp * exp(-((X_values - X_peak)^2)/(X_sig^2)) * exp(-(Y_values -     ((Q + 1) * (X_values - X_peak) + Y_peak))^2/(Y_sig^2))
#>    data: data
#>     Amp       Q  X_peak  Y_peak   X_sig   Y_sig 
#> 32.2346 -0.1942 -2.6344  2.0077  2.1080  1.4249 
#>  residual sum-of-squares: 169.9
#> 
#> Number of iterations to convergence: 13 
#> Achieved convergence tolerance: 5.196e-06
#> 
#> $model_error_stats
#>        rss     rmse deviance      AIC
#> 1 169.8812 2.172308 169.8812 172.0205
#> 
#> $fit_method
#>           method        amplitude      orientation 
#> "elliptical_log"  "unconstrained"  "unconstrained" 
#> 
#> attr(,"gaussplotR")
#> [1] "gaussplotR_fit"

## Predict the values using predict_gaussian_2D
gauss_data_uel <-
  predict_gaussian_2D(
    fit_object = gauss_fit_uel,
    X_values = grid$X_values,
    Y_values = grid$Y_values,
  )

## Plot via ggplot2 and metR
ggplot_gaussian_2D(gauss_data_uel)

Example 4: Constrained elliptical_log

Similar to the above, but here the constrain_orientation can be used to dictate the value of the Q parameter used in Priebe et al. 2003. Setting Q to 0 will result in a diagonally-oriented Gaussian, whereas setting Q to -1 will result in horizontal orientation. Q is a continuous parameter, so values in between may be used as well, such as in this example:

gauss_fit_cel <-
    fit_gaussian_2D(
      samp_dat,
      method = "elliptical_log",
      constrain_orientation = -0.66
    )

gauss_fit_cel
#> $coefs
#>        Amp     Q    X_peak   Y_peak    X_sig    Y_sig
#> 1 40.37259 -0.66 -2.603592 2.055755 1.853229 1.245801
#> 
#> $model
#> Nonlinear regression model
#>   model: response ~ Amp * exp(-((X_values - X_peak)^2)/(X_sig^2)) * exp(-(Y_values -     ((constrain_orientation + 1) * (X_values - X_peak) + Y_peak))^2/(Y_sig^2))
#>    data: data
#>    Amp X_peak Y_peak  X_sig  Y_sig 
#> 40.373 -2.604  2.056  1.853  1.246 
#>  residual sum-of-squares: 406.1
#> 
#> Number of iterations to convergence: 14 
#> Achieved convergence tolerance: 4.714e-06
#> 
#> $model_error_stats
#>        rss     rmse deviance      AIC
#> 1 406.1011 3.358658 406.1011 201.3946
#> 
#> $fit_method
#>           method        amplitude      orientation 
#> "elliptical_log"  "unconstrained"    "constrained" 
#> 
#> attr(,"gaussplotR")
#> [1] "gaussplotR_fit"

## Predict the values using predict_gaussian_2D
gauss_data_cel <-
  predict_gaussian_2D(
    fit_object = gauss_fit_cel,
    X_values = grid$X_values,
    Y_values = grid$Y_values,
  )

## Plot via ggplot2 and metR
ggplot_gaussian_2D(gauss_data_cel)

Again, setting the value of Q via constrain_orientation will very likely result in poorer-fitting Gaussians. See the analyses in Priebe et al. 2003 to get a sense of useful applications of this approach. Forcing Q = -0.66 in the above example isn’t all that useful, but goes to show that it can be done.

Example 5: Circular

Using method = "circular" constrains the Gaussian to have a roughly circular shape (i.e. spread in X- and Y- are roughly equal).

If this method is used, the fitted parameters are: Amp (amplitude), X_peak (x-axis peak location), Y_peak (y-axis peak location), X_sig (spread along x-axis), and Y_sig(spread along y-axis).

gauss_fit_cir <-
    fit_gaussian_2D(samp_dat, 
                    method = "circular")

gauss_fit_cir
#> $coefs
#>        Amp    X_peak   Y_peak    X_sig    Y_sig
#> 1 23.18005 -2.546643 1.811152 1.316288 1.642641
#> 
#> $model
#> Nonlinear regression model
#>   model: response ~ Amp * exp(-((((X_values - X_peak)^2)/(2 * X_sig^2) +     ((Y_values - Y_peak)^2)/(2 * Y_sig^2))))
#>    data: data
#>    Amp X_peak Y_peak  X_sig  Y_sig 
#> 23.180 -2.547  1.811  1.316  1.643 
#>  residual sum-of-squares: 899.6
#> 
#> Number of iterations to convergence: 22 
#> Achieved convergence tolerance: 9.057e-06
#> 
#> $model_error_stats
#>       rss     rmse deviance      AIC
#> 1 899.613 4.998925  899.613 230.0276
#> 
#> $fit_method
#>          method       amplitude     orientation 
#>      "circular" "unconstrained"              NA 
#> 
#> attr(,"gaussplotR")
#> [1] "gaussplotR_fit"

## Predict the values using predict_gaussian_2D
gauss_data_cir <-
  predict_gaussian_2D(
    fit_object = gauss_fit_cir,
    X_values = grid$X_values,
    Y_values = grid$Y_values,
  )

## Plot via ggplot2 and metR
ggplot_gaussian_2D(gauss_data_cir)

That’s all!

🐢


  1. Priebe NJ, Cassanello CR, Lisberger SG. The neural representation of speed in macaque area MT/V5. J Neurosci. 2003 Jul 2;23(13):5650-61. doi: 10.1523/JNEUROSCI.23-13-05650.2003.