Fitting Example Using dfddm

Kendal Foster and Henrik Singmann

November 04, 2020

Function dfddm evaluates the density function (or probability density function, PDF) for the Ratcliff diffusion decision model (DDM) using different methods for approximating the full PDF, which contains an infinite sum. An overview of the mathematical details of the different approximations is provided in the Math Vignette. An empirical validation of the implemented methods is provided in the Validity Vignette. Timing benchmarks for the present methods and comparison with existing methods are provided in the Benchmark Vignette.

Our implementation of the DDM has the following parameters: \(a \in (0, \infty)\) (threshold separation), \(v \in (-\infty, \infty)\) (drift rate), \(t_0 \in [0, \infty)\) (non-decision time/response time constant), \(w \in (0, 1)\) (relative starting point), and \(sv \in (0, \infty)\) (inter-trial-variability of drift).

Introduction


This vignette contains two examples of how to use fddm, in particular the dfddm function, in fitting the DDM to real-world data. We will load a dataset that is included in the fddm package and fit the Ratcliff DDM to the response time data contained within the dataset. We will show a simple fitting procedure for estimating the DDM parameter values for only a single individual in the study in addition to a more involved fitting procedure that includes DDM parameter estimation for all of the individuals in the study. After running this more involved optimization, we provide a rudimentary analysis of the fitted parameter estimates that groups the parameter estimates by the expertise of the study’s participants.

Example Fitting


In this example, we will fit the DDM to the med_dec data that comes with fddm. This dataset contains the accuracy condition reported in Trueblood et al. (2018), which investigates medical decision making among medical professionals (pathologists) and novices (i.e., undergraduate students). The task of participants was to judge whether pictures of blood cells show cancerous cells (i.e., blast cells) or non-cancerous cells (i.e., non-blast cells). The dataset contains 200 decisions per participant, based on pictures of 100 true cancerous cells and pictures of 100 true non-cancerous cells. We load the fddm package, read the data, and remove any invalid responses from the data.

library("fddm")
data(med_dec, package = "fddm")
med_dec <- med_dec[which(med_dec[["rt"]] >= 0), ]

Log-likelihood Function

Our approach will be a straightforward maximum likelihood estimation (MLE). Since we will be using the optimization function nlminb, we must write an objective function for it to optimize. By default nlminb finds the minimum of the objective function instead of the maximum, so we will simply negate our likelihood function. In addition, we will employ the common practice of using the log-likelihood as this tends to be more stable while still maintaining the same minima (negated maxima) as the regular likelihood function.

We are going to be fitting the parameters \(v\), \(a\), \(t_0\), \(w\), and \(sv\); however, we want to fit two distinct drift rates, one for the upper boundary (\(v_u\)) and one for the lower boundary (\(v_\ell\)). In order to make this distinction, we require the input of the truthful classification of each decision (i.e. what the correct response is for each entry). Note that our log-likelihood function depends on the number of response times, the number of responses, and the number of truthful classifications all being equal.

As we are using the optimization function nlminb, the first argument to our log-likelihood function needs to be a vector of the initial values of the six parameters that are being optimized: \(v_u\), \(v_\ell\), \(a\), \(t_0\), \(w\), and \(sv\). The rest of the arguments will be the other necessary inputs to dfddm that are not optimized: the vector of response times, the vector of responses, the vector of the truthful classifications, and the allowable error tolerance for the density function (optional). Details on all of these inputs can be found in the dfddm documentation.

Upon being called, the log-likelihood function first separates the input response times and responses by their truthful classification to yield two new response time vectors and two new response vectors. The response times and responses are then input into separate density functions using a separate \(v\) parameter, \(v_u\) or \(v_\ell\). These separate densities are then combined, and the log-likelihood function heavily penalizes any combination of parameters that returns a log-density of \(-\infty\) (equivalent to a regular density of \(0\)). Lastly, the actual log-likelihood is returned as the negative of the sum of all of the log-densities.

ll_fun <- function(pars, rt, resp, truth, err_tol) {
  v <- numeric(length(rt))

  # the truth is "upper" so use vu
  v[truth == "upper"] <- pars[[1]]
  # the truth is "lower" so use vl
  v[truth == "lower"] <- pars[[2]]

  dens <- dfddm(rt = rt, response = resp, a = pars[[3]], v = v, t0 = pars[[4]],
                w = pars[[5]], sv = pars[[6]], log = TRUE, err_tol = 1e-6)

  return( ifelse(any(!is.finite(dens)), 1e6, -sum(dens)) )
}

Simple Fitting Routine

As an intermediate step, we will fit the DDM to only one participant from the med_dec data. We select the individual whose data we will use for fitting before preparing the data by defining upper and lower responses and the correct response bounds.

onep <- med_dec[ med_dec[["id"]] == "2" & med_dec[["group"]] == "experienced", ]
onep[["resp"]] <- ifelse(onep[["response"]] == "blast", "upper", "lower")
onep[["truth"]] <- ifelse(onep[["classification"]] == "blast", "upper", "lower")
str(onep)
#> 'data.frame':    200 obs. of  11 variables:
#>  $ id            : int  2 2 2 2 2 2 2 2 2 2 ...
#>  $ group         : chr  "experienced" "experienced" "experienced" "experienced" ...
#>  $ block         : int  3 3 3 3 3 3 3 3 3 3 ...
#>  $ trial         : int  1 2 3 4 5 6 7 8 9 10 ...
#>  $ classification: chr  "blast" "non-blast" "non-blast" "non-blast" ...
#>  $ difficulty    : chr  "easy" "easy" "hard" "hard" ...
#>  $ response      : chr  "blast" "non-blast" "blast" "non-blast" ...
#>  $ rt            : num  0.853 0.575 1.136 0.875 0.748 ...
#>  $ stimulus      : chr  "blastEasy/BL_10166384.jpg" "nonBlastEasy/16258001115A_069.jpg" "nonBlastHard/BL_11504083.jpg" "nonBlastHard/MY_9455143.jpg" ...
#>  $ resp          : chr  "upper" "lower" "upper" "lower" ...
#>  $ truth         : chr  "upper" "lower" "lower" "lower" ...

We then pass the data and log-likelihood function with the necessary additional arguments to an optimization function. As we are using the optimization function nlminb for this example, we must input as the first argument the initial values of our DDM parameters that we want optimized. These are input in the order: \(v_u\), \(v_\ell\), \(a\), \(t_0\), \(w\), and \(sv\); we also need to define upper and lower bounds for each parameters. Fitting the DDM to this dataset is basically instantaneous using this setup.

fit <- nlminb(c(0, 0, 1, 0, 0.5, 0), objective = ll_fun,
              rt = onep[["rt"]], resp = onep[["resp"]], truth = onep[["truth"]],
              # limits:   vu,   vl,   a,  t0, w,  sv
              lower = c(-Inf, -Inf, .01,   0, 0,   0),
              upper = c( Inf,  Inf, Inf, Inf, 1, Inf))
fit
#> $par
#> [1]  5.6813 -2.1887  2.7909  0.3764  0.4010  2.2813
#> 
#> $objective
#> [1] 42.47
#> 
#> $convergence
#> [1] 0
#> 
#> $iterations
#> [1] 39
#> 
#> $evaluations
#> function gradient 
#>       56      289 
#> 
#> $message
#> [1] "relative convergence (4)"

Fitting the Entire Dataset

Here we will run a more rigorous fitting on the entire med_dec dataset to obtain parameter estimates for each participant in the study. To do this, we define a function to run the data fitting for us; we want it to output a dataframe containing the parameter estimates for each individual in the data. The inputs will be the dataset, the allowable error tolerance for the density function, how the “upper” response is presented in the dataset, and indices of the columns in the dataset containing: identification of the individuals in the dataset, the response times, the responses, and the truthful classifications.

After some data checking, the fitting function will extract the unique individuals from the dataset and run the parameter optimization for the responses and response times for each individual. The optimizations themselves are initialized with random initial parameter values to aid in the avoidance of local minima in favor of global minima. Moreover, the optimization will run 5 times for each individual, with 5 different sets of random initial parameter values. The value of the minimized log-likelihood function will be compared across all 5 runs, and the smallest such value will indicate the best fit. The parameter estimates, convergence code, and minimized value of the log-likelihood function produced by this best fit will be saved for that individual.

rt_fit <- function(data, id_idx = NULL, rt_idx = NULL, response_idx = NULL,
                   truth_idx = NULL, response_upper = NULL) {

  # Format data for fitting
  if (all(is.null(id_idx), is.null(rt_idx), is.null(response_idx),
      is.null(truth_idx), is.null(response_upper))) {
    df <- data # assume input data is already formatted
  } else {
    if(any(data[,rt_idx] < 0)) {
      stop("Input data contains negative response times; fit will not be run.")
    }
    if(any(is.na(data[,response_idx]))) {
      stop("Input data contains invalid responses (NA); fit will not be run.")
    }

    nr <- nrow(data)
    df <- data.frame(id = character(nr),
                     rt = double(nr),
                     response = character(nr),
                     truth = character(nr),
                     stringsAsFactors = FALSE)

    if (!is.null(id_idx)) { # relabel identification tags
      for (i in 1:length(id_idx)) {
        idi <- unique(data[,id_idx[i]])
        for (j in 1:length(idi)) {
          df[["id"]][data[,id_idx[i]] == idi[j]] <- paste(
            df[["id"]][data[,id_idx[i]] == idi[j]], idi[j], sep = " ")
        }
      }
      df[["id"]] <- trimws(df[["id"]], which = "left")
    }

    df[["rt"]] <- as.double(data[,rt_idx])

    df[["response"]] <- "lower"
    df[["response"]][data[,response_idx] == response_upper] <- "upper"

    df[["truth"]] <- "lower"
    df[["truth"]][data[,truth_idx] == response_upper] <- "upper"
  }

  # Preliminaries
  ids <- unique(df[["id"]])
  nids <- max(length(ids), 1) # if inds is null, there is only one individual
  ninit_vals <- 5

  # Initilize the output dataframe
  cnames <- c("ID", "Convergence", "Objective",
              "vu_fit", "vl_fit", "a_fit", "t0_fit", "w_fit", "sv_fit")
  out <- data.frame(matrix(ncol = length(cnames), nrow = nids))
  colnames(out) <- cnames
  temp <- data.frame(matrix(ncol = length(cnames)-1, nrow = ninit_vals))
  colnames(temp) <- cnames[-1]

  # Loop through each individual and starting values
  for (i in 1:nids) {
    out[["ID"]][i] <- ids[i]

    # extract data for id i
    dfi <- df[df[["id"]] == ids[i],]
    rti <- dfi[["rt"]]
    respi <- dfi[["response"]]
    truthi <- dfi[["truh"]]

    # starting value for t0 must be smaller than the smallest rt
    min_rti <- min(rti)

    # create initial values for this individual
    init_vals <- data.frame(vu = rnorm(n = ninit_vals, mean = 4, sd = 2),
                            vl = rnorm(n = ninit_vals, mean = -4, sd = 2),
                            a  = runif(n = ninit_vals, min = 0.5, max = 5),
                            t0 = runif(n = ninit_vals, min = 0, max = min_rti),
                            w  = runif(n = ninit_vals, min = 0, max = 1),
                            sv = runif(n = ninit_vals, min = 0, max = 5))

    # loop through all of the starting values
    for (j in 1:ninit_vals) {
      mres <- nlminb(init_vals[j,], ll_fun,
                     rt = rti, resp = respi, truth = truthi,
                     # limits:   vu,   vl,   a,  t0, w,  sv
                     lower = c(-Inf, -Inf, .01,   0, 0,   0),
                     upper = c( Inf,  Inf, Inf, Inf, 1, Inf))
      temp[["Convergence"]][j] <- mres[["convergence"]]
      temp[["Objective"]][j] <- mres[["objective"]]
      temp[j, -c(1, 2)] <- mres[["par"]]
    }

    # determine best fit for the individual
    min_idx <- which.min(temp[["Objective"]])
    out[i, -1] <- temp[min_idx,]
  }
  return(out)
}

We load the dataset, remove any invalid rows from the dataset, and run the fitting; the dataframe of the fitting results is output below.

data(med_dec, package = "fddm")
med_dec <- med_dec[which(med_dec[["rt"]] >= 0),]
fit <- rt_fit(med_dec, id_idx = c(2,1), rt_idx = 8, response_idx = 7,
              truth_idx = 5, response_upper = "blast")
fit
#>                  ID Convergence Objective   vu_fit vl_fit  a_fit t0_fit  w_fit    sv_fit
#> 1     experienced 2           0   154.157  4.17399 -7.714 2.6092 0.4151 0.4993 5.805e+00
#> 2     experienced 6           0   119.675  4.38532 -3.651 2.3638 0.3739 0.5930 5.380e+00
#> 3     experienced 7           0    74.397  3.53678 -4.468 1.3876 0.4535 0.4499 1.945e+00
#> 4     experienced 9           0   162.764  6.51790 -3.193 2.5723 0.4724 0.4730 5.316e+00
#> 5    experienced 12           0   122.762  1.33693 -3.003 2.1233 0.3871 0.5431 4.559e+00
#> 6    experienced 14           0   107.097  2.78487 -1.645 1.5003 0.4060 0.5481 1.785e+00
#> 7    experienced 16           0   316.742  3.27385 -2.998 2.0504 0.5071 0.4397 2.766e-01
#> 8    experienced 17           0   238.977  2.39573 -4.758 2.2852 0.4859 0.4117 2.462e+00
#> 9   inexperienced 3           0    90.848  3.24622 -4.778 1.2960 0.4461 0.6715 6.416e-09
#> 10  inexperienced 4           0   186.478  6.97799 -4.776 1.5964 0.4145 0.4630 7.141e-01
#> 11  inexperienced 5           0   182.976  6.77768 -6.582 2.8352 0.4174 0.3807 5.718e+00
#> 12  inexperienced 8           0   219.081  4.10477 -1.176 2.0509 0.1464 0.6034 7.728e-01
#> 13 inexperienced 10           0    70.452  0.27386 -2.748 1.4830 0.4132 0.4507 3.113e+00
#> 14 inexperienced 11           0   268.962  4.67309 -5.569 2.4425 0.1317 0.5135 1.300e+00
#> 15 inexperienced 13           0   129.581  4.66018 -4.103 2.0976 0.3946 0.6284 4.207e+00
#> 16 inexperienced 15           0    83.415  2.62908 -1.669 1.2326 0.3726 0.5849 1.424e-06
#> 17 inexperienced 18           0   119.633  4.69842 -2.696 1.4366 0.5222 0.4980 1.559e+00
#> 18 inexperienced 19           1   233.401  6.19088 -3.076 2.6755 0.3876 0.4712 3.926e+00
#> 19         novice 1           0   169.743  7.60253 -7.865 1.6938 0.3987 0.4472 1.132e+00
#> 20         novice 2           0   234.286  2.36825 -3.736 1.7176 0.5541 0.3470 1.992e-08
#> 21         novice 3           0   157.031  5.92329 -1.003 1.3439 0.4973 0.5263 1.279e-08
#> 22         novice 4           0    89.457 -2.82336 -4.864 1.3713 0.3775 0.4943 1.274e+00
#> 23         novice 5           0   173.613  5.87468 -1.947 1.4204 0.5130 0.4725 3.787e-06
#> 24         novice 6           0   142.353  0.43022 -4.752 1.5024 0.4409 0.5888 8.112e-01
#> 25         novice 7           0   372.724  3.10723 -1.414 2.4687 0.4413 0.3790 1.802e-08
#> 26         novice 8           0    77.527  5.88828 -1.538 1.4522 0.1090 0.6044 8.038e-01
#> 27         novice 9           0    62.579  4.05412 -2.453 1.2418 0.3777 0.4889 1.193e+00
#> 28        novice 10           0   100.287  3.41411 -1.984 1.2963 0.4663 0.4521 1.270e+00
#> 29        novice 11           0    77.262  7.79878 -4.638 1.4512 0.3841 0.6106 2.072e+00
#> 30        novice 12           0    85.624  3.89526 -3.118 1.4676 0.3523 0.3428 1.113e-07
#> 31        novice 13           0    84.946  0.05435 -5.146 1.4569 0.3979 0.6826 1.265e+00
#> 32        novice 14           0    72.945  4.26029 -3.621 1.6538 0.3755 0.4775 4.044e+00
#> 33        novice 15           0   149.422  6.04330 -1.608 1.9703 0.4065 0.4891 3.531e+00
#> 34        novice 16           0   150.412  5.78248 -4.231 1.7086 0.3357 0.6148 1.435e+00
#> 35        novice 17           0   264.390  0.60819 -3.596 2.3208 0.1914 0.4617 1.132e+00
#> 36        novice 18           0   104.866  2.98699 -3.698 1.3891 0.4650 0.4954 1.399e+00
#> 37        novice 19           0   209.592  2.88666 -3.321 2.1568 0.4349 0.5002 2.472e+00
#> 38        novice 20           0   234.731  5.28688 -3.655 2.0437 0.0000 0.5734 6.434e-01
#> 39        novice 21           0     8.786  3.69148 -2.802 1.0759 0.4179 0.5128 1.384e+00
#> 40        novice 22           0    57.884  4.27358 -1.952 1.3349 0.3313 0.5455 2.219e+00
#> 41        novice 23           0   112.185  1.36608 -2.797 1.3942 0.4165 0.5527 1.358e+00
#> 42        novice 24           0    98.604  6.30011 -5.531 1.6113 0.3361 0.6214 1.385e+00
#> 43        novice 25           0   101.399  3.34894 -3.816 1.5317 0.4189 0.4984 2.184e+00
#> 44        novice 26           0   187.189  4.19690 -6.786 1.6987 0.4637 0.3486 6.712e-01
#> 45        novice 27           0    64.849  2.35903 -4.826 1.2238 0.5150 0.4185 1.101e+00
#> 46        novice 28           0   143.266  4.70078 -2.994 1.6261 0.1316 0.4763 9.851e-01
#> 47        novice 29           0    -8.025  6.80557 -3.548 0.9979 0.3695 0.5096 1.222e+00
#> 48        novice 30           0   131.602  4.83717 -6.063 1.5847 0.3331 0.5513 1.135e+00
#> 49        novice 31           0   232.037  2.91416 -4.781 1.8749 0.4700 0.4490 1.079e+00
#> 50        novice 32           0   101.678  4.16731 -5.802 1.3497 0.5102 0.4355 1.523e+00
#> 51        novice 33           0   111.875  4.48771 -6.427 1.2291 0.3395 0.5337 4.369e-07
#> 52        novice 34           0    99.261  4.95866 -3.199 1.2717 0.4964 0.5752 5.818e-01
#> 53        novice 35           0   176.715  4.91534 -7.004 2.1075 0.5099 0.4705 3.545e+00
#> 54        novice 36           0    92.863  5.94223 -5.594 1.3239 0.5260 0.4143 1.049e+00
#> 55        novice 37           0   101.640  3.89518 -3.090 1.4562 0.3345 0.6031 9.149e-01

Rudimentary Analysis

To show some basic results of our fitting, we will plot the fitted values of \(v_u\) and \(v_\ell\) grouped by the experience level of the participant to demonstrate how these parameters differ among novices, inexperienced professionals, and experienced professionals.

library("reshape2")
library("ggplot2")

fitp <- data.frame(fit[, c(1, 4, 5)]) # make a copy to manipulate for plotting
colnames(fitp)[-1] <- c("vu", "vl")

for (i in 1:length(unique(fitp[["ID"]]))) {
  first <- substr(fitp[["ID"]][i], 1, 1)
  if (first == "n") {
    fitp[["ID"]][i] <- "novice"
  } else if (first == "i") {
    fitp[["ID"]][i] <- "inexperienced"
  } else {
    fitp[["ID"]][i] <- "experienced"
  }
}

fitp <- melt(fitp, id.vars = "ID", measure.vars = c("vu", "vl"),
             variable.name = "vuvl", value.name = "estimate")

ggplot(fitp, aes(x = factor(ID, levels = c("novice", "inexperienced", "experienced")),
                 y = estimate,
                 color = factor(vuvl, levels = c("vu", "vl")))) +
  geom_point(alpha = 0.4, size = 4) +
  labs(title = "Parameter Estimates for vu and vl",
       x = "Experience Level", y = "Parameter Estimate",
       color = "Drift Rate") +
  theme_bw() +
  theme(panel.border = element_blank(),
        plot.title = element_text(size = 23),
        plot.subtitle = element_text(size = 16),
        axis.text.x = element_text(size = 16),
        axis.text.y = element_text(size = 16),
        axis.title.x = element_text(size = 20,
                                    margin = margin(10, 5, 5, 5, "pt")),
        axis.title.y = element_text(size = 20),
        legend.title = element_text(size = 20),
        legend.text = element_text(size = 16))

Before we begin analysis of this plot, note that the drift rate corresponding to the upper threshold should always be positive, and the drift rate corresponding to the lower threshold should always be negative. Since there are a few fitted values that switch this convention, the novice participants show evidence of consistently responding incorrectly to the stimulus. In contrast, both the inexperienced and experienced participants show a clean division of drift rates around zero.

In addition, we notice that the more experienced participants tend to have higher fitted drift rates in absolute value. A more extreme drift rate means that the participant receives and processes information more efficiently than a more mild drift rate. The overall pattern is that the novices are on average the worst at receiving information, the experienced professionals are the best, and the inexperienced professionals are somewhere in the middle. This pattern indicates that experienced professionals are indeed better at their job than untrained undergraduate students!

R Session Info

sessionInfo()
#> R version 4.0.2 (2020-06-22)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19041)
#> 
#> Matrix products: default
#> 
#> locale:
#> [1] LC_COLLATE=C                            LC_CTYPE=English_United Kingdom.1252   
#> [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
#> [5] LC_TIME=English_United Kingdom.1252    
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ggplot2_3.3.2        reshape2_1.4.4       microbenchmark_1.4-7 RWiener_1.3-3       
#> [5] rtdists_0.11-2       fddm_0.2-2          
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.5       pillar_1.4.6     compiler_4.0.2   plyr_1.8.6       tools_4.0.2     
#>  [6] digest_0.6.25    evd_2.3-3        evaluate_0.14    lifecycle_0.2.0  tibble_3.0.3    
#> [11] gtable_0.3.0     lattice_0.20-41  pkgconfig_2.0.3  rlang_0.4.7      Matrix_1.2-18   
#> [16] yaml_2.2.1       mvtnorm_1.1-1    expm_0.999-5     xfun_0.18        withr_2.3.0     
#> [21] dplyr_1.0.2      stringr_1.4.0    knitr_1.30       generics_0.0.2   vctrs_0.3.4     
#> [26] tidyselect_1.1.0 grid_4.0.2       ggnewscale_0.4.3 glue_1.4.2       R6_2.4.1        
#> [31] survival_3.2-7   rmarkdown_2.4    farver_2.0.3     purrr_0.3.4      magrittr_1.5    
#> [36] scales_1.1.1     htmltools_0.5.0  ellipsis_0.3.1   splines_4.0.2    colorspace_1.4-1
#> [41] labeling_0.3     stringi_1.5.3    gsl_2.1-6        munsell_0.5.0    msm_1.6.8       
#> [46] crayon_1.3.4

References

Trueblood, Jennifer S., William R. Holmes, Adam C. Seegmiller, Jonathan Douds, Margaret Compton, Eszter Szentirmai, Megan Woodruff, Wenrui Huang, Charles Stratton, and Quentin Eichbaum. 2018. “The Impact of Speed and Bias on the Cognitive Processes of Experts and Novices in Medical Image Decision-Making.” Cognitive Research: Principles and Implications 3 (1): 28. https://doi.org/10.1186/s41235-018-0119-2.