Validity of Methods

July 08, 2020

Function dfddm evaluates the density function (or probability density function, PDF) for the Ratcliff diffusion decision model (DDM) using different methods for approximating the full PDF, which contains an infinite sum. An overview of the mathematical details of the different approximations is provided in the Math Vignette. Timing benchmarks for the present methods and comparison with existing methods are provided in the Benchmark Vignette. Two examples of using dfddm for parameter estimation are provided in the Example Vignette.

Our implementation of the DDM has the following parameters: $$a \in (0, \infty)$$ (threshold separation), $$v \in (-\infty, \infty)$$ (drift rate), $$t_0 \in [0, \infty)$$ (non-decision time/response time constant), $$w \in (0, 1)$$ (relative starting point), and $$sv \in (0, \infty)$$ (inter-trial-variability of drift).

Background

This vignette is more technical in nature and will demonstrate not only the consistency across the different implemented methods but also show that dfddm is in accordance with other implementations in the literature. In addition to testing all of the methods available in dfddm, we also include three functions that are considered to be current and widely used. First, we include the function ddiffusion from the package rtdists as it is well known for being not only user friendly and feature-rich but also designed specifically for handling data with regard to distributions for response time models. Second, we also test the dwiener function from the RWiener package, which is mainly aimed at providing an R language interface to calculate various functions from the Wiener process. Third, we include some raw R code that is bundled with the paper written by Gondan, Blurton, and Kesselmeier (2014) about improving the approximation to the density function. As this code was not yet available in an R package, it is part of fddm and can be accessed after running source on the corresponding file as shown below.

First, we perform a simple test in Validating the Density Functions, evaluating each algorithm for the density function of the DDM and comparing their consistency. To ensure rigorous concordance across all algorithms, we test each algorithm throughout a sufficiently large and granular parameter space that can be found fully defined in the subsection Generating Data. Of course these algorithms are approximations, and as such they will provide slightly different results given the same parameter inputs. To ensure that these small differences in density output do not affect the results of fitting parameters to real-world data, we also include testing in an optimization setting in Validating Fitting (Optimization) Using the Density Functions. We use the various algorithms as the bases for log-likelihood functions for the optimization process and include a variety of starting points in the parameter space to ensure rigorous consistency testing.

Validating the Density Functions

It is imperative to show that our density functions produce the same results as the current standards. To demonstrate this, we calculate the lower probability density for a granular parameter space and show that all of the results are within an acceptable error tolerance. We only calculate the lower probability density because the upper probability density negates $$v$$ ($$v' = -v$$) and takes the complement of $$w$$ ($$w' = 1-w$$); our parameter space already includes all of these values so calculating the upper probability density would be redundant. The fully defined parameter space can be found below, and it includes both realistically feasible values and some extreme values for each parameter. Since each of these functions approximates an infinite summation to a desired precision of $$\epsilon$$, we allow for a difference of $$2 \cdot \epsilon$$ between any pair of calculated densities to account for convergence from above and below the limit of the summation.

Note that we include $$sv$$ in all the functions tested below, even those that do not contain $$sv$$ themselves (i.e., RWiener and Gondan, Blurton, and Kesselmeier (2014)). This is possible because variability in drift rate can be added to those functions via the constant $$M$$ as described in the Math Vignette.

Generating Data

First we load the necessary packages and code available from the current literature.

The following code chunk stores the lower probability densities calculated across the parameter space using the different methods; we also calculate and store the log-probability for consistency testing.

# Define parameter space
RT <- c(0.001, 0.1, 1, 10)
A <- c(0.5, 1, 5)
V <- c(-5, 0, 5)
t0 <- 1e-4 # must be nonzero for RWiener
W <- c(0.2, 0.5, 0.8)
SV <- c(0, 0.5, 1.5)
SV_THRESH <- 1e-6
eps <- 1e-6 # this is the setting from rtdists

nRT <- length(RT)
nA <- length(A)
nV <- length(V)
nW <- length(W)
nSV <- length(SV)
resp <- rep("lower", nRT) # for RWiener

fnames <- c("fs_SWSE_17", "fs_SWSE_14",
"fs_Gon_17", "fs_Gon_14", "fs_Nav_17", "fs_Nav_14",
"fb_Gon_17", "fb_Gon_14", "fb_Nav_17", "fb_Nav_14",
"fl_Nav_09", "RWiener", "Gondan", "rtdists")
nf <- length(fnames)

res <- data.frame(matrix(ncol = 9, nrow = nf*nRT*nA*nV*nW*nSV))
colnames(res) <- c('rt', 'a', 'v', 'w', 'sv', 'FuncName', 'res', 'dif',
'log_res')
start <- 1
stop <- nf

# Loop through each combination of parameters and record results
for (rt in 1:nRT) {
for (a in 1:nA) {
for (v in 1:nV) {
for (w in 1:nW) {
for (sv in 1:nSV) {
# add the rt, v, a, w, and function names to the dataframe
res[start:stop, 1] <- rep(RT[rt], nf)
res[start:stop, 2] <- rep(A[a]  , nf)
res[start:stop, 3] <- rep(V[v]  , nf)
res[start:stop, 4] <- rep(W[w]  , nf)
res[start:stop, 5] <- rep(SV[sv], nf)
res[start:stop, 6] <- fnames

# calculate "lower" density
res[start,    7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "SWSE",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+1,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "SWSE",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+2,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+3,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+4,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+5,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+6,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+7,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Gondan",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+8,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+9,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "Navarro",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+10,  7] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = FALSE, n_terms_small = "",
summation_small = "", scale = "large",
err_tol = eps)
res[start+11, 7] <- dwiener(RT[rt], resp = resp[rt], alpha = A[a],
delta = V[v], tau = t0, beta = W[w],
give_log = FALSE)
res[start+12, 7] <- fs(t = RT[rt]-t0, a = A[a], v = V[v],
w = W[w], eps = eps)
res[start+13, 7] <- ddiffusion(RT[rt], resp[rt], a = A[a], v = V[v],
t0 = t0, z = W[w]*A[a], sv = SV[sv])
if (sv > SV_THRESH) { # multiply to get density with sv
t <- RT[rt] - t0
M <- exp(V[v] * A[a] * W[w] + V[v]*V[v] * t / 2 +
(SV[sv]*SV[sv] * A[a]*A[a] * W[w]*W[w] -
2 * V[v] * A[a] * W[w] - V[v]*V[v] * t) /
(2 + 2 * SV[sv]*SV[sv] * t)) / sqrt(1 + SV[sv]*SV[sv] * t)
res[start+11, 7] <- M * res[start+11, 7] # RWiener
res[start+12, 7] <- M * res[start+12, 7] # Gondan_R
}

# calculate differences
ans <- res[start, 7] # use Foster_2017_small as truth
res[start,    8] <- abs(res[start,    7] - ans)
res[start+1,  8] <- abs(res[start+1,  7] - ans)
res[start+2,  8] <- abs(res[start+2,  7] - ans)
res[start+3,  8] <- abs(res[start+3,  7] - ans)
res[start+4,  8] <- abs(res[start+4,  7] - ans)
res[start+5,  8] <- abs(res[start+1,  7] - ans)
res[start+6,  8] <- abs(res[start+6,  7] - ans)
res[start+7,  8] <- abs(res[start+7,  7] - ans)
res[start+8,  8] <- abs(res[start+8,  7] - ans)
res[start+9,  8] <- abs(res[start+9,  7] - ans)
res[start+10, 8] <- abs(res[start+10, 7] - ans)
res[start+11, 8] <- abs(res[start+11, 7] - ans)
res[start+12, 8] <- abs(res[start+12, 7] - ans)
res[start+13, 8] <- abs(res[start+13, 7] - ans)

# calculate log of "lower" density
res[start,    9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "SWSE",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+1,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "SWSE",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+2,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+3,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+4,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2017", scale = "small",
err_tol = eps)
res[start+5,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2014", scale = "small",
err_tol = eps)
res[start+6,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+7,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Gondan",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+8,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2017", scale = "both",
err_tol = eps)
res[start+9,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "Navarro",
summation_small = "2014", scale = "both",
err_tol = eps)
res[start+10,  9] <- dfddm(rt = RT[rt], response = resp[rt], a = A[a],
v = V[v], t0 = t0, w = W[w], sv = SV[sv],
log = TRUE, n_terms_small = "",
summation_small = "", scale = "large",
err_tol = eps)
res[start+11, 9] <- dwiener(RT[rt], resp = resp[rt], alpha = A[a],
delta = V[v], tau = t0, beta = W[w],
give_log = TRUE)
res[start+12, 9] <- log(fs(t = RT[rt]-t0, a = A[a], v = V[v],
w = W[w], eps = eps))
res[start+13, 9] <- log(ddiffusion(RT[rt], resp[rt], a = A[a],
v = V[v], t0 = t0, z = W[w]*A[a],
sv = SV[sv]))
if (sv > SV_THRESH) { # add to get log of density with sv
t <- RT[rt] - t0
M <- V[v] * A[a] * W[w] + V[v]*V[v] * t / 2 +
(SV[sv]*SV[sv] * A[a]*A[a] * W[w]*W[w] -
2 * V[v] * A[a] * W[w] - V[v]*V[v] * t) /
(2 + 2 * SV[sv]*SV[sv] * t) - 0.5 * log(1 + SV[sv]*SV[sv] * t)
res[start+11, 9] <- M + res[start+11, 9] # RWiener
res[start+12, 9] <- M + res[start+12, 9] # Gondan_R
}

# iterate start and stop values
start = start + nf
stop = stop + nf
}
}
}
}
}

Testing the Density Functions

Now we test the consistency of the various density function approximations by using the testthat package. As discussed above, we allow the difference between any two outputs to be twice the original desired accuracy to allow for convergence from either above or below. We have to amend some of the checks because of the instability of some of the methods, and these are documented in the Known Errors section below. First we check that all of the approximations produce densities that are non-negative (we allow a density equal to zero). Next, we check the consistency of the internal dfddm methods before confirming their accuracy with external code from established packages. Lastly, we run similar tests to check the consistency of the logged results; however, we do not test the logs of densities that are very close to zero (i.e., less than $$\epsilon^2$$) because even a small difference in approximated density can lead to a very large difference in logged density. If all tests pass correctly, there should be no output.

library("testthat")

# Subset results
SWSE <- subset(res, FuncName %in% fnames[c(1,2)])
Gondan_s <- subset(res, FuncName %in% fnames[c(3,4)])
Gondan_b <- subset(res, FuncName %in% fnames[c(7,8)])
Navarro_s <- subset(res, FuncName %in% fnames[c(5,6)])
Navarro_l <- subset(res, FuncName %in% fnames[11])
Navarro_b <- subset(res, FuncName %in% fnames[c(9,10)])
rtdists <- subset(res, FuncName == "rtdists")
RWiener <- subset(res, FuncName == "RWiener")
Gondan_R <- subset(res, FuncName == "Gondan")

# Compensate for KE 1, 2
Nav_s_res_0 <- subset(Navarro_s, res < 0) # KE 1
Nav_s_0 <- min(Nav_s_res_0$rt / Nav_s_res_0$a / Nav_s_res_0$a) # = 12 Nav_l_res_0 <- subset(Navarro_l, res < 0) # KE 2 Nav_l_0 <- max(Nav_l_res_0$rt / Nav_l_res_0$a / Nav_l_res_0$a) # = 0.04
Nav_l_dif_2eps <- subset(Navarro_l, dif > 2*eps) # KE 2
Nav_l_2 <- max(Nav_l_dif_2eps$rt / Nav_l_dif_2eps$a / Nav_l_dif_2eps$a) # = 0.004 # Ensure all densities are non-negative test_that("Non-negativity of densities", { expect_true(all(SWSE$res >= 0))
expect_true(all(Gondan_s$res >= 0)) expect_true(all(Gondan_b$res >= 0))
expect_true(all(subset(Navarro_s, rt/a/a < Nav_s_0)$res >= 0)) # see KE 1 expect_true(all(subset(Navarro_l, rt/a/a > Nav_l_0)$res >= 0)) # see KE 2
expect_true(all(subset(Navarro_b, rt/a/a > Nav_l_0)$res >= 0)) # see KE 3 expect_true(all(rtdists$res >= 0))
expect_true(all(RWiener$res >= 0)) expect_true(all(Gondan_R$res >= 0))
})

# Test accuracy within 2*eps (allows for convergence from above and below)
test_that("Consistency among internal methods", {
expect_true(all(SWSE$dif < 2*eps)) expect_true(all(Gondan_s$dif < 2*eps))
expect_true(all(Gondan_b$dif < 2*eps)) expect_true(all(Navarro_s$dif < 2*eps)) # see KE 1
expect_true(all(subset(Navarro_l, rt/a/a > Nav_l_2)$dif < 2*eps)) # see KE 2 expect_true(all(Navarro_b$dif < 2*eps)) # see KE 1,2
})

test_that("Accuracy relative to established packages", {
expect_true(all(rtdists$dif < 2*eps)) expect_true(all(RWiener$dif < 2*eps))
expect_true(all(subset(Gondan_R, sv < SV_THRESH)$dif < 2*eps)) # see KE 4 }) # Test consistency in log vs non-log test_that("Log-Consistency among internal methods", { expect_equal(subset(SWSE, res > eps*eps)$log_res,
log(subset(SWSE, res > eps*eps)$res)) expect_equal(subset(Gondan_s, res > eps*eps)$log_res,
log(subset(Gondan_s, res > eps*eps)$res)) expect_equal(subset(Gondan_b, res > eps*eps)$log_res,
log(subset(Gondan_b, res > eps*eps)$res)) expect_equal(subset(Navarro_s, res > eps*eps)$log_res,
log(subset(Navarro_s, res > eps*eps)$res)) expect_equal(subset(Navarro_l, res > eps*eps)$log_res,
log(subset(Navarro_l, res > eps*eps)$res)) expect_equal(subset(Navarro_b, res > eps*eps)$log_res,
log(subset(Navarro_b, res > eps*eps)$res)) }) test_that("Log-Consistency of established packages", { expect_equal(subset(rtdists, res > eps*eps)$log_res,
log(subset(rtdists, res > eps*eps)$res)) expect_equal(subset(RWiener, res > eps*eps)$log_res,
log(subset(RWiener, res > eps*eps)$res)) expect_equal(subset(Gondan_R, res > eps*eps)$log_res,
log(subset(Gondan_R, res > eps*eps)$res)) }) Known Errors (KE) 1. Navarro small-time approximation is unstable for “large” effective response times, $$\frac{rt}{a^2} \geq 12$$. It gives slightly negative densities for effective response times above this threshold. These parameter values should not have an effect on the “both” time scale because the large-time approximation should handle such locations in the parameter space. The negative results, however, are still within $$2 \cdot \epsilon$$ of the accepted results (basically zero). 2. Navarro large-time approximation is unstable for “small” effective response times, $$\frac{rt}{a^2} \leq 0.04$$. Similarly to the Navarro small-time approximation above, the large-time approximation gives slightly negative densities for effective response times in this range. Furthermore, it yields inaccurate densities for effective response times of $$\frac{rt}{a^2} \leq 0.004$$. These parameter values should not have an effect on the “both” time scale because the small-time approximation should handle such locations in the parameter space. 3. The Navarro “both” time scale switches between the small-time and large-time approximations. This method relies too much on the unstable large time approximation, leading to its instability; thus we only need to subset to correct for the large time. 4) The Gondan_R approximation divides the error tolerance by the multiplicative term outside of the summation. Since the outside term is different when $$sv > 0$$, the approximation uses the incorrect error tolerance for $$sv > 0$$. This affects the number of terms required in the summation to achieve the desired precision, thus not actually achieving that desired precision. This issue is fixed in our implementation of the Gondan method (n_terms_small = "Gondan", scale = "small"). For an example of this discrepancy, see the following code: Validating Fitting (Optimization) Using the Density Functions Perhaps the most practical use of dfddm is to use it in an optimization setting, such as fitting DDM parameters to real-world data, and this section will show that all of the methods implemented in dfddm yield the same results in this setting. The parameters that we will fit are: $$a$$ (threshold separation), $$v$$ (drift rate), $$t_0$$ (non-decision time/response time constant), $$w$$ (relative starting point), and $$sv$$ (inter-trial-variability of drift). Since the example data we are using consists of two different item types that each require a different correct response (i.e., for one item type the correct response is mapped to the lower boundary and for the other item type the correct response is mapped to the upper boundary), the model includes two different versions of $$v$$ (drift rate): $$v_\ell$$ for fitting to the truthful lower boundary, and $$v_u$$ for fitting to the truthful upper boundary. As many of the approximations in dfddm have different styles, we only test a subset of all the available methods in dfddm to avoid unnecessary testing. After using the various density function approximations in fitting the DDM to real-world data, we then validate that the produced parameter estimates are consistent across the various methods within a given error tolerance. Generating the Parameter Estimates Using Real-World Data We use a range of different initial parameter values for the optimization function to ensure that none of the methods encounters a problem in the parameter space. However, we need to slightly restrict the starting values to prevent fitting issues with some of the small-time methods; these restrictions are discussed in a later subsection and in Known Error 5. We allow a difference of $$0.0001$$ across the methods for each combination of parameters since the density functions can return slightly different results (within $$2 \cdot \epsilon$$), as discussed earlier in this vignette. The following subsections will define all of the functions used to generate the fittings and provide the code to run the full fitting. Since running the full fitting for all of the individuals in the data takes a long time, we will forgo running the fitting in this vignette and instead read pre-fit parameter estimates that used the provided code. To avoid too many repetitious plots, we will only include six of the methods from the previous section that illustrate the different variations and implementations of the density functions. We include: all three small-time implementations in dfddm, the two implementations in dfddm that combine the small-time and large-time, and the one method available in the rtdists package. We only use a selection of the methods available in dfddm to avoid redundancy in our testing since most of the methods have multiple sibling approximations that are extremely similar. Moreover, we do not require the RWiener package nor the Gondan raw R code because the associated density function approximations do not include an option for variability in drift rate. While we can convert the constant drift rate density to the variable drift rate density using a multiplicative factor, the densities are still potentially calculated incorrectly. For more details on the differences between the constant drift rate density functions and their variable drift rate counterparts, see the Math Vignette. First, we load the necessary packages. Log-Likelihood Functions This code chunk defines the log-likelihood functions used in the optimization algorithm. The log-likelihood functions are fairly straightforward and split the responses and associated response times by the true item status (i.e., the correct response) to enable fitting distinct drift rates ($$v_\ell$$ for the items in which the correct response is the lower boundary and $$v_u$$ for the items for which the correct response is the upper boundary). In addition, the log-likelihood functions heavily penalize any combination of parameters that returns a log-density of $$-\infty$$ (equivalent to a regular density of $$0$$). ll_fs_SWSE_17 <- function(pars, rt, resp, truth, err_tol) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "SWSE", summation_small = "2017", scale = "small", err_tol = err_tol) # the truth is "lower" so use vl densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "SWSE", summation_small = "2017", scale = "small", err_tol = err_tol) densities <- c(densu, densl) if (any(!is.finite(densities))) return(1e6) return(-sum(densities)) } ll_fs_Gon_17 <- function(pars, rt, resp, truth, err_tol) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Gondan", summation_small = "2017", scale = "small", err_tol = err_tol) # the truth is "lower" so use vl densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Gondan", summation_small = "2017", scale = "small", err_tol = err_tol) densities <- c(densu, densl) if (any(!is.finite(densities))) return(1e6) return(-sum(densities)) } ll_fs_Nav_17 <- function(pars, rt, resp, truth, err_tol) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Navarro", summation_small = "2017", scale = "small", err_tol = err_tol) # the truth is "lower" so use vl densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Navarro", summation_small = "2017", scale = "small", err_tol = err_tol) densities <- c(densu, densl) if (any(!is.finite(densities))) return(1e6) return(-sum(densities)) } ll_fb_Gon_17 <- function(pars, rt, resp, truth, err_tol) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Gondan", summation_small = "2017", scale = "both", err_tol = err_tol) # the truth is "lower" so use vl densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Gondan", summation_small = "2017", scale = "both", err_tol = err_tol) densities <- c(densu, densl) if (any(!is.finite(densities))) return(1e6) return(-sum(densities)) } ll_fb_Nav_17 <- function(pars, rt, resp, truth, err_tol) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- dfddm(rt = rtu, response = respu, a = pars[[3]], v = pars[[1]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Navarro", summation_small = "2017", scale = "both", err_tol = err_tol) # the truth is "lower" so use vl densl <- dfddm(rt = rtl, response = respl, a = pars[[3]], v = pars[[2]], t0 = pars[[4]], w = pars[[5]], sv = pars[[6]], log = TRUE, n_terms_small = "Navarro", summation_small = "2017", scale = "both", err_tol = err_tol) densities <- c(densu, densl) if (any(!is.finite(densities))) return(1e6) return(-sum(densities)) } ll_RTDists <- function(pars, rt, resp, truth) { rtu <- rt[truth == "upper"] rtl <- rt[truth == "lower"] respu <- resp[truth == "upper"] respl <- resp[truth == "lower"] # the truth is "upper" so use vu densu <- ddiffusion(rtu, respu, a = pars[[3]], v = pars[[1]], z = pars[[5]]*pars[[3]], t0 = pars[[4]], sv = pars[[6]]) # the truth is "lower" so use vl densl <- ddiffusion(rtl, respl, a = pars[[3]], v = pars[[2]], z = pars[[5]]*pars[[3]], t0 = pars[[4]], sv = pars[[6]]) densities <- c(densu, densl) if (any(densities <= 0)) return(1e6) return(-sum(log(densities))) } Fitting Function We use a different set of initial parameter values for each combination of data set and method. In a real-life situation we would probably do so using random initial parameter values to avoid local minima. Here we do this with a fixed set of initial parameter values to ensure that the different method produce the same results for different starting points. However, there are a couple of restrictions to these initial values that we must address (we would also have to do so if we picked the initial values randomly). The parameter $$t_0$$ must be less than the response time, so we set the initial values for $$t_0$$ to be strictly less than the minimum response time according to each individual in the input data.frame. In addition, the optimization algorithm encounters problems for specific values of $$a$$ when using the log-likelihood function that is based on the Navarro small-time density function approximation, and this is explained in Known Errors. Furthermore, we must place a lower bound on $$a$$ because the optimization algorithm occasionally evaluates the log-likelihood functions (and thus the underlying density function approximations) using values of $$a$$ that are very close to zero in an effort to fully explore the parameter space. In these cases, the small-time approximations struggle because $$a$$ inversely scales the input response time and effectively changes the input response time into a very large one. In particular, an extremely small value of $$a$$ can cause the “SWSE” small-time density approximation to run for a very long time; to prevent this, we bound $$a$$ below by $$0.05$$ as it is a very small value such that anything less would be of no psychological significance to the model. In common use this is not an issue because very small values of $$a$$ do not make any sense with regard to the psychological interpretation of the parameter, but this issue can arise in an exploratory optimization environment. The following code chunk defines the function that will run the optimization and produce the fitted parameter estimates for $$v_u$$, $$v_\ell$$, $$a$$, $$t_0$$, $$w$$, and $$sv$$. As discussed above, we only use a selection of methods from those available in dfddm. This fitting function will run the optimization for each set of initial parameter values and store the resulting convergence code (either $$0$$ indicating successful convergence or $$1$$ indicating unsuccessful convergence), minimized value of the objective log-likelihood function, and parameter estimates. rt_fit <- function(data, id_idx = NULL, rt_idx = NULL, response_idx = NULL, truth_idx = NULL, response_upper = NULL, err_tol = 1e-6) { # Format data for fitting if (all(is.null(id_idx), is.null(rt_idx), is.null(response_idx), is.null(truth_idx), is.null(response_upper))) { df <- data # assume input data is already formatted } else { if(any(data[,rt_idx] < 0)) { stop("Input data contains negative response times; fit will not be run.") } if(any(is.na(data[,response_idx]))) { stop("Input data contains invalid responses (NA); fit will not be run.") } nr <- nrow(data) df <- data.frame(id = character(nr), rt = double(nr), response = character(nr), truth = character(nr), stringsAsFactors = FALSE) if (!is.null(id_idx)) { # relabel identification tags for (i in 1:length(id_idx)) { idi <- unique(data[,id_idx[i]]) for (j in 1:length(idi)) { df$id[data[,id_idx[i]] == idi[j]] <- paste(df$id[data[,id_idx[i]] == idi[j]], idi[j], sep = " ") } } df$id <- trimws(df$id, which = "left") } df$rt <- as.double(data[,rt_idx])

df$response <- "lower" df$response[data[,response_idx] == response_upper] <- "upper"

df$truth <- "lower" df$truth[data[,truth_idx] == response_upper] <- "upper"
}

# Preliminaries
ids <- unique(df$id) nids <- max(length(ids), 1) # if inds is null, there is only one individual init_vals <- data.frame(vu = c( 0, 10, -.5, 0, 0, 0, 0, 0, 0, 0, 0), vl = c( 0, -10, .5, 0, 0, 0, 0, 0, 0, 0, 0), a = c( 1, 1, 1, .5, 5, 1, 1, 1, 1, 1, 1), t0 = c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), w = c(.5, .5, .5, .5, .5, .5, .5, .2, .8, .5, .5), sv = c( 1, 1, 1, 1, 1, 1, 1, 1, 1, .05, 5)) ninit_vals <- nrow(init_vals) algo_names <- c(rep("fs_SWSE_17", ninit_vals), rep("fs_Gon_17", ninit_vals), rep("fs_Nav_17", ninit_vals), rep("fb_Gon_17", ninit_vals), rep("fb_Nav_17", ninit_vals), rep("rtdists", ninit_vals)) nalgos <- length(unique(algo_names)) ni <- nalgos*ninit_vals # Initilize the result dataframe cnames <- c("ID", "Algorithm", "Convergence", "Objective", "vu_init", "vl_init", "a_init", "t0_init", "w_init", "sv_init", "vu_fit", "vl_fit", "a_fit", "t0_fit", "w_fit", "sv_fit") res <- data.frame(matrix(ncol = length(cnames), nrow = nids*ninit_vals*nalgos)) colnames(res) <- cnames # label the result dataframe res$Algorithm <- algo_names # label algorithms
res$vu_init <- init_vals$vu # label initial vu
res$vl_init <- init_vals$vl # label initial vl
res$a_init <- init_vals$a  # label initial a
res$w_init <- init_vals$w  # label initial w
res$sv_init <- init_vals$sv # label initial sv

# Loop through each individual and starting values
for (i in 1:nids) {
# extract data for id i
dfi <- df[df$id == ids[i],] rti <- dfi$rt
respi <- dfi$response truthi <- dfi$truth

# starting value for t0 must be smaller than the smallest rt
min_rti <- min(rti)
t0_lo <- 0.01*min_rti
t0_me <- 0.50*min_rti
t0_hi <- 0.99*min_rti
init_vals$t0 <- c(rep(t0_me, 5), t0_lo, t0_hi, rep(t0_me, 4)) # label the result dataframe res$ID[((i-1)*ni+1):(i*ni)] <- ids[i] # label individuals
res$t0_init[((i-1)*ni+1):(i*ni)] <- init_vals$t0 # label initial t0

# loop through all of the starting values
for (j in 1:ninit_vals) {
temp <- nlminb(init_vals[j,], ll_fs_SWSE_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits:   vu,   vl,   a,  t0, w,  sv
lower = c(-Inf, -Inf, .05,   0, 0,   0),
upper = c( Inf,  Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+0*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+0*ninit_vals+j] <- temp$objective
res[(i-1)*ni+0*ninit_vals+j, 11:16] <- temp$par temp <- nlminb(init_vals[j,], ll_fs_Gon_17, rt = rti, resp = respi, truth = truthi, err_tol = err_tol, # limits: vu, vl, a, t0, w, sv lower = c(-Inf, -Inf, .05, 0, 0, 0), upper = c( Inf, Inf, Inf, Inf, 1, Inf)) res$Convergence[(i-1)*ni+1*ninit_vals+j] <- temp$convergence res$Objective[(i-1)*ni+1*ninit_vals+j] <- temp$objective res[(i-1)*ni+1*ninit_vals+j, 11:16] <- temp$par

temp <- nlminb(init_vals[j,], ll_fs_Nav_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits:   vu,   vl,   a,  t0, w,  sv
lower = c(-Inf, -Inf, .05,   0, 0,   0),
upper = c( Inf,  Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+2*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+2*ninit_vals+j] <- temp$objective
res[(i-1)*ni+2*ninit_vals+j, 11:16] <- temp$par temp <- nlminb(init_vals[j,], ll_fb_Gon_17, rt = rti, resp = respi, truth = truthi, err_tol = err_tol, # limits: vu, vl, a, t0, w, sv lower = c(-Inf, -Inf, .05, 0, 0, 0), upper = c( Inf, Inf, Inf, Inf, 1, Inf)) res$Convergence[(i-1)*ni+3*ninit_vals+j] <- temp$convergence res$Objective[(i-1)*ni+3*ninit_vals+j] <- temp$objective res[(i-1)*ni+3*ninit_vals+j, 11:16] <- temp$par

temp <- nlminb(init_vals[j,], ll_fb_Nav_17,
rt = rti, resp = respi, truth = truthi, err_tol = err_tol,
# limits:   vu,   vl,   a,  t0, w,  sv
lower = c(-Inf, -Inf, .05,   0, 0,   0),
upper = c( Inf,  Inf, Inf, Inf, 1, Inf))
res$Convergence[(i-1)*ni+4*ninit_vals+j] <- temp$convergence
res$Objective[(i-1)*ni+4*ninit_vals+j] <- temp$objective
res[(i-1)*ni+4*ninit_vals+j, 11:16] <- temp$par temp <- nlminb(init_vals[j,], ll_RTDists, rt = rti, resp = respi, truth = truthi, # limits: vu, vl, a, t0, w, sv lower = c(-Inf, -Inf, .05, 0, 0, 0), upper = c( Inf, Inf, Inf, Inf, 1, Inf)) res$Convergence[(i-1)*ni+5*ninit_vals+j] <- temp$convergence res$Objective[(i-1)*ni+5*ninit_vals+j] <- temp$objective res[(i-1)*ni+5*ninit_vals+j, 11:16] <- temp$par
}
}
return(res)
}

Running the Fitting

As an example data set, we use the med_dec data that comes with fddm. This data contains the accuracy condition reported in Trueblood et al. (2018) investigating medical decision making among medical professionals (pathologists) and novices (i.e., undergraduate students). The task of participants was to judge whether pictures of blood cells show cancerous cells (i.e., blast cells) or non-cancerous cells (i.e., non-blast cells). The data set contains 200 decision per participant, based on pictures of 100 true cancerous cells and pictures of 100 true non-cancerous cells. We remove the trials without response (indicated by rt < 0 in the data) before fitting.

Having set up the fitting functions in the above chunks of code, we could pass the med_dec data to this function to get the parameter estimates. However, as this takes a long time we skip the fitting in this vignette and instead read the pre-fit parameter estimates in the next section.

Testing the Fitted Parameters

Now we test the consistency of the fitted parameters using the various density function approximations. As discussed above, we use an allowable error tolerance of $$0.0001$$ to account for the slight differences in the output of the different density approximations. First, we define a function to determine the differences in the objective value of the log-likelihood function and the parameter estimates. For each set of initial parameter values, there is one objective value corresponding to each algorithm; we take the minimum of these objective values and use the associated parameter estimates as the best estimates. Then we use this set of best estimates to compare against each set parameter estimates that use the same initial parameter values. At the end of this prep function, we subset the fitted data to remove those fits that used the Navarro small-time approximation with an initial value of $$a \leq .75$$ (see Known Error 5 below for more details). The following code chunk defines this function, which will be used next.

As previously mentioned, we will read the pre-fit data from file as the fits can take a long time to run. Then we run our prep function to expose any significant differences in the fits across the algorithms. As an example of the fitting comparison, we print the results for the first ID in the data set (ID = "experienced 2"). This sample shows that the agreement across methods for this set of initial parameter values is rather good for the first participant in the study.

# load data, will be in the variable 'fit'
load(system.file("extdata", "valid_fit.Rds", package = "fddm", mustWork = TRUE))
fit <- fit_prep(fit)

print("Results for ID = experienced 2")
#> [1] "Results for ID = experienced 2"
fit[(0:5)*11+1,]
#>               ID  Algorithm Convergence Objective vu_init vl_init a_init t0_init w_init sv_init
#> 1  experienced 2 fs_SWSE_17           0  42.47181       0       0      1  0.2305    0.5       1
#> 12 experienced 2  fs_Gon_17           0  42.47181       0       0      1  0.2305    0.5       1
#> 23 experienced 2  fs_Nav_17           0  42.47181       0       0      1  0.2305    0.5       1
#> 35 experienced 2  fb_Gon_17           0  42.47181      10     -10      1  0.2305    0.5       1
#> 46 experienced 2  fb_Nav_17           0  42.47181      10     -10      1  0.2305    0.5       1
#> 57 experienced 2    rtdists           0  42.47181      10     -10      1  0.2305    0.5       1
#>      vu_fit    vl_fit    a_fit    t0_fit     w_fit   sv_fit     Obj_diff       vu_diff
#> 1  5.681305 -2.188657 2.790911 0.3764464 0.4010114 2.281295 1.660993e-09  4.213544e-06
#> 12 5.681307 -2.188655 2.790915 0.3764461 0.4010111 2.281291 1.562547e-09  6.397442e-06
#> 23 5.681299 -2.188660 2.790908 0.3764466 0.4010116 2.281296 1.728765e-09 -1.982117e-06
#> 35 5.681304 -2.188660 2.790912 0.3764464 0.4010115 2.281298 0.000000e+00  0.000000e+00
#> 46 5.681305 -2.188660 2.790913 0.3764464 0.4010115 2.281299 1.469883e-08  1.119686e-06
#> 57 5.681305 -2.188660 2.790912 0.3764464 0.4010115 2.281299 1.469822e-08  1.029976e-06
#>          vl_diff        a_diff       t0_diff        w_diff       sv_diff
#> 1   8.397604e-07  1.408102e-06 -6.937031e-08 -1.609263e-07 -9.734412e-07
#> 12  3.374579e-06  5.186847e-06 -3.937648e-07 -3.794396e-07 -5.048216e-06
#> 23 -2.418119e-06 -1.014296e-06  7.146087e-08  6.097234e-08  2.327210e-07
#> 35  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00  0.000000e+00
#> 46 -3.610748e-07  3.172648e-07  4.550910e-09  1.692955e-09  8.149610e-07
#> 57 -3.626724e-07  2.295751e-07  3.599953e-09 -7.131966e-09  6.125464e-07

We can see that the fits for this individual participant and this set of initial parameter values are very similar across the different algorithms. As there are many individuals in this dataset, we will not print this style of fit comparison for each combination of participant and initial parameter values; instead we will print only those fits that are worse than the best fit. Output below are all of the runs where either:

1. the optimization run did not converge (i.e. Convergence = 0); or

2. the log-likelihood objective value differs from the smallest produced of any method for that specific set of initial parameter values; or

3. or the parameter estimates differ from the smallest produced of any method for that specific set of initial parameter values.

# Define error tolerance
eps <- 1e-4

out <- fit[unique(which(abs(fit[, c(3, 17:23)]) > eps, arr.ind = TRUE)[, 1]),]
out[, -c(1:2)] <- zapsmall(out[, -c(1:2)])
out
#>                   ID  Algorithm Convergence Objective vu_init vl_init a_init t0_init w_init sv_init
#> 411   experienced 16  fs_Gon_17           1 2966.5346     0.0     0.0    0.5  0.2895    0.5       1
#> 1588        novice 7 fs_SWSE_17           1 2961.5563     0.0     0.0    0.5  0.2690    0.5       1
#> 1597        novice 7  fs_Gon_17           1  414.7291    10.0   -10.0    1.0  0.2690    0.5       1
#> 1599        novice 7  fs_Gon_17           1 2954.4444     0.0     0.0    0.5  0.2690    0.5       1
#> 2400       novice 19  fs_Nav_17           1  819.4197    10.0   -10.0    1.0  0.2520    0.5       1
#> 771  inexperienced 8  fb_Nav_17           0  149.9361     0.0     0.0    1.0  0.0885    0.5       1
#> 1465        novice 5  fs_Gon_17           0  133.0414    10.0   -10.0    1.0  0.2815    0.5       1
#> 1697        novice 8  fb_Nav_17           0   28.7188    -0.5     0.5    1.0  0.0620    0.5       1
#> 1943       novice 12  fs_Nav_17           0   21.6547     0.0     0.0    1.0  0.3623    0.5       1
#> 3513       novice 36  fs_Gon_17           0    7.2392     0.0     0.0    0.5  0.2730    0.5       1
#> 3335       novice 33  fb_Gon_17           0   86.0960    10.0   -10.0    1.0  0.1875    0.5       1
#>       vu_fit  vl_fit  a_fit t0_fit  w_fit sv_fit  Obj_diff vu_diff vl_diff  a_diff t0_diff  w_diff
#> 411   0.0001 -0.0001 0.4921 0.3045 0.5000 0.9990 2712.8383 -1.2601  1.3891 -2.2641 -0.1647  0.0599
#> 1588  0.0000  0.0000 0.4915 0.2721 0.5001 0.9997 2617.7618 -0.4114  0.9253 -2.4014 -0.1172  0.0864
#> 1597 -0.0703 -0.8049 2.2278 0.2269 0.6624 0.0064   70.9346 -0.4817  0.1205 -0.6652 -0.1625  0.2487
#> 1599  0.0000  0.0000 0.4999 0.2690 0.5000 1.0000 2610.6499 -0.4114  0.9253 -2.3931 -0.1203  0.0863
#> 2400  1.4632 -1.5287 0.7102 0.4944 0.6476 0.6965  664.3276 -0.3301  0.5084 -1.5907  0.0666  0.1377
#> 771   1.8936 -0.0809 2.1561 0.1313 0.4815 0.0000    1.1808 -0.2726 -0.0229 -0.1671  0.0040  0.0081
#> 1465  0.4940 -1.3330 1.5444 0.5027 0.5181 0.0000    3.2519 -0.2709  0.5393 -0.1972  0.0071 -0.0105
#> 1697  1.9776  1.2560 1.4352 0.0995 0.4137 0.0000    5.0284 -0.7610 -0.6282 -0.2765  0.0065  0.0414
#> 1943 -0.7418 -2.6739 1.5598 0.3376 0.5256 0.0000    1.8148  0.3297  0.5947 -0.1746  0.0053 -0.0254
#> 3513  1.5424 -1.9930 1.5396 0.5175 0.4319 0.0000    1.5007 -0.2841  0.3311 -0.1393  0.0026  0.0039
#> 3335  1.7151 -0.1983 1.3949 0.3317 0.4688 1.1782    0.0000  0.0001  0.0000  0.0000  0.0000  0.0000
#>      sv_diff
#> 411  -0.0445
#> 1588  0.3384
#> 1597 -0.6549
#> 1599  0.3387
#> 2400 -1.1265
#> 771  -0.5912
#> 1465 -1.1449
#> 1697 -1.3573
#> 1943 -1.0363
#> 3513 -0.8562
#> 3335  0.0001

We can see that it is mostly the small time methods that have occasional issues with data fitting. Although there are two methods that appear in the above output that use both the small-time and large-time approximations, the method that combines the large-time approximation of Navarro with the small-time approximation of Gondan is actually quite stable. Using this method, there are two offending fitted parameters whose difference from the fits given using rtdists are not smaller than our given error tolerance; in this case, the differences are only marginally greater than our given error tolerance. These results suggest that the most stable methods are those that use both the small-time and large-time approximations to the density, and in particular the default option for dfddm (scale = "both", n_terms_small = "Gondan") is very stable .

Known Errors

1. If the starting parameter value for $$a \leq 0.75$$ then the optimization that uses Navarro small-time density approximation yields inaccurate parameter estimates for all of the fitted parameters, yet nlminb claims to have converged. This is likely due to the small value of $$a$$ creating a larger “effective response time” $$\frac{rt}{a^2}$$, which the small-time approximation may not handle well and thus cause the optimizer to get stuck. This issue does not appear for the density function that uses both the Navarro small-time and large-time approximations (n_terms_small = "Navarro", scale = "both") because the large-time approximation will handle this region of the parameter space and produce more accurate results.

References

Gondan, Matthias, Steven P Blurton, and Miriam Kesselmeier. 2014. “Even Faster and Even More Accurate First-Passage Time Densities and Distributions for the Wiener Diffusion Model.” Journal of Mathematical Psychology 60: 20–22.

Trueblood, Jennifer S., William R. Holmes, Adam C. Seegmiller, Jonathan Douds, Margaret Compton, Eszter Szentirmai, Megan Woodruff, Wenrui Huang, Charles Stratton, and Quentin Eichbaum. 2018. “The Impact of Speed and Bias on the Cognitive Processes of Experts and Novices in Medical Image Decision-Making.” Cognitive Research: Principles and Implications 3 (1): 28. https://doi.org/10.1186/s41235-018-0119-2.