Mathematical Description of Methods

August 05, 2021

Function dfddm evaluates the density function (or probability density function, PDF) for the Ratcliff diffusion decision model (DDM) using different methods for approximating the full PDF, which contains an infinite sum. An empirical validation of the implemented methods is provided in the Validity Vignette. Timing benchmarks for the present methods and comparison with existing methods are provided in the Benchmark Vignette. Two examples of using dfddm for parameter estimation are provided in the Example Vignette.

Our implementation of the DDM has the following parameters: $$a \in (0, \infty)$$ (threshold separation), $$v \in (-\infty, \infty)$$ (drift rate), $$t_0 \in [0, \infty)$$ (non-decision time/response time constant), $$w \in (0, 1)$$ (relative starting point), $$sv \in (0, \infty)$$ (inter-trial-variability of drift), and $$\sigma \in (0, \infty)$$ (diffusion coefficient of the underlying Wiener Process). Please note that for only this vignette, we will refer to the inter-trial variability of drift as $$\eta$$ instead of $$sv$$ to make the notation in the equations less confusing.

Mathematical Background

There are several different methods for approximating the PDF of the DDM, and there are three optional parameters in dfddm that can be used to indicate which method should be in the function call: scale, n_terms_small, and summation_small. For each method we describe, we include the parameter settings for the function call to dfddm so that it uses the desired method. As these parameters are optional, leaving them blank results in the default implementation that is indicated later in this vignette. For general purpose use, we recommend ignoring these optional parameters so that the default settings are used as this will be the fastest and most stable algorithm. Note that precedence for the optional parameters is first given to checking if the default implementation is selected. If not, precedence is then given to the scale parameter value; for example, scale = "large" will ignore the summation_small input value.

Note that there are actually two probability density functions of the DDM: the PDF with respect to the upper boundary, and the PDF with respect to the lower boundary. Following the precedent set by the literature, we use the general terminology “PDF” or “density function” to mean the probability density function with respect to the lower boundary. Should the probability density function with respect to the upper boundary be required, it may be calculated using the simple transformation $$f_\text{upper}(t ~|~ v, \eta, a, w) = f_\text{lower}(t ~|~ -v, \eta, a, 1-w)$$.

Since the DDM is widely used in parameter estimation usually involving numerical optimization, significant effort has been put into making the evaluation of its density as fast as possible. However, the density function for the DDM is notorious for containing an unavoidable infinite sum; hence, the literature has produced a few different methods of approximating the density. This vignette details the various methods used in the literature to approximate the infinite sum in the density function for the DDM.

The author of the seminal book where the density function originates, Feller (1968) explains the derivation from first principles. In this derivation there is a step that requires taking a limit, and Feller (1968) provides two different – but equivalent – limiting processes that yield two different – but equal – forms of the density function. Each of these forms contains an infinite sum, and they are known individually as the large-time approximation and the small-time approximation because the former is on average faster when calculating the density for large response times and the latter is on average faster when calculating the density for small response times.

When the drift rate is held constant (i.e. $$\eta = 0$$), the density function for the DDM is often written in a factorized form (Navarro and Fuss 2009): $$$f(t ~|~ v, a, w) = \frac{1}{a^2} \exp \left( -vaw -\frac{v^2 t}{2} \right) f_i \left( \frac{t}{a^2} ~\Big\vert~ 0, 1, w \right), \nonumber$$$ where $$f_i(\frac{t}{a^2} | 0, 1, w)$$ determines whether the large-time or small-time model will be used: \begin{aligned} f_{i=\ell} (t ~|~ 0, 1, w) &= \pi \sum_{j = 1}^{\infty} j \exp \left( -\frac{j^2 \pi^2 t}{2a^2} \right) \sin \left( j w \pi \right),\\ f_{i=s} (t ~|~ 0, 1, w) &= \frac{1}{\sqrt{2 \pi t^3}} \sum_{j = -\infty}^{\infty} (w + 2j) \exp \left( -\frac{(w + 2j)^2}{2t} \right). \end{aligned} \nonumber

In an effort to simplify the terms inside the infinite summations as much as possible, we instead rewrite the constant drift rate density function as two separate functions without the factorization: \begin{align} f_\ell(t | v, a, w) &= \frac{\pi}{a^2} e^{ \left( -vaw-\frac{v^2 t}{2} \right)} \sum_{j = 1}^{\infty} j \sin \left( j w \pi \right) \exp{ \left( -\frac{j^2 \pi^2 t}{2a^2} \right)}, \label{eq:con-l} \\ f_s(t | v, a, w) &= \frac{a}{\sqrt{2 \pi t^3}} e^{ \left( -vaw-\frac{v^2 t}{2} \right)} \sum_{j = -\infty}^{\infty} (w + 2j) \exp{ \left( -\frac{a^2}{2t} \left( w + 2j \right)^2 \right)}. \label{eq:con-s} \end{align}

In addition to having large-time and small-time variants, there exist two mathematically equivalent formulations for the infinite summation in the small-time density functions. The details and proof of equivalence of these two formulations are provided in the paper accompanying fddm, but we will continue to use the traditional formulation for the remainder of this vignette.

Now allowing the drift rate to vary across trials (i.e. $$\eta > 0$$), we should have two density functions. However, as only the small-time variable drift rate density function has been available in the literature (Blurton, Kesselmeier, and Gondan 2017), we provide the derivation of the large-time variable drift rate density function in the fddm paper. The large-time and small-time variable drift rate density function are: \begin{align} f_\ell(t | v, \eta, a, w) &= \frac{\pi}{a^2 \sqrt{1 + \eta^2 t}} \exp{ \left( \frac{\eta^2 a^2 w^2 -2vaw -v^2 t}{2 (1 + \eta^2 t)} \right)} \sum_{j = 1}^{\infty} j \sin \left( j w \pi \right) \exp{ \left( -\frac{j^2 \pi^2 t}{2a^2} \right)}, \label{eq:var-l}\\ f_s(t | v, \eta, a, w) &= \frac{a}{\sqrt{2 \pi t^3 \left( 1 + \eta^2 t \right)}} \exp{ \left( \frac{\eta^2 a^2 w^2 -2vaw -v^2 t}{2 (1 + \eta^2 t)} \right)} \sum_{j = -\infty}^{\infty} (w + 2j) \exp{ \left( -\frac{a^2}{2t} \left( w + 2j \right)^2 \right)}. \label{eq:var-s} \end{align}

Immediately of note is that the infinite summation for each time scale is the same regardless of the inclusion of variability in the drift rate. It then follows that there exists a term $$M$$ such that the density function for the constant drift rate multiplied by $$M$$ yields the density function for the variable drift rate. That is, $$M \cdot f(t | v, a, w) = f(t | v, a, w, \eta^2)$$ from the above equations; this value $$M$$ works for converting both the large-time and small-time constant drift rate densities to variable drift rate densities. Although we do not use this term, it may be useful in adapting current algorithms to easily outputting the density with variable drift rate. Note that there are some issues with simply scaling the constant drift rate density, so please see the Validity Vignette for more information about the potential problems with this conversion. The multiplicative term $$M$$ is given below: $$$M = \frac{1}{\sqrt{1 + \eta^2 t}} \exp \left( vaw + \frac{v^2 t}{2} + \frac{\eta^2 a^2 w^2 -2vaw -v^2 t}{2 (1 + \eta^2 t)} \right). \nonumber$$$

The Density Function Approximations

The main issue of these families of density functions is that they all contain an infinite sum that must be approximated. Since there is no closed form analytical solution to this infinite sum, we instead calculate only a partial sum by truncating the sequence of terms after a certain point. We cannot actually calculate the true value of the density function, but we can mathematically prove that we can get arbitrarily close to the true value; the proof of this fact is provided in the paper accompanying the fddm package. The nature of this truncation has been the topic of many papers in the literature, but the underlying idea supporting all of the methods is the same: the user specifies an allowable error tolerance, and the algorithm calculates terms of the infinite sum until the result is within the allowed error tolerance of the true value.

The methods in the literature pre-calculate the number of terms required for the infinite sum to converge within the allowed error tolerance, and this number of terms is referred to as $$k_\ell$$ and $$k_s$$ for the large-time and small-time infinite sums, respectively. Navarro and Fuss (2009) include a method for calculating $$k_c$$, the number of required terms for the infinite sum when combining the density functions of the two time scales. In addition to these existing methods, we add a novel method that does not perform this pre-calculation, and we also provide two new combinations of the large-time and small-time density functions. Note that in each method that pre-calculates the number of terms, the response time $$t$$ is scaled inversely by $$a^2$$, that is $$t' := \tfrac{t}{a^2}$$. Also note that for the rest of this vignette, the ceiling function will be denoted by $$\lceil \cdot \rceil$$.

Large-Time

The large-time density functions, Equations $$\eqref{eq:con-l}$$ and $$\eqref{eq:var-l}$$, have an infinite sum that runs for all of the positive integers. For a given error tolerance $$\epsilon$$, Navarro and Fuss (2009) provide an expression for $$k_\ell$$, the number of terms required for the large-time infinite sum to be within $$\epsilon$$ of the true value of the density function. Thus the infinite sum becomes finite: $$$\label{eq:kl} \sum_{j = 1}^{k_\ell^\text{Nav}} j \sin \left( j w \pi \right) \exp{ \left( -\frac{j^2 \pi^2 t'}{2} \right)}. \nonumber$$$

It remains to find the value of $$k_\ell^\text{Nav}$$ that ensures the truncated sum is $$\epsilon$$-close to the true value. Navarro and Fuss (2009) provide a derivation in their paper that finds an upper bound for the tail of the sum, the sum of all terms greater than $$k_\ell^\text{Nav}$$ (i.e., the error). Then they back-calculate the number of terms required to force this upper bound on the error to be less than $$\epsilon$$, since then the actual error must also be less than $$\epsilon$$. The resulting number of terms is: $$$\label{eq:kl-Nav} k_\ell^{\text{Nav}} \left( t', \epsilon \right) = \left\lceil \max \left\{ \sqrt{\frac{-2 \log(\pi t' \epsilon)}{\pi^2 t'}}, \frac{1}{\pi \sqrt{t'}} \right\} \right\rceil. \tag{L.1}$$$

This method is often viewed as the most inefficient of the available options in the literature; however, this method proves to be extremely efficient in particular areas of the parameter space (typically for large $$t'$$). The argument used to implement this method in dfddm is to set the parameter scale = "large" in the function call. In this case the other parameters, n_terms_small and summation_small, must be set to any string that does not start with S or s to avoid any potential clashes with overridden options.

Small-Time

The small-time approximations, Equations $$\eqref{eq:con-s}$$ and $$\eqref{eq:var-s}$$, also contain an infinite sum, but this sum runs over all of the integers – from negative infinity to positive infinity. Given this infinite nature in both directions, it is impossible to rigorously define the number of terms required to achieve the $$\epsilon$$-accuracy because we don’t know where to start counting the terms. To solve this issue, we rearrange the terms in the sum into the sequence $$\left\{ b_0, b_{-1}, b_1, \dots, b_{-j}, b_j, b_{-(j+1)}, b_{j+1}, \dots \right\}$$; this allows us not only to count the terms in a sensible manner but also to define $$k_s$$ as the index of the sequence where the truncation should occur. Then we can write the truncated version of the sum: $$$\label{eq:ks} \sum_{j = -\left\lfloor\frac{k_s}{2}\right\rfloor}^{\left\lfloor\frac{k_s}{2}\right\rfloor} (w + 2j) \exp{ \left( -\frac{a^2}{2t} \left( w + 2j \right)^2 \right)}. \nonumber$$$

To choose the small-time methods when using dfddm, set the optional parameter scale = "small" in the function call. You can also set the optional parameter summation_small = "2017" or summation_small = "2014", but it is recommended to ignore this parameter so it retains its default value of “2017” that evaluates slightly faster than its counterpart. This parameter controls the style of summation used in the small-time approximation, and more details on the differences between these two styles can be found in the paper accompanying fddm. The final parameter, n_terms_small, will be discussed in the following three subsections.

Navarro & Fuss

Similarly to their large-time solution, Navarro and Fuss (2009) provide an expression for $$k_s$$ given an error tolerance $$\epsilon$$. They follow a similar idea to their large-time derivation by bounding the error of the truncation with integrals then back-calculating the number of terms required to keep the error less than $$\epsilon$$. The resulting approximation yields: $$$\label{eq:ks-Nav} k_s^{\text{Nav}} \left( t', \epsilon \right) = \left\lceil \max \left\{ 2 + \sqrt{-2t' \log(2 \epsilon \sqrt{2 \pi t'})}, 1 + \sqrt{t'} \right\} \right\rceil. \tag{S.1}$$$

To use this method, set n_terms_small = "Navarro" and keep scale = "small" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance.

Gondan, Blurton, Kesselmeier

After Navarro and Fuss (2009) published their paper, Gondan, Blurton, and Kesselmeier (2014) introduced another method for calculating the required number of terms in the truncated small-time summation. It is important to note, however, that Gondan, Blurton, and Kesselmeier (2014) provided the number of required pairs of terms in the $$S_{14}$$ summation style, and not the number of required individual terms. As we want the number of individual terms, we adapt their formula and define it given a desired precision $$\epsilon$$: \label{eq:ks-Gon} \begin{aligned} k_s^{\text{Gon}} \left( t', w, \epsilon \right) &= \left\lceil \max \left\{ \tfrac{1}{2} \left( \sqrt{2t'} - w \right), \tfrac{1}{2} \left( \sqrt{-t' (u_\epsilon - \sqrt{-2 u_\epsilon -2})} - w \right) \right\} \right\rceil,\\ u_\epsilon &= \min \left\{ -1, \log(2 \pi t'^2 \epsilon^2) \right\}. \end{aligned} \tag{S.2}

To use this method, set n_terms_small = "Gondan" and keep scale = "small" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance.

Stop When Small Enough (SWSE)

If we consider the terms of the infinite sum as the sequence defined above, the series alternates in sign $$(+, -, +, \dots)$$; moreover, the series eventually decreases monotonically (in absolute value) due to the exponential term. Combining and exploiting these two mathematical properties has been the cornerstone of the previous approximations, but we will instead truncate the sum using a method suggested by Gondan, Blurton, and Kesselmeier (2014). This method does not pre-calculate the number of terms required to achieve the given error tolerance. Instead, the general idea of this method is to take full advantage of the alternating and decreasing nature of the terms in the infinite sum by applying a handy theorem (commonly known as the alternating series test) to place an upper bound on the truncation error after including so many terms. It has been proven that this upper bound is in fact the absolute value of the next term in the sequence, thus we can truncate the infinite sum once one of its terms is less than the desired error tolerance, $$\epsilon$$. Hence we do not consider the number of terms in the sum, rather just that the terms in the summation will eventually be small enough. The validity of this method is proven in the paper that accompanies the fddm package.

To use this method, set n_terms_small = "SWSE" and keep scale = "small" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance.

Combining Large-Time and Small-Time

A sensible next approach to approximating the density of the DDM is to use some combination of the large-time and small-time density functions. As their names suggest, each density function approximation has a section of the parameter space where it outperforms the other one. Essentially these methods involve calculating the number of terms required for both the large-time and small-time density functions, then using whichever approximation requires fewer terms. The goal is to use each approximation where it is efficient and avoid the areas of the parameter space where the approximations perform poorly.

To use this method of evaluating the PDF, set the optional parameter scale = "both" in the function call. These methods include similar options to the small-time approximations from the previous section, and thus it is possible to set the optional parameters in a similar fashion to above. The value of summation_small triggers the same effect as above since it only affects the small-time approximation to the PDF; we recommend ignoring this parameter so it assumes its default value of “2017” that evaluates slightly faster than its counterpart. The final parameter, n_terms_small, will be discussed in the following three subsections. As there is only one option for the large-time approximation, there are no optional parameters to set for this part of the combined time scale approximation.

Navarro Small & Navarro Large

Navarro and Fuss (2009) initially suggested an algorithm wherein both $$k_s^\text{Nav}$$ and $$k_l^\text{Nav}$$ were calculated, and whichever one has a smaller value was used. However, one issue with their original method arises when the user inputs a vector of response times. They calculate the maximum value of $$k_s^\text{Nav}$$ for all of the input response times and then compare that to the maximum value of $$k_l^\text{Nav}$$ for all of the input response times. They end up with one $$k_c^\text{Nav}$$ for all of the input response times, and this one value can be higher than normal for a response time due to the maximization of the approximation over all of the response times. This over-accuracy is not necessarily a bad thing, but it does mean that it is possible to get different densities for the same input response time depending on the other response times that have been input. To correct for this, we simply do not maximize $$k_s^\text{Nav}$$ or $$k_l^\text{Nav}$$ over all of the input response times and instead calculate one value of $$k_c^\text{Nav}$$ for each input response time. For more details, see the “Implementation” section in the paper accompanying fddm. $$$\label{eq:kc-Nav} k_c^\text{Nav} \left( t', w, \epsilon \right) = \min \left\{ k_s^{\text{Nav}}, k_l^{\text{Nav}} \right\}. \tag{B.1}$$$

To use this method, set n_terms_small = "Navarro" and keep scale = "both" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance.

Gondan Small & Navarro Large

This combination of methods has not been explored in the literature before, but it works very similarly to the combination above. The only difference is that we use the Gondan, Blurton, and Kesselmeier (2014) approximation for the small-time instead of the one provided by Navarro and Fuss (2009). Since $$k_s^\text{Gon} \leq k_s^\text{Nav}$$. $$$\label{eq:kc-Gon} k_c^{\text{Gon}} \left( t', w, \epsilon \right) = \min \left\{ k_s^{\text{Gon}}, k_l^{\text{Nav}} \right\}. \tag{B.2}$$$

To use this method, set n_terms_small = "Gondan" and keep scale = "both" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance.

Stop When Small Enough (SWSE) Small & Navarro Large

This method uses the SWSE approximation to the small-time density function in conjunction with the Navarro large-time approximation to achieve consistent performance across the parameter space. Since the SWSE approximation to the small-time density function does not explicitly calculate $$k_s$$, we only calculate $$k_\ell^{\text{Nav}}$$ and compare it to the new optional parameter max_terms_large. If $$k_\ell^{\text{Nav}}$$ $$\le$$ max_terms_large, then the Navarro large-time approximation is used. On the other hand, if $$k_\ell^{\text{Nav}}$$ $$>$$ max_terms_large, then the SWSE small-time approximation is used. The user can set the optional parameter max_terms_large to any non-negative integer to alter the behavior of this method; the default value for this parameter is $$1$$. Again, the validity of this method is proven in the paper that accompanies the fddm package. Note that in the fddm paper, the parameter max_terms_large is labelled $$\delta$$.

Since this is the default implementation, all three of the optional parameters (n_terms_small, scale, and summation_small) can be ignored. For completeness, set n_terms_small = "SWSE" and keep scale = "both" in the function call. The parameter summation_small should be ignored so that it retains its default value to obtain the best performance. This method uses a fourth optional parameter, max_terms_large, and we recommend ignoring this parameter so that it retains its default value.

R Session Info

sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19043)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C
#> [2] LC_CTYPE=English_United Kingdom.1252
#> [3] LC_MONETARY=English_United Kingdom.1252
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United Kingdom.1252
#>
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base
#>
#> other attached packages:
#> [1] ggforce_0.3.3        ggplot2_3.3.5        reshape2_1.4.4
#> [4] microbenchmark_1.4-7 RWiener_1.3-3        rtdists_0.11-2
#> [7] fddm_0.4-0
#>
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.1.1  xfun_0.24         bslib_0.2.5.1     purrr_0.3.4
#>  [5] splines_4.1.0     lattice_0.20-44   colorspace_2.0-2  vctrs_0.3.8
#>  [9] generics_0.1.0    expm_0.999-6      htmltools_0.5.1.1 yaml_2.2.1
#> [13] utf8_1.2.1        survival_3.2-11   rlang_0.4.11      jquerylib_0.1.4
#> [17] pillar_1.6.1      withr_2.4.2       glue_1.4.2        DBI_1.1.1
#> [21] tweenr_1.0.2      lifecycle_1.0.0   plyr_1.8.6        stringr_1.4.0
#> [25] munsell_0.5.0     gtable_0.3.0      mvtnorm_1.1-2     evaluate_0.14
#> [29] labeling_0.4.2    knitr_1.33        fansi_0.5.0       evd_2.3-3
#> [33] highr_0.9         Rcpp_1.0.6        scales_1.1.1      jsonlite_1.7.2
#> [37] farver_2.1.0      digest_0.6.27     stringi_1.6.2     msm_1.6.8
#> [41] dplyr_1.0.7       polyclip_1.10-0   gsl_2.1-6         grid_4.1.0
#> [45] tools_4.1.0       magrittr_2.0.1    sass_0.4.0        tibble_3.1.2
#> [49] crayon_1.4.1      pkgconfig_2.0.3   MASS_7.3-54       ellipsis_0.3.2
#> [53] Matrix_1.3-4      ggnewscale_0.4.5  assertthat_0.2.1  rmarkdown_2.9
#> [57] R6_2.5.0          compiler_4.1.0

References

Blurton, Steven P, Miriam Kesselmeier, and Matthias Gondan. 2017. “The First-Passage Time Distribution for the Diffusion Model with Variable Drift.” Journal of Mathematical Psychology 76: 7–12.
Feller, William. 1968. An Introduction to Probability Theory and Its Applications: Volume i. Vol. 1. John Wiley & Sons.
Gondan, Matthias, Steven P Blurton, and Miriam Kesselmeier. 2014. “Even Faster and Even More Accurate First-Passage Time Densities and Distributions for the Wiener Diffusion Model.” Journal of Mathematical Psychology 60: 20–22.
Navarro, Daniel J, and Ian G Fuss. 2009. “Fast and Accurate Calculations for First-Passage Times in Wiener Diffusion Models.” Journal of Mathematical Psychology 53 (4): 222–30.