The R extension of the markdown language (Xie,
Dervieux, and Riederer 2020) enables reproducible statistical
reports with nice typesetting in HTML, Microsoft Word, and Latex. Since
recently (R Core Team 2022, version 4.2),
R's manual pages include support for mathematical expressions (Sarkar and Hornik 2022; Viechtbauer 2022),
which already is a big improvement. However, rules for the mapping of
built-in language elements to their mathematical representation are
still lacking. So far, R expressions such as
`pbinom(k, N, p)`

are printed as they are; pretty
mathematical formulae such as \(P_{\mathrm{Bi}}(X \le k; N, p)\) require
explicit Latex commands, that is,
`P_{\mathrm{Bi}}\left(X \le k; N, p\right)`

. Except for
minimalistic use cases, these commands are tedious to type in and their
source code is hard to read.

The R package `mathml`

defines a set of rules for the
automatic translation of R expressions to mathematical output in
RMarkdown documents (Xie, Dervieux, and Riederer
2020) and ShinyApp webpages (Chang et al.
2022). The translation is done by an embedded Prolog interpreter
that maps nested expressions recursively to MathML and LaTeX/MathJax,
respectively. User-defined hooks enable to extend the set of rules, for
example, to represent specific R elements by custom mathematical
signs.

The main feature of the package is that the same R expressions and equations can be used for both mathematical typesetting and calculations. This saves time and reduces mistakes, as will be illustrated below.

The translation of R expressions to mathematical output is achieved
through a Prolog interpreter from R package *rolog* (Gondan 2022). Prolog is a classical logic
programming language with many applications in expert systems, computer
linguistics and symbolic artificial intelligence. The main strength of
Prolog is its concise representation of facts and rules for knowledge
and grammar, as well as its efficient built-in search engine for closed
world domains. As it is well-known, R is a statistical programming
language for data analysis and statistical modeling. Whereas Prolog is
weak in statistical computation, but strong in symbolic manipulation,
the converse may be said for the R language. The *rolog* package
bridges this gap by providing an interface to a SWI-Prolog distribution
(Wielemaker et al. 2012) in R. The
communication between the two systems is mainly in the form of queries
from R to Prolog, but two predicates allow Prolog to ring back and
evaluate terms in R.

For a first illustration of the *mathml* package, we consider
the binomial probability.

```
<- quote(pbinom(k, N, p))
term term
```

`## pbinom(k, N, p)`

The term is quoted to avoid its immediate evaluation (which would
raise an error anyway since the variables `k`

,
`N`

, `p`

have not yet been defined). Experienced
readers will remember that the quoted expression above is a short form
for

`<- call("pbinom", as.name("k"), as.name("N"), as.name("p")) term `

As is seen from the output above, the variable `term`

is
not assigned the result of the calculation, but an R call (see, e.g., Wickham 2019, for details on “non-standard
evaluation”). Such a call can eventually be evaluated with
`eval()`

,

```
<- 10
k <- 22
N <- 0.4
p eval(term)
```

`## [1] 0.77195`

The R package *mathml* can now be used to render the call in
MathML, that is the dialect for mathematical elements on HTML webpages
or in MathJax/LaTeX, as shown below (some of the curly braces are not
really needed in this simple example, but are necessary in edge
cases).

```
library(mathml)
mathjax(term)
```

`## [1] "${P}_{\\mathrm{Bi}}{\\left({{X}{\\le}{k}}{{;}{{N}{{,}{p}}}}\\right)}$"`

We can include the output in a RMarkdown document by specifying
`results='asis'`

in the R code chunk, as is done in the next
example. The R function `mathout()`

has been defined in this
vignette; it invokes `mathml()`

for HTML output and
`mathjax()`

for LaTeX output.

`mathout(term)`

${P}_{\text{Bi}}\left(X\le k;N,p\right)$

At the Prolog end, a predicate `math/2`

translates the
call `pbinom(K, N, Pi)`

into a “function” `fn/2`

with the name `P_Bi`

, one argument `X =< K`

,
and the two parameters `N`

and `Pi`

.

```
K, N, Pi), M)
math(pbinom(M = fn(subscript('P', "Bi"), (['X' =< K] ; [N, Pi])). =>
```

Thus, the predicate `math/2`

could be considered a “macro”
that translates a mathematical element (here,
`pbinom(K, N, Pi)`

) to a different mathematical element,
namely `fn(Name, (Args ; Pars))`

. The low-level predicate
`ml/3`

is used to convert these basic elements to MathML.

```
Flags, fn(Name, (Args ; Pars)), M)
ml(Flags, Name, N),
=> ml(Flags, paren(list(op(;), [list(op(','), Args), list(op(','), Pars)])), X),
ml(M = mrow([N, mo(&(af)), X]).
```

The relevant rule for `ml/3`

builds the MathML entity
`mrow([N, mo(&(af)), X])`

, with `N`

representing the name of the function and `X`

its arguments
and parameters, enclosed in parentheses. A corresponding rule
`jax/3`

does the same for MathJax/LaTeX. A list of flags can
be used for context-sensitive translation (see, e.g., the section on
errors below).

*mathml* is an R package for pretty mathematical
representation of R functions and objects in data analysis, scientific
reports and interactive web content. The currently supported features
are listed below, roughly following the order proposed by (Murrell and Ihaka 2000).

*mathml* handles the basic elements of everyday mathematical
expressions, such as numbers, Latin and Greek letters, multi-letter
identifiers, accents, subscripts, and superscripts.

```
<- quote(1 + -2L + a + abc + "a" + phi + Phi + varphi + roof(b)[i, j]^2L)
term mathout(term)
```

$1.00+-2+a+\mathrm{abc}+\text{a}+\phi +\Phi +\varphi +{\widehat{b}}_{i\text{}j}^{2}$

```
<- quote(round(3.1415, 3L) + NaN + NA + TRUE + FALSE + Inf + (-Inf))
term mathout(term)
```

$3.142+\mathrm{nan}+\mathrm{na}+T+F+\infty +\left(-\infty \right)$

An expression such as `1 + -2`

may be considered
unsatisfactory from an aesthetical perspective. It is correct R syntax,
though, and is reproduced accordingly, without the parentheses.
Parentheses around negated numbers or symbols can be added as shown
above for `+ (-Inf)`

.

To avoid name clashes with package *stats*,
`roof()`

is used instead of `hat()`

to put a hat
on a symbol (see next section for further decorations). Note that an
R function `roof()`

does not exist in base R, it is provided
by the package for convenience and points to the identity function.

The package offers some support for different fonts as well as accents and boxes etc. Internally, these decorations are implemented as identity functions, so they can be introduced into R expressions without side-effects.

```
<- quote(bold(b[x, 5L]) + bold(b[italic(x)]) + italic(ab) + italic(42L))
term mathout(term)
```

${\mathbf{b}}_{\mathbf{x}\text{}5}+{\mathbf{b}}_{\mathit{x}}+\mathit{ab}+42$

```
<- quote(tilde(a) + mean(X) + boxed(c) + cancel(d) + phantom(e) + prime(f))
term mathout(term)
```

$\tilde{a}+\overline{X}+\overline{)c}+\overline{)d}+\phantom{e}+{f}^{\prime}$

Note that the font styles only affect the display of identifiers, whereas numbers, character strings etc. are left untouched.

Arithmetic operators and parentheses are translated as they are, as illustrated below.

```
<- quote(a - ((b + c)) - d*e + f*(g + h) + i/j + k^(l + m) + (n*o)^{p + q})
term mathout(term)
```

$a-\left[\left(b+c\right)\right]-de+f\cdot \left(g+h\right)+i/j+{k}^{\left(l+m\right)}+{\left(no\right)}^{p+q}$

```
<- quote(dot(a, b) + frac(1L, nodot(c, d + e)) + dfrac(1L, times(g, h)))
term mathout(term)
```

$a\cdot b+\frac{1}{c\left(d+e\right)}+{\displaystyle \frac{1}{g\times h}}$

For multiplications involving only numbers and symbols, the
multiplication sign is omitted. This heuristic does not always produce
the desired result; therefore, *mathml* defines alternative
R functions `dot()`

, `nodot()`

, and
`times()`

. These functions calculate a product and produce
the respective multiplication signs. Similarly, `frac()`

and
`dfrac()`

can be used for small and large fractions.

For standard operators with known precedence, *mathml* is
generally able to detect if parentheses are needed; for example,
parentheses are automatically placed around `g + h`

in the
`nodot`

-example. However, we note unnecessary parentheses
around `l + m`

above. Thes parentheses are a consequence of
`quote(a^(b + c))`

actually producing a nested R call of the
form `'^'(a, (b + c))`

instead of
`'^'(a, b + c)`

:

```
<- quote(a^(b + c))
term paste(term)
```

`## [1] "^" "a" "(b + c)"`

For the present purpose, this feature is unfortunate because extra
parentheses around `b + c`

are not needed. The preferred
result is obtained by using the functional form
`quote('^'(k, l + m))`

of the power, or curly braces as a
workaround (see `p + q`

above).

Whereas in standard infix operators, the parentheses typically follow the rules for precedence, undesirable results may be obtained in custom operators.

```
<- quote(mean(X) %+-% 1.96 * s / sqrt(N))
term mathout(term)
```

$\left(\overline{X}\pm 1.96\right)\cdot s/\sqrt{N}$

```
<- quote('%+-%'(mean(X), 1.96 * s / sqrt(N))) # functional form of '%+-%'
term <- quote(mean(X) %+-% {1.96 * s / sqrt(N)}) # the same
term mathout(term)
```

$\overline{X}\pm 1.96s/\sqrt{N}$

The example is a reminder that it is not possible to define the precedence of custom operators in R, and that expressions with such operators are evaluated strictly from left to right. Again, the problem can be worked around by the functional form of the operator, or a curly brace to hide the parenthesis but enforce the correct operator precedence.

More operators are shown in Table 1, including the suggestions by Murrell and Ihaka (2000) for graphical annotations and arrows in R figures.

Operator | Output | Operator | Output | Operator | Arrow |
---|---|---|---|---|---|

A %*% B | $A\times B$ | A != B | $A\ne B$ | A %<->% B | $A\leftrightarrow \ufe0eB$ |

A %.% B | $A\cdot B$ | A ~ B | $A\sim B$ | A %->% B | $A\to B$ |

A %x% B | $A\otimes B$ | A %~~% B | $A\approx B$ | A %<-% B | $A\leftarrow B$ |

A %/% B | $\lfloor A/B\rfloor $ | A %==% B | $A\equiv B$ | A %up% B | $A\uparrow B$ |

A %% B | $\mathrm{mod}\left(A,B\right)$ | A %=~% B | $A\cong B$ | A %down% B | $A\downarrow B$ |

A & B | $A\wedge B$ | A %prop% B | $A\propto B$ | A %<=>% B | $A\iff B$ |

A | B | $A\vee B$ | A %in% B | $A\in B$ | A %=>% B | $A\Rightarrow B$ |

xor(A, B) | $A\u22bbB$ | intersect(A, B) | $A\cap B$ | A %<=% B | $A\Leftarrow B$ |

!A | $\neg A$ | union(A, B) | $A\cup B$ | A %dblup% B | $A\Uparrow B$ |

A == B | $A=B$ | crossprod(A, B) | ${A}^{\text{T}}\times B$ | A %dbldown% B | $A\Downarrow B$ |

A <- B | $A=B$ | is.null(A) | $A=\varnothing $ | $\text{}$ |

There is support for most functions from package *base*, with
adequate use and omission of parentheses.

```
<- quote(sin(x) + sin(x)^2L + cos(pi/2L) + tan(2L*pi) * expm1(x))
term mathout(term)
```

$\mathrm{sin}x+{\left(\mathrm{sin}x\right)}^{2}+\mathrm{cos}\left(\pi /2\right)+\mathrm{tan}\left(2\pi \right)\cdot \left(\mathrm{exp}x-1\right)$

```
<- quote(choose(N, k) + abs(x) + sqrt(x) + floor(x) + exp(frac(x, y)))
term mathout(term)
```

$\left(\genfrac{}{}{0ex}{}{N}{k}\right)+\left|x\right|+\sqrt{x}+\lfloor x\rfloor +\mathrm{exp}\left(\frac{x}{y}\right)$

A few more examples are shown in Table 2, including functions from
*stats*.

Function | Output | Function | Output |
---|---|---|---|

sin(x) | $\mathrm{sin}x$ | dbinom(k, N, pi) | ${P}_{\text{Bi}}\left(X=k;N,\pi \right)$ |

cosh(x) | $\mathrm{cosh}x$ | pbinom(k, N, pi) | ${P}_{\text{Bi}}\left(X\le k;N,\pi \right)$ |

tanpi(alpha) | $\mathrm{tan}\left(\alpha \pi \right)$ | qbinom(p, N, pi) | ${\mathrm{argmin}}_{k}\left[{P}_{\text{Bi}}\left(X\le k;N,\pi \right)>p\right]$ |

asinh(x) | ${\mathrm{sinh}}^{-1}x$ | dpois(k, lambda) | ${P}_{\text{Po}}\left(X=k;\lambda \right)$ |

log(p) | $\mathrm{log}p$ | ppois(k, lambda) | ${P}_{\text{Po}}\left(X\le k;\lambda \right)$ |

log1p(x) | $\mathrm{log}\left(1+x\right)$ | qpois(p, lambda) | ${\mathrm{argmax}}_{k}\left[{P}_{\text{Po}}\left(X\le k;\lambda \right)>p\right]$ |

logb(x, e) | ${\mathrm{log}}_{e}x$ | dexp(x, lambda) | ${f}_{\text{Exp}}\left(x;\lambda \right)$ |

exp(x) | $\mathrm{exp}x$ | pexp(x, lambda) | ${F}_{\text{Exp}}\left(x;\lambda \right)$ |

expm1(x) | $\mathrm{exp}x-1$ | qexp(p, lambda) | ${F}_{\text{Exp}}^{-1}\left(p;\lambda \right)$ |

choose(n, k) | $\left(\genfrac{}{}{0ex}{}{n}{k}\right)$ | dnorm(x, mu, sigma) | $\phi \left(x;\mu ,\sigma \right)$ |

lchoose(n, k) | $\mathrm{log}\left(\genfrac{}{}{0ex}{}{n}{k}\right)$ | pnorm(x, mu, sigma) | $\Phi \left(x;\mu ,\sigma \right)$ |

factorial(n) | $n!$ | qnorm(alpha/2L) | ${\Phi}^{-1}\left(\alpha /2\right)$ |

lfactorial(n) | $\mathrm{log}n!$ | 1L - pchisq(x, 1L) | $1-{F}_{{\chi}^{2}\left(1\phantom{\rule{thinmathspace}{0ex}}\text{df}\right)}\left(x\right)$ |

sqrt(x) | $\sqrt{x}$ | qchisq(1L - alpha, 1L) | ${F}_{{\chi}^{2}\left(1\phantom{\rule{thinmathspace}{0ex}}\text{df}\right)}^{-1}\left(1-\alpha \right)$ |

mean(X) | $\overline{X}$ | pt(t, N - 1L) | $P\left(T\le t;N-1\phantom{\rule{thinmathspace}{0ex}}\text{df}\right)$ |

abs(x) | $\left|x\right|$ | qt(alpha/2L, N - 1L) | ${T}_{\alpha /2}\left(N-1\phantom{\rule{thinmathspace}{0ex}}\text{df}\right)$ |

For self-written functions, matters are a bit more complicated. For a
function such as `g <- function(...) ...`

, the name
*g* is not transparent to R, because only the function body is
represented, as illustrated by the redefinition of `sign`

below.

```
<- function(x)
sgn
{if(x == 0L) return(0L)
if(x < 0L) return(-1L)
if(x > 0L) return(1L)
}
mathout(sgn)
```

$\{\begin{array}{}0\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x=0-1\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x<01\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x>0\end{array}$

`mathout(call("<-", quote(sgn(x)), sgn))`

$\mathrm{sgn}x=\{\begin{array}{}0\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x=0-1\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x<01\text{,}\phantom{\rule{thinmathspace}{0ex}}\text{if}\phantom{\rule{thinmathspace}{0ex}}x>0\end{array}$

As shown in the example, we can still display functions in the form
`head(x) = body`

if we embed the object to be shown into a
call `"<-"(head, body)`

.

The function body is generally a nested R call of the form
`'{'(L)`

, with `L`

being a list of commands (the
semicolon, not necessary in R, is translated to a newline). As
illustrated in the example, *mathml* provides limited support for
control structures such as `if`

.

Indices in square brackets are rendered as subscripts, powers are
rendered as superscript. Moreover, *mathml* defines the functions
`sum_over(x, from, to)`

, and
`prod_over(x, from, to)`

that simply return their first
argument. The other two arguments serve as decorations (*to* is
optional), for example, for summation and product signs.

```
<- quote(S[Y]^2L <- frac(1L, N) * sum(Y[i] - mean(Y))^2L)
term mathout(term)
```

${S}_{Y}^{2}=\frac{1}{N}\cdot {\sum \left({Y}_{i}-\overline{Y}\right)}^{2}$

```
<- quote(log(prod_over(L[i], i==1L, N)) <- sum_over(log(L[i]), i==1L, N))
term mathout(term)
```

$\mathrm{log}{\prod}_{i=1}^{N}{L}_{i}={\sum}_{i=1}^{N}\mathrm{log}{L}_{i}$

R's `integrate`

function takes a number of arguments, the
most important ones being the function to integrate, and the lower and
the upper bound of the integration.

```
<- quote(integrate(sin, 0L, 2L*pi))
term mathout(term)
```

$\underset{0}{\overset{2\pi}{\int}}\mathrm{sin}x\phantom{\rule{thinmathspace}{0ex}}dx$

`eval(term)`

2.221501e-16 with absolute error < 4.4e-14

For mathematical typesetting in the form of \(\int f(x)\, dx\), *mathml* needs to
find out the name of the integration variable. For that purpose, the
underlying Prolog bridge provides a predicate `r_eval/3`

that
calls R from Prolog. In the example above, this predicate evaluates
`formalArgs(args(sin))`

, which returns the names of the
arguments of `sin`

, namely, `x`

.

Note that in the example above, the quoted term is an abbreviation
for `call("integrate", quote(sin), ...)`

, with
`sin`

being an R symbol, not a function. While the R function
`integrate()`

can handle both symbols and functions,
*mathml* needs the symbol because it is unable to determine the
function name of custom functions.

One of R’s great features is the possibility to refer to function
arguments by their names, not only by their position in the list of
arguments. At the other end, Prolog does not have such a feature.
Therefore, the Prolog handlers for R calls are rather rigid, for
example, `integrate/3`

accepts exactly three arguments in a
particular order and without names, that is,
`integrate(lower=0L, upper=2L*pi, sin)`

, would not print the
desired result.

To “canonicalize” function calls with named arguments and arguments
in unusual order, *mathml* provides an auxiliary R function
`canonical(f, drop)`

that reorders the argument list of calls
to known R functions and, if `drop=TRUE`

(which is the
default), also removes the names of the arguments.

```
<- quote(integrate(lower=0L, upper=2L*pi, sin))
term canonical(term)
```

`## integrate(sin, 0L, 2L * pi)`

`mathout(canonical(term))`

$\underset{0}{\overset{2\pi}{\int}}\mathrm{sin}x\phantom{\rule{thinmathspace}{0ex}}dx$

This function can be used to feed mixtures of partially named and
positional arguments into the renderer. For details, see the R function
`match.call()`

.

Of course, *mathml* also supports matrices and vectors.

```
<- 1:3
v mathout(call("t", v))
```

${\left(1\phantom{\rule{thinmathspace}{0ex}}2\phantom{\rule{thinmathspace}{0ex}}3\right)}^{\text{T}}$

```
<- matrix(data=11:16, nrow=2, ncol=3)
A <- matrix(data=21:26, nrow=2, ncol=3)
B <- call("+", A, B)
term mathout(term)
```

$\left(\begin{array}{lll}11& 13& 15\\ 12& 14& 16\end{array}\right)+\left(\begin{array}{lll}21& 23& 25\\ 22& 24& 26\end{array}\right)$

Note that the seemingly more convenient
`term <- quote(A + B)`

yields \(A + B\) in the output—instead of the
desired matrix representation. This behavior is expected because
quotation of R calls also quote the components of the call (here,
*A* and *B*).

In typical R functions, variable names are typically longer than just single letters, which may yield unsatisfactory results in the mathematical output.

```
hook(successes, k)
hook(quote(Ntotal), quote(N), quote=FALSE)
hook(prob, pi)
<- quote(dbinom(successes, Ntotal, prob))
term mathout(term)
```

${P}_{\text{Bi}}\left(X=k;N,\pi \right)$

To improve the situation, *mathml* provides a simple hook that
can be used to replace elements (e.g., verbose variable names) of the
code by concise mathematical symbols, as illustrated in the example. To
simplify notation, the `quote`

flag of `hook()`

defaults to TRUE, and `hook()`

uses non-standard evaluation
to unpack its arguments. If quote is FALSE, as shown above, the user has
to provide the quoted expressions. Care should be taken to avoid
recursive hooks such as `hook(s, s["A"])`

that endlessly
replace the \(s\) from \(s_{\mathrm{A}}\) as in \(s_{\mathrm{A}_{\mathrm{A}_{\mathrm{A}\cdots}}}\).

The hooks can also be used for more complex elements such as R calls, with dotted symbols representing Prolog variables.

```
hook(pbinom(.K, .N, .P), sum_over(dbinom(i, .N, .P), i=0L, .K))
mathout(term)
```

${P}_{\text{Bi}}\left(X=k;N,\pi \right)$

We consider the \(t\)-statistic for
independent samples with equal variance. To avoid clutter in the
equation, the pooled variance \(s^2_{\mathrm{pool}}\) is abbreviated, and a
comment is given with the expression for \(s^2_{\mathrm{pool}}\). For this purpose,
*mathml* provides a function
`denote(abbr, expr, info)`

, with `expr`

actually
being evaluated, `abbr`

being rendered, plus a comment of the
form “with `expr`

denoting `info`

”.

```
hook(m_A, mean(X)["A"]) ; hook(s2_A, s["A"]^2L) ;
hook(n_A, n["A"])
hook(m_B, mean(X)["B"]) ; hook(s2_B, s["B"]^2L)
hook(n_B, n["B"]) ; hook(s2_p, s["pool"]^2L)
<- quote(t <- dfrac(m_A - m_B,
term sqrt(denote(s2_p, frac((n_A - 1L)*s2_A + (n_B - 1L)*s2_B, n_A + n_B - 2L),
"the pooled variance.") * (frac(1L, n_A) + frac(1L, n_B)))))
mathout(term)
```

$t={\displaystyle \frac{{\overline{X}}_{\text{A}}-{\overline{X}}_{\text{B}}}{\sqrt{{s}_{\text{pool}}^{2}\cdot \left(\frac{1}{{n}_{\text{A}}}+\frac{1}{{n}_{\text{B}}}\right)}}}$, with ${s}_{\text{pool}}^{2}=\frac{\left({n}_{\text{A}}-1\right)\cdot {s}_{\text{A}}^{2}+\left({n}_{\text{B}}-1\right)\cdot {s}_{\text{B}}^{2}}{{n}_{\text{A}}+{n}_{\text{B}}-2}$ denoting the pooled variance.

The term is evaluated below. `print()`

is needed because
the return value of an assignment of the form
`t <- dfrac(...)`

is not visible in R.

```
<- 1.5; s2_A <- 2.4^2; n_A <- 27; m_B <- 3.9; s2_B <- 2.8^2; n_B <- 20
m_A print(eval(term))
```

`## [1] -3.157427`

Consider an educational scenario in which we want to highlight a certain element of a term, for example, that a student has forgotten to subtract the null hypothesis in a \(t\)-ratio:

```
<- quote(dfrac(omit_right(mean(D) - mu[0L]), s / sqrt(N)))
t mathout(t, flags=list(error="highlight"))
```

$\frac{\overline{D}\phantom{\rule{thinmathspace}{0ex}}\overline{)-\phantom{\rule{thinmathspace}{0ex}}{\mu}_{0}}}{s/\sqrt{N}}$

`mathout(t, flags=list(error="fix"))`

$\frac{\overline{D}\phantom{\rule{thinmathspace}{0ex}}\overline{)-\phantom{\rule{thinmathspace}{0ex}}{\mu}_{0}}}{s/\sqrt{N}}$

The R function `omit_right(a + b)`

uses non-standard
evaluation techniques (e.g., Wickham 2019)
to return only the left part an operation, and cancels the right part.
This may not always be desired, for example, when illustrating how to
fix the mistake.

For this purpose, the functions `mathml()`

or
`mathjax()`

have an optional argument `flags`

which is a list with named elements. In this example, we use this
argument to tell *mathml* how to render such erroneous
expressions using the flag `error`

which is one of asis,
highlight, fix, or ignore. For more examples, see Table 3.

Operation | error = asis | highlight | fix | ignore |
---|---|---|---|---|

omit_left(a + b) | $b$ | $\overline{)a\phantom{\rule{thinmathspace}{0ex}}+}\phantom{\rule{thinmathspace}{0ex}}b$ | $\overline{)a\phantom{\rule{thinmathspace}{0ex}}+}\phantom{\rule{thinmathspace}{0ex}}b$ | $a+b$ |

omit_right(a + b) | $a$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)+\phantom{\rule{thinmathspace}{0ex}}b}$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)+\phantom{\rule{thinmathspace}{0ex}}b}$ | $a+b$ |

list(quote(a), quote(omit(b))) | $a\phantom{\rule{thinmathspace}{0ex}}\text{}$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)b}$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)b}$ | $a\phantom{\rule{thinmathspace}{0ex}}b$ |

add_left(a + b) | $a+b$ | $\overline{)a\phantom{\rule{thinmathspace}{0ex}}+}\phantom{\rule{thinmathspace}{0ex}}b$ | $\overline{)a\phantom{\rule{thinmathspace}{0ex}}+}\phantom{\rule{thinmathspace}{0ex}}b$ | $b$ |

add_right(a + b) | $a+b$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)+\phantom{\rule{thinmathspace}{0ex}}b}$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)+\phantom{\rule{thinmathspace}{0ex}}b}$ | $a$ |

list(quote(a), quote(add(b))) | $a\phantom{\rule{thinmathspace}{0ex}}b$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)b}$ | $a\phantom{\rule{thinmathspace}{0ex}}\overline{)b}$ | $a\phantom{\rule{thinmathspace}{0ex}}\text{}$ |

instead(a, b) + c | $a+c$ | $\underset{\text{instead}\phantom{\rule{thinmathspace}{0ex}}\text{of}\phantom{\rule{thinmathspace}{0ex}}b}{\underset{\u23df}{a}}+c$ | $\overline{)b}+c$ | $b+c$ |

Further customization requires the assertion of new Prolog rules
`math/2`

, `ml/3`

, `jax/3`

, as shown in
the Appendix.

This package allows R to render its terms in pretty mathematical
equations. It extends the current features of R and existing packages
for displaying mathematical formulas in R (Murrell and Ihaka 2000; Allaire et al. 2018)),
bit most importantly, *mathml* bridges the gap between
computational needs, presentation of results, and their reproducibility.
The package supports both MathML and Latex/MathJax for use in R Markdown
documents, presentations and ShinyApp webpages.

Researchers or teachers can already use RMarkdown to conduct analyses
and show results, and *mathml* smoothes this process and allows
for integrated calculations and output. As shown in the case study of
the previous section, *mathml* can help to improve data analyses
and statistical reports from an aesthetical perspective, as well as
regarding reproducibility of research.

Furthermore, the package may also allow for a better detection of possible mistakes in R programs. Similar to most programming languages (Green 1977), R code is notoriously hard to read, and the poor legibility of the language is one of the main sources of mistakes. For illustration, we consider a complicated equation (e.g., Schwarz 1994).

```
<- function(tau)
f1 dfrac(c, mu["A"]) + (dfrac(1L, mu["A"]) - dfrac(1L, mu["A"] + mu["B"]) *
{ "A"]*tau - c) * pnorm(dfrac(c - mu["A"]*tau, sqrt(sigma["A"]^2L*tau)))
((mu[- (mu["A"]*tau + c) * exp(dfrac(2L*mu["A"]*tau, sigma["A"]^2L))
* pnorm(dfrac(-c - mu["A"]*tau, sqrt(sigma["A"]^2L*tau)))))
}
mathout(f1)
```

$\frac{c}{{\mu}_{\text{A}}}}+\left\{{\displaystyle \frac{1}{{\mu}_{\text{A}}}}-{\displaystyle \frac{1}{{\mu}_{\text{A}}+{\mu}_{\text{B}}}}\cdot \left[\left({\mu}_{\text{A}}\tau -c\right)\cdot \Phi \left({\displaystyle \frac{c-{\mu}_{\text{A}}\tau}{\sqrt{{\sigma}_{\text{A}}^{2}\tau}}}\right)-\left({\mu}_{\text{A}}\tau +c\right)\cdot \mathrm{exp}\left({\displaystyle \frac{2{\mu}_{\text{A}}\tau}{{\sigma}_{\text{A}}^{2}}}\right)\cdot \Phi \left({\displaystyle \frac{-c-{\mu}_{\text{A}}\tau}{\sqrt{{\sigma}_{\text{A}}^{2}\tau}}}\right)\right]\right\$

The first version has a wrong parenthesis, which is barely visible in the code, whereas in the mathematical representation, the wrong curly brace is immediately obvious (the correct version is shown below for comparison).

```
<- function(tau)
f2 dfrac(c, mu["A"]) + (dfrac(1L, mu["A"]) - dfrac(1L, mu["A"] + mu["B"])) *
{ "A"]*tau - c) * pnorm(dfrac(c - mu["A"]*tau, sqrt(sigma["A"]^2L*tau)))
((mu[- (mu["A"]*tau + c) * exp(dfrac(2L*mu["A"]*tau, sigma["A"]^2L))
* pnorm(dfrac(-c - mu["A"]*tau, sqrt(sigma["A"]^2L*tau))))
}
mathout(f2)
```

$\frac{c}{{\mu}_{\text{A}}}}+\left({\displaystyle \frac{1}{{\mu}_{\text{A}}}}-{\displaystyle \frac{1}{{\mu}_{\text{A}}+{\mu}_{\text{B}}}}\right)\cdot \left[\left({\mu}_{\text{A}}\tau -c\right)\cdot \Phi \left({\displaystyle \frac{c-{\mu}_{\text{A}}\tau}{\sqrt{{\sigma}_{\text{A}}^{2}\tau}}}\right)-\left({\mu}_{\text{A}}\tau +c\right)\cdot \mathrm{exp}\left({\displaystyle \frac{2{\mu}_{\text{A}}\tau}{{\sigma}_{\text{A}}^{2}}}\right)\cdot \Phi \left({\displaystyle \frac{-c-{\mu}_{\text{A}}\tau}{\sqrt{{\sigma}_{\text{A}}^{2}\tau}}}\right)\right]$

As the reader may know from own experience, missed parentheses are frequent causes of wrong results and errors that are hard to locate in programming code. This particular example shows that mathematical rendering can help to substantially reduce the amount of careless errors in programming.

In its current version *mathml* has some limitations. For
example, it is currently not possible to use the functions of
*mathml* for writing inline formulas, here, the user has to adopt
the usual LateX notation. Moreover, for long equations, we did not yet
find a convenient way to insert line breaks. This is mostly due to
lacking support by MathML and LaTeX renderers. For example, in its
current stage, the LaTeX package *breqn* (Robertson et al. 2021) is mostly a proof of
concept.

The package *mathml* is available for R version (…) and later,
and can be easily installed using the usual
`install.packages("mathml")`

. The source code of the package
is found at https://github.com/mgondan/mathml.

Several ways exist for translating new R terms to their mathematical representation. We have already seen above how to use “hooks” to translate long variable names from R to compact mathematical signs, as well as functions such as cumulative probabilities \(P(X \le k)\) to different representations like \(\sum_{i=0}^k P(X = i)\). Obviously, the hooks require that there already exists a rule to translate the target representation into MathML and MathJax.

In this appendix we describe a few more ways to extend the set of
translations according to a user’s needs. As stated in the background
section, the Prolog end provides two classes of rules for translation,
macros `math/2,3,4`

mirroring the R hooks mentioned above,
and the low-level predicates `ml/3`

and `jax/3`

that create proper MathML and Latex terms.

To render the model equation of a linear model such as
`lm(EOT ~ T0 + Therapy, data=d)`

in mathematical form, it is
sufficient to map the `Formula`

in
`lm(Formula, Data)`

to its respective equation. This can in
two ways, using either the hooks described above, or a new
`math/2`

macro at the Prolog end.

`hook(lm(.Formula, .Data), .Formula)`

The hook is simple, but is a bit limited because only R’s tilde-form of linear models is shown, and it only works for a call with exactly two arguments.

Below is an example how to build a linear equation of the form \(Y = b_0 + b_1X_1 + ...\) using the Prolog
macros from *mathml*.

```
LM, M) :-
math_hook(compound(LM),
LM =.. [lm, ~(Y, Sum) | _Tail],
Sum, Predictors),
summands(, X) * X, member(X, Predictors), Terms),
findall(subscript(bModel, Terms),
summands(M = (Y == subscript(b, 0) + Model + epsilon).
```

The predicate `summands/2`

unpacks an expression
`A + B + C`

to a list `[C, B, A]`

and vice-versa
(see the file `lm.pl`

for details).

```
::consult(system.file(file.path("pl", "lm.pl"), package="mathml"))
rolog
<- quote(lm(EOT ~ T0 + Therapy, data=d, na.action=na.fail))
term mathout(term)
```

$\mathrm{EOT}={b}_{0}+{b}_{\mathrm{T0}}\mathrm{T0}+{b}_{\mathrm{Therapy}}\mathrm{Therapy}+\epsilon $

Base R does not provide a function like `cuberoot(x)`

or
`nthroot(x, n)`

, and the present package does not support the
respective representation. To obtain a cube root, a programmer would
typically type `x^(1/3)`

or better `x^{1/3}`

(see
the practice section why the curly brace is preferred in an exponent),
resulting in \(x^{1/3}\) which may
still not match everyone’s taste. Here describe the steps needed to
represent the \(n\)-th root as \(\sqrt[n]x\).

We assume that `nthroot(x, n)`

is available in the current
namespace (manually defined, or from R package
*pracma*, Borchers 2022), so that the names of the
arguments and their order are accessible to `canonical()`

if
needed. As we can see below, *mathml* uses a default
representation `name(arguments)`

for such unknown
functions.

```
<- function(x, n)
nthroot ^{1L/n}
x
<- canonical(quote(nthroot(n=3L, 2L)))
term mathout(term)
```

$\mathrm{nthroot}\left(2,3\right)$

A proper MathML term is obtained by `mlx/3`

(the x in mlx
indicates that it is an extension and is prioritized over the default
ml/3 rules). `mlx/3`

recursively invokes `ml/3`

for translating the function arguments *X* and *N*, and
then constructs the correct MathML entity
`<mroot>...</mroot>`

.

```
X, N), M, Flags) :-
mlx(nthroot(X, X1, Flags),
ml(N, N1, Flags),
ml(M = mroot([X1, N1]).
```

The explicit unification `M = ...`

in the last line serves
to avoid clutter in the head of `mlx/3`

. The Prolog file
`nthroot.pl`

also includes the respective rule for Latex and
can be consulted from the package folder via the underlying package
*rolog*.

```
::consult(system.file(file.path("pl", "nthroot.pl"), package="mathml"))
rolog
<- quote(nthroot(a * (b + c), 3L)^2L)
term mathout(term)
```

${\left[\sqrt[3]{a\cdot \left(b+c\right)}\right]}^{2}$

```
<- quote(a^(1L/3L) + a^{1L/3L} + a^(1.0/3L))
term mathout(term)
```

$\sqrt[3]{a}+{a}^{1/3}+{a}^{\left(1.00/3\right)}$

The file `nthroot.pl`

includes three more statements
`precx/3`

and `parenx/3`

, as well as a
`math_hook/2`

macro. The first sets the operator precedence
of the cubic root above the power, thereby putting a parentheses around
nthroot in \((\sqrt[3]{\ldots})^2\).
The second tells the system to increase the counter of the parentheses
below the root, such that the outer parenthesis becomes a square
bracket.

The last rule maps powers like `a^(1L/3L)`

to
`nthroot/3`

, as shown in the first summand. Of course,
*mathml* is not a proper computer algebra system. As is
illustrated by the other terms in the sum, such macros are limited to
purely syntactical matching, and terms like `a^{1L/3L}`

with
the curly brace or `a^(1.0/3L)`

with a floating point number
in the numerator are not detected.

Supported by the Erasmus+ program of the European Commission (2019-1-EE01-KA203-051708).

Allaire, JJ, Rich Iannone, Alison Presmanes Hill, and Yihui Xie. 2018.
“Distill for r Markdown.” https://rstudio.github.io/distill/.

Borchers, Hans W. 2022. *Pracma: Practical Numerical Math
Functions*. https://CRAN.R-project.org/package=pracma.

Chang, Winston, Joe Cheng, JJ Allaire, Carson Sievert, Barret Schloerke,
Yihui Xie, Jeff Allen, Jonathan McPherson, Alan Dipert, and Barbara
Borges. 2022. *Shiny: Web Application Framework for r*. https://CRAN.R-project.org/package=shiny.

Gondan, Matthias. 2022. *Rolog: Query ’SWI’-’Prolog’ from r*. https://github.com/mgondan/rolog.

Green, T. R. G. 1977. “Conditional Program Statements and Their
Comprehensibility to Professional Programmers.” *Journal of
Occupational Psychology* 50: 93–109.

Murrell, Paul, and Ross Ihaka. 2000. “An Approach to Providing
Mathematical Annotation in Plots.” *Journal of Computational
and Graphical Statistics* 9: 582–99.

R Core Team. 2022. *R: A Language and Environment for Statistical
Computing*. Vienna, Austria: R Foundation for Statistical Computing.
https://www.R-project.org/.

Robertson, Will, Joseph Wright, Frank Mittelbach, and Ulrike Fischer.
2021. *Breqn: Automatic Line Breaking of Displayed Equations*. https://www.ctan.org/pkg/breqn.

Sarkar, Deepayan, and Kurt Hornik. 2022. *Enhancements to
HTML Documentation*. https://blog.r-project.org/2022/04/08/enhancements-to-html-documentation/index.html.

Schwarz, Wolf. 1994. “Diffusion, Superposition, and the
Redundant-Targets Effect.” *Journal of Mathematical
Psychology* 38: 504–20.

Viechtbauer, Wolfgang. 2022. *Mathjaxr: Using ’Mathjax’ in Rd
Files*. https://CRAN.R-project.org/package=mathjaxr.

Wickham, H. 2019. *Advanced R*. Cambridge: Chapman
and Hall/CRC.

Wielemaker, Jan, Tom Schrijvers, Markus Triska, and Torbjörn Lager.
2012. “SWI-Prolog.” *Theory and Practice of
Logic Programming* 12 (1-2): 67–96.

Xie, Y., C. Dervieux, and E. Riederer. 2020. *R Markdown
Cookbook*. Cambridge: Chapman and Hall/CRC.