Using valaddin

Eugene Ha


valaddin is a lightweight R package that enables you to transform an existing function into a function with input validation checks. It does so without requiring you to modify the body of the function, in contrast to doing input validation using stop or stopifnot, and is therefore suitable for both programmatic and interactive use.

This document illustrates the use of valaddin, by example. For usage details, see the main documentation page, ?firmly.

Use cases

The workhorse of valaddin is the function firmly, which applies input validation to a function, in situ. It can be used to:

Enforce types for arguments

For example, to require that all arguments of the function

are numerical, apply firmly with the check formula ~is.numeric1:

ff behaves just like f, but with a constraint on the type of its arguments:

Enforce constraints on argument values

For example, use firmly to put a cap on potentially long-running computations:

The role of each part of the value-constraining formula is evident:

Warn about pitfalls

If the default behavior of a function is problematic, or unexpected, you can use firmly to warn you. Consider the function as.POSIXct, which creates a date-time object:

The problem is that d is a potentially ambiguous object (with hidden state), because it’s not assigned a time zone, explicitly. If you compute the local hour of d using as.POSIXlt, you get an answer that interprets d according to your current time zone; another user—or you, in another country, in the future—may get a different result.

Sys.setenv(TZ = "EST")
d <- as.POSIXct("2017-01-01 09:30:00")
as.POSIXlt(d, tz = "EST")$hour
#> [1] 9

To warn yourself about this pitfall, you can modify as.POSIXct to complain when you’ve forgotten to specify a time zone:

Now when you call as.POSIXct, you get a cautionary reminder:

NB: The missing-argument warning is implemented by wrapping functions. The underlying function base::as.POSIXct is called unmodified.

Use loosely to access the original function

Though reassigning as.POSIXct may seem risky, it is not, for the behavior is unchanged (aside from the extra precaution), and the original as.POSIXct remains accessible:

  • With a namespace prefix: base::as.POSIXct
  • By applying loosely to strip input validation: loosely(as.POSIXct)

Decline handouts

R tries to help you express your ideas as concisely as possible. Suppose you want to truncate negative values of a vector w:

ifelse assumes (correctly) that you intend the 0 to be repeated 5 times, and does that for you, automatically.

Nonetheless, R’s good intentions have a darker side:

This smells like a coding error. Instead of complaining that pos is too short, ifelse recycles it to line it up with z. The result is probably not what you wanted.

In this case, you don’t need a helping hand, but rather a firm one:

ifelse_f is more pedantic than ifelse. But it also spares you the consequences of invalid inputs:

Reduce the risks of a lazy evaluation-style

When R make a function call, say, f(a), the value of the argument a is not materialized in the body of f until it is actually needed. Usually, you can safely ignore this as a technicality of R’s evaluation model; but in some situations, it can be problematic if you’re not mindful of it.

Consider a bank that waives fees for students. A function to make deposits might look like this2:

Suppose Bob is an account holder, currently not in school:

If Bob were to deposit an amount to cover an future fee payment, his account balance would be updated to:

Bob goes back to school and informs the bank, so that his fees will be waived:

But now suppose that, somewhere in the bowels of the bank’s software, the type of Bob’s account object is converted from a list to an environment:

If Bob were to deposit an amount to cover an future fee payment, his account balance would now be updated to:

Becoming a student has cost Bob money. What happened to the amount deposited?

The culprit is lazy evaluation and the modify-in-place semantics of environments. In the call deposit(account = bobs_acct, value = bobs_acct$fee), the value of the argument value is only set when it’s used, which comes after the object fee in the environment bobs_acct has already been zeroed out.

To minimize such risks, forbid account from being an environment:

This makes Bob a happy customer, and reduces the bank’s liability:

Prevent self-inflicted wounds

You don’t mean to shoot yourself, but sometimes it happens, nonetheless:

firmly can safeguard you from such mishaps: implement a safety procedure

gather your safety gear

then put it on

Now save and load engage safety features that prevent you from inadvertently destroying your data:

NB: Input validation is implemented by wrapping functions; thus, if the arguments are valid, the underlying functions base::save, base::load are called unmodified.

Toolbox of input checkers

valaddin provides a collection of over 50 pre-made input checkers to facilitate typical kinds of argument checks. These checkers are prefixed by vld_, for convenient browsing and look-up in editors and IDE’s that support name completion.

For example, to create a type-checked version of the function upper.tri, which returns an upper-triangular logical matrix, apply the checkers vld_matrix, vld_boolean (here “boolean” is shorthand for “logical vector of length 1”):

upper_tri <- firmly(upper.tri, vld_matrix(~x), vld_boolean(~diag))

# upper.tri assumes you mean a vector to be a column matrix
#>       [,1]
#> [1,] FALSE
#> [2,] FALSE

#> Error: upper_tri(x = 1:2, diag = FALSE)
#> Not matrix: x

# But say you actually meant (1, 2) to be a diagonal matrix
#>       [,1]  [,2]
#> [1,] FALSE  TRUE

upper_tri(diag(1:2), diag = "true")
#> Error: upper_tri(x = diag(1:2), diag = "true")
#> Not boolean: diag

upper_tri(diag(1:2), TRUE)
#>       [,1] [,2]
#> [1,]  TRUE TRUE
#> [2,] FALSE TRUE

Check anything with vld_true

Any input validation can be expressed as an assertion that “such and such must be true”; to apply it as such, use vld_true (or its complement, vld_false).

For example, the above hardening of ifelse can be redone as:

Make your own input checker with localize

A check formula such as ~ is.numeric (or "Not number" ~ is.numeric, if you want a custom error message) imposes its condition “globally”:

With localize, you can concentrate a globally applied check formula to specific expressions. The result is a reusable custom checker:

(In fact, chk_numeric is equivalent to the pre-built checker vld_numeric.)

Conversely, apply globalize to impose your localized checker globally:

  1. The inspiration to use ~ as a quoting operator came from the vignette Non-standard evaluation, by Hadley Wickham.

  2. Adapted from an example in Section 6.3 of Chambers, Extending R, CRC Press, 2016. For the sake of the example, ignore the fact that logic to handle fees does not belong in a function for deposits!