This vignette illustrates the various functions of PointedSDMs by using three datasets of the solitary tinamou (Tinamus solitarius) – a species of ground bird found on the eastern side of Brazil. Due to package dependencies, this vignette is not run. However the data and R script are available such that the user may carry out inference.
library(PointedSDMs) library(raster) library(ggpolypath) library(INLA) library(ggplot2)
Firstly, we load in the datasets and objects required for this
vignette. The SolitaryTinamou dataset attached to this package
list of four objects; for ease of use, we make
new objects for the items in this
data('SolitaryTinamou') <- CRS("+proj=longlat +ellps=WGS84") projection <- SolitaryTinamou$datasets datasets <- SolitaryTinamou$covariates covariates <- SolitaryTinamou$region region <- SolitaryTinamou$mesh mesh $crs <- projectionmesh
The first item is a
list of three datasets:
eBird, Gbif and Parks. The first two are
data.frame objects containing only two variables:
Y describing the latitude and longitude
coordinates of the species location respectively. As a result of this,
these two datasets are considered to be present only datasets
in our integrated model.
The other dataset (Parks) is also a
object. It contains the two coordinate variables present in the first
two datasets, but contains two additional variables:
Present, a binary variable describing the presence
(1) or absence (0) of the species at the given
area describing the area of the park.
Since we have information on the presences and absences of the species
in this dataset, we consider it a presence absence dataset.
Region is a
SpatialPolygons object which
give the boundary of the park containing the species; it was used in the
mesh construction and for the plots in this vignette.
The next object is
Raster objects of the covariates (Forest, NPP and
Altitude) describing the area of the parks. We stack these
three objects together using the
stack function, and then
<- scale(stack(covariates)) covariates crs(covariates) <- projection plot(covariates)
Finally we require a Delaunay triangulated mesh for the construction of the spatial field. A plot of the mesh used for this vignette is provided below.
To set up an integrated species distribution model with
PointedSDMs, we initialize it with the
intModel function – which results in an R6 objects
with additional slot functions to further customize the model. The base
model we run for these data comprises of the spatial covariates and an
intercept term for each dataset.
<- intModel(datasets, spatialCovariates = covariates, Coordinates = c('X', 'Y'), base Projection = projection, responsePA = 'Present', Offset = 'area', Mesh = mesh, pointsSpatial = NULL)
.$plot function produces a gg object
of the points used in this analysis by dataset; from this plot, we see
that most of the species locations are found towards the eastern and
south-central part of the park.
$plot(Boundary = FALSE) + basegg(region) + ggtitle('Plot of the species locations by dataset')
In this model, we also include prior information for the
Forest effect using
$priorsFixed(Effect = 'Forest', mean.linear = 0.5, prec.linear = 0.01)base
To run the integrated model, we use the
function with the
data argument as the object created with
intModel function above.
<- runModel(data = base) baseModel baseModel
Spatial fields are fundamental in our spatial species distribution
models, and so we include them in the model by setting
pointsSpatial = TRUE in
we will remove the intercept terms by specifying
pointsIntercept = FALSE
<- intModel(datasets, Coordinates = c('X', 'Y'), fields Projection = projection, Mesh = mesh, responsePA = 'Present', Offset = 'area', pointsIntercept = FALSE)
To specify the spatial field in the model, we use the slot function
$specifySpatial. This in turn will call R-INLA’s
inla.spde2.pcmatern function, which is used to specify
penalizing complexity (PC) priors for the parameters of the field. If we
PC = FALSE in this function, our shared spatial
field would be specified with R-INLA’s
$specifySpatial(sharedSpatial = TRUE, prior.range = c(1,0.01), fieldsprior.sigma = c(0.75, 0.05))
We furthermore include an additional spatial field (deemed the
bias field) for our citizen science eBird observations
$addBias slot function.
Finally we run the integrated model, again with
but this time we specify options with R-INLA’s empirical
Bayes integration strategy to help with computation time.
<- runModel(fields, options = list(control.inla = list(int.strategy = 'eb')))fieldsModel
If we wanted to make predictions of the shared spatial random field
across the map, we set
spatial = TRUE in the generic
<- predict(fieldsModel, mesh = mesh, spatial_predictions mask = region, spatial = TRUE, fun = 'linear')
And subsequently plot using the generic
However if we wanted to make predictions of the bias field, we would
do this by setting
biasfield = TRUE.
<- predict(fieldsModel, bias_predictions mesh = mesh, mask = region, biasfield = TRUE, fun = 'linear')
The last function of interest is
removes a dataset out of the full model, and then calculates a
cross-validation score with this reduced model. In this case, we try the
function out by removing the eBird dataset.
<- datasetOut(model = fieldsModel, dataset = 'eBird')eBird_out