# Background

This vignette aims to demonstrate the workflows used to perform contact analysis using the wildlifeDI package in R. Specifically, two datasets are used to show how the different functions for contact analysis can be used. The main contact analysis functions in the wildilfeDI package have all been given a ‘con’ prefix (e.g., conProcess) so that they clearly stand apart from the dynamic interaction indices and other functions available in the package.

# Set Up

## Libraries and Packages

library(wildlifeDI)
library(ggplot2)
library(sf)
library(igraph)
library(nlme)

# Contact Analysis - Doe Deer Data

First let’s take a look at the doe deer data.

data(does)
does
##
## *********** List of class ltraj ***********
##
## Type of the traject: Type II (time recorded)
## * Time zone unspecified: dates printed in user time zone *
## Irregular traject. Variable time lag between two locs
##
## Characteristics of the bursts:
##            id       burst nb.reloc NAs          date.begin            date.end
## 1 d16241y2011 d16241y2011     1486   0 2011-04-30 19:02:37 2011-05-31 18:32:39
## 2 d16243y2011 d16243y2011     1480   0 2011-04-30 19:02:40 2011-05-31 18:32:37
## 3 d16244y2011 d16244y2011     1474   0 2011-04-30 19:02:36 2011-05-31 18:32:39
## 4 d16246y2011 d16246y2011     1481   0 2011-04-30 19:02:36 2011-05-31 18:32:40
## 5 d16247y2011 d16247y2011     1482   0 2011-04-30 19:02:37 2011-05-31 18:32:40
## 6 d16250y2011 d16250y2011     1474   0 2011-04-30 19:02:37 2011-05-31 18:32:37
## 7 d16252y2011 d16252y2011     1487   0 2011-04-30 19:02:40 2011-05-31 18:32:36
##
##
##  infolocs provided. The following variables are available:
## [1] "Elev"    "Slope"   "pForest" "dPond"   "dRoads"
plot(does)

## Processing contacts

We use the conProcess function to identify contacts first. We use a temporal threshold of $$t_c$$ = 15 minutes (based on the 30 minute tracking data) to define simultaneous fixes. A distance threshold of $$d_c$$ = 50 m (based on biologically relevant signals between deer and previous literature) was used to define contacts. The parameter dc must be specified in the correct units (i.e., those associated with the tracking dataset). The parameter tc needs to be specified in seconds. We can look at the distribution of all paired fixes (based on tc) to further explore whether our choice for dc makes sense.

plt <- dcPlot(does,tc=15*60,dmax=1000)

plt
##  [1]  55.55449 126.26021 227.26837 287.87327 398.98226 459.58716 580.79696
##  [8] 671.70431 702.00676 833.31737

The red lines in the dcPlot are automatically generated using a natural breaks algorithm to find local minima. They are more for reference, and not to be used for empirical assessment. That being said, it appears that a choice of $$d_c$$=50 corresponds to the first local minima.

doecons <- conProcess(does,dc=50,tc=15*60)  

Next we can arrange contacts between does into phases of continuous interaction using the function conPhase. A parameter $$p_c$$ is used to group contacts as belonging to the same phases separated by $$p_c$$ units in time. The parameter $$p_c$$ must be specified in seconds. The function conSummary can be used to summarize contacts and phases within the entire dataset to get some basic statistics. It computes how many contacts are observed, and in how many unique segments these occur in, as well as some other values regarding contact phase duration. Here $$p_c$$ = 60 minutes.

doephas <- conPhase(doecons, pc=60*60)
conSummary(doephas)
##                   stat result
## 1              n Fixes  10364
## 2           n Contacts    493
## 3             n Phases     99
## 4 Longest Phase (secs)  66574
## 5    Mean Phase (secs)   8154
## 6  Median Phase (secs)   3601
## 7   no. one-fix Phases     33

The summary stats suggest contacts between does are often over short bursts, but in some cases can be continuous over much longer periods of time.

## Temporal Analysis of Contacts

The conPairs and conTemporal functions can be used to extract more detailed information about the timing and duration of phases within the dataset. We plot the frequency histogram of contacts throughout the day (by hour), then the histogram of contacts throughout the month of May (by day), and the histogram of the duration of all contact phases.

doepair <- conPairs(doephas)
doetemp <- conTemporal(doephas,units='mins')

doepair$hod <- as.POSIXlt(doepair$date)$hour + as.POSIXlt(doepair$date)$min / 60 #convert POSIX to hours doetemp$hod <- as.POSIXlt(doetemp$start_time)$hour + as.POSIXlt(doetemp$start_time)$min / 60  #convert POSIX to hours
doepair$dom <- as.POSIXlt(doepair$date)$mday hist(doepair$dom,breaks=0:31)

We can see a clear cluster of contacts occurring near the end of the month, this is likely associated with parturition cycles, and the onset of denning behaviour.

hist(doepair$hod,breaks=0:24) #Figure 2b We can see the clear diurnal pattern in when contacts occur, which corresponds to known activity peaks in deer. There is a curious minimum occurrig at approximately 4-5 am right before the morning activity peak. hist(doetemp$hod,breaks=0:24) #Figure 2c

Again a similar pattern emerges when we only look at the timing of the initiation of contact phases. Finally, we can look at the distribution of the duration of the contact phases.

## Comparing contacts to random fixes

We can statistically test if the does behaved differently during contacts compared to other times. To do this we can compare contact fixes to random (non-contact) fixes. Here we show two variables: percent forest cover (related to habitat) and movement step-length (related to behaviour).

#Use ConContext for randomization Analysis
doe_rand <- conContext(doephas,var=c('pForest','dist'),nrand=1000)

g1 = ggplot(doe_rand, aes(x=dt_lev, y=pForest)) +
geom_boxplot() +
labs(x='',y='Forest Cover (%)')

g2 = ggplot(doe_rand, aes(x=dt_lev, y=dist)) +
geom_boxplot() +
labs(x='',y='Step-Length (m)')

g1

g2

The boxplots show a visible difference for percent forest cover between contacts and randomly chosen non-contact fixes. However, for step-length we see no evidence of a visible difference in behaviour.

The summary statistics show a similar picture.

tapply(doe_rand$dist, doe_rand$dt_lev, mean)
##      Con      Rnd
## 54.34104 52.28285
tapply(doe_rand$dist, doe_rand$dt_lev, sd) 
##      Con      Rnd
## 64.64867 80.01611
tapply(doe_rand$pForest, doe_rand$dt_lev, mean)
##       Con       Rnd
##  3.269952 13.446281
tapply(doe_rand$pForest, doe_rand$dt_lev, sd)
##       Con       Rnd
##  5.809105 14.103965

We fit generalized linear mixed models with the random vs contact indicator as a covariate and percent forest and step-length as different dependent variables. We specified the individual ID as a random effect. The models were fit using the package ‘nlme’.

m1 = lme(pForest ~ dt_lev,random = ~1|ID, data = doe_rand)
summary(m1)
## Linear mixed-effects model fit by REML
##   Data: doe_rand
##        AIC      BIC    logLik
##   11033.83 11055.19 -5512.914
##
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    10.34022 8.546956
##
## Fixed effects:  pForest ~ dt_lev
##                 Value Std.Error   DF  t-value p-value
## (Intercept) 10.826604  3.935059 1534 2.751319   6e-03
## dt_levRnd    2.151121  0.541637 1534 3.971521   1e-04
##  Correlation:
##           (Intr)
## dt_levRnd -0.101
##
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max
## -3.12283455 -0.40059256 -0.03950743  0.38505312  6.32908311
##
## Number of Observations: 1542
## Number of Groups: 7
m2 = lme(dist ~ dt_lev, random= ~1|ID, data = doe_rand ,na.action=na.exclude)
summary(m2)
## Linear mixed-effects model fit by REML
##   Data: doe_rand
##        AIC      BIC    logLik
##   17688.35 17709.71 -8840.177
##
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    2.305465 74.94907
##
## Fixed effects:  dist ~ dt_lev
##                Value Std.Error   DF   t-value p-value
## (Intercept) 54.80103  3.457116 1534 15.851660  0.0000
## dt_levRnd   -2.58290  4.113137 1534 -0.627962  0.5301
##  Correlation:
##           (Intr)
## dt_levRnd -0.791
##
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max
## -0.7334525 -0.6083967 -0.3892580  0.2027686 10.0317087
##
## Number of Observations: 1542
## Number of Groups: 7

# Contact Analysis - Mock Hunters and Deer Bucks

The initial steps of contact analysis are very similar to the contact analysis for the does. Here, instead of providing the raw data (which was too large) we provide the data processed after running the function conContext. The steps are shown below but not run. The parameters are defined as specified in the manuscript, and finally we chose three contextual/behavioural variables to explore: step length (termed dist), displacement from contact, and percent forest cover.

The parameters of the conContext function can be chosen to evaluate before and after contact events in a systematic way. Here we look at each fix-segment (8 minute interval) up to 96 minutes before and after a contact to study how behaviour changes during these periods.

# NOT RUN
mca <- conProcess(deer,hunters,dc=150,tc=4*60) # process contacts, tc=4 min, dc=150m
mcp <- conPhase(mca,pc=16*60)                  # group into phases pc=16 min

mcp <- conDisplacement(mcp,def='first')    # calculate displacement

#Context Analysis
mockhunt <- conContext(mcp,var=c('dist','displacement','Forest_Perc'),def='first',nlag=12,lag=8*60,gap=4*60,idcol='burst',nrand=NA)

We can load in these data, to take a look at the structure of this dataframe.

data(mockhunt)
head(mockhunt)
##                     date       dist displacement Forest_Perc dt_con dt_lev
## 2757 2008-12-16 15:38:37  10.440307     0.000000       26.03      0    Con
## 2756 2008-12-16 15:30:36   2.236068     2.236068       26.03   -481     B1
## 2758 2008-12-16 15:46:36   6.708204    10.440307       26.03    479     A1
## 2755 2008-12-16 15:22:36  15.524175    14.142136       26.65   -961     B2
## 2759 2008-12-16 15:54:36 365.531120     7.615773       26.03    959     A2
## 2754 2008-12-16 15:14:37   9.848858     5.385165       26.03  -1440     B3
##      phaid     ID
## 2757     1 2008_1
## 2756     1 2008_1
## 2758     1 2008_1
## 2755     1 2008_1
## 2759     1 2008_1
## 2754     1 2008_1

The output from conContext can be easily used to study before and after contact events by looking at the dt_con and dt_level columns which provide the time (in seconds) before or after the contact, and a factor ‘level’ based on the lags that were input into the conContext function. Here we define a contact as occurring at the first fix in a contact phase. Other definitions are of course possible. Then we look at 12 x 8 minute lags before and after each contact. The gap argument is useful to center the lags at the regular fix recording times so small deviations in the GPS times recorded are kept within the correct lag. To do this set the gap argument to 1/2 the fix interval.

There are two useful ways to look at these data:

1. a lineplot
2. a boxplot

The lineplot can be used to look at patterns that change over time individually, whereas a boxplot can be used to look at general trends grouping all the individuals together.

If we first look at the line plots for each of the three variables we can get an idea of any changes in behaviour or context before and after contact events. In such a plot, each line represents a single contact event (phase; $$n=47$$). The x-axis is the time before and after the first fix in the contact phase. The y-axis is any one of the behaviour or contextual variables; here we will first look at step-length.

ggplot(data=mockhunt, aes(x=dt_lev, y=dist, group=phaid)) +
geom_line(col='grey32') +
labs(x='Time to contact (min)',y='Step-length (m)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))
## Warning: Removed 4 row(s) containing missing values (geom_path).

Here we can see evidence of increased moment following contacts with mock-hunters, but a high degree of variation between individuals. It is unclear precisely how long after a contact with a mock-hunters the effect lasts.

We can look at the spatial displacement (e.g., actual geographic distance) a deer makes following a contact as well.

ggplot(data=mockhunt, aes(x=dt_lev, y=displacement, group=phaid)) +
geom_line(col='grey32') +
labs(x='Time to contact (min)',y='Displacement (m)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))

In the above plot we can see some unusually high movement behaviour by one or two individuals, but also that there is clearly larger displacement away from contact locations after contacts compared to before.

Last, we can look at how animals use forest cover before and after contacts.

ggplot(data=mockhunt, aes(x=dt_lev, y=Forest_Perc, group=phaid)) +
geom_line(col='grey32') +
labs(x='Time to contact (min)',y='Forest Cover (%)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))

From this analysis it is less clear how forest coverage is used by deer before and after contacts. It appears that the increased movement behaviour shown through step-length and displacement may result in changes in forest coverage levels. However, it does not appear that deer prefer or avoid forest coverage in response to contacts from hunters based on these data.

A perhaps cleaner way to look at these same patterns is to use boxplots.

ggplot(mockhunt, aes(x=dt_lev, y=dist)) +
geom_boxplot() +
coord_cartesian(ylim=c(0,1000)) +
labs(x='Time to contact (min)',y='Step length (m)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))
## Warning: Removed 4 rows containing non-finite values (stat_boxplot).

From this boxplot we can see the clear pattern of increased movement after contact events. It appears based on this boxplot that the effect of the contact manifests in increased movement (step lengths) for about $$6 \times 8 = 48$$ minutes post contact.

Next, we will look at the displacement (distance-to-contact) effect using the box plots.

ggplot(mockhunt, aes(x=dt_lev, y=displacement)) +
geom_boxplot() +
coord_cartesian(ylim=c(0,2000)) +
labs(x='Time to contact (min)',y='Distance to contact (m)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))

In this graph, it can be clearly seen that contacts result in a spatial displacament by white-tailed deer on average about 500 m displacement from their pre-contact position. This is important information to consider, as it means that white-tailed deer typically move ~500m following contact with a human hunter.

Of interest is whether there are any environmental changes in the habitats chosen following contacts, here we will look at the percent forest cover as one such example.

ggplot(mockhunt, aes(x=dt_lev, y=Forest_Perc)) +
geom_boxplot() +
labs(x='Time to contact (min)',y='Forest Cover (%)') +
scale_x_discrete(labels=c(as.character(seq(-96,96,by=8))))

The visual evidence here suggests (as noted with the line graph) that deer do not preferentially select or avoid forest habitat before, during, or after contacts. The distribution of forest habitat found before, during, and after contacts is comparable at all times.

### Longitudinal models for Before-After contact analysis

We fit longitudinal regression models with an intervention dummy term (associated with the contact event) to test how the contact influenced three outcome variables: Step-length, displacement from contact, and percent forest cover, as shown in the above box-plots. We again include the individual ID as a random effect. The models were fit using the ‘nlme’ package.

These models can be interpreted as follows:

• The intercept term refers to the pre-contact intercept.
• The con coefficient is added to the pre-contact intercept to represent the post-contact intercept term.
• The t coefficient represents the slope of the relationship with time prior to the contact
• the t:con coefficient represents how that original time relationship slope changes after the contact (i.e., it is additive).
mockhunt$t = as.integer(mockhunt$dt_lev)
mockhunt$con = 0 mockhunt$con[which(mockhunt$t > 12)] = 1 mh = mockhunt[!is.na(mockhunt$dist),]

GLMM for step-length shows significant increase in step-length post contact. It also shows that their was no significant time trend pre-contact, but a negative time trend post contact, which aligns with the visual description of the data.

distlme <- lme(dist ~ t*con , random = ~ 1|ID, correlation=corAR1(),data = mh)
summary(distlme)
## Linear mixed-effects model fit by REML
##   Data: mh
##        AIC      BIC    logLik
##   14730.28 14765.41 -7358.142
##
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    55.29269 170.5807
##
## Correlation Structure: AR(1)
##  Formula: ~1 | ID
##  Parameter estimate(s):
##        Phi
## 0.03149203
## Fixed effects:  dist ~ t * con
##                Value Std.Error   DF   t-value p-value
## (Intercept)  29.6138  19.50429 1095  1.518320  0.1292
## t             1.2567   2.10434 1095  0.597184  0.5505
## con         375.0853  40.40403 1095  9.283362  0.0000
## t:con       -16.4909   2.91292 1095 -5.661306  0.0000
##  Correlation:
##       (Intr) t      con
## t     -0.699
## con   -0.331  0.385
## t:con  0.528 -0.757 -0.873
##
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max
## -1.4944524 -0.4289121 -0.1152831  0.1111148 12.2314216
##
## Number of Observations: 1121
## Number of Groups: 23

GLMM for displacement shows that the contact has a significant impact on the displacement intercept and slope, suggesting a change in directionality of the relationship again observed in the box-plots. The magnitude of this relationship increases after the contact as well.

displme <- lme(displacement ~ t*con , random = ~ 1|ID, correlation=corAR1(),data = mh)
summary(displme)
## Linear mixed-effects model fit by REML
##   Data: mh
##       AIC      BIC    logLik
##   16489.2 16524.33 -8237.598
##
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    195.0454 375.5195
##
## Correlation Structure: AR(1)
##  Formula: ~1 | ID
##  Parameter estimate(s):
##       Phi
## 0.1305572
## Fixed effects:  displacement ~ t * con
##                 Value Std.Error   DF   t-value p-value
## (Intercept)  276.5632  53.85759 1095  5.135083  0.0000
## t            -15.3532   4.68729 1095 -3.275509  0.0011
## con         -416.3911  92.98177 1095 -4.478202  0.0000
## t:con         49.9622   6.93164 1095  7.207859  0.0000
##  Correlation:
##       (Intr) t      con
## t     -0.565
## con   -0.329  0.513
## t:con  0.455 -0.807 -0.905
##
## Standardized Within-Group Residuals:
##        Min         Q1        Med         Q3        Max
## -2.6498008 -0.5714637 -0.1255741  0.4269929 10.4585368
##
## Number of Observations: 1121
## Number of Groups: 23

Finally, the GLMM for percent forest cover effectively shows that there was no significant impact of the contact event in the percent forest cover variable.

pForlme <- lme(Forest_Perc ~ t*con , random = ~ 1|ID, correlation=corAR1(),data = mh)
summary(pForlme)
## Linear mixed-effects model fit by REML
##   Data: mh
##        AIC      BIC    logLik
##   8299.792 8334.921 -4142.896
##
## Random effects:
##  Formula: ~1 | ID
##         (Intercept) Residual
## StdDev:    7.830022 9.620532
##
## Correlation Structure: AR(1)
##  Formula: ~1 | ID
##  Parameter estimate(s):
##        Phi
## -0.2209123
## Fixed effects:  Forest_Perc ~ t * con
##                 Value Std.Error   DF   t-value p-value
## (Intercept) 22.939602 1.8593111 1095 12.337689  0.0000
## t           -0.119169 0.1200057 1095 -0.993027  0.3209
## con         -2.470816 2.1036970 1095 -1.174512  0.2404
## t:con        0.272337 0.1315123 1095  2.070808  0.0386
##  Correlation:
##       (Intr) t      con
## t     -0.416
## con   -0.055 -0.007
## t:con  0.246 -0.598 -0.753
##
## Standardized Within-Group Residuals:
##         Min          Q1         Med          Q3         Max
## -3.28714151 -0.49969518 -0.02643809  0.52668920  5.64325441
##
## Number of Observations: 1121
## Number of Groups: 23

# Summary

The wildlifeDI package can be used to tackle a wide range of problems when performing contact analysis using wildlife tracking data. Specifically, it provides tools to process, manage, and analyze contacts spatially and temporally. It provides output data structures that are useful for integration in R’s well established statistical modelling packages facilitating further statistical analyses.

## Session Information

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 17763)
##
## Matrix products: default
##
## locale:
##
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base
##
## other attached packages:
##  [1] nlme_3.1-152        igraph_1.2.6        sf_1.0-0
##
## loaded via a namespace (and not attached):
##  [1] Rcpp_1.0.6         lattice_0.20-44    prettyunits_1.1.1  class_7.3-19
##  [5] assertthat_0.2.1   digest_0.6.27      utf8_1.2.1         R6_2.5.0
##  [9] evaluate_0.14      e1071_1.7-7        highr_0.9          pillar_1.6.1
## [13] rlang_0.4.10       progress_1.2.2     jquerylib_0.1.4    rmarkdown_2.9
## [17] labeling_0.4.2     stringr_1.4.0      munsell_0.5.0      proxy_0.4-26
## [21] compiler_4.1.0     xfun_0.24          pkgconfig_2.0.3    rgeos_0.5-5
## [25] htmltools_0.5.1.1  tidyselect_1.1.1   tibble_3.1.2       codetools_0.2-18
## [29] fansi_0.5.0        crayon_1.4.1       dplyr_1.0.6        withr_2.4.2
## [33] grid_4.1.0         jsonlite_1.7.2     gtable_0.3.0       lifecycle_1.0.0
## [37] DBI_1.1.1          magrittr_2.0.1     units_0.7-2        scales_1.1.1
## [41] KernSmooth_2.23-20 stringi_1.5.3      farver_2.1.0       bslib_0.2.5.1
## [45] ellipsis_0.3.2     generics_0.1.0     vctrs_0.3.8        tools_4.1.0
## [49] glue_1.4.2         purrr_0.3.4        hms_1.1.0          yaml_2.2.1
## [53] colorspace_2.0-1   classInt_0.4-3     knitr_1.33         sass_0.4.0