*Community trajectory analysis* (CTA) is a framework to analyze community dynamics described as trajectories in a chosen space of community resemblance. CTA takes trajectories as objects to be analyzed and compared geometrically. Given a distance matrix between community states and vectors defining sites (i.e. sampling units) and surveys (temporal replications), package **vegclust** offers a set of functions to perform CTA. In this vignette you will learn how to conduct CTA using different package functions. First of all, we load `vegclust`

and other libraries needed in this document:

```
library(vegclust)
library(RColorBrewer)
library(smacof)
```

`## Loading required package: plotrix`

```
##
## Attaching package: 'smacof'
```

```
## The following object is masked from 'package:base':
##
## transform
```

`library(MASS)`

In this section we describe how to study the trajectories of three sites that have been surveyed four times each. We use a small dataset where trajectories occur in a space of two dimensions, so that geometric calculations can be followed more easily.

To specify community dynamics, we need three data items:

- A set of community states (i.e. coordinates in a space \(\Omega\)), described using a distance matrix \(d\);
- A vector specifying the site (i.e. sampling unit) corresponding to each community state;
- A vector specifying the survey (i.e. time point) corresponding to the sampling of each community state.

In CTA sampling units do not need to be surveyed synchronously nor the same number of times. However, in this example we assume all three sites have been surveyed synchronously four times.

Let us first define the vectors that describe the site and the survey of each community state:

```
#Description of sites and surveys
sites = c(1,1,1,1,2,2,2,2,3,3,3,3)
surveys=c(1,2,3,4,1,2,3,4,1,2,3,4)
```

We then build a matrix with the coordinates corresponding to the community states of the three sites during each survey. We assume that the community space \(\Omega\) has two dimensions:

```
#Raw data table
xy<-matrix(0, nrow=12, ncol=2)
xy[2,2]<-1
xy[3,2]<-2
xy[4,2]<-3
xy[5:6,2] <- xy[1:2,2]
xy[7,2]<-1.5
xy[8,2]<-2.0
xy[5:6,1] <- 0.25
xy[7,1]<-0.5
xy[8,1]<-1.0
xy[9:10,1] <- xy[5:6,1]+0.25
xy[11,1] <- 1.0
xy[12,1] <-1.5
xy[9:10,2] <- xy[5:6,2]
xy[11:12,2]<-c(1.25,1.0)
cbind(sites,surveys,xy)
```

```
## sites surveys
## [1,] 1 1 0.00 0.00
## [2,] 1 2 0.00 1.00
## [3,] 1 3 0.00 2.00
## [4,] 1 4 0.00 3.00
## [5,] 2 1 0.25 0.00
## [6,] 2 2 0.25 1.00
## [7,] 2 3 0.50 1.50
## [8,] 2 4 1.00 2.00
## [9,] 3 1 0.50 0.00
## [10,] 3 2 0.50 1.00
## [11,] 3 3 1.00 1.25
## [12,] 3 4 1.50 1.00
```

The matrix of Euclidean distances \(d\) between community states in \(\Omega\) is then:

```
#Distance matrix
D = dist(xy)
D
```

```
## 1 2 3 4 5 6 7
## 2 1.0000000
## 3 2.0000000 1.0000000
## 4 3.0000000 2.0000000 1.0000000
## 5 0.2500000 1.0307764 2.0155644 3.0103986
## 6 1.0307764 0.2500000 1.0307764 2.0155644 1.0000000
## 7 1.5811388 0.7071068 0.7071068 1.5811388 1.5206906 0.5590170
## 8 2.2360680 1.4142136 1.0000000 1.4142136 2.1360009 1.2500000 0.7071068
## 9 0.5000000 1.1180340 2.0615528 3.0413813 0.2500000 1.0307764 1.5000000
## 10 1.1180340 0.5000000 1.1180340 2.0615528 1.0307764 0.2500000 0.5000000
## 11 1.6007811 1.0307764 1.2500000 2.0155644 1.4577380 0.7905694 0.5590170
## 12 1.8027756 1.5000000 1.8027756 2.5000000 1.6007811 1.2500000 1.1180340
## 8 9 10 11
## 2
## 3
## 4
## 5
## 6
## 7
## 8
## 9 2.0615528
## 10 1.1180340 1.0000000
## 11 0.7500000 1.3462912 0.5590170
## 12 1.1180340 1.4142136 1.0000000 0.5590170
```

CTA is based on the analysis of information in \(d\). Therefore, it does not require explicit coordinates. This is an advantage because it allows the analysis to be conducted on arbitrary metric (or semi-metric) spaces. The choice of \(d\) is left to the user and will depend on the problem at hand.

To begin our analysis of the three trajectories, we display them in an ordination space, using function `trajectoryPCoA`

. Since \(\Omega\) has only two dimensions in this example, the Principal Coordinates Analysis (PCoA) on \(d\) displays the complete space:

```
par(mar=c(4,4,1,1))
trajectoryPCoA(D, sites, surveys, traj.colors = c("black","red", "blue"), lwd = 2)
legend("topleft", col=c("black","red", "blue"),
legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
```

While trajectory of site ‘1’ (black arrows) is made of three segments of the same length and direction, trajectory of site ‘2’ (red arrows) has a second and third segments that bend and are shorter than that of the segond segment of site ‘1’. Trajectory of site ‘3’ includes a stronger change in direction and shorter segments.

As this example has two dimensions and we used Euclidean distance, the same plot (albeit rotated) can be straightforwardly obtained using matrix `xy`

and function `trajectoryPlot`

:

```
par(mar=c(4,4,1,1))
trajectoryPlot(xy, sites, surveys, traj.colors = c("black","red", "blue"), lwd = 2)
legend("topright", col=c("black","red", "blue"),
legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"), bty="n", lty=1, lwd = 2)
```

While `trajectoryPCoA`

uses PCoA (also known as classical Multidimensional Scaling) to display trajectories, users can display community trajectories using other ordination techniques such as metric Multidimensional Scaling (mMDS; see function *mds* of package **smacof**) or non-metric MDS (nMDS; see function *metaMDS* in package **vegan** or function *isoMDS* in package **MASS**). Function `trajectoryPlot`

will help drawing arrows between segments to represent trajectories on the ordination space given by any of these methods.

Functions `trajectoryPCoA`

and `trajectoryPlot`

allow selecting the subset of trajectories to be displayed, so that the representation can focus on particular pathways (see also function `trajectorySelection`

).

One may be interested in studying the geometric properties of particular trajectories. For example, one can obtain the length of trajectory segments and the total pathway length using function `trajectoryLengths`

:

`trajectoryLengths(D, sites, surveys)`

```
## S1 S2 S3 Trajectory
## 1 1 1.000000 1.0000000 3.000000
## 2 1 0.559017 0.7071068 2.266124
## 3 1 0.559017 0.5590170 2.118034
```

In CTA, angles are measured using triplets of time-ordered community states (a pair of consecutive segments is an example of such triplets). As \(d\) may represent a space of multiple dimensions, angles cannot be calculated with respect to a single plane. Instead, each angle is measured on the plane defined by each triplet. Zero angles indicate that the three points (e.g. the two consecutive segments) are in a straight line. The larger the angle value, the more is trajectory changing in direction. Mean and standard deviation statistics of angles are calculated according to circular statistics (see R package `circular`

). Function `trajectoryAngles`

allows calculating the angles between consecutive segments:

`trajectoryAngles(D, sites, surveys)`

```
## S1-S2 S2-S3 mean sd rho
## 1 0.00000 0.00000 0.00000 0.00000000 1.0000000
## 2 26.56505 18.43495 22.50000 0.07097832 0.9974842
## 3 63.43495 53.13010 58.28253 0.08998746 0.9959593
```

While site ‘1’ follows a straight pathway, angles are > 0 for trajectories of site ‘2’ and ‘3’, denoting the change in direction. In this case, the same information could be obtained by inspecting the previous plots, but in a case where \(\Omega\) has many dimensions, the representation will correspond to a reduced (ordination) space and hence, angles and lengths in the plot will not correspond exactly to those of functions `trajectoryLengths`

and `trajectoryAngles`

, which take into account the complete \(\Omega\) space.

Angles can be calculated not only for all consecutive segments but for all four triplets of ordered community states, whether of consecutive segments or not (i.e., between points 1-2-3, 1-2-4, 1-3-4 and 2-3-4). This is done by specifying `all=TRUE`

in `trajectoryAngles`

:

`trajectoryAngles(D, sites, surveys, all=TRUE)`

```
## A1 A2 A3 A4 mean sd rho
## 1 0.00000 0.0000 0.00000 0.00000 0.00000 0.0000000 1.0000000
## 2 26.56505 36.8699 35.53768 18.43495 29.36033 0.1300790 0.9915754
## 3 63.43495 90.0000 94.76364 53.13010 75.36015 0.3078934 0.9537066
```

The *mean resultant length* of circular statistics (column `rho`

of the previous result), which takes values between 0 and 1, can be used to assess the degree of homogeneity of angle values and it will take a value of 1 if all angles are the same. This approach, however, uses only angular information and does not take into account the length of segments.

To measure the overall directionality of a community trajectory (i.e. if the pathway consistently follows the same direction in \(\Omega\) ), we recommend using another statistic that is sensitive to both angles and segment lengths and is implemented in function `trajectoryDirectionality`

:

`trajectoryDirectionality(D, sites, surveys)`

```
## 1 2 3
## 1.0000000 0.8274026 0.5620859
```

As known from previous plots, trajectory of site ‘2’ is less straight than trajectory of site ‘1’ and that of site ‘3’ has the lowest directionality value. Note that function only computes a descriptive statistic, i.e. it does not perform any statistical test on directionality.

Community states occupy a position within their trajectory that depends on the total pathway length of the trajectory. By adding the length of segments prior to a given state and dividing the sum by the total length of the trajectory we obtain the relative position of the community state. Function `trajectoryProjection`

allows obtaining the relative position of each point of a trajectory. To use it for this purpose one should use as parameters the distance matrix between state and the indices that conform the trajectory. For example for the two example trajectories we would have:

`trajectoryProjection(D, 1:4, 1:4)`

```
## distanceToTrajectory segment relativePosition
## 1 0 1 0.0000000
## 2 0 1 0.3333333
## 3 0 2 0.6666667
## 4 0 3 1.0000000
```

`trajectoryProjection(D, 5:8, 5:8)`

```
## distanceToTrajectory segment relativePosition
## 5 0 1 0.0000000
## 6 0 1 0.4412822
## 7 0 2 0.6879664
## 8 NA NA NA
```

Because the trajectory of site ‘2’ as the second and third segments that are longer than the first, the relative position of state ‘6’ is lower than 1/3 and that of state ‘7’ is lower than 2/3.

Function `trajectoryProjection`

can also be used to **project** arbitrary community states onto a given reference trajectory. For example we can study the projection of third state of the trajectory of site ‘1’ (i.e. state 3) onto the trajectory of site ‘2’ (i.e. states 5 to 8):

`trajectoryProjection(D, 3, 5:8)`

```
## distanceToTrajectory segment relativePosition
## 3 0.7071068 3 0.6879664
```

When trajectories have been sampled the same number of times, function `trajectoryConvergence`

allows performing tests of convergence based on the trend analysis of the sequences of distances between points of the two trajectories (i.e. first-first, second-second, …):

`trajectoryConvergence(D, sites, surveys, symmetric = TRUE)`

```
## $tau
## 1 2 3
## 1 NA 0.9128709 0.9128709
## 2 0.9128709 NA 0.9128709
## 3 0.9128709 0.9128709 NA
##
## $p.value
## 1 2 3
## 1 NA 0.1485617 0.1485617
## 2 0.1485617 NA 0.1485617
## 3 0.1485617 0.1485617 NA
```

The function performs the Mann-Whitney trend test. Values of the statistic (‘tau’) larger than 0 correspond to trajectories that are diverging, whereas values lower than 0 correspond to trajectories that are converging. By setting `symmetric = FALSE`

the convergence test becomes asymmetric. In this case the sequence of distances *between every point of one trajectory and the other*:

`trajectoryConvergence(D, sites, surveys, symmetric = FALSE)`

`## WARNING: Error exit, tauk2. IFAULT = 12`

```
## $tau
## 1 2 3
## 1 NA 0.9128709 0.9128709
## 2 0.9128709 NA 0.8164966
## 3 0.9128709 0.6666666 NA
##
## $p.value
## 1 2 3
## 1 NA 0.1485617 0.1485617
## 2 0.1485617 NA 1.0000000
## 3 0.1485617 0.3081795 NA
```

The asymmetric test is useful to determine if one trajectory is becoming closer to the other or if it is departing from the other.

To start comparing trajectories between sites (i.e. between sampling units), one important step is the calculation of distances between directed segments, which can be obtained by calling function `segmentDistances`

:

```
Ds = segmentDistances(D, sites, surveys)$Dseg
Ds
```

```
## 1[1-2] 1[2-3] 1[3-4] 2[1-2] 2[2-3] 2[3-4]
## 1[2-3] 1.0000000
## 1[3-4] 2.0000000 1.0000000
## 2[1-2] 0.2500000 1.0307764 2.0155644
## 2[2-3] 1.0307764 0.7071068 1.5811388 1.0000000
## 2[3-4] 1.5811388 1.0000000 1.4142136 1.5206906 0.7071068
## 3[1-2] 0.5000000 1.1180340 2.0615528 0.2500000 1.0307764 1.5000000
## 3[2-3] 1.1180340 1.1180340 2.0124612 1.0307764 0.5590170 0.7500000
## 3[3-4] 1.6007811 1.5590170 2.0155644 1.4577380 1.1180340 1.0606602
## 3[1-2] 3[2-3]
## 1[2-3]
## 1[3-4]
## 2[1-2]
## 2[2-3]
## 2[3-4]
## 3[1-2]
## 3[2-3] 1.0000000
## 3[3-4] 1.5590170 0.5590170
```

Distances between segments are affected by differences in both *position*, *size* and *direction*. Hence, among the six segments of this example, the distance is maximum between the last segment of trajectory ‘1’ and the first segment of trajectory ‘3’.

One can display distances between segments in two dimensions using mMDS.

```
mMDS = mds(Ds)
mMDS
```

```
##
## Call:
## mds(delta = Ds)
##
## Model: Symmetric SMACOF
## Number of objects: 9
## Stress-1 value: 0.062
## Number of iterations: 15
```

```
xret = mMDS$conf
par(mar=c(4,4,1,1))
plot(xret, xlab="axis 1", ylab = "axis 2", asp=1, pch=21,
bg=c(rep("black",3), rep("red",3), rep("blue",3)),
xlim=c(-1.5,1), ylim=c(-1,1.5))
text(xret, labels=rep(paste0("s",1:3),3), pos=1)
legend("topleft", pt.bg=c("black","red","blue"), pch=21, bty="n", legend=c("Trajectory 1", "Trajectory 2", "Trajectory 3"))
```

Distances between segments are internally calculated when comparing whole trajectories using function `trajectoryDistances`

. Here we show the dissimilarity between the two trajectories as assessed using either the *Hausdorff* distance (equal to the maximum distance between directed segments) or the *directed segment path distance* (an average of distances between segments):

`trajectoryDistances(D, sites, surveys, distance.type = "Hausdorff")`

```
## 1 2
## 2 2.015564
## 3 2.061553 1.500000
```

`trajectoryDistances(D, sites, surveys, distance.type = "DSPD")`

```
## 1 2
## 2 0.7214045
## 3 1.1345910 0.5714490
```

DSPD is a symmetrized distance. To calculate non-symmetric distances one uses:

`trajectoryDistances(D, sites, surveys, distance.type = "DSPD", symmetrization = NULL)`

```
## 1 2 3
## 1 0.0000000 0.7904401 1.2101651
## 2 0.6523689 0.0000000 0.5196723
## 3 1.0590170 0.6232257 0.0000000
```

In this example we analyze the dynamics of 8 permanent forest plots located on slopes of a valley in the New Zealand Alps. The study area is mountainous and centered on the Craigieburn Range (Southern Alps), South Island, New Zealand. Forests plots are almost monospecific, being the mountain beech (*Fuscospora cliffortioides*) the main dominant tree species. Previously forests consisted of largely mature stands, but some of them were affected by different disturbances during the sampling period (1972-2009) which includes 9 surveys. We begin our example by loading the data set, which includes 72 plot observations:

`data("avoca")`

Community data is in form of an object `stratifiedvegdata`

. To account for differences in tree diameter, while emphasizing regeneration, the data contains individual counts to represent tree abundance and trees are classified in 19 quadratic diameter bins (in cm): {(2.25, 4], (4, 6.25], (6.25, 9], … (110.25, 121]}. The data set also includes vectors `avoca_surveys`

and `avoca_sites`

that indicate the survey and forest plot corresponding to each forest state.

Before starting community trajectory analysis, we have to use function `vegdiststruct`

to calculate distances between forest plot states in terms of structure and composition:

`avoca_D_man = vegdiststruct(avoca_strat, method="manhattan", transform = function(x){log(x+1)})`

Distances in `avoca_D_man`

are calculated using the Manhattan metric.

The distance matrix `avoca_D_man`

conforms our definition of \(\Omega\). We use `trajectoryPCoA`

to display the relations between forest plot states in this space and to draw the trajectory of each plot:

```
par(mar=c(4,4,1,1))
trajectoryPCoA(avoca_D_man, avoca_sites, avoca_surveys,
traj.colors = brewer.pal(8,"Accent"),
axes=c(1,2), length=0.1, lwd=2)
legend("topright", bty="n", legend = 1:8, col = brewer.pal(8,"Accent"), lwd=2)
```

Note that in this case, the full \(\Omega\) includes more than two dimensions, and PCoA is representing 43% of total variance (correction for negative eigenvalues is included in the call to `cmdscale`

from `trajectoryPCoA`

), so one has to be careful when interpreting trajectories visually.

Another option is to use mMDS to represent trajectories, which in this case produces a similar result:

```
mMDS = mds(avoca_D_man)
mMDS
```

```
##
## Call:
## mds(delta = avoca_D_man)
##
## Model: Symmetric SMACOF
## Number of objects: 72
## Stress-1 value: 0.114
## Number of iterations: 49
```

```
par(mar=c(4,4,1,1))
trajectoryPlot(mMDS$conf, avoca_sites, avoca_surveys,
traj.colors = brewer.pal(8,"Accent"),
axes=c(1,2), length=0.1, lwd=2)
legend("topright", bty="n", legend = 1:8, col = brewer.pal(8,"Accent"), lwd=2)
```

One can inspect specific trajectories using argument `selection`

in function `trajectoryPCoA`

. This allows getting a better view of particular trajectories, here that of forest plot ‘3’:

```
par(mfrow=c(1,2))
trajectoryPCoA(avoca_D_man, avoca_sites, avoca_surveys,
selection= 3,
length=0.1, lwd=2)
plotTrajDiamDist(3)
```

In the right hand, we added a representation of the change in the mountain beech tree diameter distribution through time for trajectory of forest plot ‘3’. The dynamics of this plot include mostly growth, which results in individuals moving from one diameter class to the other. The whole trajectory looks mostly directional. Let’s now inspect the trajectory of forest plot ‘4’:

```
par(mfrow=c(1,2))
trajectoryPCoA(avoca_D_man, avoca_sites, avoca_surveys,
selection= 4,
length=0.1, lwd=2)
plotTrajDiamDist(4)
```

This second trajectory is less straight and seems to include a turn by the end of the sampling period, corresponding to the recruitment of new saplings.

While trajectory lengths and angles can be inspected visually in ordination diagrams, it is better to calculate them using the full \(\Omega\) space (i.e., from matrix `avoca_D_man`

). Using function `trajectoryLengths`

we can see that the trajectory of forest plot ‘4’ is lengthier than that of plot ‘3’, mostly because includes a lengthier last segment (i.e. the recruitment of new individuals):

`trajectoryLengths(avoca_D_man, avoca_sites, avoca_surveys)`

```
## S1 S2 S3 S4 S5 S6 S7
## 1 1.2173214 1.5911988 1.0976965 2.1173501 0.5238760 1.5863283 1.5750365
## 2 0.5971165 1.7469687 0.9866591 0.9413060 1.3687614 0.6280231 1.4565581
## 3 1.1480971 1.2404953 0.6508116 0.4371405 0.5098385 1.2078811 1.6241741
## 4 0.7932307 1.8572629 0.7268623 0.8348635 3.0560437 1.9863939 0.9905892
## 5 1.7769875 0.3367341 0.7283030 0.6515714 1.2660552 0.9544933 1.2605333
## 6 2.1891568 0.5312711 1.0035212 0.4113220 2.1394743 1.0482871 1.4924056
## 7 0.2919002 0.8857645 1.0908604 0.5601649 2.0600208 0.3712067 1.1014563
## 8 0.1909713 1.2000266 2.3211891 0.6539882 2.7807668 0.8170202 1.2072425
## S8 Trajectory
## 1 0.6277940 10.336602
## 2 1.1232798 8.848673
## 3 1.4536390 8.272077
## 4 3.8794520 14.124698
## 5 0.5842314 7.558909
## 6 1.9623777 10.777816
## 7 1.7518489 8.113223
## 8 1.5107357 10.681940
```

If we calculate the angles between consecutive segments (using function `trajectoryLengths`

) we see that indeed the trajectory of ‘3’ is rather directional, but the angles of trajectory of ‘4’ are larger, on aveerage:

```
avoca_ang <- trajectoryAngles(avoca_D_man, avoca_sites, avoca_surveys)
avoca_ang
```

```
## S1-S2 S2-S3 S3-S4 S4-S5 S5-S6 S6-S7
## 1 41.74809 8.669806e+01 7.401875e+01 26.94138 113.40657 100.67068
## 2 68.14891 3.466506e+01 8.537736e-07 0.00000 25.97111 0.00000
## 3 87.24519 3.088828e+01 1.207418e-06 0.00000 50.90743 48.12854
## 4 37.65736 8.537736e-07 8.537736e-07 36.25240 55.53607 74.21065
## 5 42.02156 7.166400e+01 1.207418e-06 49.95436 65.75897 65.82083
## 6 41.69894 4.611675e+01 5.669641e+01 135.84929 0.00000 0.00000
## 7 53.63254 1.152378e+02 6.519921e+01 60.71442 0.00000 56.25733
## 8 180.00000 0.000000e+00 9.213121e+01 132.36445 71.33948 36.43189
## S7-S8 mean sd rho
## 1 1.021996e+02 78.82477 0.5334615 0.8673692
## 2 1.207418e-06 17.61550 0.4257423 0.9133572
## 3 4.568199e+01 37.39925 0.5018798 0.8816663
## 4 4.980332e+01 36.44717 0.4531030 0.9024417
## 5 1.061250e+02 57.89321 0.5239476 0.8717431
## 6 0.000000e+00 34.34150 0.7777287 0.7390195
## 7 1.207418e-06 49.96352 0.6562139 0.8062928
## 8 1.207418e-06 66.68343 1.1601974 0.5101610
```

By calling function `trajectoryDirectionality`

we can confirm that the trajectory for site ‘4’ is less straight than that of site ‘3’:

```
avoca_dir <- trajectoryDirectionality(avoca_D_man, avoca_sites, avoca_surveys)
avoca_dir
```

```
## 1 2 3 4 5 6 7
## 0.6781369 0.6736490 0.8651467 0.5122482 0.6677116 0.7058465 0.7391775
## 8
## 0.5254225
```

The following code displays the relationship between the statistic in `trajectoryDirectionality`

and the mean resultant vector length that uses angular information only and assesses the constancy of angle values:

```
avoca_rho = trajectoryAngles(avoca_D_man, avoca_sites, avoca_surveys, all=TRUE)$rho
par(mar=c(4,4,1,1))
plot(avoca_rho, avoca_dir, xlab = "rho(T)", ylab = "dir(T)", type="n")
text(avoca_rho, avoca_dir, as.character(1:8))
```

We can calculate the resemblance between forest plot trajectories using `trajectoryDistances`

:

```
## 1 2 3 4 5 6 7
## 2 2.405
## 3 6.805 5.773
## 4 6.123 6.646 5.225
## 5 6.020 5.541 3.235 4.966
## 6 9.490 8.866 3.436 6.043 4.505
## 7 4.024 3.291 4.365 4.993 4.389 6.205
## 8 9.797 9.747 5.591 4.611 6.383 5.043 6.949
```

The closest trajectories are those of plots ‘1’ and ‘2’. They looked rather close in *position* in the PCoA ordination of \(\Omega\) with all trajectories, so probably it is position, rather than *shape* which has influenced this low value. The next pair of similar trajectories are those of the ‘3’-‘5’ pair. We can again use mMDS to produce an ordination of resemblances between trajectories:

```
mMDS<-mds(avoca_D_traj_man)
mMDS
```

```
##
## Call:
## mds(delta = avoca_D_traj_man)
##
## Model: Symmetric SMACOF
## Number of objects: 8
## Stress-1 value: 0.091
## Number of iterations: 25
```

```
x<-mMDS$conf[,1]
y<-mMDS$conf[,2]
par(mar=c(4,4,1,1))
plot(x,y, type="p", asp=1, xlab=paste0("Axis 1"),
ylab=paste0("Axis 2"), col="black",
bg= brewer.pal(8,"Accent"), pch=21)
text(x,y, labels=1:8, pos=1)
```

Distances between trajectories can be calculated after centering them (i.e. after bringing all trajectories to the center of the \(\Omega\) space). This is done using function `centerTrajectories`

, which returns a new dissimilarity matrix. Centering removes differences in *position* between trajectories, which is useful in cases where one wants to focus on spatio-temporal interaction while discarding spatial patterns that are constant in time. We do no illustrate centering here, because in is not interesting for this particular example.

Some dissimilarity coefficients that are popular in community ecology, such as the percentage difference (alias Bray-Curtis), have the drawback of being a non-Euclidean (dissimilarity matrices do not allow a representation in a Cartesian space), or even semi-metric (i.e. triangle inequality is not ensured). In order to use these coefficients in multivariate analyses that require these properties a transformation of the original space will normally be in order. In this section, we compare different alternatives and provide some recommendations on this issue.

Here, we use an example of a single synthetic community to illustrate the effect of square root transformation on a community trajectory. We begin by defining the species data of the trajectory itself. The dataset consists of four rows (i.e. surveys) and four columns (species). The dynamics in this example consist in an constant increase in the number of individuals of the first species and a corresponding decrease of the others, while keeping a total abundance of 100 individuals:

```
sites = rep(1,4)
surveys=1:4
spdata = rbind(c(35,30,20,15),
c(50,25,15,10),
c(65,20,10,5),
c(80,15,5,0))
```

We now use function `vegdist`

from package `vegan`

to calculate the Bray-Curtis coefficient:

```
D = vegan::vegdist(spdata, "bray")
is.metric(D)
```

`## [1] TRUE`

`D`

```
## 1 2 3
## 2 0.15
## 3 0.30 0.15
## 4 0.45 0.30 0.15
```

This dissimilarity matrix is a metric, so one would not need any transformation for CTA. However, it is a good example to illustrate the effect of the square root transformation.

If we draw the resemblance space corresponding to this dissimilarity matrix we see a straight trajectory:

`trajectoryPCoA(D,sites,surveys)`

Here we see that Bray-Curtis dissimilarity responds linearly to the proposed sequence of community dynamics. To confirm this geometry, we can calculate the geometric properties of the trajectory (i.e. length, angle between consecutive segments and overall directionality):

`trajectoryLengths(D,sites,surveys)`

```
## S1 S2 S3 Trajectory
## 1 0.15 0.15 0.15 0.45
```

`trajectoryAngles(D,sites,surveys)`

```
## S1-S2 S2-S3 mean sd rho
## 1 0 0 0 0 1
```

`trajectoryDirectionality(D,sites,surveys)`

```
## 1
## 1
```

Angles are 0 degrees and overall directionality is maximum (i.e. 1), in accordance with the plot and the data. We now proceed to take the square root of the dissimilarity values, as would be necessary to achieve a metric (and Euclidean) space in a more complex data set:

```
sqrtD = sqrt(D)
sqrtD
```

```
## 1 2 3
## 2 0.3872983
## 3 0.5477226 0.3872983
## 4 0.6708204 0.5477226 0.3872983
```

The transformation increases all dissimilarity values (because the original values are smaller than 1), but the increase is larger for smaller values, so the ratio between large dissimilarities and small dissimilarities decreases. This has an effect on the overall shape of the trajectory, which surprisingly now looks like:

`trajectoryPCoA(sqrtD,sites,surveys)`

In addition to the distortion observed, the number of dimensions of the data have increased (i.e the sum of variance explained by the two axes is 88% < 100%), so we cannot be sure that the angles are well represented. If we re-calculate the properties of the trajectory taking into account all dimensions we obtain:

`trajectoryLengths(sqrtD,sites,surveys)`

```
## S1 S2 S3 Trajectory
## 1 0.3872983 0.3872983 0.3872983 1.161895
```

`trajectoryAngles(sqrtD,sites,surveys)`

```
## S1-S2 S2-S3 mean sd rho
## 1 90 90 90 0 1
```

`trajectoryAngles(sqrtD,sites,surveys, all=TRUE)`

```
## A1 A2 A3 A4 mean sd rho
## 1 90 90 90 90 90 0 1
```

`trajectoryDirectionality(sqrtD,sites,surveys)`

```
## 1
## 0.5
```

The length of segments and the trajectory have increased, but all segments are of the same length, in agreement with the original trajectory. In contrast, the angles are now 90 degrees and the overall directionality has decreased substantially.

Here we use simulated data to compare four transformation approaches:

- Local transformation of semi-metric dissimilarities (such as the percentage difference) in every triplet when the triangle inequality is required. This is done by default in the CTA functions of package
`vegclust`

. - Global transformation of dissimilarities, using the square root.
- Global transformation of dissimilarities by using Principal Coordinates Analysis (classical multidimensional scaling) with eigenvalue correction.
- Global transformation of dissimilarities by using metric multidimensional scaling (metric MDS).

We use simulated dynamics to build another trajectory with more species (20) and more time steps. We begin by setting the number of time steps (50) and the size of the community (50 individuals):

```
Nsteps = 50
CC = 50
Nreplace <- CC*0.05
```

`Nreplace`

is the number of individuals to be replaced each time step (5%). Now we define the initial community vector and the vector with the probabilities of offspring for each species according to some ecological conditions:

```
x <- c(0, 1, 0, 67, 1, 3, 0, 2, 2, 2, 1, 6, 2, 0, 0, 2, 5, 1, 6, 0)
poffspring <- c(0, 0, 0.002, 0.661 ,0 ,0, 0.037, 0.281, 0, 0, 0, 0.008, 0, 0, 0.005, 0.003, 0, 0, 0, 0)
```

We can now simulate the dynamics by sequentially applying stochastic deaths and recruitment:

```
m <- matrix(0, nrow=Nsteps+1, ncol=length(x))
m[1, ] = x
for(k in 1:Nsteps) {
pdeath <-x/sum(x) #Equal probability of dying
deaths<-rmultinom(1,Nreplace, pdeath)
x <- pmax(x - deaths,0)
offspring = rmultinom(1,Nreplace, as.vector(poffspring))
x <- x + offspring
m[k+1, ]<-x
}
```

Then we decide how frequently (with respect to the simulated step) a sample of the community is taken, here every four steps:

```
Sj <- seq(1,Nsteps+1, by=4) #Sample every four steps
mj <- m[Sj,]
surveys = 1:length(Sj)
sites = rep(1,length(Sj))
```

Now we are ready to calculate the Bray-Curtis dissimilarity:

`D <- vegan::vegdist(mj,"bray")`

In this more complex trajectory, some triangles may not obey the triangle inequality (depending on the simulation). This can be inspected using function `is.metric`

:

`is.metric(D, tol=0.0000001)`

`## [1] TRUE`

Deviations from a metric space, if they exist, will be small, so that local transformation of triangles will be very small.

Local transformations are not possible to display trajectories. When we plot the trajectory using function `trajectoryPCoA`

the global transformation of principal coordinates analysis (PCoA) with negative eigenvalue correction is performed. This is fine to display trajectories, but has problems when measuring angular properties, as we will see.

`pcoa<-trajectoryPCoA(D, sites, surveys, selection=1,length=0.1, axes=c(1,2))`

```
## Warning in cmdscale(D2, eig = TRUE, add = TRUE, k = nrow(as.matrix(D2)) - :
## only 11 of the first 12 eigenvalues are > 0
```

`pcoaD = dist(pcoa$points)`

The trajectory has some twists derived from stochasticity in death and recruitment. Let’s now look at the square root of the Bray-Curtis dissimilarity:

```
sqrtD = sqrt(D)
pcoaSqrt = trajectoryPCoA(sqrtD, sites, surveys, selection=1,length=0.1, axes=c(1,2))
```

Finally, we also transform dissimilarities using metric multidimensional scaling (mMDS), provided by package `smacof`

:

```
res <- mds(D, ndim = length(Sj)-1, type = "interval")
mmdsD <- dist(res$conf)
trajectoryPlot(res$conf, sites, surveys, selection=1,length=0.1, axes=c(1,2))
```

While the three plots look different, the differences are not striking (besides rotation issues). We can compare the stress of the global solutions:

`stress0(D,pcoaSqrt$points, type="interval")`

`## [1] 0.05955285`

`stress0(D,pcoa$points, type="interval")`

`## [1] 0.002693561`

`stress0(D,res$conf, type="interval")`

`## [1] 0.009140085`

Where we see that the square root leads to the strongest alteration of original dissimilarities. If we calculate geometric properties we are not limited by ordination plots and we can take into account all dimensions.

```
anglesD = trajectoryAngles(D,sites,surveys)
anglesSqrtD = trajectoryAngles(sqrtD,sites,surveys)
anglesPcoaD = trajectoryAngles(pcoaD,sites,surveys)
anglesmmdsD = trajectoryAngles(mmdsD,sites,surveys)
df<-as.data.frame(rbind(anglesD, anglesSqrtD, anglesPcoaD, anglesmmdsD))
row.names(df)<-c("local", "global.sqrt", "global.pcoa", "global.mmds")
round(df,2)
```

```
## S1-S2 S2-S3 S3-S4 S4-S5 S5-S6 S6-S7 S7-S8 S8-S9 S9-S10
## local 0.00 0.00 71.79 82.82 82.82 109.47 109.47 82.82 75.52
## global.sqrt 90.00 90.00 100.18 104.48 104.48 114.09 114.09 104.48 101.78
## global.pcoa 80.59 63.56 91.84 101.21 101.21 113.20 113.20 101.21 95.51
## global.mmds 1.99 11.31 69.35 90.95 83.41 109.04 104.00 91.32 81.25
## S10-S11 S11-S12 mean sd rho
## local 0.00 67.11 64.58 0.71 0.78
## global.sqrt 90.00 99.59 101.20 0.14 0.99
## global.pcoa 70.17 89.17 92.90 0.27 0.96
## global.mmds 42.09 71.40 70.63 0.60 0.84
```

The first call to `trajectoryAngles`

with matrix `D`

represents the default strategy of transforming triangles locally, which involves the weakest transformation of all and can be taken as reference. Both the square root and PCoA with negative eigenvalue correction induce a strong transformation of trajectory angles. The global solution of metric MDS leads to angles that are more similar to those of the local transformation strategy.

If we inspect the overall directionality, the global solution of metric MDS provides a value that is again closer to that of local transformation, compared to PCoA and the square root:

`trajectoryDirectionality(D,sites,surveys)`

```
## 1
## 0.854166
```

`trajectoryDirectionality(sqrtD,sites,surveys)`

```
## 1
## 0.4845857
```

`trajectoryDirectionality(pcoaD,sites,surveys)`

```
## 1
## 0.6480544
```

`trajectoryDirectionality(mmdsD,sites,surveys)`

```
## 1
## 0.7818728
```

In this small study we compared the effect of different solutions to violation of the triangle inequality. If function `is.metric`

returns TRUE for a given data set one should not worry about violations of the triangle inequality. Local solutions are those that imply the smallest number of changes, but these will not be consistent across triplets, so users may desire to apply a global transformation that produces euclidean spaces. This should be done with care. We have shown how the square root transformation distorts the angles and overall directionality of trajectories on the space defined by the percentage difference (alias Bray-Curtis) dissimilarity index. We suspect that this negative effect of square root transformation on angles happens no matter which coefficient is used to derive the initial distance matrix. Therefore, we advocate for avoiding its use when conducting community trajectory analysis (in particular for angles). The global transformation consisting in the application of PCoA with negative eigenvalue correction is less strong than the square root, but it still strongly change the angles between segments. Perhaps, the less harmful global transformation is provided by metric Multidimensional Scaling, but the need to embed distances in an Euclidean space of all three transformations implies a stronger requirement than being a metric, and results in distortions.