Time series come with a strict temporal order that dictate the type of operations that can be done. An example of operation is the moving average, where a window slides over the time order and averages of the response are computed on the temporal subset. The tsibble package provides three moving window operations, called verbs that operate on temporal data objects (nouns):

`slide()`

/`slide2()`

/`pslide()`

: sliding window with overlapping observations.`tile()`

/`tile2()`

/`ptile()`

: tiling window without overlapping observations.`stretch()`

/`stretch2()`

/`pstretch()`

: fixing an initial window and expanding to include more observations.

These functions handle all sorts of objects and feature *purrr*-like interface. In this vignette, I will walk you through the `slide()`

and its variants, but the example snippets are also applicable to `tile()`

and `stretch()`

.

In spirit of `purrr::map()`

, `slide()`

accepts one input, `slide2()`

two inputs, and `pslide()`

multiple inputs, all of which always return lists for the sake of type stability. Other variants including `slide_lgl()`

, `slide_int()`

, `slide_dbl()`

, `slide_chr()`

return vectors of the corresponding type, as well as `slide_dfr()`

and `slide_dfc()`

for row-binding and column-binding data frames respectively. This full-fledged window family empowers users to build window-related workflows in all sorts of ways, from fixed window size to calendar periods, and from moving average to model fitting.

The `pedestrian`

dataset includes hourly pedestrian counts in the city of Melbourne, with `Sensor`

as key and `Date_Time`

as index. These windowed functions are index-based rolling for tackling general problems, rather than time indexed. Implicit missing values are thereby made explicit using `fill_gaps()`

, and `.full = TRUE`

warrants the equal time length of each sensor. This prepares the data inputs in the expected order.

```
library(tsibble)
library(dplyr)
pedestrian_full <- pedestrian %>%
fill_gaps(.full = TRUE)
pedestrian_full
#> # A tsibble: 70,176 x 5 [1h] <Australia/Melbourne>
#> # Key: Sensor [4]
#> Sensor Date_Time Date Time Count
#> <chr> <dttm> <date> <int> <int>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826
#> 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567
#> 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264
#> 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139
#> # … with 7.017e+04 more rows
```

Moving average is one of the common techniques to smooth time series. We can apply daily window smoother (a fixed window size of 24) easily for each sensor. `slide()`

returns an output the same length as the input with `.fill = NA`

(by default) and `.align = "center-left"`

padded at both sides of the data range, so that the result fits into `mutate()`

in harmony. `slide_dbl()`

produces the numeric vector returned by `mean()`

.

```
pedestrian_full %>%
group_by(Sensor) %>%
mutate(Daily_MA = slide_dbl(Count,
mean, na.rm = TRUE, .size = 24, .align = "center-left"
))
#> # A tsibble: 70,176 x 6 [1h] <Australia/Melbourne>
#> # Key: Sensor [4]
#> # Groups: Sensor [4]
#> Sensor Date_Time Date Time Count Daily_MA
#> <chr> <dttm> <date> <int> <int> <dbl>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630 NA
#> 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826 NA
#> 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567 NA
#> 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264 NA
#> 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139 NA
#> # … with 7.017e+04 more rows
```

To make this even-order moving average symmetric, a second moving average with `.size = 2`

should be applied to `Daily_MA`

.

What if the time period we’d like to slide over happens not to be a fixed window size, for example sliding over three months. The preprocessing step is to wrap observations into monthly subsets (a list of tsibbles) using `nest()`

.

```
pedestrian_mth <- pedestrian_full %>%
mutate(YrMth = yearmonth(Date_Time)) %>%
nest(-Sensor, -YrMth)
pedestrian_mth
#> # A tibble: 96 x 3
#> Sensor YrMth data
#> <chr> <mth> <list>
#> 1 Birrarung Marr 2015 Jan <tsibble [744 × 4]>
#> 2 Birrarung Marr 2015 Feb <tsibble [672 × 4]>
#> 3 Birrarung Marr 2015 Mar <tsibble [744 × 4]>
#> 4 Birrarung Marr 2015 Apr <tsibble [721 × 4]>
#> 5 Birrarung Marr 2015 May <tsibble [744 × 4]>
#> # … with 91 more rows
```

Now it’s ready to (rock and) roll. When setting `.size = 1`

in `slide()`

, it behaves exactly the same as `purrr::map()`

, mapping over each element in the object. However, `(1)`

a bundle of 3 subsets (`.size = 3`

) needs to be binded first and then computed for average counts; `(2)`

alternatively, `.bind = TRUE`

takes care of binding data frames by row. The nicely-glued simple operations facilitate complex tasks in an easier-to-comprehend manner.

```
pedestrian_mth %>%
group_by(Sensor) %>%
# (1)
# mutate(Monthly_MA = slide_dbl(data,
# ~ mean(bind_rows(.)$Count, na.rm = TRUE), .size = 3, .align = "center"
# ))
# (2) equivalent to (1)
mutate(Monthly_MA = slide_dbl(data,
~ mean(.$Count, na.rm = TRUE), .size = 3, .align = "center", .bind = TRUE
))
#> # A tibble: 96 x 4
#> # Groups: Sensor [4]
#> Sensor YrMth data Monthly_MA
#> <chr> <mth> <list> <dbl>
#> 1 Birrarung Marr 2015 Jan <tsibble [744 × 4]> NA
#> 2 Birrarung Marr 2015 Feb <tsibble [672 × 4]> 634.
#> 3 Birrarung Marr 2015 Mar <tsibble [744 × 4]> 546.
#> 4 Birrarung Marr 2015 Apr <tsibble [721 × 4]> 554.
#> 5 Birrarung Marr 2015 May <tsibble [744 × 4]> 397.
#> # … with 91 more rows
```

We have had a glimpse at row-oriented workflow to slide over consecutive months using `nest()`

in the preceding example. To leverage this workflow more, we can fit a linear model for each sensor simultaneously but independently, and in turn obtain its fitted values and residuals over weekly rolling windows. This is where `pslide()`

comes to play. It takes a list or a data frame (multiple inputs) and apply the custom function `my_diag()`

to every rolling block. We start with a tsibble and end up with a diagnostic tibble of relatively larger size.

```
my_diag <- function(...) {
data <- tibble(...)
fit <- lm(Count ~ Time, data = data)
list(fitted = fitted(fit), resid = residuals(fit))
}
pedestrian %>%
filter_index(~ "2015-03") %>%
nest(-Sensor) %>%
mutate(diag = purrr::map(data, ~ pslide_dfr(., my_diag, .size = 24 * 7)))
#> # A tibble: 4 x 3
#> Sensor data diag
#> <chr> <list> <list>
#> 1 Birrarung Marr <tsibble [2,160 × 4]> <tibble [334,825 × 2…
#> 2 Bourke Street Mall (North) <tsibble [1,032 × 4]> <tibble [145,321 × 2…
#> 3 QV Market-Elizabeth St (West) <tsibble [2,160 × 4]> <tibble [334,825 × 2…
#> 4 Southern Cross Station <tsibble [2,160 × 4]> <tibble [334,825 × 2…
```

Why `slide()`

not working for this case? It is intended to work with a list, i.e. column-by-column for a data frame. However, here we perform a row-wise sliding over multiple columns of a data frame at one, `pslide()`

does the job for handling multiple lists. This example running a bit longer? Time to kick **furrr** in for parallel processing. Their multiprocessing counterparts are all prefixed with `future_`

.

The `slide()`

examples default to sliding over complete sets. In some cases, you may find partial sliding more appropriate, which can be enabled by `.partial = TRUE`

. Additionally, as opposed to moving window forward by a positive `.size`

, a negative one moves window backward.