**NOTE: The rolling window family will be deprecated in the future. Please consider using the slide package instead.**

Time series come with a strict temporal order that dictate the type of operations that can be done. An example of operation is moving averages, where a window slides over the time order, and the averages of the response are computed on the subset. The tsibble package provides three moving window operations:

`slide()`

/`slide2()`

/`pslide()`

: sliding window with overlapping observations.`tile()`

/`tile2()`

/`ptile()`

: tiling window without overlapping observations.`stretch()`

/`stretch2()`

/`pstretch()`

: fixing an initial window and expanding to include more observations.

These functions handle all sorts of inputs (not limited to tsibble) and feature *purrr*-like interface. In this vignette, I will walk you through the `slide()`

and its variants, but the example snippets are also applicable to `tile()`

and `stretch()`

.

In spirit of the `purrr::map()`

family, `slide()`

accepts one input, `slide2()`

two inputs, and `pslide()`

multiple inputs, all of which always return lists for the sake of type stability. Other variants including `slide_lgl()`

, `slide_int()`

, `slide_dbl()`

, `slide_chr()`

return vectors of the corresponding type, as well as `slide_dfr()`

and `slide_dfc()`

for row-binding and column-binding data frames respectively. This full-fledged window family empowers users to build window-related workflows in all sorts of ways, from fixed window size to calendar periods, and from moving average to model fitting.

The `pedestrian`

dataset includes hourly pedestrian counts in the city of Melbourne, with `Sensor`

as key and `Date_Time`

as index. These windowed functions are index-based rolling for tackling general problems, rather than time indexed. Implicit missing values are thereby made explicit using `fill_gaps()`

, and `.full = TRUE`

warrants the equal time length of each sensor. This prepares the data inputs in the expected order.

```
library(dplyr)
library(tidyr)
library(tsibble)
pedestrian_full <- pedestrian %>%
fill_gaps(.full = TRUE)
pedestrian_full
#> # A tsibble: 70,176 x 5 [1h] <Australia/Melbourne>
#> # Key: Sensor [4]
#> Sensor Date_Time Date Time Count
#> <chr> <dttm> <date> <int> <int>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630
#> 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826
#> 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567
#> 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264
#> 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139
#> # … with 7.017e+04 more rows
```

Moving average is one of the common techniques to smooth time series. We can apply daily window smoother (a fixed window size of 24) easily for each sensor. `slide()`

returns an output the same length as the input with `.fill = NA`

(by default) and `.align = "center-left"`

padded at both sides of the data range, so that the result fits into `mutate()`

in harmony. `slide_dbl()`

produces the numeric vector returned by `mean()`

.

```
pedestrian_full %>%
group_by_key() %>%
mutate(Daily_MA = slide_dbl(Count,
mean, na.rm = TRUE, .size = 24, .align = "center-left"
))
#> # A tsibble: 70,176 x 6 [1h] <Australia/Melbourne>
#> # Key: Sensor [4]
#> # Groups: Sensor [4]
#> Sensor Date_Time Date Time Count Daily_MA
#> <chr> <dttm> <date> <int> <int> <dbl>
#> 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630 NA
#> 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826 NA
#> 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567 NA
#> 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264 NA
#> 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139 NA
#> # … with 7.017e+04 more rows
```

To make this even-order moving average symmetric, a second moving average with `.size = 2`

should be applied to `Daily_MA`

.

What if the time period we’d like to slide over happens not to be a fixed window size, for example sliding over three months. The preprocessing step is to wrap observations into monthly subsets (a list of tsibbles) using `nest()`

.

```
pedestrian_mth <- pedestrian_full %>%
mutate(YrMth = yearmonth(Date_Time)) %>%
nest(data = c(-Sensor, -YrMth))
pedestrian_mth
#> # A tibble: 96 x 3
#> Sensor YrMth data
#> <chr> <mth> <list>
#> 1 Birrarung Marr 2015 Jan <tsibble [744 × 4]>
#> 2 Birrarung Marr 2015 Feb <tsibble [672 × 4]>
#> 3 Birrarung Marr 2015 Mar <tsibble [744 × 4]>
#> 4 Birrarung Marr 2015 Apr <tsibble [721 × 4]>
#> 5 Birrarung Marr 2015 May <tsibble [744 × 4]>
#> # … with 91 more rows
```

Now it’s ready to (rock and) roll. When setting `.size = 1`

in `slide()`

, it behaves exactly the same as `purrr::map()`

, mapping over each element in the object. However, `(1)`

a bundle of 3 subsets (`.size = 3`

) needs to be binded first and then computed for average counts; `(2)`

alternatively, `.bind = TRUE`

takes care of binding data frames by row. The nicely-glued simple operations facilitate complex tasks in an easier-to-comprehend manner.

```
pedestrian_mth %>%
group_by(Sensor) %>%
# (1)
# mutate(Monthly_MA = slide_dbl(data,
# ~ mean(bind_rows(.)$Count, na.rm = TRUE), .size = 3, .align = "center"
# ))
# (2) equivalent to (1)
mutate(Monthly_MA = slide_dbl(data,
~ mean(.$Count, na.rm = TRUE), .size = 3, .align = "center", .bind = TRUE
))
#> # A tibble: 96 x 4
#> # Groups: Sensor [4]
#> Sensor YrMth data Monthly_MA
#> <chr> <mth> <list> <dbl>
#> 1 Birrarung Marr 2015 Jan <tsibble [744 × 4]> NA
#> 2 Birrarung Marr 2015 Feb <tsibble [672 × 4]> 634.
#> 3 Birrarung Marr 2015 Mar <tsibble [744 × 4]> 546.
#> 4 Birrarung Marr 2015 Apr <tsibble [721 × 4]> 554.
#> 5 Birrarung Marr 2015 May <tsibble [744 × 4]> 397.
#> # … with 91 more rows
```

We have had a glimpse at row-oriented workflow to slide over consecutive months using `nest()`

in the preceding example. To leverage this workflow more, we can fit a linear model for each sensor simultaneously but independently, and in turn obtain its fitted values and residuals over weekly rolling windows. This is where `pslide()`

comes to play. It takes a list or a data frame (multiple inputs) and apply the custom function `my_diag()`

to every rolling block. We start with a tsibble and end up with a diagnostic tibble of relatively larger size.

```
my_diag <- function(...) {
data <- tibble(...)
fit <- lm(Count ~ Time, data = data)
list(fitted = fitted(fit), resid = residuals(fit))
}
pedestrian %>%
filter_index(~ "2015-03") %>%
group_by_key() %>%
nest() %>%
mutate(diag = purrr::map(data, ~ pslide_dfr(., my_diag, .size = 24 * 7)))
#> # A tibble: 4 x 3
#> # Groups: Sensor [4]
#> Sensor data diag
#> <chr> <list> <list>
#> 1 Birrarung Marr <tsibble [2,160 × 4]> <tibble [334,825 × 2…
#> 2 Bourke Street Mall (North) <tsibble [1,032 × 4]> <tibble [145,321 × 2…
#> 3 QV Market-Elizabeth St (West) <tsibble [2,160 × 4]> <tibble [334,825 × 2…
#> 4 Southern Cross Station <tsibble [2,160 × 4]> <tibble [334,825 × 2…
```

Why `slide()`

not working for this case? It is intended to work with a list, i.e. column-by-column for a data frame. However, here we perform a row-wise sliding over multiple columns of a data frame at one, `pslide()`

does the job for handling multiple lists. This example running a bit longer? Time to kick **furrr** in for parallel processing. Their multiprocessing counterparts are all prefixed with `future_`

.

The `slide()`

examples default to sliding over complete sets. In some cases, you may find partial sliding more appropriate, which can be enabled by `.partial = TRUE`

. Additionally, as opposed to moving window forward by a positive `.size`

, a negative one moves window backward.