It may be useful to retrieve multiple or all of the resources for a package. For example, the TTC Subway Delay Data is released monthly, with a new file each month.
library(opendatatoronto)
ttc_subway_delays_resources <- list_package_resources("https://open.toronto.ca/dataset/ttc-subway-delay-data/")
ttc_subway_delays_resources
#> # A tibble: 33 x 4
#> name id format last_modified
#> <chr> <chr> <chr> <date>
#> 1 ttc-subway-delay-codes fece136b-224a-412a-b191-… XLSX 2019-08-15
#> 2 ttc-subway-delay-jan-201… 6664420f-316f-4f94-9ba4-… XLSX 2019-08-15
#> 3 ttc-subway-delay-may-2017 d1159888-0035-45a0-b238-… XLSX 2019-08-15
#> 4 ttc-subway-delay-june-20… 240d8e8c-d300-4f91-b94f-… XLSX 2019-08-15
#> 5 ttc-subway-delay-july-20… 98d4ac77-aa9f-40a3-97ee-… XLSX 2019-08-15
#> 6 ttc-subway-delay-august-… 92e7649a-cf2f-4ac7-9802-… XLSX 2019-08-15
#> 7 ttc-subway-delay-septemb… 61412f10-656b-4992-9a1a-… XLSX 2019-08-15
#> 8 ttc-subway-delay-october… 69a6db37-7982-49c7-8dbc-… XLSX 2019-08-15
#> 9 ttc-subway-delay-novembe… 10080217-8022-41c0-a8ba-… XLSX 2019-08-15
#> 10 ttc-subway-delay-decembe… a731c4bb-630a-4530-b590-… XLSX 2019-08-15
#> # … with 23 more rows
It would be cumbersome to retrieve them one at a time, e.g. via
library(dplyr)
delays_jan_2019 <- ttc_subway_delays_resources %>%
filter(name == "ttc-subway-delay-january-2019") %>%
get_resource()
delays_feb_2019 <- ttc_subway_delays_resources %>%
filter(name == "ttc-subway-delay-february-2019") %>%
get_resource()
and so on.
Instead, you can iterate through the different resources using the purrr
package and get them in a nested tibble:
library(purrr)
ttc_subway_delays_2019 <- ttc_subway_delays_resources %>%
filter(grepl("2019", name)) %>%
mutate(data = map(id, get_resource))
ttc_subway_delays_2019
#> # A tibble: 10 x 5
#> name id format last_modified data
#> <chr> <chr> <chr> <date> <list>
#> 1 ttc-subway-delay-… 9a824dba-20cc-40b1… XLSX 2019-08-15 <tibble [1,…
#> 2 ttc-subway-delay-… e6bac74e-2da2-4429… XLSX 2019-08-15 <tibble [1,…
#> 3 ttc-subway-delay-… 0511879f-3233-4a42… XLSX 2019-08-15 <tibble [1,…
#> 4 ttc-subway-delay-… 447b4a5a-f696-4f05… XLSX 2019-08-15 <tibble [1,…
#> 5 ttc-subway-delay-… a302fcab-81a1-4142… XLSX 2019-08-15 <tibble [1,…
#> 6 ttc-subway-delay-… 655a138c-d381-4fe7… XLSX 2019-08-15 <tibble [1,…
#> 7 ttc-subway-delay-… 34d9619f-0239-4dad… XLSX 2019-10-07 <tibble [1,…
#> 8 ttc-subway-delay-… b6557580-a0f4-4c96… XLSX 2019-10-07 <tibble [1,…
#> 9 ttc-subway-delay-… e2a5e386-ddf7-4416… XLSX 2019-10-07 <tibble [1,…
#> 10 "ttc-subway-delay… fd837bd2-85ed-485e… XLSX 2019-11-08 <tibble [1,…
All of these resources contain equivalent data, with the same column names
map(ttc_subway_delays_2019[["data"]], colnames)
#> [[1]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[2]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[3]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[4]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[5]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[6]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[7]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[8]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[9]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
#>
#> [[10]]
#> [1] "Date" "Time" "Day" "Station" "Code"
#> [6] "Min Delay" "Min Gap" "Bound" "Line" "Vehicle"
so the data can easily be unnested and combined:
bind_rows(ttc_subway_delays_2019[["data"]])
#> # A tibble: 16,185 x 10
#> Date Time Day Station Code `Min Delay` `Min Gap`
#> <dttm> <chr> <chr> <chr> <chr> <dbl> <dbl>
#> 1 2019-01-01 00:00:00 01:08 Tues… YORK M… PUSI 0 0
#> 2 2019-01-01 00:00:00 02:14 Tues… ST AND… PUMST 0 0
#> 3 2019-01-01 00:00:00 02:16 Tues… JANE S… TUSC 0 0
#> 4 2019-01-01 00:00:00 02:27 Tues… BLOOR … SUO 0 0
#> 5 2019-01-01 00:00:00 03:03 Tues… DUPONT… MUATC 11 16
#> 6 2019-01-01 00:00:00 03:08 Tues… EGLINT… EUATC 11 16
#> 7 2019-01-01 00:00:00 03:09 Tues… DUPONT… EUATC 6 11
#> 8 2019-01-01 00:00:00 03:26 Tues… ST CLA… EUATC 4 9
#> 9 2019-01-01 00:00:00 03:37 Tues… KENNED… TUMVS 0 0
#> 10 2019-01-01 00:00:00 08:04 Tues… DAVISV… MUNOA 5 10
#> # … with 16,175 more rows, and 3 more variables: Bound <chr>, Line <chr>,
#> # Vehicle <dbl>