# Using riding_binplot to create tile grid maps

## Description

The riding_binplot() function takes data at the federal riding level as its input and creates a tile grid map to visualize these data with ggplot2.

## Overview

Although Canada is a very large country, around 80% of its population is concentrated in urban areas. Because of this, the majority of federal electoral ridings are small and urban while a much smaller number of ridings in sparsely-populated regions account for the majority of Canada’s landmass. Because of this, visualizing statistics at the riding level with standard cloropeth maps is not ideal. The riding_binplot() function of the mapcan package offers an alternative: the tile grid map. Inspired by the statebins package (Rudis 2017), federal riding tile grid maps are a way of visualizing riding-level data when it is not neccessary to accurately represent the ridings geographically.

## Using riding_binplot()

### Categorical variables

#### Preparing the data

riding_binplot() requires a data frame that includes a riding-level variable as well as numeric riding code variable. Specify the riding-level variable of interest in the value_col argument. Specify the numeric riding code variable in the riding_col argument. For instance, we may want to create a tile plot for the seat distribution of the 2015 federal election.

Luckily, mapcan comes with built-in federal election data (the federal_election_results data frame) that dates back to 1996 with variables for riding codes/names, provinces, population, voter turnout and the winning party for each riding. Let’s restrict federal_election_results to include only observations (ridings) from the 2015 election. Because we only need (1) a riding characteristic variable and (2) a riding code variable, we can select only these columns from the data.

fed_2015 <- federal_election_results %>%
filter(election_year == 2015) %>%
dplyr::select(riding_code, party)

Note: We remove all irrelevant variables from the data frame to illustrate more clearly how riding_binplot operates. They do not need to be removed in practice.

This new data frame has the two columns that we need: one to specify the riding code (riding_code) and another to specify a riding-level characteristic of interest (party). Let’s look at a sample:

fed_2015 %>%
sample_n(10)
#>    riding_code   party
#> 1        24024 Liberal
#> 2        59009     NDP
#> 3        24072 Liberal
#> 4        59008 Liberal
#> 5        10005 Liberal
#> 6        24026 Liberal
#> 7        35007 Liberal
#> 8        35042 Liberal
#> 9        35052 Liberal
#> 10       48014 Liberal

Looking good.

#### Plotting the data

Use continuous = TRUE for continuous variables and continuous = FALSE for categorical variables. party is a categorical variable, so we will specify continuous = FALSE:

fed_2015_bins <- fed_2015 %>%
riding_binplot(value_col = party,
continuous = FALSE) +
ggtitle("Tile grid map of 2015 federal election results")
fed_2015_bins

You will notice that the axis text has no substantive significance (it just the coordinates of the tiles). You can remove it, along with the axis ticks and background grid using theme_mapcan function, a ggplot theme that is part of the mapcan package.

fed_2015_bins <- fed_2015_bins +
theme_mapcan()

fed_2015_bins

You will also notice that the colours provided by default from riding_binplot are not ideal for a tile plot with Canadian political parties. Because riding_binplot simply creates a ggplot object, we can override the riding_binplot scale and add our own:

fed_2015_bins +
scale_fill_manual(name = "Party",
values = c("mediumturquoise", "blue", "springgreen3", "red", "orange"))
#> Scale for 'fill' is already present. Adding another scale for 'fill',
#> which will replace the existing scale.

In this tile grid map, tiles are arranged by province yet, because seats are highly concentrated in urban areas, each tile only roughly corresponds to geographic location of the riding it represents. The arrange = TRUE option provides a better representation of the disribution of variables within provinces when the exact location of the riding is not relevant.

fed_2015 %>%
riding_binplot(value = party,
continuous = FALSE,
arrange = TRUE) +
theme_mapcan() +
scale_fill_manual(name = "Party",
values = c("mediumturquoise", "blue", "springgreen3", "red", "orange")) +
ggtitle("Tile grid map of 2015 federal election results")
#> Scale for 'fill' is already present. Adding another scale for 'fill',
#> which will replace the existing scale.

Compared to a standard cloropeth map, the tile grid map gives us a better sense of how the 2015 federal election shaped up:

fed_2015_cloropleth <- mapcan(boundaries = ridings, type = standard)

left_join(fed_2015_cloropleth, fed_2015, by = "riding_code") %>%
ggplot(aes(long, lat, group = group, fill = party)) +
geom_polygon() +
coord_fixed() +
scale_fill_manual(name = "Party",
values = c("mediumturquoise", "blue", "springgreen3", "red", "orange")) +
ggtitle("Tile grid map of 2015 federal election results") +
theme_mapcan()

### Continuous variables

Let’s make a riding tile plot with a continuous variable: voter turnout. This variable is included in the federal_election_results dataset, which is part of the mapcan package. Specify continuous = TRUE in riding_binplot so that ggplot uses a continuous fill scale. The default scale is scale_fill_viridis_c() from the viridis package. Though, as demonstrated above, this can be changed easily by overriding the default scale.

Note that the data can be tidied up using dplyr and piped directly into riding_binplot():

federal_election_results %>%
filter(election_year == 2015) %>%
riding_binplot(value = voter_turnout,
continuous = TRUE) +
ggtitle("Tile grid map of 2015 federal election voter turnout")

Once again, because we care little about the axis grid and axis values, we can use the theme_mapcan() theme:

federal_election_results %>%
filter(election_year == 2015) %>%
riding_binplot(value = voter_turnout,
continuous = TRUE) +
theme_mapcan() +
ggtitle("Tile grid map of 2015 federal election voter turnout")

Like with the election results that are plotted above, we may want to arrange the values of voter turnout within provinces. Fortunately, we can also use the arrange = TRUE argument when the riding-level variable is continuous.

federal_election_results %>%
filter(election_year == 2015) %>%
riding_binplot(value = voter_turnout,
continuous = TRUE,
arrange = TRUE) +
theme_mapcan() +
ggtitle("Tile grid map of 2015 federal election voter turnout")

If squares are not your jam, you can also use hexagons with the shape = hexagon argument. Note that the default is shape = square.

federal_election_results %>%
filter(election_year == 2015) %>%
riding_binplot(value = voter_turnout,
continuous = TRUE,
arrange = TRUE,
shape = "hexagon") +
theme_mapcan() +
ggtitle("Hexagon grid map of 2015 federal election voter turnout")

### Provincial ridings

Currently, only Quebec provincial ridings can be plotted with riding_binplot(). In the near future, riding_binplot() will be able to create bin plots for the other provinces.

To create a provincial riding plot, specify provincial = TRUE (to let riding_binplot know that it is provincial ridings you wish to plot) and province = QC for Quebec.

riding_binplot(quebec_provincial_results,
value_col = party,
riding_col = riding_code,
continuous = FALSE,
provincial = TRUE,
province = QC,
shape = "hexagon") +
theme_mapcan() +
scale_fill_manual(name = "Winning party",
values = c("deepskyblue1", "red","royalblue4",  "orange")) +
ggtitle("Hexagon grid map of 2018 Quebec provincial election results")
#> Scale for 'fill' is already present. Adding another scale for 'fill',
#> which will replace the existing scale.

## Reference

Rudis, Bob. 2017. Statebins: Create ’U.s.’ Uniform Square State Cartogram Heatmaps. https://github.com/hrbrmstr/statebins.