ggforce: Visual Guide

Thomas Lin Pedersen

2019-04-23

Introduction

This document serves as the main overview of the ggforce package, and will try to explain the hows and whys of the different extension along with clear visual examples. It will try to link back to relevant academic articles describing the different visualization types in more detail - both for the benefit of the reader but also to give credit to the people who thought long and hard about how to best present your data.

Geom versions

Some of the geom versions presented below, comes in two or more flavors, potentially suffixed with 0 or 2, such as for geom_bezier which also comes in the versions geom_bezier0 and geom_bezier2. This pattern is mainly used in line drawings such as splines, arcs and bezier and has been adopted for edge drawing in the ggraph package as well. In all cases the base version (no suffix) has been implemented efficiently in C++ and produces a set of points along the line, that can be traced using a path. The benefit of this is that the detail level can be chosen, thus giving the user control over the rendering time. On top of that, an additional column is added to the data with the position along the path, which can be used to map e.g. an opacity gradient to. For the base version each line is encoded in one row using x, y, xend, and yend in the same manner as known as geom_segment. The same input format is used for the 0-version, but this version maps directly to native grid grobs. While there is seldom a performance reason to use the native grobs, these version do ensure that the path is always smooth (For the base versions this is dependent on the number of points calculated). The 0-versions does not allow for mapping of gradients to the path. The 2-version changes the input format into encoding the start and end points on different rows in the same manner as for geom_path. The benefit of this is that different aesthetic variables can be defined for the start and end, e.g. colour, and these versions will make sure to interpolate that aesthetic along the path so you can get e.g. smooth transition of size, colour, and opacity along a spline.

Layers

This section shows the extensions to ggplot2’s geoms and stats. It rarely makes sense to talk about one and not the other, so they are grouped together here. Often the focus will be on the geoms, unless a new stat does not have an accompanying geom, in which case the stat will be discussed along with which geoms it should be used with.

Shapes

Most area based geoms in ggplot2 is using geom_polygon() underneath in order to draw the shapes. ggforce offers a more powerful version of this functionality in the form of geom_shape which is used in all the area based geoms in ggforce, and can be dropped in everywhere geom_polygon() is being used in ggplot2 and elsewhere. The difference between geom_shape() and geom_polygon() lies in the ability of geom_shape() to round its corners as well as expand and contract itself by absolute amounts (i.e. not relative to the plot dimensions). All of these ability is automatically transferred to all the other geoms that depends on geom_shape()

Arcs

Arcs are segments of a circle and defined by a centre point, a radius and a start and end angle. In ggforce arcs come in two flavors: arc and arc_bar, where the former draws an arc with a single line and the latter draws it as a polygon that can have a fill and outline. A wedge is a special case of arc_bar where the innermost radius is 0. The most well known use of arcs in plotting is with the much loathed pie chart (and its cousin the donut chart). The reason for all the hatred against pie charts are just and related to the fact that humans are much better at comparing heights than angles. Because of this a bar chart will always communicate your data better than a pie chart. Donut charts are a little better as the hole in the middle forces the eye to compare arc spans rather than angles, but it is still better to use a bar chart. Arcs, being a fundamental visual element, can be used for other things though, such as sunburst plots or annotating radial visualizations.

As pie charts are most well known, we’ll start by upsetting all visualization expert and produce one:

While the above produces some of the most hated plot types in the world it does showcase the use of arcs in plotting. Arcs can be used in many different visualization types to annotate radial position etc. as in e.g. choord diagrams.

Using arc is just like arc_bar except that it does not take an r0 argument and does not have any fill. Furthermore the arc geoms contains the 0 and 2 versions making gradients and interpolation possible.

Circles

Standard ggplot2 generally has you covered when it comes to drawing circles through the point geom, it does not make it possible to draw circles where the radius of the circles are related to the coordinate system. The geom_circle from ggforce are precisely for that. It generates a polygon resembling a circle based on a center point and a radius, making the radius directly readable from the axes. The geom are mainly intended to make it possible to draw circles with fine grained control, but will often not have any utility in itself. An exception would be in plotting trees as enclosure diagrams using circles. Here it will be necessary to have fine control over radius.

Ellipse

As with circles it is possible to draw ellipses according to the coordinate system. It requires some more parameters than geom_circle, namely two radii and an angle:

Be aware that ggplot2 contains a stat_ellipse which estimates uncertainty ellipses to points.

Beziers

A bezier is a smooth curve defined by its end point and one or two control points. It is well known in vector drawing software such as Adobe Illustrator, where the control points provide an intuitive way to manipulate the curve. In essence the control points define the direction and the force the curve exits the end point with - the more distant the control point is to the end point, the longer the curve travels in the direction of the control point before beginning to move towards the other end point.

There is no succinct way to describe a bezier in a single row, so all the versions use multiple rows to describe the bezier, grouped by the group aesthetic. The first row is the start point followed by one or two control points and then the end point. As bezierGrob from grid only supports quadratics beziers (2 control points) the 0-version approximates a qubic bezier by placing placing the two control points on top of each other.

Diagonals

In visualization parlance a diagonal is a path connecting two points through a smooth curve that starts perpendicular to either the x or y axis and gradually bends to meet the other end. It is often implemented as cubic bezier curves with the control points of each end extending perpendicular from the end points. Diagonals are often used in visualising trees (see geom_edge_diagonal() in ggraph) but is provided here as a general construct.