Using nVennR to generate and explore n-dimensional, quasi-proportional Venn diagrams

Victor Quesada

2018-05-11

nVennR provides an R interface to the nVenn algorithm. This vignette intends to illustrate three basic uses of nVennR:

Create diagrams

There are two ways to create a Venn diagram object (nVennObj), which will be referenced as high-level (by providing intersecting lists) and low-level (from scratch).

High-level

The most common use for a package like nVennR is to depict the relationships between several intersecting lists. The main function for this task is plotVenn. The input is a list of vectors (or lists) describing each set. The name of each inner vector will be used for labelling. If inner vectors are not named, labels can be provided as a vector with sNames. Empty names will be filled with GroupN. Examples:

library(nVennR)
exampledf
#>    Employee SAS Python R
#> 1      A001   Y      Y Y
#> 2      A002   N      Y Y
#> 3      A003   Y      Y N
#> 4      A004   Y      Y Y
#> 5      A005   Y      N N
#> 6      A006   Y      N Y
#> 7      A007   N      N N
#> 8      A008   Y      N N
#> 9      A009   N      N Y
#> 10     A010   N      N Y
#> 11     A011   Y      Y Y
#> 12     A012   Y      Y Y
#> 13     A013   Y      N Y
#> 14     A014   Y      N Y
#> 15     A015   N      N Y
#> 16     A016   N      N Y
#> 17     A017   N      Y N
#> 18     A018   N      Y N
sas <- subset(exampledf, SAS == "Y")$Employee
python <- subset(exampledf, Python == "Y")$Employee
rr <- subset(exampledf, R == "Y")$Employee
myV <- plotVenn(list(SAS=sas, PYTHON=python, R=rr), nCycles = 2000)

The number of sets is arbitrary. For more than five sets, the default 7000 simulation cycles may not be enough. You can set a different number of cycles with nCycles, or you can run the simulation repeatedly by providing the returned nVennObj to plotVenn. Repeated execution is encouraged, as long simulations are resource-intensive. Also, the nVenn algorithm lowers the speed of the simulation if the topology of the diagram fails. Running it a second time as shown below can recover resets the speed of the simulations, and sometimes makes it significantly faster.

myV2 <- plotVenn(list(SAS=sas, PYTHON=python, R=rr, c("A006", "A008", "A011", "Unk"), c("A011", "Unk", "A101", "A006", "A000"), c("A101", "A006", "A008")))
myV2 <- plotVenn(nVennObj = myV2)

Low-level

Users can also build an nVennObj from scratch. Most of the time, this will not be useful, but it might have some theoretical applications. For instance, let us get a five-set Venn diagram (in Venn diagrams, all the regions are shown). With the high-level procedure, we would need five sets with all the possible intersections. Instead, we can use createVennObj:

myV3 <- createVennObj(nSets = 5, sSizes = c(rep(1, 32)))
myV3 <- plotVenn(nVennObj = myV3)
myV3 <- plotVenn(nVennObj = myV3)