`BayesMallows`

: An R Package for Probabilistic Preference Learning with the Mallows Rank ModelThe `BayesMallows`

package implements methods for Bayesian preference learning with the Mallows rank model, as originally described in Vitelli et al. (2018), and further developed in Asfaw et al. (2016) and Crispino et al. (2018). This vignette describes the usage of the package, starting from the complete data cases, through top-\(k\) rankings, pairwise comparisons, and finally clustering. We refer to the above mentioned papers, as well as the review Liu et al. (2018) for a thorough description of the methods. The necessary methods for data preprocessing, tuning of algorithms, and assessment of the posterior distributions will be described along the way.

We also need some packages for plotting and data wrangling.

Here is an overview of the most used functions. You can read their documentation with `?function_name`

, or search for an example in this vignette.

Function Name | Description |
---|---|

`compute_mallows` |
Compute the posterior distribution of the Bayesian Mallows model. This is the main function of the package. Returns an object of class `BayesMallows` . |

`compute_mallows_mixtures` |
Compute multiple Mallows models with different number of mixture components. This is a convenience function for determining the number of mixtures to use. |

`sample_mallows` |
Sample from the Mallows model. |

`plot.BayesMallows` |
Quick plots of the posterior densities of parameters of the Mallows model. |

`assess_convergence` |
Study the convergence of the Markov chain, in order to determine burnin and other algorithm parameters. |

`plot_elbow` |
Create an elbow plot for comparing models with different number of clusters. |

`plot_top_k` |
Plot the top-\(k\) rankings. Particularly relevant when the data is in the form of pairwise comparisons. |

`assign_cluster` |
Compute the cluster assignment of assessors. |

`compute_consensus` Com |
pute the CP or MAP consensus ranking of the latent ranks. |

`compute_posterior_intervals` |
Compute Bayesian posterior intervals for the parameters. |

`generate_initial_ranking` |
Generate an initial ranking, for the case of missing data or pairwise comparisons. |

`generate_transitive_closure` |
Generate the transitive closure for a set of pairwise comparisons. |

`estimate_partition_function` |
Estimate the partition function of the Mallows model using either importance sampling or an asymptotic approximation. |

Here is an overview of the example datasets in `BayesMallows`

. You can read their documentation with `?dataset_name`

, or search for an example in this vignette.

Dataset Name | Description |
---|---|

`beach_preferences` |
Stated pairwise preferences between random subsets of 15 images of beaches, by 60 assessors. |

`sushi_rankings` |
Complete rankings of 10 types of sushi by 5000 assessors. |

`potato_visual` |
Complete rankings of 20 potatoes by weight, based on visual inspection, by 12 assessors. |

`potato_weighing` |
Complete rankings of 20 potatoes by weight, where the assessors were allowed to weigh the potatoes in their hands, by 12 assessors. |

`potato_true_ranking` |
Vector of true weight rankings for the 20 potatoes in the example datasets `potato_visual` and `potato_weighing` . |

We here give an informal review of the Mallows model (Mallows (1957)). The distribution of a ranking \(r \in \mathcal{P}_{n}\) of \(n\) items is modeled as

\[ P(r | \alpha, \rho) = Z_{n}(\alpha)^{-1} \exp\left\{-\frac{\alpha}{n} d(r, \rho)\right\} 1_{\mathcal{P}_{n}}(r), \]

where \(\mathcal{P}_{n}\) is the set of all permutations of \(1, \dots, n\), \(\alpha\) is a scale parameter, \(\rho \in \mathcal{P}_{n}\) is a latent consensus ranking, \(d(\cdot, \cdot)\) is a distance measure, \(1_{S}(\cdot)\) is the indicator function for the set \(S\), and \(Z_{n}(\alpha)\) is the partition function, or normalizing constant.

Given \(N\) observed rankings, \(R_{1}, \dots, R_{N}\), the likelihood of the model is

\[ P(R_{1}, \dots, R_{N} | \alpha, \rho) = Z_{n}(\alpha)^{-N} \exp\left\{-\frac{\alpha}{n} \sum_{j=1}^{N}d(R_{j}, \rho)\right\} \prod_{j=1}^{N} \left\{1_{\mathcal{P}_{n}}(R_{j}) \right\}. \]

The `rankings`

argument to `compute_mallows`

is assumed to be a matrix of the form \((R_{1}, R_{2}, \dots, R_{N})^{T}\), i.e., each row contains a ranking and each column is an item.

For \(\alpha\) we use an exponential prior, with density \(\pi(\alpha | \lambda) = \lambda \exp(-\lambda \alpha).\) The rate parameter \(\lambda\) can be set by the user with the `lambda`

argument to `compute_mallows`

. For \(\rho\) we assume a uniform prior distribution on \(\mathcal{P}_{n}\), with density \(\pi(\rho) = 1_{\mathcal{P}_{n}}(\rho) /n!.\)

We use a Metropolis-Hastings algorithm for computing the posterior distributions of \(\alpha\) and \(\rho\). We propose \(\alpha\) from a lognormal distribution \(\log \mathcal{N}(\log(\alpha), \sigma_{\alpha}^{2})\). We propose \(\rho\) with a leap-and-shift algorithm, described in detail in Vitelli et al. (2018). The standard deviation \(\sigma_{\alpha}^{2}\) and the leap size can be set by the user with the arguments `alpha_prop_sd`

and `leap_size`

to `compute_mallows`

.

If each assessor \(j\) has ranked a subset of the items \(\mathcal{A}_{j}\), we use data augmentation to fill in the missing ranks. We define the augmented data vectors \(\tilde{R}_{1}, \dots, \tilde{R}_{N}\), and use a uniform prior for each assessor with support \(\mathcal(P)_{n} \setminus R_{j}\), i.e., the set of rankings not already chosen. The Metropolis-Hastings algorithm now alternates between sampling \(\tilde{R}_{1}, \dots, \tilde{R}_{N}\) given the current \(\alpha\) and \(\rho\), and sampling \(\alpha\) and \(\rho\) given the current \(\tilde{R}_{1}, \dots, \tilde{R}_{N}\).

When the assessors have stated a set of pairwise comparisons, rather than rankings, we use the same data augmentation ideas as for partial rankings, but the proposal distribution is slightly more complicated in order to ensure that the proposed ranking is compliant with the ordering implied by the pairwise comparisons. In addition, the transitive closure of the stated ordering has to be computed, in order to find all *implied orderings*. From a user perspective, no new algorithm parameters need to be considered.

When the assessor pool is heterogeneous, one might assume that there exist several latent consensus rankings, \(\rho_{1}, \dots, \rho_{C}\), one for each cluster of assessors. Letting \(z_{1}, \dots, z_{N} \in\{1, \dots, C\}\) assign each assessor to each cluster, the likelihood of the observed rankings is

\[P(R_{1}, \dots, R_{N} | \{\alpha_{c}, \rho_{c}\}_{c=1,\dots,C}, z_{1},\dots,z_{N}) = \prod_{j=1}^{N}\frac{1_{\mathcal{P}_{n}}(\rho)}{Z_{n}(\alpha_{z_{j}})}\exp\left\{ -\frac{\alpha_{z_{j}}}{n} d(R_{j}, \rho_{z_{j}})\right\}. \] For the scale parameters \(\alpha_{1}, \dots, \alpha_{C}\) we assume the exponential prior as before, all with the same rate parameter \(\lambda\). We assume that the cluster labels are a priori distributed according to \(P(z_{1}, \dots, z_{N} | \tau_{1}, \dots, \tau_{C}) = \prod_{j=1}^{N} \tau_{z_{j}}\), where \(\tau_{c}\) is the a priori probability that an assessor belongs to cluster \(c\). For \(\tau_{1}, \dots, \tau_{C}\) we assume the Dirichlet prior \(\pi(1, \dots, C) = \Gamma(\psi C)\Gamma(\psi)^{-C}\prod_{c=1}^{C}\tau_{c}^{\psi - 1}\), where \(\Gamma(\cdot)\) is the gamma function. The user can control the value of \(\psi\) with the `psi`

argument to `compute_mallows`

.

Having described the model, we now show some case studies demonstrating the use of the `BayesMallows`

package. To reduce the computing time, we have typically taken a fairly small number of posterior samples, and rather focused on demonstrating the different methods. In general, we recommend taking a larger number of samples than what we do here, in order to obtain good approximations of the posterior distributions.

`BayesMallows`

comes with example data described in Liu et al. (2018). A total of 12 assessors were asked to rank 20 potatoes based on their weight. In the first round, the assessors were only allowed to study the potatoes visually, while in the second round, the assessors were also allowed to hold the potatoes in their hands in order to compare them. The data sets are named `potato_visual`

and `potato_weighing`

, respectively. The true ordering of the potatoes’ weights are stored in the vector `potato_true_ranking`

.

The `potato_visual`

dataset is shown below. The column names P1, …, P20 represent potatoes, and the row names A1, …, A12 represent assessors. The `potato_weighing`

dataset has a similar structure.

P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | P15 | P16 | P17 | P18 | P19 | P20 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

A1 | 10 | 18 | 19 | 15 | 6 | 16 | 4 | 20 | 3 | 5 | 12 | 1 | 2 | 9 | 17 | 8 | 7 | 14 | 13 | 11 |

A2 | 10 | 18 | 19 | 17 | 11 | 15 | 6 | 20 | 4 | 3 | 13 | 1 | 2 | 7 | 16 | 8 | 5 | 12 | 9 | 14 |

A3 | 12 | 15 | 18 | 16 | 13 | 11 | 7 | 20 | 6 | 3 | 8 | 2 | 1 | 4 | 19 | 5 | 9 | 14 | 10 | 17 |

A4 | 9 | 17 | 19 | 16 | 10 | 15 | 5 | 20 | 3 | 4 | 8 | 1 | 2 | 7 | 18 | 11 | 6 | 13 | 14 | 12 |

A5 | 12 | 17 | 19 | 15 | 7 | 16 | 2 | 20 | 3 | 9 | 13 | 1 | 4 | 5 | 18 | 11 | 6 | 8 | 10 | 14 |

A6 | 10 | 15 | 19 | 16 | 8 | 18 | 6 | 20 | 3 | 7 | 11 | 1 | 2 | 4 | 17 | 9 | 5 | 13 | 12 | 14 |

A7 | 9 | 16 | 19 | 17 | 10 | 15 | 5 | 20 | 3 | 8 | 11 | 1 | 2 | 6 | 18 | 7 | 4 | 14 | 12 | 13 |

A8 | 14 | 18 | 20 | 19 | 11 | 15 | 6 | 17 | 4 | 3 | 10 | 1 | 2 | 7 | 16 | 8 | 5 | 12 | 9 | 13 |

A9 | 8 | 16 | 18 | 19 | 12 | 13 | 6 | 20 | 5 | 3 | 7 | 1 | 4 | 2 | 17 | 10 | 9 | 15 | 14 | 11 |

A10 | 7 | 17 | 19 | 18 | 9 | 15 | 5 | 20 | 3 | 10 | 11 | 1 | 2 | 6 | 16 | 8 | 4 | 13 | 12 | 14 |

A11 | 12 | 16 | 19 | 15 | 13 | 18 | 7 | 20 | 3 | 5 | 11 | 1 | 2 | 6 | 17 | 10 | 4 | 14 | 8 | 9 |

A12 | 14 | 15 | 19 | 16 | 12 | 18 | 8 | 20 | 3 | 4 | 9 | 1 | 2 | 7 | 17 | 6 | 5 | 13 | 10 | 11 |

The `compute_mallows`

function is the workhorse of `BayesMallows`

. It runs the Metropolis-Hastings algorithm and returns the posterior distribution of the scale parameter \(\alpha\) and the latent ranks \(\rho\) of the Mallows model. To see all its arguments, please run `?compute_mallows`

in the console.

We start by using all the default values of the parameters, so we only need to supply the matrix of ranked items. We use the `potato_visual`

data printed above.

The argument returned is a list object of class `BayesMallows`

, which contains a whole lot of information about the MCMC run.

The function `assess_convergence`

produces plots for visual convergence assessment. We start by studying \(\alpha\), which is the default. The plot is shown below, and looks good enough, at least to begin with.

Next, we study the convergence of \(\rho\). To avoid too complicated plots, we pick 5 items to plot. Again, you can read more about this function by running `?assess_convergence`

in the console.

Based on these plots, it looks like the algorithm starts to converge after around 1000 iterations. Discarding the first 1000 iterations as burn-in hence seems like a safe choice. We create a new list element for `model_fit`

called `burnin`

, in which this choice is saved. Having done this, we do not have to provide a `burnin`

argument to subsequent analyses of the posterior distributions.

Once we are confident that the algorithm parameters are reasonable, we can study the posterior distributions of the model parameters using the generic function `plot.BayesMallows`

.

With a burnin of 1000, the original `model_fit`

object from the previous subsection has 1000 MCMC samples. The default parameter of `plot.BayesMallows`

is \(alpha\), so we can study the posterior distribution with the simple statement below.

We can also get the posterior credible intervals for \(\alpha\):

parameter | mean | median | conf_level | hpdi | central_interval |
---|---|---|---|---|---|

alpha | 10.95 | 10.908 | 95 % | [9.759,12.241] | [9.718,12.229] |

Obtaining posterior samples from \(\rho\) is in general harder than for \(\alpha\). Some items tend to be very sticky. We start by plotting the `model_fit`

object from above. We now have to tell `plot.BayesMallows`

that we want a plot of `parameter = "rho"`

and all the items. This gives us posterior the posterior density of all the items.

We can also find the posterior intervals of the latent ranks:

item | parameter | mean | median | conf_level | hpdi | central_interval |
---|---|---|---|---|---|---|

P1 | rho | 10 | 10 | 95 % | [9,11] | [9,11] |

P2 | rho | 17 | 17 | 95 % | [16,18] | [16,18] |

P3 | rho | 19 | 19 | 95 % | [19] | [19] |

P4 | rho | 16 | 16 | 95 % | [16,18] | [16,18] |

P5 | rho | 10 | 10 | 95 % | [9,12] | [9,12] |

P6 | rho | 15 | 15 | 95 % | [15] | [15] |

P7 | rho | 6 | 6 | 95 % | [5,7] | [5,7] |

P8 | rho | 20 | 20 | 95 % | [20] | [20] |

P9 | rho | 3 | 3 | 95 % | [3] | [3] |

P10 | rho | 4 | 4 | 95 % | [4] | [4] |

P11 | rho | 11 | 11 | 95 % | [9,11] | [9,11] |

P12 | rho | 1 | 1 | 95 % | [1] | [1] |

P13 | rho | 2 | 2 | 95 % | [2] | [2] |

P14 | rho | 7 | 7 | 95 % | [6,7] | [6,7] |

P15 | rho | 18 | 18 | 95 % | [17,18] | [17,18] |

P16 | rho | 8 | 8 | 95 % | [8] | [8] |

P17 | rho | 5 | 5 | 95 % | [5,6] | [5,6] |

P18 | rho | 14 | 14 | 95 % | [13,14] | [13,14] |

P19 | rho | 12 | 12 | 95 % | [10],[12] | [10,12] |

P20 | rho | 13 | 13 | 95 % | [13,14] | [13,14] |

Updating \(\alpha\) in each step may not be necessary. With the `alpha_jump`

argument, we can tell the MCMC algorithm to update \(\alpha\) only every `alpha_jump`

th iteration.

We should assess convergence again:

Saving a large number of iterations of \(\rho\) gets quite expensive, so `compute_mallows`

has a `rho_thinning`

parameter. It specifies that only each `rho_thinning`

th iteration of \(\rho\) should be saved to memory. We double the number of iterations, while setting `rho_thinning = 2`

. This gives us the same number of posterior samples.

Please be careful with thinning. In this small data example it is definitely wasteful! Running the same number of iterations without thinning always gives a better approximation of the posterior distribution. Thinning might be useful when you need to run a large number of iterations to explore the space of latent ranks, and the latent ranks from all iterations do not fit in memory. (See, e.g., Gelman et al. (2004) for a discussion of thinning).

We can try to use the Kendall distance instead of the footrule distance.

It is also worth pointing out that `compute_mallows`

checks if the input data are indeed ranks. Let us try this by manipulating the first row of `potato_visual`

, giving rank 1 to the first two items:

Now imagine that the assessors in the potato experiment were asked to rank only the top-five heaviest potatoes. We generate these data by retaining only ranks 5 or higher in `potato_visual`

, setting the rest to `NA`

.

In Vitelli et al. (2018) it is shown that the unranked items do not effect the MAP estimates of the ranked items in this top-k setting. In this case, there are 8 potatoes which have been ranked, and so the unranked potatoes should have uniform posterior distributions between 9 and 20. However, arriving at these uniform posteriors require a large number of MCMC iterations, so we instead remove these items:

```
item_ranked <- apply(potato_top, 2, function(x) !all(is.na(x)))
potato_top <- potato_top[, item_ranked, drop = FALSE]
```

We are now left with this 12 by 8 matrix:

P7 | P9 | P10 | P12 | P13 | P14 | P16 | P17 | |
---|---|---|---|---|---|---|---|---|

A1 | 4 | 3 | 5 | 1 | 2 | NA | NA | NA |

A2 | NA | 4 | 3 | 1 | 2 | NA | NA | 5 |

A3 | NA | NA | 3 | 2 | 1 | 4 | 5 | NA |

A4 | 5 | 3 | 4 | 1 | 2 | NA | NA | NA |

A5 | 2 | 3 | NA | 1 | 4 | 5 | NA | NA |

A6 | NA | 3 | NA | 1 | 2 | 4 | NA | 5 |

A7 | 5 | 3 | NA | 1 | 2 | NA | NA | 4 |

A8 | NA | 4 | 3 | 1 | 2 | NA | NA | 5 |

A9 | NA | 5 | 3 | 1 | 4 | 2 | NA | NA |

A10 | 5 | 3 | NA | 1 | 2 | NA | NA | 4 |

A11 | NA | 3 | 5 | 1 | 2 | NA | NA | 4 |

A12 | NA | 3 | 4 | 1 | 2 | NA | NA | 5 |

The `compute_mallows`

function automatically recognizes the `NA`

values as missing ranks, and augments the data, as described in Section 4.1 of Vitelli et al. (2018). Let us try:

Looking at the returned object, we see that `any_missing`

is `TRUE`

, so `compute_mallows`

has correctly detected that there are missing values. We set `save_aug = TRUE`

so we can get a diagnostic plot for the convergence of the data augmentation. After you are confident that the algorithm converges, you might want to use the default `save_aug = FALSE`

, because saving the data takes memory.

`model_fit`

also has assessor-wise acceptance rates for the augmentation.

assessor | acceptance_rate |
---|---|

1 | 0.56 |

2 | 0.56 |

3 | 0.90 |

4 | 0.58 |

5 | 0.59 |

6 | 0.59 |

7 | 0.58 |

8 | 0.54 |

9 | 0.56 |

10 | 0.56 |

11 | 0.51 |

12 | 0.55 |

The tuning of \(\alpha\) and \(\rho\) can be done exactly as described above for the full potato data. Let us now study the convergence of the data augmentation.

Looking back, we see that assessor 1 does not have values for potatoes 14, 16, and 17. Hence, we expect these to jump between ranks 6, 7, and 8, since the 5 ranked items are fixed to values 1,…,5. Below we have plotted potato 14, which fluctuates between 6 and 8. You can confirm that these same happens to 16 and 17.

Assessor 1 has ranks for potatoes 7, 9, 10, 12, and 13, which are fixed. You can confirm that they are fixed by running the following command.

We can study the posterior distributions of \(\alpha\) and \(\rho\) just like we did above.

Like for the complete ranks, we can vary the distance measure used in the Mallows model. Here are command for running with Cayley, Kendall, and Spearman distance.

If the ranks are missing at random, we cannot remove the unranked items as we did for top-\(k\) rankings above. Let us assume that 10 % of the data in `potato_visual`

have disappeared due to a disk failure. We generate these in the code chunk below:

```
missing_indicator <- if_else(
runif(nrow(potato_visual) * ncol(potato_visual)) < 0.1,
NA_real_, 1)
potato_missing <- potato_visual * missing_indicator
```

The data now look like the following:

P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | P15 | P16 | P17 | P18 | P19 | P20 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

A1 | 10 | NA | 19 | NA | 6 | 16 | 4 | 20 | 3 | 5 | 12 | 1 | 2 | 9 | NA | 8 | 7 | 14 | 13 | 11 |

A2 | 10 | NA | 19 | 17 | 11 | NA | 6 | 20 | 4 | 3 | 13 | 1 | 2 | 7 | 16 | 8 | 5 | 12 | 9 | 14 |

A3 | 12 | 15 | 18 | 16 | 13 | 11 | 7 | 20 | 6 | 3 | 8 | 2 | 1 | 4 | 19 | 5 | 9 | 14 | 10 | 17 |

A4 | 9 | 17 | 19 | 16 | 10 | 15 | 5 | 20 | 3 | 4 | 8 | 1 | 2 | 7 | 18 | 11 | 6 | 13 | NA | 12 |

A5 | 12 | 17 | 19 | NA | 7 | 16 | 2 | 20 | 3 | 9 | 13 | 1 | 4 | 5 | 18 | 11 | 6 | 8 | 10 | 14 |

A6 | NA | 15 | 19 | 16 | 8 | 18 | 6 | 20 | 3 | 7 | 11 | 1 | 2 | 4 | 17 | 9 | 5 | 13 | 12 | 14 |

A7 | 9 | 16 | 19 | 17 | 10 | 15 | 5 | 20 | 3 | 8 | 11 | NA | 2 | 6 | 18 | 7 | 4 | 14 | 12 | NA |

A8 | 14 | 18 | 20 | 19 | 11 | 15 | 6 | 17 | 4 | 3 | 10 | 1 | 2 | NA | 16 | 8 | 5 | 12 | 9 | 13 |

A9 | 8 | NA | 18 | NA | 12 | 13 | 6 | 20 | 5 | 3 | NA | 1 | 4 | NA | 17 | 10 | 9 | 15 | 14 | 11 |

A10 | 7 | 17 | 19 | 18 | NA | 15 | NA | 20 | 3 | 10 | 11 | 1 | 2 | 6 | 16 | 8 | NA | 13 | 12 | 14 |

A11 | NA | 16 | 19 | 15 | NA | 18 | 7 | 20 | 3 | NA | 11 | NA | 2 | 6 | 17 | 10 | 4 | NA | 8 | 9 |

A12 | 14 | 15 | 19 | 16 | 12 | 18 | 8 | 20 | 3 | 4 | 9 | 1 | 2 | 7 | 17 | 6 | 5 | NA | 10 | NA |

We supply `potato_missing`

to `compute_mallows`

as before:

We check the convergence with the following commands. The plots (not shown) would show good convergence before 1,000 iterations.

Again, we can look at the acceptance rates for the augmented data. Now let us compare it to the number of missing ranks per assessor.

```
bind_cols(model_fit$aug_acceptance,
n_missing = apply(potato_missing, 1, function(x) sum(is.na(x))))
```

assessor | acceptance_rate | n_missing |
---|---|---|

1 | 0.56 | 3 |

2 | 0.68 | 2 |

3 | 1.00 | 0 |

4 | 1.00 | 1 |

5 | 1.00 | 1 |

6 | 1.00 | 1 |

7 | 0.53 | 2 |

8 | 1.00 | 1 |

9 | 0.16 | 4 |

10 | 0.41 | 3 |

11 | 0.08 | 5 |

12 | 0.98 | 2 |

We see that for assessors who have either 0 or 1 missing ranks, their augmentations are always accepted. In the case of 0 missing ranks, there is in fact no augmentation going on at all, but we use the convention that the assessor’s own complete data is accepted.

The posterior distributions can be studied as shown above.

Handling of pairwise preferences in the Mallows rank model is described in Section 4.2 of Vitelli et al. (2018).

Let us start by considering a toy example with two assessors and five items. Assessor 1 has stated a set of preferences \[ \mathcal{B}_{1} = \left\{A_{1} \prec A_{2}, A_{2} \prec A_{5}, A_{4} \prec A_{5} \right\} \] and assessor 2 has the set of preferences \[ \mathcal{B}_{2} = \left\{ A_{1} \prec A_{2}, A_{2} \prec A_{3}, A_{3} \prec A_{4} \right\}. \]

Each time an assessor is asked to compare two objects, a measurement is made. Therefore, in order to keep the data *tidy* (Wickham (2014)), we define a dataframe in which each row corresponds to a pairwise comparison. The columns (variables) are *the assessor*, *the bottom item*, and *the top item*.

In the code snippet below, we define such a dataframe for the toy example presented above:

```
pair_comp <- tribble(
~assessor, ~bottom_item, ~top_item,
1, 1, 2,
1, 2, 5,
1, 4, 5,
2, 1, 2,
2, 2, 3,
2, 3, 4
)
```

assessor | bottom_item | top_item |
---|---|---|

1 | 1 | 2 |

1 | 2 | 5 |

1 | 4 | 5 |

2 | 1 | 2 |

2 | 2 | 3 |

2 | 3 | 4 |

Next, we need to find the transitive closure for the set of pairwise comparisons given by each user. `BayesMallows`

comes with a function `generate_transitive_closure`

to do just this.

As we can see, `pair_comp_tc`

has an additional row containing the relation \(A_{4} \prec A_{5}\) for assessor 1. For assessor 2, \[\text{tc}(\mathcal{B}_{2}) = \mathcal{B}_{2} \cup \left\{ A_{1} \prec A_{3}, A_{1} \prec A_{4}, A_{2} \prec A_{4}\right\},\] so three new rows have been added.

assessor | bottom_item | top_item |
---|---|---|

1 | 1 | 2 |

1 | 1 | 5 |

1 | 2 | 5 |

1 | 4 | 5 |

2 | 1 | 2 |

2 | 1 | 3 |

2 | 2 | 3 |

2 | 1 | 4 |

2 | 2 | 4 |

2 | 3 | 4 |

The dataframe returned by `generate_transitive_closure`

inherits from `tibble`

, but has subclass `BayesMallowsTC`

. The `compute_mallows`

function uses this information to ensure that the object provided has been through the `generate_transitive_closure`

function. If it has not, `compute_mallows`

will do it for us, but this may lead to additional computing time when running several diagnostic runs and trying out different parameters, since the transitive closure will be recomputed each time.

We can also generate an initial ranking, consistent with the pairwise comparisons. Again, `compute_mallows`

will do it for us, but we may save time by computing it once and for all before we starting running the algorithms.

Rather than digging deeper into this toy example, we go on with a real application.

The beach preference dataset is described in Section 6.2 of Vitelli et al. (2018), and is available in the dataframe `beach_preferences`

in `BayesMallows`

. In short, \(60\) assessors were each asked to perform a random set of pairwise comparisons between pictures of \(15\) beaches. The first few rows are listed below.

assessor | bottom_item | top_item |
---|---|---|

1 | 2 | 15 |

1 | 5 | 3 |

1 | 13 | 3 |

1 | 4 | 7 |

1 | 5 | 15 |

1 | 12 | 6 |

We start by generating the transitive closure of the preferences.

We can compare the dataframes before and after. We see that the number of rows has been approximately doubled, and that `beach_tc`

has subclass `BayesMallowsTC`

has it should.

```
str(beach_preferences)
#> Classes 'tbl_df', 'tbl' and 'data.frame': 1442 obs. of 3 variables:
#> $ assessor : num 1 1 1 1 1 1 1 1 1 1 ...
#> $ bottom_item: num 2 5 13 4 5 12 14 7 8 14 ...
#> $ top_item : num 15 3 3 7 15 6 6 9 10 9 ...
str(beach_tc)
#> Classes 'BayesMallowsTC', 'tbl_df', 'tbl' and 'data.frame': 2921 obs. of 3 variables:
#> $ assessor : num 1 1 1 1 1 1 1 1 1 1 ...
#> $ bottom_item: int 4 5 7 4 5 7 8 9 13 14 ...
#> $ top_item : int 1 1 1 3 3 3 3 3 3 3 ...
```

Next, we generate an initial ranking.

We can also take a look at the first 6 rows in it.

1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

7 | 15 | 3 | 14 | 13 | 2 | 10 | 12 | 6 | 4 | 1 | 5 | 8 | 11 | 9 |

5 | 15 | 14 | 6 | 2 | 13 | 12 | 11 | 1 | 7 | 4 | 10 | 9 | 8 | 3 |

3 | 9 | 7 | 15 | 14 | 2 | 6 | 13 | 1 | 8 | 12 | 11 | 10 | 5 | 4 |

5 | 15 | 2 | 14 | 11 | 8 | 9 | 10 | 1 | 4 | 3 | 13 | 7 | 12 | 6 |

9 | 15 | 7 | 5 | 6 | 1 | 14 | 13 | 4 | 3 | 2 | 12 | 11 | 8 | 10 |

9 | 13 | 5 | 12 | 15 | 3 | 7 | 11 | 1 | 6 | 4 | 14 | 8 | 10 | 2 |

Let us add column names to the initial ranking. This will make the plots more informative. But be aware that if the names are too long, some plots will be filled with text. We therefore name them B1, B2, …, B15.

We can now check the convergence, using the same tools as before.

```
test_run <- compute_mallows(
rankings = beach_init_rank,
preferences = beach_tc,
save_aug = TRUE
)
#> Using exact partition function.
```

The trace plots for \(\alpha\) and \(\rho\) show good convergence.

When using `assess_convergence`

, we can set `parameter = "Rtilde"`

and choose `assessors`

and `items`

. We now illustrate this by going a bit deeper into some assessors. Referring to the mathematical notation in the top of the vignette, `Rtilde`

corresponds to the augmented rankings \(\tilde{R}_{1}, \dots, \tilde{R}_{N}\).

Let us look at Beach 2 for assessor 1.

assessor | bottom_item | top_item |
---|---|---|

1 | 2 | 6 |

1 | 2 | 15 |

It is implied by the preferences of assessor 1 that \(\{B_{2} \prec B_{6}\}\) and \(\{B_{2} \prec B_{15}\}\).

This seems correct.

Next, no ordering is implied between beach 2 and 4 for assessor 1.

```
beach_tc %>%
filter(assessor == 1, bottom_item %in% c(2, 4), top_item %in% c(2, 4)) %>%
nrow()
#> [1] 0
```

The traces of item 2 and 4 do indeed cross.

No ordering is implied between beach 5 and beach 8 either:

```
beach_tc %>%
filter(assessor == 1, bottom_item %in% c(5, 8), top_item %in% c(5, 8)) %>%
nrow()
#> [1] 0
```

The traces of beaches 5 and 8 do indeed cross.

We go on with assessor 2. Let us take a look at, say, beach 10.

assessor | bottom_item | top_item |
---|---|---|

2 | 10 | 5 |

2 | 7 | 10 |

Beach 10 is preferred to beach 7, but disfavored to beach 5. The trace plot does indeed show this:

No ordering is implied between beach 1 and beach 15 for assessor 2.

```
beach_tc %>%
filter(assessor == 1, bottom_item %in% c(1, 15), top_item %in% c(1, 15)) %>%
nrow()
#> [1] 0
```

Their traces do indeed cross.

We delete the test run before going on.

Based on the convergence diagnostics, and being fairly conservative, we set `burnin = 1000`

, and take an additional 5,000 samples. We still save the augmented data, because we are going to call the function `plot_top_k`

below.

```
model_fit <- compute_mallows(
rankings = beach_init_rank,
preferences = beach_tc,
nmc = 6000,
save_aug = TRUE
)
#> Using exact partition function.
model_fit$burnin <- 1000
```

The posterior densities of \(\alpha\) and \(\rho\) can be studied as shown above.

We can rank the beaches according to the cumulative probability (CP) consensus. This functionality is provided by `compute_consensus`

, which returns a dataframe.

ranking | item | cumprob |
---|---|---|

1 | B9 | 0.87 |

2 | B6 | 1.00 |

3 | B3 | 0.57 |

4 | B11 | 0.99 |

5 | B15 | 1.00 |

6 | B10 | 0.94 |

7 | B1 | 1.00 |

8 | B5 | 0.76 |

9 | B7 | 0.63 |

10 | B13 | 1.00 |

11 | B8 | 0.66 |

12 | B4 | 0.60 |

13 | B12 | 0.87 |

14 | B14 | 0.95 |

15 | B2 | 1.00 |

The maximum a posterior (MAP) consensus ranking is the a posterior most likely value of \(\rho\).

probability | item | map_ranking |
---|---|---|

0.12 | B9 | 1 |

0.12 | B6 | 2 |

0.12 | B3 | 3 |

0.12 | B11 | 4 |

0.12 | B15 | 5 |

0.12 | B10 | 6 |

0.12 | B1 | 7 |

0.12 | B5 | 8 |

0.12 | B13 | 9 |

0.12 | B7 | 10 |

0.12 | B8 | 11 |

0.12 | B12 | 12 |

0.12 | B4 | 13 |

0.12 | B14 | 14 |

0.12 | B2 | 15 |

We can compute posterior intervals for several parameters. Here we do it for \(\rho\).

item | parameter | mean | median | conf_level | hpdi | central_interval |
---|---|---|---|---|---|---|

B1 | rho | 7 | 7 | 95 % | [6,7] | [6,7] |

B2 | rho | 15 | 15 | 95 % | [14,15] | [14,15] |

B3 | rho | 3 | 3 | 95 % | [3,4] | [3,4] |

B4 | rho | 12 | 12 | 95 % | [11,13] | [11,13] |

B5 | rho | 8 | 8 | 95 % | [8,10] | [8,10] |

B6 | rho | 2 | 2 | 95 % | [1,2] | [1,2] |

B7 | rho | 9 | 9 | 95 % | [8,10] | [8,10] |

B8 | rho | 11 | 11 | 95 % | [11,13] | [11,13] |

B9 | rho | 1 | 1 | 95 % | [1,2] | [1,2] |

B10 | rho | 6 | 6 | 95 % | [6,7] | [6,7] |

B11 | rho | 4 | 4 | 95 % | [3,4] | [3,4] |

B12 | rho | 13 | 13 | 95 % | [12,14] | [12,15] |

B13 | rho | 9 | 10 | 95 % | [8,10] | [8,10] |

B14 | rho | 14 | 14 | 95 % | [13,15] | [13,15] |

B15 | rho | 5 | 5 | 95 % | [5] | [4,5] |

We can also find the posterior probability for each beach of being ranked top-\(k\), both in \(\rho\) and among the assessors.

We can do a top-\(k\) plot with `plot_top_k`

. By default, `k = 3`

. It may be necessary to do some experimentation with the `rel_widths`

argument to get a good looking plot.

In many situations, it is interesting to divide the assessors into clusters, in which each the assessors in each cluster have similar preferences. In Vitelli et al. (2018), Section 4.3, it is shown how the Bayesian Mallows model can be extended to do exactly this, with a mixture approach. `BayesMallows`

supports this through the optional argument `n_clusters`

to `compute_mallows`

.

To illustrate how to perform clustering, let us create some clusters in the `potato_visual`

dataset. This dataset has 12 assessors. We leave the first 6 assessors as is, but for the last 6, we revert the rankings.

P1 | P2 | P3 | P4 | P5 | P6 | P7 | P8 | P9 | P10 | P11 | P12 | P13 | P14 | P15 | P16 | P17 | P18 | P19 | P20 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

A1 | 10 | 18 | 19 | 15 | 6 | 16 | 4 | 20 | 3 | 5 | 12 | 1 | 2 | 9 | 17 | 8 | 7 | 14 | 13 | 11 |

A2 | 10 | 18 | 19 | 17 | 11 | 15 | 6 | 20 | 4 | 3 | 13 | 1 | 2 | 7 | 16 | 8 | 5 | 12 | 9 | 14 |

A3 | 12 | 15 | 18 | 16 | 13 | 11 | 7 | 20 | 6 | 3 | 8 | 2 | 1 | 4 | 19 | 5 | 9 | 14 | 10 | 17 |

A4 | 9 | 17 | 19 | 16 | 10 | 15 | 5 | 20 | 3 | 4 | 8 | 1 | 2 | 7 | 18 | 11 | 6 | 13 | 14 | 12 |

A5 | 12 | 17 | 19 | 15 | 7 | 16 | 2 | 20 | 3 | 9 | 13 | 1 | 4 | 5 | 18 | 11 | 6 | 8 | 10 | 14 |

A6 | 10 | 15 | 19 | 16 | 8 | 18 | 6 | 20 | 3 | 7 | 11 | 1 | 2 | 4 | 17 | 9 | 5 | 13 | 12 | 14 |

A7 | 12 | 5 | 2 | 4 | 11 | 6 | 16 | 1 | 18 | 13 | 10 | 20 | 19 | 15 | 3 | 14 | 17 | 7 | 9 | 8 |

A8 | 7 | 3 | 1 | 2 | 10 | 6 | 15 | 4 | 17 | 18 | 11 | 20 | 19 | 14 | 5 | 13 | 16 | 9 | 12 | 8 |

A9 | 13 | 5 | 3 | 2 | 9 | 8 | 15 | 1 | 16 | 18 | 14 | 20 | 17 | 19 | 4 | 11 | 12 | 6 | 7 | 10 |

A10 | 14 | 4 | 2 | 3 | 12 | 6 | 16 | 1 | 18 | 11 | 10 | 20 | 19 | 15 | 5 | 13 | 17 | 8 | 9 | 7 |

A11 | 9 | 5 | 2 | 6 | 8 | 3 | 14 | 1 | 18 | 16 | 10 | 20 | 19 | 15 | 4 | 11 | 17 | 7 | 13 | 12 |

A12 | 7 | 6 | 2 | 5 | 9 | 3 | 13 | 1 | 18 | 17 | 12 | 20 | 19 | 14 | 4 | 15 | 16 | 8 | 11 | 10 |

Let us try first without clustering.

Looking at the convergence of \(\alpha\), we see that this model does not settle on a consensus ranking.

Let us instead introduce clustering:

```
model_fit <- compute_mallows(
rankings = potato_manipulated,
n_clusters = 2
)
#> Using exact partition function.
```

We can assess convergence in the usual way. Let us start with `alpha`

. The `assess_convergence`

method now shows one trace per cluster.

We plot the potato that is most often ranked heaviest (P8), and the potato that is most often ranked lightest (P12).