FACT - (Feature Attributions for Clustering)

To get value from a clustering algorithm, it is important to understand the mapping procedure of an algorithm that assigns instances to clusters. FACT is an algorithm agnostic framework that provides feature attribution while preserving the integrity of the data.



We aim to divide American states by their standardized crime rates in 3 clusters.

attributes_scale = attributes(scale(USArrests))
Murder Assault UrbanPop Rape
Alabama 1.24 0.78 -0.52 0.00
Alaska 0.51 1.11 -1.21 2.48
Arizona 0.07 1.48 1.00 1.04
Arkansas 0.23 0.23 -1.07 -0.18
California 0.28 1.26 1.76 2.07
Colorado 0.03 0.40 0.86 1.86

USArrests Data Set

Therefore, we use a c-means algorithm from mlr3cluster.

tsk_usa = TaskClust$new(id = "usarest", backend = data.frame(scale(USArrests)))
c_lrn = lrn("clust.cmeans", centers = 3, predict_type = "prob")

Then, we create a ClustPredictor that wraps the information needed for our methods.

predictor = ClustPredictor$new(c_lrn, data = tsk_usa$data(), y = c_lrn$model$membership)

How does Assault effect the partitions created by c-means clustering?

The sIDEA plot shows:

idea_assault = IDEA$new(predictor, "Assault", grid.size = 50)

Short Interpretation:


If you use FACT in a scientific publication, please cite it as:

Scholbeck, C.A., Funk, H., Casalicchio, G. (2023). Algorithm-Agnostic Feature Attributions for Clustering. In: Longo, L. (eds) Explainable Artificial Intelligence. xAI 2023. Communications in Computer and Information Science, vol 1901. Springer, Cham. https://doi.org/10.1007/978-3-031-44064-9_13


