Title: Random partitions beyond exchangeability
Abstract:
Species sampling models provide a structural framework for understanding random discrete distributions and random partitions under exchangeability assumptions. However, they do not encompass more general symmetry assumptions, such as partial exchangeability, which is often assumed when dealing with heterogeneous data collected from related sources. Additionally, they do not address separate and joint exchangeability, typically used to model matrix observations.
To overcome these limitations, we introduce multivariate species sampling models (mSSMs), a general class of models characterized by their partially exchangeable partition probability function. mSSMs encompass most existing Bayesian nonparametric models for partially exchangeable data and help to elucidate their core distributional properties and the learning mechanisms they induce. Specifically, mSSMs facilitate the study of general properties of random partitions under partial exchangeability assumptions. We demonstrate that the dependence structure is determined by the probability of ties occurring across groups, with independence across sources corresponding to a zero probability of such ties. The results presented provide a comprehensive understanding of the dependence structure induced by a wide range of nonparametric models under partial exchangeability assumptions.
Furthermore, leveraging this structural framework, dependent random partition models can be defined under various symmetry assumptions also beyond partial exchangeability. This allows to address challenges posed by dynamic and multi-view clustering. In particular, we introduce conditional partial exchangeability (CPE), a unifying framework for symmetry assumptions in dependent partitions of the same objects. CPE differs from traditional partial exchangeability through its conditional nature and its requirement for marginal invariance. Together, these conditions ensure local dependence among partitions.
The practical implications of these theoretical and modeling advances are demonstrated through two applications: a multi-armed bandit problem aimed at maximizing species discovery when sampling sequentially across multiple sites, and a community detection problem in multiplex network data.
The seminar is based on the following works:
Franzolini, B., Lijoi, A., PrĂ¼nster, I., Rebaudo, G. (2025). Multivariate species sampling models. Working Paper
Franzolini, B., De Iorio, M., and Eriksson, J.G. (2025). Conditional partial exchangeability: a probabilistic framework for multi-view clustering. arXiv:2307.01152
Ghidini, V., Franzolini, B., Durante, D. (2025). Hierarchically extended stochastic block models for multiplex networks. Working Paper