Bayesian analysis of (extended) feature allocation models: predictions, sufficientness, and applications

We introduce the class of extended feature allocation models, which generalizes standard feature allocations by allowing dependence across features' labels. In particular, we study a Bayesian nonparametric model where data are conditionally i.i.d. from a Bernoulli product model, and the random parameter of interest is an almost surely discrete random measure seen as a functional of a point process. General expressions for the marginal, posterior, and predictive laws are obtained and exploited to derive two sufficientness postulates. Given a sample of size $n$, we characterize models in which the distribution of the new features in an additional sample depends solely on the sample size or the sample size and the number of unique features displayed in the sample. Our general methodology is illustrated by an engagement prediction problem in large-scale A/B testing and in an application in spatial statistics.

Based on joint works with Federico Camerlenghi, Stefano Favaro, Lorenzo Ghilotti, Lorenzo Masoero, and Thomas Richardson