A Bayesian nonparametric approach for the analysis of multiple categorical item responses

Andrew Waters, Kassandra Fronczyk, Michele Guindani, Richard G. Baraniuk, Marina Vannucci

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


We develop a modeling framework for joint factor and cluster analysis of datasets where multiple categorical response items are collected on a heterogeneous population of individuals. We introduce a latent factor multinomial probit model and employ prior constructions that allow inference on the number of factors as well as clustering of the subjects into homogeneous groups according to their relevant factors. Clustering, in particular, allows us to borrow strength across subjects, therefore helping in the estimation of the model parameters, particularly when the number of observations is small. We employ Markov chain Monte Carlo techniques and obtain tractable posterior inference for our objectives, including sampling of missing data. We demonstrate the effectiveness of our method on simulated data. We also analyze two real-world educational datasets and show that our method outperforms state-of-the-art methods. In the analysis of the real-world data, we uncover hidden relationships between the questions and the underlying educational concepts, while simultaneously partitioning the students into groups of similar educational mastery.

Original languageEnglish (US)
Pages (from-to)52-66
Number of pages15
JournalJournal of Statistical Planning and Inference
StatePublished - Nov 1 2015


  • Bayesian nonparametrics
  • Cluster analysis
  • Factor analysis
  • Learning analytics
  • Multinomial probit model

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Applied Mathematics


Dive into the research topics of 'A Bayesian nonparametric approach for the analysis of multiple categorical item responses'. Together they form a unique fingerprint.

Cite this