Greedy feature selection for subspace clustering

Eva L. Dyer, Aswin C. Sankaranarayanan, Richard G. Baraniuk

Research output: Contribution to journalArticlepeer-review

125 Scopus citations

Abstract

Unions of subspaces provide a powerful generalization of single subspace models for collections of high-dimensional data; however, learning multiple subspaces from data is challenging due to the fact that segmentation - the identification of points that live in the same subspace - and subspace estimation must be performed simultaneously. Recently, sparse recovery methods were shown to provide a provable and robust strategy for exact feature selection (EFS)-recovering subsets of points from the ensemble that live in the same subspace. In parallel with recent studies of EFS with l1-minimization, in this paper, we develop sufficient conditions for EFS with a greedy method for sparse signal recovery known as orthogonal matching pursuit (OMP). Following our analysis, we provide an empirical study of feature selection strategies for signals living on unions of subspaces and characterize the gap between sparse recovery methods and nearest neighbor (NN)-based approaches. In particular, we demonstrate that sparse recovery methods provide significant advantages over NN methods and that the gap between the two approaches is particularly pronounced when the sampling of subspaces in the data set is sparse. Our results suggest that OMP may be employed to reliably recover exact feature sets in a number of regimes where NN approaches fail to reveal the subspace membership of points in the ensemble.

Original languageEnglish (US)
Pages (from-to)2487-2517
Number of pages31
JournalJournal of Machine Learning Research
Volume14
StatePublished - Sep 1 2013

Keywords

  • Hybrid linear models
  • Low-rank approximation
  • Nearest neighbors
  • Sparse approximation
  • Structured sparsity
  • Subspace clustering
  • Unions of subspaces

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Greedy feature selection for subspace clustering'. Together they form a unique fingerprint.

Cite this