TY - GEN
T1 - Deterministic column sampling for low-rank matrix approximation
T2 - 16th SIAM International Conference on Data Mining 2016, SDM 2016
AU - Patel, Raajen
AU - Goldstein, Tom
AU - Dyer, Eva
AU - Mirhoseini, Azalia
AU - Baraniuk, Richard
N1 - Funding Information:
This work was made possible by the generous support of the National Science Foundation (CCF-1535902 & CCF-1527501), the Office of Naval Research (N00014-15-1-2676), Air Force Office of Scientific Research (FA9550-14-1-0088), and Army Research Office (W911NF-15-1-0316).
Publisher Copyright:
Copyright © by SIAM.
PY - 2016
Y1 - 2016
N2 - Kernel matrices that encode the distance (or similarity) between data points are widely used throughout the computational sciences for classification, clustering, and dimensionality reduction. For large datasets, the cost of computing and factorizing such matrices becomes intractable. Thus instead of operating on the entire matrix, approximate methods such as the Nystrom method and the incomplete Cholesky decomposition (ICD) generate a low rank matrix factorization using only a subset of the matrix columns (or rows). Here, we present an adaptive column sampling strategy for the Nystrom method that we dub Accelerated Sequential Incoherence Selection (oASIS). This sampling strategy reveals a missing link between Nystrom methods and ICD: we demonstrate that ICD is actually a special case of the Nystrom method with the oASIS adaptive sampling rule. Numerical experiments and theoretical results suggest that oASIS achieves performance comparable to state-of-the-art greedy Nystrom methods but with shorter runtimes and less memory consumption.
AB - Kernel matrices that encode the distance (or similarity) between data points are widely used throughout the computational sciences for classification, clustering, and dimensionality reduction. For large datasets, the cost of computing and factorizing such matrices becomes intractable. Thus instead of operating on the entire matrix, approximate methods such as the Nystrom method and the incomplete Cholesky decomposition (ICD) generate a low rank matrix factorization using only a subset of the matrix columns (or rows). Here, we present an adaptive column sampling strategy for the Nystrom method that we dub Accelerated Sequential Incoherence Selection (oASIS). This sampling strategy reveals a missing link between Nystrom methods and ICD: we demonstrate that ICD is actually a special case of the Nystrom method with the oASIS adaptive sampling rule. Numerical experiments and theoretical results suggest that oASIS achieves performance comparable to state-of-the-art greedy Nystrom methods but with shorter runtimes and less memory consumption.
UR - http://www.scopus.com/inward/record.url?scp=84991666905&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84991666905&partnerID=8YFLogxK
U2 - 10.1137/1.9781611974348.67
DO - 10.1137/1.9781611974348.67
M3 - Conference contribution
AN - SCOPUS:84991666905
T3 - 16th SIAM International Conference on Data Mining 2016, SDM 2016
SP - 594
EP - 602
BT - 16th SIAM International Conference on Data Mining 2016, SDM 2016
A2 - Venkatasubramanian, Sanjay Chawla
A2 - Meira, Wagner
PB - Society for Industrial and Applied Mathematics Publications
Y2 - 5 May 2016 through 7 May 2016
ER -