Abstract
Inferring appropriate information from large datasets has become important. In particular, identifying relationships among variables in these datasets has far-reaching impacts. In this article, we introduce the uniform information coefficient (UIC), which measures the amount of dependence between two multidimensional variables and is able to detect both linear and non-linear associations. Our proposed UIC is inspired by the maximal information coefficient (MIC) [1].; however, the MIC was originally designed to measure dependence between two one-dimensional variables. Unlike the MIC calculation that depends on the type of association between two variables, we show that the UIC calculation is less computationally expensive and more robust to the type of association between two variables. The UIC achieves this by replacing the dynamic programming step in the MIC calculation with a simpler technique based on the uniform partitioning of the data grid. This computational efficiency comes at the cost of not maximizing the information coefficient as done by the MIC algorithm. We present theoretical guarantees for the performance of the UIC and a variety of experiments to demonstrate its quality in detecting associations.
Original language | English (US) |
---|---|
Pages (from-to) | 1098-1107 |
Number of pages | 10 |
Journal | IEEE Transactions on Pattern Analysis and Machine Intelligence |
Volume | 44 |
Issue number | 2 |
DOIs | |
State | Published - Feb 1 2022 |
Keywords
- Detection
- correlation
- estimation
- k-nearest neighbor
- mutual information
ASJC Scopus subject areas
- Software
- Computer Vision and Pattern Recognition
- Computational Theory and Mathematics
- Artificial Intelligence
- Applied Mathematics