TY - GEN
T1 - MINER
T2 - 17th European Conference on Computer Vision, ECCV 2022
AU - Saragadam, Vishwanath
AU - Tan, Jasper
AU - Balakrishnan, Guha
AU - Baraniuk, Richard G.
AU - Veeraraghavan, Ashok
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
PY - 2022
Y1 - 2022
N2 - We introduce a new neural signal model designed for efficient high-resolution representation of large-scale signals. The key innovation in our multiscale implicit neural representation (MINER) is an internal representation via a Laplacian pyramid, which provides a sparse multiscale decomposition of the signal that captures orthogonal parts of the signal across scales. We leverage the advantages of the Laplacian pyramid by representing small disjoint patches of the pyramid at each scale with a small MLP. This enables the capacity of the network to adaptively increase from coarse to fine scales, and only represent parts of the signal with strong signal energy. The parameters of each MLP are optimized from coarse-to-fine scale which results in faster approximations at coarser scales, thereby ultimately an extremely fast training process. We apply MINER to a range of large-scale signal representation tasks, including gigapixel images and very large point clouds, and demonstrate that it requires fewer than 25% of the parameters, 33% of the memory footprint, and 10% of the computation time of competing techniques such as ACORN to reach the same representation accuracy. A fast implementation of MINER for images and 3D volumes is accessible from https://vishwa91.github.io/miner.
AB - We introduce a new neural signal model designed for efficient high-resolution representation of large-scale signals. The key innovation in our multiscale implicit neural representation (MINER) is an internal representation via a Laplacian pyramid, which provides a sparse multiscale decomposition of the signal that captures orthogonal parts of the signal across scales. We leverage the advantages of the Laplacian pyramid by representing small disjoint patches of the pyramid at each scale with a small MLP. This enables the capacity of the network to adaptively increase from coarse to fine scales, and only represent parts of the signal with strong signal energy. The parameters of each MLP are optimized from coarse-to-fine scale which results in faster approximations at coarser scales, thereby ultimately an extremely fast training process. We apply MINER to a range of large-scale signal representation tasks, including gigapixel images and very large point clouds, and demonstrate that it requires fewer than 25% of the parameters, 33% of the memory footprint, and 10% of the computation time of competing techniques such as ACORN to reach the same representation accuracy. A fast implementation of MINER for images and 3D volumes is accessible from https://vishwa91.github.io/miner.
UR - http://www.scopus.com/inward/record.url?scp=85142726074&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142726074&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-20050-2_19
DO - 10.1007/978-3-031-20050-2_19
M3 - Conference contribution
AN - SCOPUS:85142726074
SN - 9783031200496
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 318
EP - 333
BT - Computer Vision – ECCV 2022 - 17th European Conference, 2022, Proceedings
A2 - Avidan, Shai
A2 - Brostow, Gabriel
A2 - Cissé, Moustapha
A2 - Farinella, Giovanni Maria
A2 - Hassner, Tal
PB - Springer Science and Business Media Deutschland GmbH
Y2 - 23 October 2022 through 27 October 2022
ER -