Robust 2D and 3D registration with deep neural networks

Recovering 3D geometry is a crucial task in computer vision, essential for accurate world reconstruction and perception. Modern applications in AR, VR, autonomous driving, and medical imaging rely heavily on 3D and 4D reconstruction techniques. This thesis aims to enhance registration methods, which play a key role in reconstruction, by fusing classical multi-view geometry and deep neural networks. We explore this theme in three primary directions, each distinguished by registration dimensionality: 3D–3D, 3D–2D, and 2D–2D. First, we focus on improving the alignment of 3D point clouds in both rigid and non-rigid scenarios. In non-rigid 3D registration, traditional methods directly optimize a motion field between a source and target surface. This often leads to slow convergence and being trapped in local minima. We introduce a neural network-based scene flow to initialize the optimization, providing a more efficient and robust solution. Additionally, we present a novel surface normal estimation technique that aids both rigid and non-rigid registration. Unlike conventional methods that use a fixed global neighbor parameter, our approach employs a self-attention mechanism to adapt to local geometry variations. Second, we address the challenge of registering 2D images to 3D Neural Radiance Fields (NeRF) through joint optimization of NeRF and camera parameters. Original NeRF training mandates pre-processed camera parameters, creating a bottleneck in the workflow. Our approach allows for end-to-end camera parameter estimation during NeRF training while reusing the existing photometric loss in NeRF. We further extend this to account for larger camera movements by incorporating a monocular depth prior. Lastly, we propose a method for interest point discovery, which is beneficial for 2D image registration. Unlike existing interest point identification methods that suffer from significant viewpoint changes and occlusion boundaries, we propose a multi-view interest point discovery approach to address these limitations. Our method is trained in a self-supervised fashion with pure-geometric constraints that encourage point identification repeatability, sparsity, and multi-view consistency. In summary, this thesis explores the fusion of traditional multi-view geometry concepts with deep learning priors in various registration tasks, including point cloud registration, image-to-NeRF registration, and image-to-image registration.

Robust 2D and 3D registration with deep neural networks

Fiche du document

Mots-clés Und

Sujets proches En

Citer ce document

Métriques

Partage / Export

Résumé 0

Par les mêmes auteurs

Sur les mêmes sujets