Developing autonomous agents requires them to understand and interact within a 3D environment. The capability to deduce, shape, and apply 3D representations is crucial in AI for various applications, including robotic handling, autonomous driving, virtual reality, and photo editing. The ambition to comprehend 3D spaces in computer vision has made significant strides with the advent of advanced (deep) learning methods. This area aims to delve into the integration of 3D Vision with Learning-based approaches, highlighting recent breakthroughs in the field.
It covers topics including:
- Explicit, Implicit, and Neural 3D Representations
- Differentiable Rendering
- Single-view 3D Prediction: Objects, Scenes, and Humans
- Neural Rendering
- Multi-view 3D Inference: Radiance Fields, Multi-plane Images, Implicit Surfaces, etc.
- Generative 3D Models
- Shape Abstraction
- Mesh and Point cloud processing
1. Learning the basics of rendering with PyTorch3D, exploring 3D representations, and practice constructing simple geometry
2. Exploring the types of loss and decoder functions for regressing to voxels, point clouds, and mesh representation from single view RGB input
3. Volume Rendering and Neural Radiance Fields
1. Implementing sphere tracing for rendering an SDF
2. Implementing an MLP architecture for a neural SDF, and training this neural SDF on point cloud data
VolSDF
1. Implementing a function for converting SDF into volume density
2. Extending the NeuralSurface class to predict color
Phong Relighting
Implementing the Phong reflection model in order to render the SDF volume we trained under different lighting conditions
Point Cloud Classification
Implementing a classification model that classifies points clouds from across 3 classes (chairs, vases and lamps objects)
Point Cloud Segmentation
Implementing a Segmentation model that segments points of chair objects into 6 semantic segmentation classes
Structure from Motion
MVS SLAM SLAM
References