signed distance function

An overview of SDF

outline

  • What is SDF?
    • Mathematical Definition
    • Characteristics
    • Pros and Cons
  • SDF in 3D reconstructions
    • NeuS (NeurIPS 2021, Wang et al.)
    • Neuralangelo (CVPR 2023, Li et al.)
  • SDF in 3D generations
    • One-2-3-45 (NeurIPS 2023, Liu et al.)
    • BlockFusion (arXiv 2024, Wu et al.)

SDF

f(x) = \begin{cases} d(x, \partial\Omega) & \text{if } x \in \Omega \\ -d(x, \partial\Omega) & \text{if } x \notin \Omega \end{cases}
f(x) = \begin{cases} -d(x, \partial\Omega) & \text{if } x \in \Omega \\ d(x, \partial\Omega) & \text{if } x \notin \Omega \end{cases}

Original Definition

Definition in NeuS

\partial\Omega: \text{Boundary}

SDF: Signed Distance Function

從該點到最近表面的最短距離

SDF

SDF

SDF

Characteristic

Surface

\text{SDF}: f(x, y, z)

Surface Normal

Eikonal Equation

\{(x, y, z) | f(x, y, z) = 0\}
\nabla f
|\nabla f| = 0

SDF

Characteristic

(x,y)
f(x,y)=\sqrt{x^2+y^2}-1

SDF

Characteristic

f(x,y)=\sqrt{x^2+y^2}-1
\begin{align*} \nabla f &= \left( \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right) \\ &= \left(\frac{2x}{2\sqrt{x^2+y^2}}, \frac{2y}{2\sqrt{x^2+y^2}}\right) \\ &= \left(\frac{x}{\sqrt{x^2+y^2}}, \frac{y}{\sqrt{x^2+y^2}}\right) \end{align*}
|\nabla f| = \frac{x^2+y^2}{x^2+y^2} = 1

SDF

Pros and Cons

Pros

Cons

  • A high precision and continuous surface representation 
  • Better surface quality 
  • Easy ray casting and collision detection
  • Memory consumption
  • Can't directly edit
  • Computational intensity
  • How to perform volume rendering?

3D reconstruction

Directly use 2D images as supervision

3D reconstruction

  • NeuS (NeurIPS 2021, Wang et al.)
  • Neuralangelo (CVPR 2023, Li et al.)

NeuS (NeurIPS 2021, Wang et al.)

Author of F2-NeRF

NeuS (NeurIPS 2021, Wang et al.)

Recap: NeRF

MLP

(\mathbf{r}(t), \mathbf{d})
c, \sigma

NeuS (NeurIPS 2021, Wang et al.)

Recap: NeRF

MLP

(\mathbf{r}(t), \mathbf{d})
c, \sigma
C(\mathbf{r}) = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)

Volume Rendering

NeuS (NeurIPS 2021, Wang et al.)

Recap: NeRF

MLP

(\mathbf{r}(t), \mathbf{d})
c, f(\mathbf{r}(t))
C(\mathbf{r}) = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)

Volume Rendering

NeuS (NeurIPS 2021, Wang et al.)

MLP

(\mathbf{r}(t), \mathbf{d})
c, f(\mathbf{r}(t))
C(\mathbf{r}) = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)

Volume Rendering

How to perform volume rendering?

NeuS (NeurIPS 2021, Wang et al.)

MLP

(\mathbf{r}(t), \mathbf{d})
c, f(\mathbf{r}(t))
C(\mathbf{r}) = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)

Volume Rendering

f(\mathbf{r}(t))\rightarrow \sigma(\mathbf{r}(t))

NeuS (NeurIPS 2021, Wang et al.)

MLP

(\mathbf{r}(t), \mathbf{d})
c, f(\mathbf{r}(t))
C(\mathbf{r}) = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)

Volume Rendering

g\left(f(\mathbf{r}(t))\right) = \sigma(\mathbf{r}(t))

NeuS (NeurIPS 2021, Wang et al.)

C(\mathbf{r}) = \int_{t_n}^{t_f} \underbrace{w(t)}_\text{weight function} c(\mathbf{r}(t), \mathbf{d})dt = \int_{t_n}^{t_f} T(t)\sigma(\mathbf{r}(t))c(\mathbf{r}(t), \mathbf{d})dt

Volume Rendering

  1. Unbiased: weight function has local maximal value at the surface
  2. Occlusion-aware: points that are closer to view point should have larger weight

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

w(t) = T(t) \sigma(t)
T(t) = \exp\left(-\int_{t_n}^{t} \sigma(\mathbf{r}(s))ds\right)
g\left(f(\mathbf{r}(t))\right) = \sigma(\mathbf{r}(t))
g(x) = \phi_s (x) = se^{-sx} / (1 + e^{-sx})^2

logistic density

distribution

Biased!

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

w(t) = \frac{\phi_s(f(\mathbf{p}(t)))}{\int_{0}^{+\infty} \phi_s(f(\mathbf{p}(u)))du}
g(x) = \phi_s (x) = se^{-sx} / (1 + e^{-sx})^2

logistic density

distribution

unbiased

but not occlusion-aware

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

T(t) = \exp\left(-\int_{0}^{t} \rho(u) du\right)
w(t) = T(t) \rho (t)

Goal: to derive an unbiased and occlusion-aware weight function

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

w(t) = \frac{\phi_s(f(\mathbf{p}(t)))}{\int_{0}^{+\infty} \phi_s(f(\mathbf{p}(u)))du}

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

\begin{align*} w(t) &= \lim_{t^* \to +\infty} \frac{\phi_s(f(\mathbf{p}(t)))}{\int_{0}^{+\infty} \phi_s(f(\mathbf{p}(u)))du} \\ &= \lim_{t^* \to +\infty} \frac{\phi_s(f(\mathbf{p}(t)))}{\int_{0}^{+\infty} \phi_s(-|\cos(\theta)|(u - t^*))du} \\ &= \lim_{t^* \to +\infty} \frac{\phi_s(f(\mathbf{p}(t)))}{\int_{-t^*}^{+\infty} \phi_s(-|\cos(\theta)|u^*)du^*} \\ &= \lim_{t^* \to +\infty} \frac{\phi_s(f(\mathbf{p}(t)))}{|\cos(\theta)|^{-1}\int_{-|\cos(\theta)|t^*}^{+\infty} \phi_s(\hat{u})d\hat{u}} \\ &= |\cos(\theta)| \phi_s(f(\mathbf{p}(t))). \end{align*}
w(t) = T(t) \rho (t)
T(t) \rho (t) = |\cos(\theta)| \phi_s(f(\mathbf{p}(t)))

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

w(t) = T(t) \rho (t)
T(t) \rho (t) = |\cos(\theta)| \phi_s(f(\mathbf{p}(t)))
T(t) = \exp\left(-\int_{0}^{t} \rho(u) du\right)
T(t) \rho (t) = - \frac{d T}{d t}(t)
f(\mathbf{p}(t)) = -|\cos(\theta)| \cdot (t - t^*)
T(t) \rho (t) = |\cos(\theta)| \phi_s(f(\mathbf{p}(t))) = - \frac{d \Phi_s}{dt} \left(f(\mathbf{p}(t))\right)

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

T(t) \rho (t) = |\cos(\theta)| \phi_s(f(\mathbf{p}(t))) = - \frac{d \Phi_s}{dt} \left(f(\mathbf{p}(t))\right)
\frac{d T}{d t}(t) = \frac{d \Phi_s}{dt} \left(f(\mathbf{p}(t))\right)
T(t) \rho (t) = - \frac{d T}{d t}(t)
T(t) = \Phi_s \left(f(\mathbf{p}(t))\right)
T(t) = \exp\left(-\int_{0}^{t} \rho(u) du\right)
\int_{0}^{t} \rho(u) du = - \ln \left(\Phi_s \left(f(\mathbf{p}(t))\right) \right)
\Rightarrow \rho(t) = \frac{-\frac{d\Phi_s}{dt} \left( f(\mathbf{p}(t)) \right)}{\Phi_s(f(\mathbf{p}(t)))}.
T(t) = \exp\left(-\int_{0}^{t} \rho(u) du\right)
w(t) = T(t) \rho (t)

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

\rho(t) = \text{max}\left(\frac{-\frac{d\Phi_s}{dt} \left( f(\mathbf{p}(t)) \right)}{\Phi_s(f(\mathbf{p}(t)))}, 0\right).
T(t) = \exp\left(-\int_{0}^{t} \rho(u) du\right)
w(t) = T(t) \rho (t)
C(\mathbf{r}) = \int_{t_n}^{t_f} \underbrace{w(t)}_\text{weight function} c(\mathbf{r}(t), \mathbf{d})dt

NeuS (NeurIPS 2021, Wang et al.)

Volume Rendering

NeuS (NeurIPS 2021, Wang et al.)

Loss Terms

\mathcal{L} = \mathcal{L}_{color} + \lambda\mathcal{L}_{reg} + \beta\mathcal{L}_{mask}
\mathcal{L}_{reg} = \frac{1}{nm} \sum_{k,i} \left( \|\nabla f(\hat{p}_{k,i})\|_2 - 1 \right)^2.

NeuS (NeurIPS 2021, Wang et al.)

Results

NeuS (NeurIPS 2021, Wang et al.)

Results

Chamfer Distance

Neuralangelo (CVPR 2023, Li et al.)

Author of InstantNGP

Neuralangelo (CVPR 2023, Li et al.)

Neuralangelo (CVPR 2023, Li et al.)

  • SOTA of SDF-based 3D reconstruction
  • Improve from NeuS
    • MLP outputs SDF and RGB
  • Integrate InstantNGP
    • Multi-resolution hash grid
  • Introduce numerical gradient

Neuralangelo (CVPR 2023, Li et al.)

Recap: InstantNGP

Neuralangelo (CVPR 2023, Li et al.)

Numerical Gradients

Neuralangelo (CVPR 2023, Li et al.)

Numerical Gradients

Only update entries near sample point

By choosing eps, it can "smooth" the output

Neuralangelo (CVPR 2023, Li et al.)

Numerical Gradients

Neuralangelo (CVPR 2023, Li et al.)

Coarse-to-fine

Neuralangelo (CVPR 2023, Li et al.)

Loss Terms

\mathcal{L} = \mathcal{L}_{\text{RGB}} + w_{eik}\mathcal{L}_{eik} + w_{\text{curv}}\mathcal{L}_{\text{curv}}
\mathcal{L}_{\text{curv}} = \frac{1}{N} \sum_{i=1}^{N} \left| \nabla^2 f(\mathbf{x}_i) \right|

Neuralangelo (CVPR 2023, Li et al.)

Loss Terms

\mathcal{L}_{\text{curv}} = \frac{1}{N} \sum_{i=1}^{N} \left| \nabla^2 f(\mathbf{x}_i) \right|

Neuralangelo (CVPR 2023, Li et al.)

Results

Neuralangelo (CVPR 2023, Li et al.)

Results

Neuralangelo (CVPR 2023, Li et al.)

Results

Neuralangelo (CVPR 2023, Li et al.)

Results

3D Generation

  • One-2-3-45 (NeurIPS 2023, Liu et al.)
    • 3D object generation
  • BlockFusion (arXiv 2024, Wu et al.)
    • 3D scene generation

One-2-3-45 (NeurIPS 2023, Liu et al.)

One-2-3-45 (NeurIPS 2023, Liu et al.)

Goal

One-2-3-45 (NeurIPS 2023, Liu et al.)

Methods

One-2-3-45 (NeurIPS 2023, Liu et al.)

Training

  • Datasets
    • Objaverse-LVIS
    • 46k 3D models in 1156 categories
    • RGB-D
  • Training Time
    • Trained on 2 A10 GPUs
    • 6 days

One-2-3-45 (NeurIPS 2023, Liu et al.)

Results

One-2-3-45 (NeurIPS 2023, Liu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

BlockFusion (arXiv 2024, Wu et al.)

BlockFusion (arXiv 2024, Wu et al.)

Goal

To generate unbounded 3D scene geometry conditioned on 2D layout

BlockFusion (arXiv 2024, Wu et al.)

Datasets

  • Room
    • 3DFront and 3D-FUTURE
  • City and Village
    • Designed by artists

BlockFusion (arXiv 2024, Wu et al.)

Methods

1.

2.

BlockFusion (arXiv 2024, Wu et al.)

Methods - Raw Tri-plane Fitting

A training block

corresponding tri-plane

|\Omega| = 100000 \\ |\Omega_0| = 500000
\mathcal{L}_{geo} = \mathcal{L}_{SDF} + \mathcal{L}_{Normal} + \mathcal{L}_{Eikonal}
\mathcal{L}_{\text{SDF}} = \lambda_1 \sum_{p \in \Omega_0} \left\| \Phi(p) \right\| + \lambda_2 \sum_{p \in \Omega} \left\| \Phi(p) - d_p \right\| \\ \mathcal{L}_{\text{Normal}} = \lambda_3 \sum_{p \in \Omega_0} \left\| \nabla_p \Phi(p) - n_p \right\| \\ \mathcal{L}_{\text{Eikonal}} = \lambda_4 \sum_{p \in \Omega_0} \left\| | \nabla_p \Phi(p)| - 1 \right\|

BlockFusion (arXiv 2024, Wu et al.)

Methods

1.

2.

BlockFusion (arXiv 2024, Wu et al.)

Methods - Latent Tri-plane

BlockFusion (arXiv 2024, Wu et al.)

Methods - Latent Tri-plane

BlockFusion (arXiv 2024, Wu et al.)

Methods

1.

2.

How to generate unbounded scene?

BlockFusion (arXiv 2024, Wu et al.)

Methods - Latent Tri-plane Extrapolation

BlockFusion (arXiv 2024, Wu et al.)

Methods - Latent Tri-plane Extrapolation

BlockFusion (arXiv 2024, Wu et al.)

Time

  • NVIDIA V100
  • Training
    • Tri-plane fitting: 4750 GPU hr.
    • Auto-encoder: 768 GPU hr.
    • Diffusion: 384 GPU hr.
  • Inference
    • 6 minutes per block
    • Large indoor scene: 3 hr.

BlockFusion (arXiv 2024, Wu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

Results

BlockFusion (arXiv 2024, Wu et al.)

Results

Conclusion

  • SDF can build high quality surface and also suitable for obtaining accurate surface normal
  • We can perform marching algorithm to generate mesh from SDF
  • Generative methods based on SDF are still lack
  • SDF can only represent geometry, but no texture
Made with Slides.com