Logo

A Real-World 3D Object Inverse Rendering Benchmark

NeurIPS 2023 Datasets & Benchmarks Track

TL;DR

We present a novel real-world 3D Object inverse Rendering Benchmark, Stanford-ORB, to evaluate object inverse rendering methods. The benchmark consists of:

  • 2,795 HDR images of 14 objects captured in 7 in-the-wild scenes (each object is captured in 3 scenes);
  • 418 HDR ground truth environment maps aligned with image captures;
  • Studio captured textured mesh of all objects;
  • A set of comphehensive benchmarks for inverse rendering evaluation;

The benchmark is set to be plug & play -- all data has been cleaned and organized in the most common structures (i.e. Blender, LLFF, Colmap). We also provide the scripts for dataloading and evaluation, along with the results from various state-of-the-art methods. To test your model, check our github page for more details.

Gallary

Scanned Meshes (Texture generated from NVDiffRec)

Grogu

Curry

Gnome

Teapot

Car

Pepsi

Image Captures

Image Gallary

Overview

Overview

The figure shows the overall pipeline of data capture. For each object, Left: we obtain its 3D shape using a 3D scanner and Physics-Based Rendering (PBR) materials using high-quality light box images. Middle: we also capture multi-view masked images in 3 different in-the-wild scenes, together with the ground-truth environment maps. Right: we carefully register the camera poses for all images using the scanned mesh and recovered materials, and prepare the data for the evaluation benchmarks. Credit to Maurice Svay for the low-poly camera mesh model.

Benchmark Design

Our benchmark is based on the single-scene reconstrution. That means, to test a model, it should be trained with one of our datapoints (i.e. images of one object captured in one scene) and being evaluted at a time. . The evaluation includes:

  • Novel View Synthesis: Evaluating the inferred novel views in the same scene;
  • Novel Scene Relighting: Evaluating the inferred novel views in novel scenes, given ground-truth environment map;
  • Geometry Estimation: Evaluating the reconstructed 3D geometry.
For more details of our benchmark, please refer to our github page.

Results

Quantitative Results


Method Geometry Estimation Novel Scene Relighting Novel View Synthesis
Training Views Depth Geometry Shape PSNR-H PSNR-L SSIM LPIPS PSNR-H PSNR-L SSIM LPIPS
Latest Methods
Neural-PBIR

(ICCV 2023)
All 0.30 0.06 0.43 26.01 33.26 0.979 0.023 28.82 36.80 0.986 0.019
IllumiNeRF

(Neurips 2024)
All N/A 25.56 32.74 0.976 0.027 N/A
RelitLRM 6 N/A 24.67 31.52 0.969 0.032 N/A
Novel View Synthesis / 3D Reconstruction Methods
IDR All 0.35 0.05 0.30 N/A 30.11 39.66 0.990 0.017
NeRF All 2.19 0.62 62.05 N/A 26.31 33.59 0.968 0.044
Material Decomposition Methods
Neural-PIL All 0.86 0.29 4.14 N/A 25.79 33.35 0.963 0.051
PhySG All 1.90 0.17 9.28 21.81 28.11 0.960 0.055 24.24 32.15 0.974 0.047
NVDiffRec All 0.31 0.06 0.62 22.91 29.72 0.963 0.039 21.94 28.44 0.969 0.030
NeRD All 1.39 0.28 13.70 23.29 29.65 0.957 0.059 25.83 32.61 0.963 0.054
NeRFactor All 0.87 0.29 9.53 23.54 30.38 0.969 0.048 26.06 33.47 0.973 0.046
InvRender All 0.59 0.06 0.44 23.76 30.83 0.970 0.046 25.91 34.01 0.977 0.042
NVDiffRecMC All 0.32 0.04 0.51 24.43 31.60 0.972 0.036 28.03 36.40 0.982 0.028
Single-View Prediction Methods
SI-SVBRDF 1 81.48 0.29 N/A N/A N/A
SIRFS 1 N/A 0.59 N/A N/A N/A
Reference Results
NVDiffRecMC+GT Mesh All N/A 25.08 32.28 0.974 0.027 N/A
NVDiffRec+GT Mesh All N/A 24.93 32.42 0.975 0.027 N/A

Qualitative Results

Novel View Synthesis

Download Links

For convenience proposes, we provide separate download links for images organized in different data structures, and the auxiliary files.

  • blender_LDR.tar.gz (11G): LDR images and camera files (organized as the Blender Dataset structure);
  • blender_HDR.tar.gz (72G): HDR images and camera files (organized as the Blender Dataset structure);
  • llff_colmap_LDR.tar.gz (11G): LDR images and camera files (organized as the LLFF Dataset structure and as the Colmap's structure);
  • llff_colmap_HDR.tar.gz (72G): HDR images and camera files (organized as the LLFF Dataset structure and as the Colmap's structure);
  • ground_truth.tar.gz (4.8G): GT environment maps, 3D meshes, depth maps, normal maps, and pesudo-gt albedo maps.
If your model prefers the Blender Dataset, download blender_LDR.tar.gz, blender_HDR.tar.gz and ground_truth.tar.gz. Otherwise download llff_colmap_LDR.tar.gz, llff_colmap_HDR.tar.gz and ground_truth.tar.gz. Notice that the HDR dataset is also required for HDR evaluations even if your model runs in LDR.

Citation

@inproceedings{kuang2023stanfordorb,
  title={Stanford-ORB: a real-world 3D object inverse rendering benchmark},
  author={Kuang, Zhengfei and Zhang, Yunzhi and Yu, Hong-Xing and Agarwala, Samir and Wu, Elliott and Wu, Jiajun and others},
  journal={Advances in Neural Information Processing Systems Datasets and Benchmarks Track},
  year={2023}
}

The website template was borrowed from Michaƫl Gharbi.