A Real-World 3D Object Inverse Rendering Benchmark
NeurIPS 2023 Datasets & Benchmarks Track
We present a novel real-world 3D Object inverse Rendering Benchmark, Stanford-ORB, to evaluate object inverse rendering methods. The benchmark consists of:
- 2,795 HDR images of 14 objects captured in 7 in-the-wild scenes (each object is captured in 3 scenes);
- 418 HDR ground truth environment maps aligned with image captures;
- Studio captured textured mesh of all objects;
- A set of comphehensive benchmarks for inverse rendering evaluation;
The benchmark is set to be plug & play -- all data has been cleaned and organized in the most common structures (i.e. Blender, LLFF, Colmap). We also provide the scripts for dataloading and evaluation, along with the results from various state-of-the-art methods. To test your model, check our github page for more details.
Scanned Meshes (Texture generated from NVDiffRec)
The figure shows the overall pipeline of data capture. For each object, Left: we obtain its 3D shape using a 3D scanner and Physics-Based Rendering (PBR) materials using high-quality light box images. Middle: we also capture multi-view masked images in 3 different in-the-wild scenes, together with the ground-truth environment maps. Right: we carefully register the camera poses for all images using the scanned mesh and recovered materials, and prepare the data for the evaluation benchmarks. Credit to Maurice Svay for the low-poly camera mesh model.
Our benchmark is based on the single-scene reconstrution. That means, to test a model, it should be trained with one of our datapoints (i.e. images of one object captured in one scene) and being evaluted at a time. . The evaluation includes:
- Novel View Synthesis: Evaluating the inferred novel views in the same scene;
- Novel Scene Relighting: Evaluating the inferred novel views in novel scenes, given ground-truth environment map;
- Geometry Estimation: Evaluating the reconstructed 3D geometry.
|Method||Geometry Estimation||Novel Scene Relighting||Novel View Synthesis|
|Novel View Synthesis / 3D Reconstruction Methods|
|Material Decomposition Methods|
|Single-View Prediction Methods|
For convenience proposes, we provide separate download links for images organized in different data structures, and the auxiliary files.
- blender_LDR.tar.gz (11G): LDR images and camera files (organized as the Blender Dataset structure);
- blender_HDR.tar.gz (72G): HDR images and camera files (organized as the Blender Dataset structure);
- llff_colmap_LDR.tar.gz (11G): LDR images and camera files (organized as the LLFF Dataset structure and as the Colmap's structure);
- llff_colmap_HDR.tar.gz (72G): HDR images and camera files (organized as the LLFF Dataset structure and as the Colmap's structure);
- ground_truth.tar.gz (4.8G): GT environment maps, 3D meshes, depth maps, normal maps, and pesudo-gt albedo maps.
The website template was borrowed from Michaël Gharbi.