PyTorch implementation of paper "Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields", CVPR 2024.
Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields
Tianqi Liu, Xinyi Ye, Min Shi, Zihao Huang, Zhiyu Pan, Zhan Peng, Zhiguo Cao*
CVPR 2024
project page | paper | poster | model
Generalizable NeRF aims to synthesize novel views for unseen scenes. Common practices involve constructing variance-based cost volumes for geometry reconstruction and encoding 3D descriptors for decoding novel views. However, existing methods show limited generalization ability in challenging conditions due to inaccurate geometry, sub-optimal descriptors, and decoding strategies. We address these issues point by point. First, we find the variance-based cost volume exhibits failure patterns as the features of pixels corresponding to the same point can be inconsistent across different views due to occlusions or reflections. We introduce an Adaptive Cost Aggregation (ACA) approach to amplify the contribution of consistent pixel pairs and suppress inconsistent ones. Unlike previous methods that solely fuse 2D features into descriptors, our approach introduces a Spatial-View Aggregator (SVA) to incorporate 3D context into descriptors through spatial and inter-view interaction. When decoding the descriptors, we observe the two existing decoding strategies excel in different areas, which are complementary. A Consistency-Aware Fusion (CAF) strategy is proposed to leverage the advantages of both. We incorporate the above ACA, SVA, and CAF into a coarse-to-fine framework, termed Geometry-aware Reconstruction and Fusion-refined Rendering (GeFu). GeFu attains state-of-the-art performance across multiple datasets.
git clone https://github.com/TQTQliu/GeFu.git
cd GeFu
conda create -n gefu python=3.8
conda activate gefu
pip install -r requirements.txt
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
Training data. Download DTU training data and Depth raw. Unzip and organize them as:
mvs_training
├── dtu
├── Cameras
├── Depths
├── Depths_raw
└── Rectified
Download the NeRF Synthetic and Real Forward-facing datasets and unzip them.
To train a generalizable model from scratch on DTU, specify data_root
in configs/gefu/dtu_pretrain.yaml
first and then run:
python train_net.py --cfg_file configs/gefu/dtu_pretrain.yaml
Our code also supports multi-gpu training. The released pretrained model was trained with 4 GPUs.
python -m torch.distributed.launch --nproc_per_node=4 train_net.py --cfg_file configs/gefu/dtu_pretrain.yaml distributed True gpus 0,1,2,3
Here we take the scan1 on the DTU as an example:
cd ./trained_model/gefu
mkdir dtu_ft_scan1
cp dtu_pretrain/latest.pth dtu_ft_scan1
cd ../..
python train_net.py --cfg_file configs/gefu/dtu/scan1.yaml
We provide the finetuned models for each scenes here.
Download the pretrained model and put it into trained_model/gefu/dtu_pretrain/latest.pth
Use the following command to evaluate the pretrained model on DTU:
python run.py --type evaluate --cfg_file configs/gefu/dtu_pretrain.yaml gefu.eval_depth True
The rendering images will be saved in result/gefu/dtu_pretrain
. Add the save_video True
parameter at the end of the command to save the rendering videos.
python run.py --type evaluate --cfg_file configs/gefu/llff_eval.yaml
python run.py --type evaluate --cfg_file configs/gefu/nerf_eval.yaml
If you find our work useful for your research, please cite our paper.
@InProceedings{Liu_2024_CVPR,
author = {Liu, Tianqi and Ye, Xinyi and Shi, Min and Huang, Zihao and Pan, Zhiyu and Peng, Zhan and Cao, Zhiguo},
title = {Geometry-aware Reconstruction and Fusion-refined Rendering for Generalizable Neural Radiance Fields},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2024},
pages = {7654-7663}
}
-
PixelNeRF: Neural Radiance Fields from One or Few Images, CVPR 2021
-
IBRNet: Learning Multi-View Image-Based Rendering, CVPR 2021
-
MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo, ICCV 2021
-
Neural Rays for Occlusion-aware Image-based Rendering, CVPR 2022
-
ENeRF: Efficient Neural Radiance Fields for Interactive Free-viewpoint Video, SIGGRAPH Asia 2022
-
Is Attention All NeRF Needs?, ICLR 2023
-
Explicit Correspondence Matching for Generalizable Neural Radiance Fields, arXiv 2023
The project is mainly based on ENeRF. Many thanks for their excellent contributions! When using our code, please also pay attention to the license of ENeRF.