EvoScene: Self-Evolving 3D Scene Generation from a Single Image

Kaizhi Zheng, Yue Fan, Jing Gu, Zishuo Xu, Xuehai He, Xin Eric Wang

University of California, Santa Cruz; University of California, Santa Barbara

Generating high-quality, textured 3D scenes from a single image remains a fundamental challenge in vision and graphics. Recent image-to-3D generators recover reasonable geometry from single views, but their object-centric training limits generalization to complex, large-scale scenes with faithful structure and texture. We present EvoScene, a self-evolving, training-free framework that progressively reconstructs complete 3D scenes from single images. The key idea is combining the complementary strengths of existing models: geometric reasoning from 3D generation models and visual knowledge from video generation models. Through three iterative stages—Spatial Prior Initialization, Visual-guided 3D Scene Mesh Generation, and Spatial-guided Novel View Generation—EvoScene alternates between 2D and 3D domains, gradually improving both structure and appearance. Experiments on diverse scenes demonstrate that EvoScene achieves superior geometric stability, view-consistent textures, and unseen-region completion compared to strong baselines, producing ready-to-use 3D meshes for practical applications.

Model Architecture

Getting Started

Prerequisites

System: The code is currently tested only on Linux.
Hardware: An NVIDIA GPU with at least 70GB of memory is necessary. The code has been verified on NVIDIA H100 GPU. (Video generation requires the most VRAM cost.)
Software:
- The CUDA Toolkit is needed to compile certain submodules. The code has been tested with CUDA versions 12.8.
- Conda is recommended for managing dependencies.
- Python version 3.8 or higher is required.

Installation

Clone the repo

git clone https://github.com/eric-ai-lab/EvoScene.git
cd EvoScene

Install the dependencies

Before running the following command there are somethings to note:
- By adding --new-env, a new conda environment named evoscene will be created. If you want to use an existing conda environment, please remove this flag.
- By default the evoscene environment will use pytorch 2.8.0 with CUDA 12.8. If you want to use a different version of CUDA (e.g., if you have CUDA Toolkit 12.2 installed and do not want to install another 11.8 version for submodule compilation), you can remove the --new-env flag and manually install the required dependencies. Refer to PyTorch for the installation command.
- If you have multiple CUDA Toolkit versions installed, PATH should be set to the correct version before running the command. For example, if you have CUDA Toolkit 11.8 and 12.2 installed, you should run export PATH=/usr/local/cuda-11.8/bin:$PATH before running the command.
Create a new conda environment named trellis and install the dependencies:
```
. ./setup.sh --new-env --basic --xformers --flash-attn --diffoctreerast --spconv --mipgaussian --kaolin --nvdiffrast
```

Demo

We provide several example images for the demo. Here is the command of how to use the main.py.

python3 main.py --image_path_list assets/example_images/scene_0.png --save_folder output/scene_0

The results will save in the output/scene_0 and the final mesh output will be saved as output/scene_0/iteration_2/final_scene.glb.

Note: You can adjust hyperparameters such as inference steps, guidance scale, texture size, and other generation parameters in config.py.

Blender Rendering

We provider the rendering script for the scene re-rendering in Blender. To run the rendering, please download the Blender-4.5 first.

#Download Blender 4.5
sudo apt-get update && sudo apt-get install -y libsm6 libice6
wget https://download.blender.org/release/Blender4.5/blender-4.5.4-linux-x64.tar.xz
tar -xzvf blender-4.5.4-linux-x64.tar.xz

# Running the render script
python3 blender_render/render.py

If you find EvoScene useful in your research or applications, please cite as below:

@article{zheng2025self,
  title={Self-Evolving 3D Scene Generation from a Single Image},
  author={Zheng, Kaizhi and Fan, Yue and Gu, Jing and Xu, Zishuo and He, Xuehai and Wang, Xin Eric},
  journal={arXiv preprint arXiv:2512.08905},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
HunyuanWorld-Mirror @ 4d86308		HunyuanWorld-Mirror @ 4d86308
VideoX-Fun @ 3888b6b		VideoX-Fun @ 3888b6b
assets		assets
blender_render		blender_render
submodules/diff-gaussian-rasterization		submodules/diff-gaussian-rasterization
trellis		trellis
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
config.py		config.py
image_processing.py		image_processing.py
main.py		main.py
model_loaders.py		model_loaders.py
scene_construction.py		scene_construction.py
scene_utils.py		scene_utils.py
setup.sh		setup.sh
video_gen.py		video_gen.py
video_utils.py		video_utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EvoScene: Self-Evolving 3D Scene Generation from a Single Image

Model Architecture

Getting Started

Prerequisites

Installation

Demo

Blender Rendering

If you find EvoScene useful in your research or applications, please cite as below:

About

Uh oh!

Releases

Packages

Languages

License

eric-ai-lab/EvoScene

Folders and files

Latest commit

History

Repository files navigation

EvoScene: Self-Evolving 3D Scene Generation from a Single Image

Model Architecture

Getting Started

Prerequisites

Installation

Demo

Blender Rendering

If you find EvoScene useful in your research or applications, please cite as below:

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages