Compositional generation
The compositional nature of language allows users to combine concepts in novel ways and control generation. A template prompt describing a primary object (an armchair or a teapot) is stylized with 16 materials: avocado, glacier, orchid, pikachu, brain coral, gourd, peach, rubik's cube, doughnut, hibiscus, peacock, sardines, fossil, lotus root, pig, or strawberry. These prompt templates are sourced from DALL-E.
an archair in the shape of a ____.
an archair imitating a ____.
a teapot in the shape of a ____.
a teapot imitating a ____.
Related publications
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
Ajay Jain, Matthew Tancik, Pieter Abbeel ICCV 2021 International Conference on Computer Vision
DietNeRF regularizes Neural Radiance Fields with a CLIP-based loss to improve 3D reconstruction. Given only a few images of an object or scene, we reconstruct its 3D structure & render novel views using prior knowledge contained in large image encoders.
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, Pratul Srinivasan ICCV 2021 International Conference on Computer Vision
NeRF is aliased, but we can anti-alias it by casting cones and prefiltering the positional encoding function. Dream Fields combine mip-NeRF's integrated positional encoding with Fourier features.
Citation
Ajay Jain, Ben Mildenhall, Jonathan T. Barron, Pieter Abbeel, Ben Poole. Zero-Shot Text-Guided Object Generation with Dream Fields. arXiv, 2021.
@article{jain2021dreamfields,
author = {Jain, Ajay and Mildenhall, Ben and Barron, Jonathan T. and Abbeel, Pieter and Poole, Ben},
title = {Zero-Shot Text-Guided Object Generation with Dream Fields},
joural = {CVPR},
year = {2022},
}